瀏覽代碼

Updated to remove dropping of failing guards and just focus
on the specifics of recording, storing, and learning
circuitbuildtimeout parameters.



svn:r16511

Mike Perry 16 年之前
父節點
當前提交
5166e5ff55
共有 1 個文件被更改,包括 79 次插入59 次删除
  1. 79 59
      doc/spec/proposals/151-path-selection-improvements.txt

+ 79 - 59
doc/spec/proposals/151-path-selection-improvements.txt

@@ -10,8 +10,8 @@ Overview
 
 
   The performance of paths selected can be improved by adjusting the
   The performance of paths selected can be improved by adjusting the
   CircuitBuildTimeout and avoiding failing guard nodes. This proposal
   CircuitBuildTimeout and avoiding failing guard nodes. This proposal
-  describes a method of tracking buildtime statistics, and using those
-  statistics to adjust the CircuitBuildTimeout and the number of guards.
+  describes a method of tracking buildtime statistics at the client, and 
+  using those statistics to adjust the CircuitBuildTimeout.
 
 
 Motivation
 Motivation
 
 
@@ -22,71 +22,91 @@ Motivation
 
 
 Implementation
 Implementation
 
 
+  Storing Build Times
+
+    Circuit build times will be stored in the circular array
+    'circuit_build_times' consisting of uint16_t elements as milliseconds.
+    The total size of this array will be based on the number of circuits
+    it takes to converge on a good fit of the long term distribution of
+    the circuit builds for a fixed link. We do not want this value to be
+    too large, because it will make it difficult for clients to adapt to
+    moving between different links.
+
+    From our initial observations, this value appears to be on the order 
+    of 1000, but will be configurable in a #define NCIRCUITS_TO_OBSERVE.
+ 
+  Long Term Storage
+
+    The long-term storage representation will be implemented by storing a 
+    histogram with BUILDTIME_BIN_WIDTH millisecond buckets (default 50) when 
+    writing out the statistics to disk. The format of this histogram on disk 
+    is yet to be finalized, but it will likely be of the format 
+    'CircuitBuildTime <bin> <count>'.
+    Example:
+
+    CircuitBuildTimeBin 1 100
+    CircuitBuildTimeBin 2 50
+    ...
+
+    Reading the histogram in will entail multiplying each bin by the 
+    BUILDTIME_BIN_WIDTH and then inserting <count> values into the 
+    circuit_build_times array each with the value of
+    <bin>*BUILDTIME_BIN_WIDTH.
+
   Learning the CircuitBuildTimeout
   Learning the CircuitBuildTimeout
 
 
     Based on studies of build times, we found that the distribution of
     Based on studies of build times, we found that the distribution of
-    circuit buildtimes appears to be a Pareto distribution. The number
-    of circuits to observe (ncircuits_to_cutoff) before changing the
-    CircuitBuildTimeout will be tunable. From out measurements, 
-    ncircuits_to_cuttoff appears to be on the order of 100.
- 
-	In addition, the total number of circuits gathered
-    (ncircuits_to_observe) will also be tunable. It is likely that
-    ncircuits_to_observe will be somewhere on the order of 1000. The values
-    can be represented compactly in Tor in milliseconds as a circular array
-    of 16 bit integers. More compact long-term storage representations can
-    be implemented by simply storing a histogram with 50 millisecond buckets
-    when writing out the statistics to disk.
-
-  Calculating the preferred CircuitBuildTimeout
-
-    Circuits that have longer buildtimes than some x% of the estimated
-    CDF of the Pareto distribution will be excluded. x will be tunable
-    as well.
-
-  Circuit timeouts
-
-    In the event of a timeout, backoff values should include the 100-x%
-    of expected CDF of timeouts.  Also, in the event of network failure,
-    the observation mechanism should stop collecting timeout data.
-
-  Dropping Failed Guards
-
-    In addition, we have noticed that some entry guards are much more
-    failure prone than others. In particular, the circuit failure rates for
-    the fastest entry guards was approximately 20-25%, where as slower
-    guards exhibit failure rates as high as 45-50%. In [1], it was
-    demonstrated that failing guard nodes can deliberately bias path
-    selection to improve their success at capturing traffic. For both these
-    reasons, failing guards should be avoided. 
-    
-    We propose increasing the number of entry guards to five, and gathering
-    circuit failure statistics on each entry guard. Any guards that exceed
-    the average failure rate of all guards by 10% after we have
-    gathered ncircuits_to_observe circuits will be replaced.
-    
+    circuit buildtimes appears to be a Pareto distribution. 
 
 
-Issues
+    We will calculate the parameters for a Pareto distribution
+    fitting the data using the estimators at
+    http://en.wikipedia.org/wiki/Pareto_distribution#Parameter_estimation.
 
 
-  Impact on anonymity
+    The timeout itself will be calculated by solving the CDF for the 
+    a percentile cutoff BUILDTIME_PERCENT_CUTOFF. This value
+    represents the percentage of paths the Tor client will accept out of
+    the total number of paths. We have not yet determined a good
+    cutoff for this mathematically, but 85% seems a good choice for now.
 
 
-    Since this follows a Pareto distribution, large reductions on the
-    timeout can be achieved without cutting off a great number of the
-    total paths.  However, hard statistics on which cutoff percentage
-    gives optimal performance have not yet been gathered.
+    From http://en.wikipedia.org/wiki/Pareto_distribution#Definition,
+    the calculation we need is pow(BUILDTIME_PERCENT_CUTOFF/100.0, k)/Xm. 
+
+  When to Begin Calculation
+
+    The number of circuits to observe (NCIRCUITS_TO_CUTOFF) before 
+    changing the CircuitBuildTimeout will be tunable via a #define. From 
+    our measurements, a good value for NCIRCUITS_TO_CUTOFF appears to be 
+    on the order of 100.
+
+  Dealing with Timeouts
+
+    Timeouts should be counted as the expectation of the region of 
+    of the Pareto distribution beyond the cutoff. The proposal will
+    be updated with this value soon.
+
+    Also, in the event of network failure, the observation mechanism 
+    should stop collecting timeout data.
 
 
-  Guard Turnover
+    Circuits that timeout will be destroyed, as this indicates one
+    or more of their respective nodes are currently overloaded.
 
 
-    We contend that the risk from failing guards biasing path selection
-    outweighs the risk of exposure to larger portions of the network
-    for the first hop. Furthermore, from our observations, it appears
-    that circuit failure is strongly correlated to node load. Allowing
-    clients to migrate away from failing guards should naturally
-    rebalance the network, and eventually clients should converge on
-    a stable set of reliable guards. It is also likely that once clients
-    begin to migrate away from failing guards, their load should go
-    down, causing their failure rates to drop as well.
+  Client Hints
 
 
+    Some research still needs to be done to provide initial values
+    for CircuitBuildTimeout based on values learned from modem
+    users, DSL users, Cable Modem users, and dedicated links. A 
+    radiobutton in Vidalia should eventually be provided that
+    sets CircuitBuildTimeout to one of these values and also 
+    provide the option of purging all learned data, should any exist.
 
 
-[1] http://www.crhc.uiuc.edu/~nikita/papers/relmix-ccs07.pdf
+    These values can either be published in the directory, or
+    shipped hardcoded for a particular Tor version.
+    
+Issues
+
+  Impact on anonymity
 
 
+    Since this follows a Pareto distribution, large reductions on the
+    timeout can be achieved without cutting off a great number of the
+    total paths. This will eliminate a great deal of the performance
+    variation of Tor usage.