151-path-selection-improvements.txt 3.7 KB

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192
  1. Filename: 151-path-selection-improvements.txt
  2. Title: Improving Tor Path Selection
  3. Version:
  4. Last-Modified:
  5. Author: Fallon Chen, Mike Perry
  6. Created: 5-Jul-2008
  7. Status: Draft
  8. Overview
  9. The performance of paths selected can be improved by adjusting the
  10. CircuitBuildTimeout and avoiding failing guard nodes. This proposal
  11. describes a method of tracking buildtime statistics, and using those
  12. statistics to adjust the CircuitBuildTimeout and the number of guards.
  13. Motivation
  14. Tor's performance can be improved by excluding those circuits that
  15. have long buildtimes (and by extension, high latency). For those Tor
  16. users who require better performance and have lower requirements for
  17. anonymity, this would be a very useful option to have.
  18. Implementation
  19. Learning the CircuitBuildTimeout
  20. Based on studies of build times, we found that the distribution of
  21. circuit buildtimes appears to be a Pareto distribution. The number
  22. of circuits to observe (ncircuits_to_cutoff) before changing the
  23. CircuitBuildTimeout will be tunable. From out measurements,
  24. ncircuits_to_cuttoff appears to be on the order of 100.
  25. In addition, the total number of circuits gathered
  26. (ncircuits_to_observe) will also be tunable. It is likely that
  27. ncircuits_to_observe will be somewhere on the order of 1000. The values
  28. can be represented compactly in Tor in milliseconds as a circular array
  29. of 16 bit integers. More compact long-term storage representations can
  30. be implemented by simply storing a histogram with 50 millisecond buckets
  31. when writing out the statistics to disk.
  32. Calculating the preferred CircuitBuildTimeout
  33. Circuits that have longer buildtimes than some x% of the estimated
  34. CDF of the Pareto distribution will be excluded. x will be tunable
  35. as well.
  36. Circuit timeouts
  37. In the event of a timeout, backoff values should include the 100-x%
  38. of expected CDF of timeouts. Also, in the event of network failure,
  39. the observation mechanism should stop collecting timeout data.
  40. Dropping Failed Guards
  41. In addition, we have noticed that some entry guards are much more
  42. failure prone than others. In particular, the circuit failure rates for
  43. the fastest entry guards was approximately 20-25%, where as slower
  44. guards exhibit failure rates as high as 45-50%. In [1], it was
  45. demonstrated that failing guard nodes can deliberately bias path
  46. selection to improve their success at capturing traffic. For both these
  47. reasons, failing guards should be avoided.
  48. We propose increasing the number of entry guards to five, and gathering
  49. circuit failure statistics on each entry guard. Any guards that exceed
  50. the average failure rate of all guards by 10% after we have
  51. gathered ncircuits_to_observe circuits will be replaced.
  52. Issues
  53. Impact on anonymity
  54. Since this follows a Pareto distribution, large reductions on the
  55. timeout can be achieved without cutting off a great number of the
  56. total paths. However, hard statistics on which cutoff percentage
  57. gives optimal performance have not yet been gathered.
  58. Guard Turnover
  59. We contend that the risk from failing guards biasing path selection
  60. outweighs the risk of exposure to larger portions of the network
  61. for the first hop. Furthermore, from our observations, it appears
  62. that circuit failure is strongly correlated to node load. Allowing
  63. clients to migrate away from failing guards should naturally
  64. rebalance the network, and eventually clients should converge on
  65. a stable set of reliable guards. It is also likely that once clients
  66. begin to migrate away from failing guards, their load should go
  67. down, causing their failure rates to drop as well.
  68. [1] http://www.crhc.uiuc.edu/~nikita/papers/relmix-ccs07.pdf