161-computing-bandwidth-adjustments.txt 6.6 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165
  1. Title: Computing Bandwidth Adjustments
  2. Filename: 161-computing-bandwidth-adjustments.txt
  3. Author: Mike Perry
  4. Created: 12-May-2009
  5. Target: 0.2.2.x
  6. Status: Open
  7. 1. Motivation
  8. There is high variance in the performance of the Tor network. Despite
  9. our efforts to balance load evenly across the Tor nodes, some nodes are
  10. significantly slower and more overloaded than others.
  11. Proposal 160 describes how we can augment the directory authorities to
  12. vote on measured bandwidths for routers. This proposal describes what
  13. goes into the measuring process.
  14. 2. Measurement Selection
  15. The general idea is to determine a load factor representing the ratio
  16. of the capacity of measured nodes to the rest of the network. This load
  17. factor could be computed from three potentially relevant statistics:
  18. circuit failure rates, circuit extend times, or stream capacity.
  19. Circuit failure rates and circuit extend times appear to be
  20. non-linearly proportional to node load. We've observed that the same
  21. nodes when scanned at US nighttime hours (when load is presumably
  22. lower) exhibit almost no circuit failure, and significantly faster
  23. extend times than when scanned during the day.
  24. Stream capacity, however, is much more uniform, even during US
  25. nighttime hours. Moreover, it is a more intuitive representation of
  26. node capacity, and also less dependent upon distance and latency
  27. if amortized over large stream fetches.
  28. 3. Average Stream Bandwidth Calculation
  29. The average stream bandwidths are obtained by dividing the network into
  30. slices of 50 nodes each, grouped according to advertised node bandwidth.
  31. Two hop circuits are built using nodes from the same slice, and a large
  32. file is downloaded via these circuits. For nodes in the first 15% of the
  33. network, a 500K file will be used. For nodes in the next 15%, a 250K file
  34. will be used. For nodes in next 15%, a 100K file will be used. The
  35. remainder of the nodes will fetch a 75K file.[1]
  36. This process is repeated 250 times, and average stream capacities are
  37. assigned to each node from these results.
  38. In the future, a node generator type can be created to ensure that
  39. each node is chosen to participate in an equal number of circuits,
  40. and the selection will continue until every live node is chosen
  41. to participate in at least 7 circuits.
  42. 4. Ratio Calculation Options
  43. There are two options for deriving the ratios themselves. They can
  44. be obtained by dividing each nodes' average stream capacity by
  45. either the average for the slice, or the average for the network as a
  46. whole.
  47. Dividing by the network-wide average has the advantage that it will
  48. account for issues related to unbalancing between higher vs lower
  49. capacity, such as Steven Murdoch's queuing theory weighting result.
  50. Dividing by the slice average has the advantage that many scans can
  51. be run in parallel from a single authority, and that results are
  52. typically available sooner after a given scan takes place.
  53. 5. Ratio Filtering
  54. After the base ratios are calculated, a second pass is performed
  55. to remove any streams with nodes of ratios less than X=0.5 from
  56. the results of other nodes. In addition, all outlying streams
  57. with capacity of one standard deviation below a node's average
  58. are also removed.
  59. The final ratio result will be calculated as the maximum of
  60. these two resulting ratios if both are less than 1.0, the minimum
  61. if both are greater than 1.0, and the mean if one is greater
  62. and one is less than 1.0.
  63. 6. Pseudocode for Ratio Calculation Algorithm
  64. Here is the complete pseudocode for the ratio algorithm:
  65. Slices = {S | S is 50 nodes of similar consensus capacity}
  66. for S in Slices:
  67. while exists node N in S with circ_chosen(N) < 7:
  68. fetch_slice_file(build_2hop_circuit(N, (exit in S)))
  69. for N in S:
  70. BW_measured(N) = MEAN(b | b is bandwidth of a stream through N)
  71. Bw_stddev(N) = STDDEV(b | b is bandwidth of a stream through N)
  72. Bw_avg(S) = MEAN(b | b = BW_measured(N) for all N in S)
  73. Normal_Routers(S) = {N | Bw_measured(N)/Bw_avg(S) > 0.5 }
  74. for N in S:
  75. Normal_Streams(N) =
  76. {stream via N | all nodes in stream not in {Normal_Routers(S)-N}
  77. and bandwidth > BW_measured(N)-Bw_stddev(N)}
  78. BW_Norm_measured(N) = MEAN(b | b is a bandwidth of Normal_Streams(N))
  79. Bw_net_avg(Slices) = MEAN(BW_measured(N) for all N in Slices)
  80. Bw_Norm_net_avg(Slices) = MEAN(BW_Norm_measured(N) for all N in Slices)
  81. for N in all Slices:
  82. Bw_net_ratio(N) = Bw_measured(N)/Bw_net_avg(Slices)
  83. Bw_Norm_net_ratio(N) = Bw_measured2(N)/Bw_Norm_net_avg(Slices)
  84. if Bw_net_ratio(N) < 1.0 and Bw_Norm_net_ratio(N) < 1.0:
  85. ResultRatio(N) = MAX(Bw_net_ratio(N), Bw_Norm_net_ratio(N))
  86. else if Bw_net_ratio(N) > 1.0 and Bw_Norm_net_ratio(N) > 1.0:
  87. ResultRatio(N) = MIN(Bw_net_ratio(N), Bw_Norm_net_ratio(N))
  88. else:
  89. ResultRatio(N) = MEAN(Bw_net_ratio(N), Bw_Norm_net_ratio(N))
  90. 7. Security implications
  91. The ratio filtering will deal with cases of sabotage by dropping
  92. both very slow outliers in stream average calculations, as well
  93. as dropping streams that used very slow nodes from the calculation
  94. of other nodes.
  95. This scheme will not address nodes that try to game the system by
  96. providing better service to scanners. The scanners can be detected
  97. at the entry by IP address, and at the exit by the destination fetch.
  98. Measures can be taken to obfuscate and separate the scanners' source
  99. IP address from the directory authority IP address. For instance,
  100. scans can happen offsite and the results can be rsynced into the
  101. authorities. The destination fetch can also be obscured by using SSL
  102. and periodically changing the large document that is fetched.
  103. Neither of these methods are foolproof, but such nodes can already
  104. lie about their bandwidth to attract more traffic, so this solution
  105. does not set us back any in that regard.
  106. 8. Integration with Proposal 160
  107. The final results will be produced for the voting mechanism
  108. described in Proposal 160 by multiplying the derived ratio by
  109. the average published consensus bandwidth during the course of the
  110. scan, and taking the weighted average with the previous consensus
  111. bandwidth:
  112. Bw_new = (Bw_current * Alpha + Bw_scan_avg*Bw_ratio)/(Alpha + 1)
  113. The Alpha parameter is a smoothing parameter intended to prevent
  114. rapid oscillation between loaded and unloaded conditions.
  115. This will produce a new bandwidth value that will be output into a
  116. file consisting of lines of the form:
  117. node_id=<idhex> SP bw=<Bw_new> NL
  118. This file can be either copied or rsynced into a directory readable
  119. by the directory authority.