123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165 |
- Filename: 112-bring-back-pathlencoinweight.txt
- Title: Bring Back Pathlen Coin Weight
- Version: $Revision$
- Last-Modified: $Date$
- Author: Mike Perry
- Created:
- Status: Superseded
- Superseded-By: 115
- Overview:
- The idea is that users should be able to choose a weight which
- probabilistically chooses their path lengths to be 2 or 3 hops. This
- weight will essentially be a biased coin that indicates an
- additional hop (beyond 2) with probability P. The user should be
- allowed to choose 0 for this weight to always get 2 hops and 1 to
- always get 3.
- This value should be modifiable from the controller, and should be
- available from Vidalia.
- Motivation:
- The Tor network is slow and overloaded. Increasingly often I hear
- stories about friends and friends of friends who are behind firewalls,
- annoying censorware, or under surveillance that interferes with their
- productivity and Internet usage, or chills their speech. These people
- know about Tor, but they choose to put up with the censorship because
- Tor is too slow to be usable for them. In fact, to download a fresh,
- complete copy of levine-timing.pdf for the Anonymity Implications
- section of this proposal over Tor took me 3 tries.
- There are many ways to improve the speed problem, and of course we
- should and will implement as many as we can. Johannes's GSoC project
- and my reputation system are longer term, higher-effort things that
- will still provide benefit independent of this proposal.
- However, reducing the path length to 2 for those who do not need the
- (questionable) extra anonymity 3 hops provide not only improves
- their Tor experience but also reduces their load on the Tor network by
- 33%, and can be done in less than 10 lines of code. That's not just
- Win-Win, it's Win-Win-Win.
- Furthermore, when blocking resistance measures insert an extra relay
- hop into the equation, 4 hops will certainly be completely unusable
- for these users, especially since it will be considerably more
- difficult to balance the load across a dark relay net than balancing
- the load on Tor itself (which today is still not without its flaws).
- Anonymity Implications:
- It has long been established that timing attacks against mixed
- networks are extremely effective, and that regardless of path
- length, if the adversary has compromised your first and last
- hop of your path, you can assume they have compromised your
- identity for that connection.
- In [1], it is demonstrated that for all but the slowest, lossiest
- networks, error rates for false positives and false negatives were
- very near zero. Only for constant streams of traffic over slow and
- (more importantly) extremely lossy network links did the error rate
- hit 20%. For loss rates typical to the Internet, even the error rate
- for slow nodes with constant traffic streams was 13%.
- When you take into account that most Tor streams are not constant,
- but probably much more like their "HomeIP" dataset, which consists
- mostly of web traffic that exists over finite intervals at specific
- times, error rates drop to fractions of 1%, even for the "worst"
- network nodes.
- Therefore, the user has little benefit from the extra hop, assuming
- the adversary does timing correlation on their nodes. The real
- protection is the probability of getting both the first and last hop,
- and this is constant whether the client chooses 2 hops, 3 hops, or 42.
- Partitioning attacks form another concern. Since Tor uses telescoping
- to build circuits, it is possible to tell a user is constructing only
- two hop paths at the entry node. It is questionable if this data is
- actually worth anything though, especially if the majority of users
- have easy access to this option, and do actually choose their path
- lengths semi-randomly.
- Nick has postulated that exits may also be able to tell that you are
- using only 2 hops by the amount of time between sending their
- RELAY_CONNECTED cell and the first bit of RELAY_DATA traffic they
- see from the OP. I doubt that they will be able to make much use
- of this timing pattern, since it will likely vary widely depending
- upon the type of node selected for that first hop, and the user's
- connection rate to that first hop. It is also questionable if this
- data is worth anything, especially if many users are using this
- option (and I imagine many will).
- Perhaps most seriously, two hop paths do allow malicious guards
- to easily fail circuits if they do not extend to their colluding peers
- for the exit hop. Since guards can detect the number of hops in a
- path, they could always fail the 3 hop circuits and focus on
- selectively failing the two hop ones until a peer was chosen.
- I believe currently guards are rotated if circuits fail, which does
- provide some protection, but this could be changed so that an entry
- guard is completely abandoned after a certain ratio of extend or
- general circuit failures with respect to non-failed circuits. This
- could possibly be gamed to increase guard turnover, but such a game
- would be much more noticeable than an individual guard failing circuits,
- though, since it would affect all clients, not just those who chose
- a particular guard.
- Why not fix Pathlen=2?:
- The main reason I am not advocating that we always use 2 hops is that
- in some situations, timing correlation evidence by itself may not be
- considered as solid and convincing as an actual, uninterrupted, fully
- traced path. Are these timing attacks as effective on a real network
- as they are in simulation? Would an extralegal adversary or authoritarian
- government even care? In the face of these situation-dependent unknowns,
- it should be up to the user to decide if this is a concern for them or not.
- It should probably also be noted that even a false positive
- rate of 1% for a 200k concurrent-user network could mean that for a
- given node, a given stream could be confused with something like 10
- users, assuming ~200 nodes carry most of the traffic (ie 1000 users
- each). Though of course to really know for sure, someone needs to do
- an attack on a real network, unfortunately.
- Implementation:
- new_route_len() can be modified directly with a check of the
- PathlenCoinWeight option (converted to percent) and a call to
- crypto_rand_int(0,100) for the weighted coin.
- The entry_guard_t structure could have num_circ_failed and
- num_circ_succeeded members such that if it exceeds N% circuit
- extend failure rate to a second hop, it is removed from the entry list.
- N should be sufficiently high to avoid churn from normal Tor circuit
- failure as determined by TorFlow scans.
- The Vidalia option should be presented as a boolean, to minimize confusion
- for the user. Something like a radiobutton with:
-
- * "I use Tor for Censorship Resistance, not Anonymity. Speed is more
- important to me than Anonymity."
- * "I use Tor for Anonymity. I need extra protection at the cost of speed."
-
- and then some explanation in the help for exactly what this means, and
- the risks involved with eliminating the adversary's need for timing attacks
- wrt to false positives, etc.
- Migration:
- Phase one: Experiment with the proper ratio of circuit failures
- used to expire garbage or malicious guards via TorFlow.
- Phase two: Re-enable config and modify new_route_len() to add an
- extra hop if coin comes up "heads".
- Phase three: Make radiobutton in Vidalia, along with help entry
- that explains in layman's terms the risks involved.
- [1] http://www.cs.umass.edu/~mwright/papers/levine-timing.pdf
|