| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387 | 
							- Filename: 115-two-hop-paths.txt
 
- Title: Two Hop Paths
 
- Version: $Revision$
 
- Last-Modified: $Date$
 
- Author: Mike Perry
 
- Created:
 
- Status: Open
 
- Supersedes: 112
 
- Overview:
 
-   The idea is that users should be able to choose if they would like
 
-   to have either two or three hop paths through the tor network. 
 
-   Let us be clear: the users who would choose this option should be
 
-   those that are concerned with IP obfuscation only: ie they would not be
 
-   targets of a resource-intensive multi-node attack. It is sometimes said
 
-   that these users should find some other network to use other than Tor.
 
-   This is a foolish suggestion: more users improves security of everyone,
 
-   and the current small userbase size is a critical hindrance to
 
-   anonymity, as is discussed below and in [1].
 
-   This value should be modifiable from the controller, and should be
 
-   available from Vidalia.
 
- Motivation:
 
-   The Tor network is slow and overloaded. Increasingly often I hear
 
-   stories about friends and friends of friends who are behind firewalls,
 
-   annoying censorware, or under surveillance that interferes with their
 
-   productivity and Internet usage, or chills their speech. These people
 
-   know about Tor, but they choose to put up with the censorship because
 
-   Tor is too slow to be usable for them. In fact, to download a fresh,
 
-   complete copy of levine-timing.pdf for the Theoretical Argument
 
-   section of this proposal over Tor took me 3 tries.
 
-   Furthermore, the biggest current problem with Tor's anonymity for
 
-   those who really need it is not someone attacking the network to
 
-   discover who they are. It's instead the extreme danger that so few
 
-   people use Tor because it's so slow, that those who do use it have
 
-   essentially no confusion set.
 
-   The recent case where the professor and the rogue Tor user were the
 
-   only Tor users on campus, and thus suspected in an incident involving
 
-   Tor and that University underscores this point: "That was why the police
 
-   had come to see me. They told me that only two people on our campus were
 
-   using Tor: me and someone they suspected of engaging in an online scam.
 
-   The detectives wanted to know whether the other user was a former
 
-   student of mine, and why I was using Tor"[1].
 
-   Not only does Tor provide no anonymity if you use it to be anonymous
 
-   but are obviously from a certain institution, location or circumstance,
 
-   it is also dangerous to use Tor for risk of being accused of having
 
-   something significant enough to hide to be willing to put up with
 
-   the horrible performance as opposed to using some weaker alternative.
 
-   There are many ways to improve the speed problem, and of course we
 
-   should and will implement as many as we can. Johannes's GSoC project
 
-   and my reputation system are longer term, higher-effort things that
 
-   will still provide benefit independent of this proposal.
 
-   However, reducing the path length to 2 for those who do not need the
 
-   extra anonymity 3 hops provide not only improves their Tor experience
 
-   but also reduces their load on the Tor network by 33%, and should
 
-   increase adoption of Tor by a good deal. That's not just Win-Win, it's
 
-   Win-Win-Win.
 
- Who will enable this option?
 
-   This is the crux of the proposal. Admittedly, there is some anonymity
 
-   loss and some degree of decreased investment required on the part of
 
-   the adversary to attack 2 hop users versus 3 hop users, even if it is
 
-   minimal and limited mostly to up-front costs and false positives.
 
-   The key questions are:
 
-   1. Are these users in a class such that their risk is significantly
 
-      less than the amount of this anonymity loss?
 
-   2. Are these users able to identify themselves?
 
-   Many many users of Tor are not at risk for an adversary capturing c/n
 
-   nodes of the network just to see what they do. These users use Tor to
 
-   circumvent aggressive content filters, or simply to keep their IP out of
 
-   marketing and search engine databases. Most content filters have no
 
-   interest in running Tor nodes to catch violators, and marketers
 
-   certainly would never consider such a thing, both on a cost basis and a
 
-   legal one.
 
-   In a sense, this represents an alternate threat model against these
 
-   users who are not at risk for Tor's normal threat model.
 
-   It should be evident to these users that they fall into this class. All
 
-   that should be needed is a radio button
 
-    * "I use Tor for local content filter circumvention and/or IP obfuscation, 
 
-       not anonymity. Speed is more important to me than high anonymity. 
 
-       No one will make considerable efforts to determine my real IP."
 
-    * "I use Tor for anonymity and/or national-level, legally enforced 
 
-       censorship. It is possible effort will be taken to identify 
 
-       me, including but not limited to network surveillance. I need more 
 
-       protection."
 
-  
 
-   and then some explanation in the help for exactly what this means, and
 
-   the risks involved with eliminating the adversary's need for timing
 
-   attacks with respect to false positives. Ultimately, the decision is a
 
-   simple one that can be made without this information, however. The user
 
-   does not need Paul Syverson to instruct them on the deep magic of Onion
 
-   Routing to make this decision. They just need to know why they use Tor.
 
-   If they use it just to stay out of marketing databases and/or bypass a
 
-   local content filter, two hops is plenty. This is likely the vast
 
-   majority of Tor users, and many non-users we would like to bring on 
 
-   board.
 
-   So, having established this class of users, let us now go on to
 
-   examine theoretical and practical risks we place them at, and determine
 
-   if these risks violate the users needs, or introduce additional risk 
 
-   to node operators who may be subject to requests from law enforcement
 
-   to track users who need 3 hops, but use 2 because they enjoy the
 
-   thrill of russian roulette.
 
- Theoretical Argument:
 
-   It has long been established that timing attacks against mixed
 
-   and onion networks are extremely effective, and that regardless 
 
-   of path length, if the adversary has compromised your first and 
 
-   last hop of your path, you can assume they have compromised your
 
-   identity for that connection.
 
-   In fact, it was demonstrated that for all but the slowest, lossiest
 
-   networks, error rates for false positives and false negatives were
 
-   very near zero[2]. Only for constant streams of traffic over slow and
 
-   (more importantly) extremely lossy network links did the error rate
 
-   hit 20%. For loss rates typical to the Internet, even the error rate
 
-   for slow nodes with constant traffic streams was 13%.
 
-   When you take into account that most Tor streams are not constant,
 
-   but probably much more like their "HomeIP" dataset, which consists
 
-   mostly of web traffic that exists over finite intervals at specific
 
-   times, error rates drop to fractions of 1%, even for the "worst"
 
-   network nodes.
 
-   Therefore, the user has little benefit from the extra hop, assuming
 
-   the adversary does timing correlation on their nodes. Since timing
 
-   correlation is simply an implementation issue and is most likely
 
-   a single up-front cost (and one that is like quite a bit cheaper
 
-   than the cost of the machines purchased to host the nodes to mount
 
-   an attack), the real protection is the low probability of getting
 
-   both the first and last hop of a client's stream.
 
- Practical Issues:
 
-   Theoretical issues aside, there are several practical issues with the
 
-   implementation of Tor that need to be addressed to ensure that
 
-   identity information is not leaked by the implementation.
 
-   Exit policy issues:
 
-   If a client chooses an exit with a very restrictive exit policy
 
-   (such as an IP or IP range), the first hop then knows a good deal
 
-   about the destination. For this reason, clients should not select
 
-   exits that match their destination IP with anything other than "*".
 
-   Partitioning:
 
-   Partitioning attacks form another concern. Since Tor uses telescoping
 
-   to build circuits, it is possible to tell a user is constructing only
 
-   two hop paths at the entry node and on the local network. An external
 
-   adversary can potentially differentiate 2 and 3 hop users, and decide
 
-   that all IP addresses connecting to Tor and using 3 hops have something
 
-   to hide, and should be scrutinized more closely or outright apprehended.
 
-   One solution to this is to use the "leaky-circuit" method of attaching
 
-   streams: The user always creates 3-hop circuits, but if the option
 
-   is enabled, they always exit from their 2nd hop. The ideal solution
 
-   would be to create a RELAY_SHISHKABOB cell which contains onion
 
-   skins for every host along the path, but this requires protocol
 
-   changes at the nodes to support.
 
-   Guard nodes:
 
-   Since guard nodes can rotate due to client relocation, network
 
-   failure, node upgrades and other issues, if you amortize the risk a
 
-   mobile, dialup, or otherwise intermittently connected user is exposed to
 
-   over any reasonable duration of Tor usage (on the order of a year), it
 
-   is the same with or without guard nodes. Assuming an adversary has
 
-   c%/n% of network bandwidth, and guards rotate on average with period R,
 
-   statistically speaking, it's merely a question of if the user wishes
 
-   their risk to be concentrated with probability c/n over an expected
 
-   period of R*c, and probability 0 over an expected period of R*(n-c),
 
-   versus a continuous risk of (c/n)^2. So statistically speaking, guards
 
-   only create a time-tradeoff of risk over the long run for normal Tor
 
-   usage. Rotating guards do not reduce risk for normal client usage long
 
-   term.[3]
 
-   On other other hand, assuming a more stable method of guard selection
 
-   and preservation is devised, or a more stable client side network than 
 
-   my own is typical (which rotates guards frequently due to network issues
 
-   and moving about), guard nodes provide a tradeoff in the form of c/n% of
 
-   the users being "sacrificial users" who are exposed to high risk O(c/n)
 
-   of identification, while the rest of the network is exposed to zero
 
-   risk.
 
-   The nature of Tor makes it likely an adversary will take a "shock and
 
-   awe" approach to suppressing Tor by rounding up a few users whose
 
-   browsing activity has been observed to be made into examples, in an
 
-   attempt to prove that Tor is not perfect.
 
-   Since this "shock and awe" attack can be applied with or without guard
 
-   nodes, stable guard nodes do offer a measure of accountability of sorts.
 
-   If a user was using a small set of guard nodes and knows them well, and
 
-   then is suddenly apprehended as a result of Tor usage, having a fixed
 
-   set of entry points to suspect is a lot better than suspecting the whole
 
-   network. Conversely, it can also give non-apprehended users comfort
 
-   that they are likely to remain safe indefinitely with their set of (now
 
-   presumably trusted) guards. This is probably the most beneficial
 
-   property of reliable guards: they deter the adversary from mounting
 
-   "shock and awe" attacks because the surviving users will not
 
-   intimidated, but instead made more confident. Of course, guards need to
 
-   be made much more stable and users need to be encouraged to know their
 
-   guards for this property to really take effect. 
 
-   This beneficial property of client vigilance also carries over to an
 
-   active adversary, except in this case instead of relying on the user
 
-   to remember their guard nodes and somehow communicate them after
 
-   apprehension, the code can alert them to the presence of an active
 
-   adversary before they are apprehended. But only if they use guard nodes.
 
-   So lets consider the active adversary: Two hop paths allow malicious
 
-   guards to get considerably more benefit from failing circuits if they do
 
-   not extend to their colluding peers for the exit hop. Since guards can
 
-   detect the number of hops in a path via either timing or by statistical
 
-   analysis of the exit policy of the 2nd hop, they can perform this attack
 
-   predominantly against 2 hop users.
 
-   This can be addressed by completely abandoning an entry guard after a
 
-   certain ratio of extend or general circuit failures with respect to
 
-   non-failed circuits. The proper value for this ratio can be determined
 
-   experimentally with TorFlow. There is the possibility that the local
 
-   network can abuse this feature to cause certain guards to be dropped,
 
-   but they can do that anyways with the current Tor by just making guards
 
-   they don't like unreachable. With this mechanism, Tor will complain
 
-   loudly if any guard failure rate exceeds the expected in any failure
 
-   case, local or remote.
 
-   Eliminating guards entirely would actually not address this issue due
 
-   to the time-tradeoff nature of risk. In fact, it would just make it
 
-   worse. Without guard nodes, it becomes much more difficult for clients
 
-   to become alerted to Tor entry points that are failing circuits to make
 
-   sure that they only devote bandwidth to carry traffic for streams which
 
-   they observe both ends. Yet the rogue entry points are still able to
 
-   significantly increase their success rates by failing circuits.
 
-   For this reason, guard nodes should remain enabled for 2 hop users,
 
-   at least until an IP-independent, undetectable guard scanner can
 
-   be created. TorFlow can scan for failing guards, but after a while, 
 
-   its unique behavior gives away the fact that its IP is a scanner and 
 
-   it can be given selective service.
 
-   
 
-   Consideration of risks for node operators:
 
-   There is a serious risk for two hop users in the form of guard
 
-   profiling. If an adversary running an exit node notices that a
 
-   particular site is always visited from a fixed previous hop, it is
 
-   likely that this is a two hop user using a certain guard, which could be
 
-   monitored to determine their identity. Thus, for the protection of both
 
-   2 hop users and node operators, 2 hop users should limit their guard
 
-   duration to a sufficient number of days to verify reliability of a node,
 
-   but not much more. This duration can be determined experimentally by
 
-   TorFlow.
 
-   Considering a Tor client builds on average 144 circuits/day (10
 
-   minutes per circuit), if the adversary owns c/n% of exits on the
 
-   network, they can expect to see 144*c/n circuits from this user, or
 
-   about 14 minutes of usage per day per percentage of network penetration.
 
-   Since it will take several occurrences of user-linkable exit content
 
-   from the same predecessor hop for the adversary to have any confidence
 
-   this is a 2 hop user, it is very unlikely that any sort of demands made
 
-   upon the predecessor node would guaranteed to be effective (ie it
 
-   actually was a guard), let alone be executed in time to apprehend the 
 
-   user before they rotated guards.
 
-   The reverse risk also warrants consideration. If a malicious guard has
 
-   orders to surveil Mike Perry, it can determine Mike Perry is using two
 
-   hops by observing his tendency to choose a 2nd hop with a viable exit
 
-   policy. This can be done relatively quickly, unfortunately, and
 
-   indicates Mike Perry should spend some of his time building real 3 hop
 
-   circuits through the same guards, to require them to at least wait for
 
-   him to actually use Tor to determine his style of operation, rather than
 
-   collect this information from his passive building patterns.
 
-   However, to actively determine where Mike Perry is going, the guard
 
-   will need to require logging ahead of time at multiple exit nodes that
 
-   he may use over the course of the few days while he is at that guard,
 
-   and correlate the usage times of the exit node with Mike Perry's
 
-   activity at that guard for the few days he uses it. At this point, the
 
-   adversary is mounting a scale and method of attack (widespread logging,
 
-   timing attacks) that works pretty much just as effectively against 3
 
-   hops, so exit node operators are exposed to no additional danger than
 
-   they otherwise normally are.
 
- Why not fix Pathlen=2?:
 
-   The main reason I am not advocating that we always use 2 hops is that
 
-   in some situations, timing correlation evidence by itself may not be
 
-   considered as solid and convincing as an actual, uninterrupted, fully
 
-   traced path. Are these timing attacks as effective on a real network as
 
-   they are in simulation? Maybe the circuit multiplexing of Tor can serve 
 
-   to frustrate them to a degree? Would an extralegal adversary or 
 
-   authoritarian government even care? In the face of these situation 
 
-   dependent unknowns, it should be up to the user to decide if this is 
 
-   a concern for them or not.
 
-   It should probably also be noted that even a false positive
 
-   rate of 1% for a 200k concurrent-user network could mean that for a
 
-   given node, a given stream could be confused with something like 10
 
-   users, assuming ~200 nodes carry most of the traffic (ie 1000 users
 
-   each). Though of course to really know for sure, someone needs to do
 
-   an attack on a real network, unfortunately.
 
-   Additionally, at some point cover traffic schemes may be implemented to
 
-   frustrate timing attacks on the first hop. It is possible some expert
 
-   users may do this ad-hoc already, and may wish to continue using 3 hops
 
-   for this reason.
 
- Implementation:
 
-   new_route_len() can be modified directly with a check of the
 
-   Pathlen option. However, circuit construction logic should be
 
-   altered so that both 2 hop and 3 hop users build the same types of
 
-   circuits, and the option should ultimately govern circuit selection,
 
-   not construction. This improves coverage against guard nodes being
 
-   able to passively profile users who aren't even using Tor.
 
-   PathlenCoinWeight, anyone? :)
 
-   The exit policy hack is a bit more tricky. compare_addr_to_addr_policy
 
-   needs to return an alternate ADDR_POLICY_ACCEPTED_WILDCARD or
 
-   ADDR_POLICY_ACCEPTED_SPECIFIC return value for use in
 
-   circuit_is_acceptable.
 
-   
 
-   The leaky exit is trickier still.. handle_control_attachstream
 
-   does allow paths to exit at a given hop. Presumably something similar
 
-   can be done in connection_ap_handshake_process_socks, and elsewhere?
 
-   Circuit construction would also have to be performed such that the
 
-   2nd hop's exit policy is what is considered, not the 3rd's.
 
-   The entry_guard_t structure could have num_circ_failed and
 
-   num_circ_succeeded members such that if it exceeds F% circuit
 
-   extend failure rate to a second hop, it is removed from the entry list.
 
-   F should be sufficiently high to avoid churn from normal Tor circuit
 
-   failure as determined by TorFlow scans.
 
-   The Vidalia option should be presented as a radio button.
 
- Migration:
 
-   Phase 1: Adjust exit policy checks if Pathlen is set, implement leaky
 
-   circuit ability, and 2-3 hop circuit selection logic governed by
 
-   Pathlen.
 
-   Phase 2: Experiment to determine the proper ratio of circuit
 
-   failures used to expire garbage or malicious guards via TorFlow
 
-   (pending Bug #440 backport+adoption).
 
-   Phase 3: Implement guard expiration code to kick off failure-prone
 
-   guards and warn the user. Cap 2 hop guard duration to a proper number
 
-   of days determined sufficient to establish guard reliability (to be
 
-   determined by TorFlow).
 
-   Phase 4: Make radiobutton in Vidalia, along with help entry
 
-   that explains in layman's terms the risks involved.
 
-   Phase 5: Allow user to specify path length by HTTP URL suffix.
 
- [1] http://p2pnet.net/story/11279
 
- [2] http://www.cs.umass.edu/~mwright/papers/levine-timing.pdf
 
- [3] Proof available upon request ;)
 
 
  |