|
@@ -1351,11 +1351,6 @@ acknowledge his existence.
|
|
|
\Section{Attacks and Defenses}
|
|
|
\label{sec:attacks}
|
|
|
|
|
|
-% XXX In sec9 we should talk about bandwidth classes, which will
|
|
|
-% enable us to accept a lot more ORs than if we continue to
|
|
|
-% require 10mbit connections for all ORs. -RD
|
|
|
-
|
|
|
-
|
|
|
Below we summarize a variety of attacks, and discuss how well our
|
|
|
design withstands them.
|
|
|
|
|
@@ -1647,236 +1642,157 @@ by the session key shared by the client and server.
|
|
|
\Section{Open Questions in Low-latency Anonymity}
|
|
|
\label{sec:maintaining-anonymity}
|
|
|
|
|
|
-% There must be a better intro than this! -NM
|
|
|
In addition to the open problems discussed in
|
|
|
-Section~\ref{subsec:non-goals}, many other questions remain to be
|
|
|
-solved by future research before we can be confident of our security.
|
|
|
-
|
|
|
-Many of these open issues are questions of balance. For example,
|
|
|
-how often should users rotate to fresh circuits? Too-frequent
|
|
|
-rotation is inefficient, expensive, and may lead to intersection attacks
|
|
|
-and predecessor attacks \cite{wright03},
|
|
|
-but too-infrequent rotation
|
|
|
-makes the user's traffic linkable. Along with opening a fresh
|
|
|
-circuit, clients can also limit linkability by exiting from a middle point
|
|
|
-of the circuit, or by truncating and re-extending the circuit; but
|
|
|
-more analysis is needed to determine the proper trade-off.
|
|
|
-
|
|
|
-A similar question surrounds timing of directory operations:
|
|
|
-how often should directories be updated? With too-infrequent
|
|
|
-updates clients receive an inaccurate picture of the network; with
|
|
|
-too-frequent updates the directory servers are overloaded.
|
|
|
-
|
|
|
-%do different exit policies at different exit nodes trash anonymity sets,
|
|
|
-%or not mess with them much?
|
|
|
-%% Why would they? By routing traffic to certain nodes preferentially?
|
|
|
-
|
|
|
-%[XXX Choosing paths and path lengths: I'm not writing this bit till
|
|
|
-% Arma's pathselection stuff is in. -NM]
|
|
|
-%Alice always chooses her path to contain at least
|
|
|
-%three nodes unrelated to herself and her destination, choosing the
|
|
|
-%number of nodes beyond the third from a geometric distribution;
|
|
|
-%explain why. -NM
|
|
|
-
|
|
|
-%%%% Roger said that he'd put a path selection paragraph into section
|
|
|
-%%%% 4 that would replace this.
|
|
|
-%
|
|
|
-%I probably should have noted that this means loops will be on at least
|
|
|
-%five hop routes, which should be rare given the distribution. I'm
|
|
|
-%realizing that this is reproducing some of the thought that led to a
|
|
|
-%default of five hops in the original onion routing design. There were
|
|
|
-%some different assumptions, which I won't spell out now. Note that
|
|
|
-%enclave level protections really change these assumptions. If most
|
|
|
-%circuits are just two hops, then just a single link observer will be
|
|
|
-%able to tell that two enclaves are communicating with high probability.
|
|
|
-%So, it would seem that enclaves should have a four node minimum circuit
|
|
|
-%to prevent trivial circuit insider identification of the whole circuit,
|
|
|
-%and three hop minimum for circuits from an enclave to some nonclave
|
|
|
-%responder. But then... we would have to make everyone obey these rules
|
|
|
-%or a node that through timing inferred it was on a four hop circuit
|
|
|
-%would know that it was probably carrying enclave to enclave traffic.
|
|
|
-%Which... if there were even a moderate number of bad nodes in the
|
|
|
-%network would make it advantageous to break the connection to conduct
|
|
|
-%a reformation intersection attack. Ahhh! I gotta stop thinking
|
|
|
-%about this and work on the paper some before the family wakes up.
|
|
|
-%On Sat, Oct 25, 2003 at 06:57:12AM -0400, Paul Syverson wrote:
|
|
|
-%> Which... if there were even a moderate number of bad nodes in the
|
|
|
-%> network would make it advantageous to break the connection to conduct
|
|
|
-%> a reformation intersection attack. Ahhh! I gotta stop thinking
|
|
|
-%> about this and work on the paper some before the family wakes up.
|
|
|
-%This is the sort of issue that should go in the 'maintaining anonymity
|
|
|
-%with tor' section towards the end. :)
|
|
|
-%Email from between roger and me to beginning of section above. Fix and move.
|
|
|
+Section~\ref{subsec:non-goals}, many other questions must be solved
|
|
|
+before we can be confident of Tor's security.
|
|
|
+
|
|
|
+Many of these open issues are questions of balance. For example,
|
|
|
+how often should users rotate to fresh circuits? Frequent rotation
|
|
|
+is inefficient, expensive, and may lead to intersection attacks and
|
|
|
+predecessor attacks \cite{wright03}, but infrequent rotation makes the
|
|
|
+user's traffic linkable. Along with opening a fresh circuit, clients can
|
|
|
+also limit linkability by exiting from a middle point of the circuit,
|
|
|
+or by truncating and re-extending the circuit; but more analysis is
|
|
|
+needed to determine the proper trade-off.
|
|
|
+
|
|
|
+A similar question surrounds timing of directory operations: how often
|
|
|
+should directories be updated? Clients that update infrequently receive
|
|
|
+an inaccurate picture of the network, but frequent updates can overload
|
|
|
+the directory servers. More generally, we must find more
|
|
|
+decentralized yet practical ways to distribute up-to-date snapshots of
|
|
|
+network status without introducing new attacks.
|
|
|
+
|
|
|
+How should we choose path lengths? If she uses only two hops, then both
|
|
|
+these nodes are certain that by colluding they will learn about Alice
|
|
|
+and Bob. Our current approach is that Alice always chooses at least three
|
|
|
+nodes unrelated to herself and her destination. Thus normally she chooses
|
|
|
+three nodes, but if she is running an OR and her destination is on an OR,
|
|
|
+she uses five. Should Alice choose a nondeterministic path length (say,
|
|
|
+increasing it from a geometric distribution), to foil an attacker who
|
|
|
+uses timing to learn that he is the fifth hop and thus concludes that
|
|
|
+both Alice and the responder are on ORs?
|
|
|
|
|
|
Throughout this paper, we have assumed that end-to-end traffic
|
|
|
confirmation will immediately and automatically defeat a low-latency
|
|
|
-anonymity system. Even high-latency anonymity
|
|
|
-systems can be vulnerable to end-to-end traffic confirmation, if the
|
|
|
-traffic volumes are high enough, and if users' habits are sufficiently
|
|
|
-distinct \cite{limits-open,statistical-disclosure}. \emph{Can
|
|
|
- anything be done to make low-latency systems resist these attacks as
|
|
|
- well as high-latency systems?}
|
|
|
-Tor already makes some effort to conceal the starts and
|
|
|
-ends of streams by wrapping all long-range control commands in
|
|
|
-identical-looking relay cells, but more analysis is needed. Link
|
|
|
-padding could frustrate passive observers who count packets; long-range
|
|
|
-padding could work against observers who own the first hop in a
|
|
|
-circuit. But more research needs to be done in order to find an
|
|
|
-efficient and practical approach. Volunteers prefer not to run
|
|
|
-constant-bandwidth padding; but more sophisticated traffic shaping
|
|
|
-approaches remain somewhat unanalyzed.
|
|
|
-%[XXX is this so?]
|
|
|
-Recent work
|
|
|
-on long-range padding \cite{defensive-dropping} shows promise. One
|
|
|
-could also try to reduce correlation in packet timing by batching and
|
|
|
-re-ordering packets, but it is unclear whether this could improve
|
|
|
-anonymity without introducing so much latency as to render the
|
|
|
+anonymity system. Even high-latency anonymity systems can be
|
|
|
+vulnerable to end-to-end traffic confirmation, if the traffic volumes
|
|
|
+are high enough, and if users' habits are sufficiently distinct
|
|
|
+\cite{limits-open,statistical-disclosure}. Can anything be done to
|
|
|
+make low-latency systems resist these attacks as well as high-latency
|
|
|
+systems? Tor already makes some effort to conceal the starts and ends of
|
|
|
+streams by wrapping all long-range control commands in identical-looking
|
|
|
+relay cells. Link padding could frustrate passive observers who count
|
|
|
+packets; long-range padding could work against observers who own the
|
|
|
+first hop in a circuit. But more research remains to find an efficient
|
|
|
+and practical approach. Volunteers prefer not to run constant-bandwidth
|
|
|
+padding; but no convincing traffic shaping approach has ever been
|
|
|
+specified. Recent work on long-range padding \cite{defensive-dropping}
|
|
|
+shows promise. One could also try to reduce correlation in packet timing
|
|
|
+by batching and re-ordering packets, but it is unclear whether this could
|
|
|
+improve anonymity without introducing so much latency as to render the
|
|
|
network unusable.
|
|
|
|
|
|
-Even if passive timing attacks were wholly solved, active timing
|
|
|
-attacks would remain. \emph{What can
|
|
|
- be done to address attackers who can introduce timing patterns into
|
|
|
- a user's traffic?} % [XXX mention likely approaches]
|
|
|
-
|
|
|
-%%% I think we cover this by framing the problem as ``Can we make
|
|
|
-%%% end-to-end characteristics of low-latency systems as good as
|
|
|
-%%% those of high-latency systems?'' Eliminating long-term
|
|
|
-%%% intersection is a hard problem.
|
|
|
-%
|
|
|
-%Even regardless of link padding from Alice to the cloud, there will be
|
|
|
-%times when Alice is simply not online. Link padding, at the edges or
|
|
|
-%inside the cloud, does not help for this.
|
|
|
-
|
|
|
-In order to scale to many users, and to prevent an
|
|
|
-attacker from observing the whole network at once, it may be necessary
|
|
|
-for low-latency anonymity systems to support far more servers than Tor
|
|
|
-currently anticipates. This introduces several issues. First, if
|
|
|
-approval by a centralized set of directory servers is no longer
|
|
|
-feasible, what mechanism should be used to prevent adversaries from
|
|
|
-signing up many spurious servers?
|
|
|
-Second, if clients can no longer have a complete
|
|
|
-picture of the network at all times, how can they perform
|
|
|
-discovery while preventing attackers from manipulating or exploiting
|
|
|
-gaps in client knowledge? Third, if there are too many servers
|
|
|
-for every server to constantly communicate with every other, what kind
|
|
|
-of non-clique topology should the network use? Restricted-route
|
|
|
-topologies promise comparable anonymity with better scalability
|
|
|
-\cite{danezis-pets03}, but whatever topology we choose, we need some
|
|
|
-way to keep attackers from manipulating their position within it.
|
|
|
-Fourth, since no centralized authority is tracking server reliability,
|
|
|
-How do we prevent unreliable servers from rendering the network
|
|
|
-unusable? Fifth, do clients receive so much anonymity benefit from
|
|
|
-running their own servers that we should expect them all to do so, or
|
|
|
-do we need to find another incentive structure to motivate them?
|
|
|
-(Tarzan and MorphMix present possible solutions.)
|
|
|
-
|
|
|
-% [[ XXX how to approve new nodes (advogato, sybil, captcha (RTT));]
|
|
|
-
|
|
|
-Alternatively, it may be the case that one of these problems proves
|
|
|
-intractable, or that the drawbacks to many-server systems prove
|
|
|
-greater than the benefits. Nevertheless, we may still do well to
|
|
|
-consider non-clique topologies. A cascade topology may provide more
|
|
|
-defense against traffic confirmation.
|
|
|
-% XXX Why would it? Cite. -NM
|
|
|
-%
|
|
|
-% Huh? Do you mean for simple attacks just because of larger anonymity
|
|
|
-% sets? -PS
|
|
|
-Does the hydra topology (many input nodes, few output nodes) work
|
|
|
-better? Are we going to get a hydra anyway because most nodes will be
|
|
|
-middleman nodes?
|
|
|
-
|
|
|
-As mentioned in Section~\ref{subsec:dos}, Tor could improve its
|
|
|
-robustness against node failure by buffering transmitted stream data
|
|
|
-at the network's edges until the data has been acknowledged by the
|
|
|
-other end of the stream. The efficacy of this approach remains to be
|
|
|
-tested, however, and there may be more effective means for ensuring
|
|
|
-reliable connections in the presence of unreliable nodes.
|
|
|
-
|
|
|
-%%% Keeping this original paragraph for a little while, since it
|
|
|
-%%% is not the same as what's written there now.
|
|
|
-%
|
|
|
-%Because Tor depends on TLS and TCP to provide a reliable transport,
|
|
|
-%when one of the servers goes down, all the circuits (and thus streams)
|
|
|
-%traveling over that server must break. This reduces anonymity because
|
|
|
-%everybody needs to reconnect right then (does it? how much?) and
|
|
|
-%because exit connections all break at the same time, and it also harms
|
|
|
-%usability. It seems the problem is even worse in a peer-to-peer
|
|
|
-%environment, because so far such systems don't really provide an
|
|
|
-%incentive for nodes to stay connected when they're done browsing, so
|
|
|
-%we would expect a much higher churn rate than for onion routing.
|
|
|
-%there ways of allowing streams to survive the loss of a node in the
|
|
|
-%path?
|
|
|
-
|
|
|
-% Roger or Paul suggested that we say something about incentives,
|
|
|
-% too, but I think that's a better candidate for our future work
|
|
|
-% section. After all, we will doubtlessly learn very much about why
|
|
|
-% people do or don't run and use Tor in the near future. -NM
|
|
|
-
|
|
|
-%We should run a squid at each exit node, to provide comparable anonymity
|
|
|
-%to private exit nodes for cache hits, to speed everything up, and to
|
|
|
-%have a buffer for funny stuff coming out of port 80.
|
|
|
-% on the other hand, it hampers PFS, because ORs have pages in the cache.
|
|
|
-%I previously elsewhere suggested bulk transfer proxies to carve
|
|
|
-%up big things so that they could be downloaded in less noticeable
|
|
|
-%pieces over several normal looking connections. We could suggest
|
|
|
-%similarly one or a handful of squid nodes that might serve up
|
|
|
-%some of the more sensitive but common material, especially if
|
|
|
-%the relevant sites didn't want to or couldn't run their own OR.
|
|
|
-%This would be better than having everyone run a squid which would
|
|
|
-%just help identify after the fact the different history of that
|
|
|
-%node's activity. All this kind of speculation needs to move to
|
|
|
-%future work section I guess. -PS]
|
|
|
+Common wisdom suggests that Alice should run her own onion router for best
|
|
|
+anonymity, because traffic coming through her node could plausibly have
|
|
|
+come from elsewhere. How much mixing do we need before this is actually
|
|
|
+effective, or is it immediately beneficial because many real-world
|
|
|
+adversaries won't be able to observe Alice's router?
|
|
|
+
|
|
|
+To scale to many users, and to prevent an attacker from observing the
|
|
|
+whole network at once, it may be necessary for low-latency anonymity
|
|
|
+systems to support far more servers than Tor currently anticipates.
|
|
|
+This introduces several issues. First, if approval by a centralized set
|
|
|
+of directory servers is no longer feasible, what mechanism should be used
|
|
|
+to prevent adversaries from signing up many colluding servers? Second,
|
|
|
+if clients can no longer have a complete picture of the network at all
|
|
|
+times, how can they perform discovery while preventing attackers from
|
|
|
+manipulating or exploiting gaps in client knowledge? Third, if there
|
|
|
+are too many servers for every server to constantly communicate with
|
|
|
+every other, what kind of non-clique topology should the network use?
|
|
|
+Restricted-route topologies promise comparable anonymity with better
|
|
|
+scalability \cite{danezis-pets03}, but whatever topology we choose, we
|
|
|
+need some way to keep attackers from manipulating their position within
|
|
|
+it \cite{casc-rep}. Fourth, since no centralized authority is tracking
|
|
|
+server reliability, How do we prevent unreliable servers from rendering
|
|
|
+the network unusable? Fifth, do clients receive so much anonymity benefit
|
|
|
+from running their own servers that we should expect them all to do so
|
|
|
+\cite{econymics}, or do we need to find another incentive structure to
|
|
|
+motivate them? Tarzan and MorphMix present possible solutions.
|
|
|
+
|
|
|
+% advogato, captcha
|
|
|
+
|
|
|
+A cascade topology with long-range padding and mixing may provide more
|
|
|
+defense against traffic confirmation against a large adversary, because
|
|
|
+it aggregates many users. Does the hydra topology (many input nodes,
|
|
|
+few output nodes) work better against some adversaries? Are we going to
|
|
|
+get a hydra anyway because most nodes will be middleman nodes?
|
|
|
+
|
|
|
+When a Tor node goes down, all its circuits (and thus streams) must break.
|
|
|
+Do users abandon the system because of this brittleness? How well
|
|
|
+does the method in Section~\ref{subsec:dos} allow streams to survive
|
|
|
+node failure? If affected users rebuild circuits immediately, how much
|
|
|
+anonymity is lost? It seems the problem is even worse in a peer-to-peer
|
|
|
+environment---so far such systems don't provide an incentive for peers to
|
|
|
+stay connected when they're done retrieving content, so we would expect
|
|
|
+a higher churn rate.
|
|
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|
|
|
\Section{Future Directions}
|
|
|
\label{sec:conclusion}
|
|
|
|
|
|
-Tor brings together many innovations into
|
|
|
-a unified deployable system. But there are still several attacks that
|
|
|
-work quite well, as well as a number of sustainability and run-time
|
|
|
-issues remaining to be ironed out. In particular:
|
|
|
-
|
|
|
-% Many of these (Scalability, cover traffic, morphmix)
|
|
|
-% are duplicates from open problems.
|
|
|
-%
|
|
|
-
|
|
|
-\emph{Scalability:} Tor's emphasis on design simplicity and
|
|
|
-deployability has led us to adopt a clique topology, a
|
|
|
-semi-centralized model for directories and trusts, and a
|
|
|
-full-network-visibility model for client knowledge. None of these
|
|
|
-properties will scale to more than a few hundred servers.
|
|
|
-Promising approaches to better scalability exist (see
|
|
|
-Section~\ref{sec:maintaining-anonymity}), but more deployment
|
|
|
-experience would be helpful in learning the relative importance of
|
|
|
-these bottlenecks.
|
|
|
-
|
|
|
-\emph{Incentives:} Volunteers may want to run nodes for publicity
|
|
|
-or better anonymity \cite{econymics}.
|
|
|
-more users -> more anonymity
|
|
|
-
|
|
|
-\emph{Cover traffic:} Currently we avoid cover traffic because
|
|
|
-whereas its costs in performance and bandwidth are clear, and because its
|
|
|
-security benefits are not well understood. With more research
|
|
|
-\cite{SS03,defensive-dropping}, this price/value ratio may change,
|
|
|
-both for link-level cover traffic and also long-range cover traffic.
|
|
|
-
|
|
|
-\emph{Better directory distribution:} Even with the threshold
|
|
|
-directory agreement algorithm described in Section~\ref{subsec:dirservers},
|
|
|
-directory distribution is still performance-critical. We must find more
|
|
|
-decentralized yet practical ways to distribute up-to-date snapshots of
|
|
|
-network status without introducing new attacks. Also, directory
|
|
|
-retrieval presents a scaling problem, since clients currently
|
|
|
-download a description of the entire network state every 15
|
|
|
-minutes. As the state grows larger and clients more numerous, we
|
|
|
-may need to move to a solution in which clients only receive
|
|
|
-incremental updates to directory state.
|
|
|
-
|
|
|
-\emph{Implementing location-hidden servers:} While
|
|
|
-Section~\ref{sec:rendezvous} describes a design for rendezvous
|
|
|
-points and location-hidden servers, these features have not yet been
|
|
|
-implemented. While doing so we are likely to encounter additional
|
|
|
-issues that must be resolved, both in terms of usability and anonymity.
|
|
|
+Tor brings together many innovations into a unified deployable system. The
|
|
|
+immediate next steps include:
|
|
|
+
|
|
|
+\emph{Scalability:} Tor's emphasis on design simplicity and deployability
|
|
|
+has led us to adopt a clique topology, a semi-centralized model for
|
|
|
+directories and trusts, and a full-network-visibility model for client
|
|
|
+knowledge. These properties will not scale past a few hundred servers.
|
|
|
+Section~\ref{sec:maintaining-anonymity} describes some promising
|
|
|
+approaches, but more deployment experience will be helpful in learning
|
|
|
+the relative importance of these bottlenecks.
|
|
|
+
|
|
|
+\emph{Bandwidth classes:} In this paper we assume all onion routers have
|
|
|
+good bandwidth and latency. We should adapt the Morphmix model,
|
|
|
+where nodes advertise their bandwidth level (DSL, T1, T3), and
|
|
|
+Alice avoids bottlenecks in her path by choosing nodes that match or
|
|
|
+exceed her bandwidth. In this way DSL users can join the Tor network.
|
|
|
+
|
|
|
+\emph{Incentives:} Volunteers who run nodes are rewarded with publicity
|
|
|
+and possibly better anonymity \cite{econymics}. More nodes means increased
|
|
|
+scalability, and more users can mean more anonymity. We need to continue
|
|
|
+examining the incentive structures for participating in Tor.
|
|
|
+
|
|
|
+\emph{Cover traffic:} Currently Tor avoids cover traffic because its costs
|
|
|
+in performance and bandwidth are clear, whereas its security benefits are
|
|
|
+not well-understood. We must pursue more research on both link-level cover
|
|
|
+traffic and long-range cover traffic to determine some simple padding
|
|
|
+schemes that offer provable protection against our chosen adversary.
|
|
|
+
|
|
|
+%%\emph{Offer two relay cell sizes:} Traffic on the Internet tends to be
|
|
|
+%%large for bulk transfers and small for interactive traffic. One cell
|
|
|
+%%size cannot be optimal for both types of traffic.
|
|
|
+% This should go in the spec and todo, but not the paper yet. -RD
|
|
|
+
|
|
|
+\emph{Caching at exit nodes:} We should run a caching web proxy at each
|
|
|
+exit node, to provide anonymity for cached pages (Alice's request never
|
|
|
+leaves the Tor network), to improve speed, and to reduce bandwidth cost.
|
|
|
+%XXX and to have a layer to block to block funny stuff out of port 80.
|
|
|
+% is that a useful thing to say?
|
|
|
+On the other hand, forward security is weakened because routers have the
|
|
|
+pages in their cache. We must find the right balance between usability
|
|
|
+and security.
|
|
|
+
|
|
|
+\emph{Better directory distribution:} Directory retrieval presents
|
|
|
+a scaling problem, since clients currently download a description of
|
|
|
+the entire network state every 15 minutes. As the state grows larger
|
|
|
+and clients more numerous, we may need to move to a solution in which
|
|
|
+clients only receive incremental updates to directory state.
|
|
|
+
|
|
|
+\emph{Implement location-hidden services:} The design in
|
|
|
+Section~\ref{sec:rendezvous} has not yet been implemented. While doing
|
|
|
+so we are likely to encounter additional issues that must be resolved,
|
|
|
+both in terms of usability and anonymity.
|
|
|
|
|
|
\emph{Further specification review:} Although we have a public,
|
|
|
byte-level specification for the Tor protocols, this document has
|
|
@@ -1889,7 +1805,6 @@ designer of MorphMix to make the common elements of our two systems
|
|
|
share a common specification and implementation. So far, this seems
|
|
|
to be relatively straightforward. Interoperability will allow testing
|
|
|
and direct comparison of the two designs for trust and scalability.
|
|
|
-% XXXX Bandwidth classes.
|
|
|
|
|
|
\emph{Wider-scale deployment:} The original goal of Tor was to
|
|
|
gain experience in deploying an anonymizing overlay network, and
|
|
@@ -1900,7 +1815,6 @@ able to evaluate some of our design decisions, including our
|
|
|
robustness/latency trade-offs, our performance trade-offs (including
|
|
|
cell size), our abuse-prevention mechanisms, and
|
|
|
our overall usability.
|
|
|
-% XXX large and small cells on same network.
|
|
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|