|
|
@@ -476,6 +476,7 @@ Tor's evolution.
|
|
|
\end{description}
|
|
|
|
|
|
\SubSection{Non-goals}
|
|
|
+\label{subsec:non-goals}
|
|
|
In favoring conservative, deployable designs, we have explicitly deferred
|
|
|
a number of goals. Many of these goals are desirable in anonymity systems,
|
|
|
but we choose to defer them either because they are solved elsewhere,
|
|
|
@@ -1539,124 +1540,161 @@ Mention jurisdictional arbitrage.
|
|
|
|
|
|
Pull attacks and defenses into analysis as a subsection
|
|
|
|
|
|
-\Section{Maintaining anonymity in Tor}
|
|
|
+\Section{Open Questions in Low-latency Anonymity}
|
|
|
\label{sec:maintaining-anonymity}
|
|
|
|
|
|
-\footnote{The first Onion Routing design \cite{or-ih96} protected against
|
|
|
-this threat to some
|
|
|
-extent by requiring users to hide network access behind an onion
|
|
|
-router/firewall that was also forwarding traffic from other nodes.
|
|
|
-However, it is desirable for users to
|
|
|
-benefit from Onion Routing even when they can't run their own
|
|
|
-onion routers.
|
|
|
-%Such users, especially if they engage in certain unusual
|
|
|
-%communication behaviors, may be identifiable \cite{wright03}.
|
|
|
-%To
|
|
|
-%complicate the possibility of such attacks Tor multiplexes many
|
|
|
-%stream down each circuit, but still rotates the circuit
|
|
|
-%periodically to avoid too much linkability from requests on a single
|
|
|
-%circuit.
|
|
|
-}
|
|
|
-
|
|
|
-I probably should have noted that this means loops will be on at least
|
|
|
-five hop routes, which should be rare given the distribution. I'm
|
|
|
-realizing that this is reproducing some of the thought that led to a
|
|
|
-default of five hops in the original onion routing design. There were
|
|
|
-some different assumptions, which I won't spell out now. Note that
|
|
|
-enclave level protections really change these assumptions. If most
|
|
|
-circuits are just two hops, then just a single link observer will be
|
|
|
-able to tell that two enclaves are communicating with high probability.
|
|
|
-So, it would seem that enclaves should have a four node minimum circuit
|
|
|
-to prevent trivial circuit insider identification of the whole circuit,
|
|
|
-and three hop minimum for circuits from an enclave to some nonclave
|
|
|
-responder. But then... we would have to make everyone obey these rules
|
|
|
-or a node that through timing inferred it was on a four hop circuit
|
|
|
-would know that it was probably carrying enclave to enclave traffic.
|
|
|
-Which... if there were even a moderate number of bad nodes in the
|
|
|
-network would make it advantageous to break the connection to conduct
|
|
|
-a reformation intersection attack. Ahhh! I gotta stop thinking
|
|
|
-about this and work on the paper some before the family wakes up.
|
|
|
-On Sat, Oct 25, 2003 at 06:57:12AM -0400, Paul Syverson wrote:
|
|
|
-> Which... if there were even a moderate number of bad nodes in the
|
|
|
-> network would make it advantageous to break the connection to conduct
|
|
|
-> a reformation intersection attack. Ahhh! I gotta stop thinking
|
|
|
-> about this and work on the paper some before the family wakes up.
|
|
|
-This is the sort of issue that should go in the 'maintaining anonymity
|
|
|
-with tor' section towards the end. :)
|
|
|
-Email from between roger and me to beginning of section above. Fix and move.
|
|
|
-
|
|
|
-
|
|
|
-[Put as much of this as a part of open issues as is possible.]
|
|
|
-
|
|
|
-[what's an anonymity set?]
|
|
|
-
|
|
|
-packet counting attacks work great against initiators. need to do some
|
|
|
-level of obfuscation for that. standard link padding for passive link
|
|
|
-observers. long-range padding for people who own the first hop. are
|
|
|
-we just screwed against people who insert timing signatures into your
|
|
|
-traffic?
|
|
|
-
|
|
|
-Even regardless of link padding from Alice to the cloud, there will be
|
|
|
-times when Alice is simply not online. Link padding, at the edges or
|
|
|
-inside the cloud, does not help for this.
|
|
|
-
|
|
|
-how often should we pull down directories? how often send updated
|
|
|
-server descs?
|
|
|
-
|
|
|
-when we start up the client, should we build a circuit immediately,
|
|
|
-or should the default be to build a circuit only on demand? should we
|
|
|
-fetch a directory immediately?
|
|
|
-
|
|
|
-would we benefit from greater synchronization, to blend with the other
|
|
|
-users? would the reduced speed hurt us more?
|
|
|
-
|
|
|
-does the "you can't see when i'm starting or ending a stream because
|
|
|
-you can't tell what sort of relay cell it is" idea work, or is just
|
|
|
-a distraction?
|
|
|
-
|
|
|
-does running a server actually get you better protection, because traffic
|
|
|
-coming from your node could plausibly have come from elsewhere? how
|
|
|
-much mixing do you need before this is actually plausible, or is it
|
|
|
-immediately beneficial because many adversary can't see your node?
|
|
|
-
|
|
|
-do different exit policies at different exit nodes trash anonymity sets,
|
|
|
-or not mess with them much?
|
|
|
-
|
|
|
-do we get better protection against a realistic adversary by having as
|
|
|
-many nodes as possible, so he probably can't see the whole network,
|
|
|
-or by having a small number of nodes that mix traffic well? is a
|
|
|
-cascade topology a more realistic way to get defenses against traffic
|
|
|
-confirmation? does the hydra (many inputs, few outputs) topology work
|
|
|
-better? are we going to get a hydra anyway because most nodes will be
|
|
|
+% There must be a better intro than this! -NM
|
|
|
+In addition to the open problems discussed in
|
|
|
+section~\ref{subsec:non-goals}, many other questions remain to be
|
|
|
+solved by future research before we can be truly confident that we
|
|
|
+have built a secure low-latency anonymity service.
|
|
|
+
|
|
|
+Many of these open issues are questions of balance. For example,
|
|
|
+how often should users rotate to fresh circuits? Too-frequent
|
|
|
+rotation is inefficient and expensive, but too-infrequent rotation
|
|
|
+makes the user's traffic linkable. Instead of opening a fresh
|
|
|
+circuit; clients can also limit linkability exit from a middle point
|
|
|
+of the circuit, or by truncating and re-extending the circuit, but
|
|
|
+more analysis is needed to determine the proper trade-off.
|
|
|
+[XXX mention predecessor attacks?]
|
|
|
+
|
|
|
+A similar question surrounds timing of directory operations:
|
|
|
+how often should directories be updated? With too-infrequent
|
|
|
+updates clients receive an inaccurate picture of the network; with
|
|
|
+too-frequent updates the directory servers are overloaded.
|
|
|
+
|
|
|
+%do different exit policies at different exit nodes trash anonymity sets,
|
|
|
+%or not mess with them much?
|
|
|
+%
|
|
|
+%% Why would they? By routing traffic to certain nodes preferentially?
|
|
|
+
|
|
|
+[XXX Choosing paths and path lengths: I'm not writing this bit till
|
|
|
+ Arma's pathselection stuff is in. -NM]
|
|
|
+
|
|
|
+%%%% Roger said that he'd put a path selection paragraph into section
|
|
|
+%%%% 4 that would replace this.
|
|
|
+%
|
|
|
+%I probably should have noted that this means loops will be on at least
|
|
|
+%five hop routes, which should be rare given the distribution. I'm
|
|
|
+%realizing that this is reproducing some of the thought that led to a
|
|
|
+%default of five hops in the original onion routing design. There were
|
|
|
+%some different assumptions, which I won't spell out now. Note that
|
|
|
+%enclave level protections really change these assumptions. If most
|
|
|
+%circuits are just two hops, then just a single link observer will be
|
|
|
+%able to tell that two enclaves are communicating with high probability.
|
|
|
+%So, it would seem that enclaves should have a four node minimum circuit
|
|
|
+%to prevent trivial circuit insider identification of the whole circuit,
|
|
|
+%and three hop minimum for circuits from an enclave to some nonclave
|
|
|
+%responder. But then... we would have to make everyone obey these rules
|
|
|
+%or a node that through timing inferred it was on a four hop circuit
|
|
|
+%would know that it was probably carrying enclave to enclave traffic.
|
|
|
+%Which... if there were even a moderate number of bad nodes in the
|
|
|
+%network would make it advantageous to break the connection to conduct
|
|
|
+%a reformation intersection attack. Ahhh! I gotta stop thinking
|
|
|
+%about this and work on the paper some before the family wakes up.
|
|
|
+%On Sat, Oct 25, 2003 at 06:57:12AM -0400, Paul Syverson wrote:
|
|
|
+%> Which... if there were even a moderate number of bad nodes in the
|
|
|
+%> network would make it advantageous to break the connection to conduct
|
|
|
+%> a reformation intersection attack. Ahhh! I gotta stop thinking
|
|
|
+%> about this and work on the paper some before the family wakes up.
|
|
|
+%This is the sort of issue that should go in the 'maintaining anonymity
|
|
|
+%with tor' section towards the end. :)
|
|
|
+%Email from between roger and me to beginning of section above. Fix and move.
|
|
|
+
|
|
|
+Throughout this paper, we have assumed that end-to-end traffic
|
|
|
+analysis cannot yet be defeated. But even high-latency anonymity
|
|
|
+systems can be vulnerable to end-to-end traffic analysis, if the
|
|
|
+traffic volumes are high enough, and if users' habits are sufficiently
|
|
|
+distinct \cite{disclosure,statistical-disclosure}. \emph{What can be
|
|
|
+ done to limit the effectiveness of these attacks against low-latency
|
|
|
+ systems?} Tor already makes some effort to conceal the starts and
|
|
|
+ends of streams by wrapping all long-range control commands in
|
|
|
+identical-looking relay cells, but more analysis is needed. Link
|
|
|
+padding could frustrate passive observer who count packets; long-range
|
|
|
+padding could work against observers who own the first hop in a
|
|
|
+circuit. But more research needs to be done in order to find an
|
|
|
+efficient and practical approach. Volunteers prefer not to run
|
|
|
+constant-bandwidth padding; but more sophisticated traffic shaping
|
|
|
+approaches remain somewhat unanalyzed. [XXX is this so?] Recent work
|
|
|
+on long-range padding \cite{long-range-padding} shows promise. One
|
|
|
+could also try to reduce correlation in packet timing by batching and
|
|
|
+re-ordering packets, but it is unclear whether this could improve
|
|
|
+anonymity without introducing so much latency as to render the
|
|
|
+network unusable.
|
|
|
+
|
|
|
+Even if passive timing attacks were wholly solved, active timing
|
|
|
+attacks would remain. \emph{What can
|
|
|
+ be done to address attackers who can introduce timing patterns into
|
|
|
+ a user's traffic?} [XXX mention likely approaches]
|
|
|
+
|
|
|
+%%% I think we cover this by framing the problem as ``Can we make
|
|
|
+%%% end-to-end characteristics of low-latency systems as good as
|
|
|
+%%% those of high-latency systems?'' Eliminating long-term
|
|
|
+%%% intersection is a hard problem.
|
|
|
+%
|
|
|
+%Even regardless of link padding from Alice to the cloud, there will be
|
|
|
+%times when Alice is simply not online. Link padding, at the edges or
|
|
|
+%inside the cloud, does not help for this.
|
|
|
+
|
|
|
+In order to scale to large numbers of users, and to prevent an
|
|
|
+attacker from observing the whole network at once, it may be necessary
|
|
|
+for low-latency anonymity systems to support far more servers than Tor
|
|
|
+currently anticipates. This introduces several issues. First, if
|
|
|
+approval by a centralized set of directory servers is no longer
|
|
|
+feasible, what mechanism should be used to prevent adversaries from
|
|
|
+signing up many spurious servers? (Tarzan and Morphmix present
|
|
|
+possible solutions.) Second, if clients can no longer have a complete
|
|
|
+picture of the network at all times how do we prevent attackers from
|
|
|
+manipulating client knowledge? Third, if there are to many servers
|
|
|
+for every server to constantly communicate with every other, what kind
|
|
|
+of non-clique topology should the network use? [XXX cite george's
|
|
|
+ restricted-routes paper] (Whatever topology we choose, we need some
|
|
|
+way to keep attackers from manipulating their position within it.)
|
|
|
+Fourth, since no centralized authority is tracking server reliability,
|
|
|
+How do we prevent unreliable servers from rendering the network
|
|
|
+unusable? Fifth, do clients receive so much anonymity benefit from
|
|
|
+running their own servers that we should expect them all to do so, or
|
|
|
+do we need to find another incentive structure to motivate them?
|
|
|
+
|
|
|
+Alternatively, it may be the case that one of these problems proves
|
|
|
+intractable, or that the drawbacks to many-server systems prove
|
|
|
+greater than the benefits. Nevertheless, we may still do well to
|
|
|
+consider non-clique topologies. A cascade topology may provide more
|
|
|
+defense against traffic confirmation confirmation.
|
|
|
+% Why would it? Cite. -NM
|
|
|
+Does the hydra (many inputs, few outputs) topology work
|
|
|
+better? Are we going to get a hydra anyway because most nodes will be
|
|
|
middleman nodes?
|
|
|
|
|
|
-using a circuit many times is good because it's less cpu work.
|
|
|
- good because of predecessor attacks with path rebuilding.
|
|
|
- bad because predecessor attacks can be more likely to link you with a
|
|
|
- previous circuit since you're so verbose.
|
|
|
- bad because each thing you do on that circuit is linked to the other
|
|
|
- things you do on that circuit.
|
|
|
- how often to rotate?
|
|
|
- how to decide when to exit from middle?
|
|
|
- when to truncate and re-extend versus when to start new circuit?
|
|
|
-
|
|
|
-Because Tor runs over TCP, when one of the servers goes down it seems
|
|
|
-that all the circuits (and thus streams) going over that server must
|
|
|
-break. This reduces anonymity because everybody needs to reconnect
|
|
|
-right then (does it? how much?) and because exit connections all break
|
|
|
-at the same time, and it also reduces usability. It seems the problem
|
|
|
-is even worse in a p2p environment, because so far such systems don't
|
|
|
-really provide an incentive for nodes to stay connected when they're
|
|
|
-done browsing, so we would expect a much higher churn rate than for
|
|
|
-onion routing. Are there ways of allowing streams to survive the loss
|
|
|
-of a node in the path?
|
|
|
-
|
|
|
-discuss topologies. Cite George's non-freeroutes paper. Maybe this
|
|
|
-graf goes elsewhere.
|
|
|
-
|
|
|
-discuss attracting users; incentives; usability.
|
|
|
-
|
|
|
-Choosing paths and path lengths.
|
|
|
+%%% Do more with this paragraph once The TCP-over-TCP paragraph is
|
|
|
+%%% more integrated into Related works.
|
|
|
+%
|
|
|
+As mentioned in section\ref{where-is-it-now}, Tor could improve its
|
|
|
+robustness against node failure by buffering stream data at the
|
|
|
+network's edges, and performing end-to-end acknowledgments. The
|
|
|
+efficacy of this approach remains to be tested, however, and there
|
|
|
+may be more effective means for ensuring reliable connections in the
|
|
|
+presence of unreliable nodes.
|
|
|
+
|
|
|
+%%% Keeping this original paragraph for a little while, since it
|
|
|
+%%% is not the same as what's written there now.
|
|
|
+%
|
|
|
+%Because Tor depends on TLS and TCP to provide a reliable transport,
|
|
|
+%when one of the servers goes down, all the circuits (and thus streams)
|
|
|
+%traveling over that server must break. This reduces anonymity because
|
|
|
+%everybody needs to reconnect right then (does it? how much?) and
|
|
|
+%because exit connections all break at the same time, and it also harms
|
|
|
+%usability. It seems the problem is even worse in a peer-to-peer
|
|
|
+%environment, because so far such systems don't really provide an
|
|
|
+%incentive for nodes to stay connected when they're done browsing, so
|
|
|
+%we would expect a much higher churn rate than for onion routing.
|
|
|
+%there ways of allowing streams to survive the loss of a node in the
|
|
|
+%path?
|
|
|
+
|
|
|
+% Roger or Paul suggested that we say something about incentives,
|
|
|
+% too, but I think that's a better candidate for our future work
|
|
|
+% section. After all, we will doubtlessly learn very much about why
|
|
|
+% people do or don't run and use Tor in the near future. -NM
|
|
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|