Browse Source

Retitle and write section 8.

svn:r702
Nick Mathewson 22 years ago
parent
commit
c826c5a95c
1 changed files with 152 additions and 114 deletions
  1. 152 114
      doc/tor-design.tex

+ 152 - 114
doc/tor-design.tex

@@ -476,6 +476,7 @@ Tor's evolution.
 \end{description}
 \end{description}
 
 
 \SubSection{Non-goals}
 \SubSection{Non-goals}
+\label{subsec:non-goals}
 In favoring conservative, deployable designs, we have explicitly deferred
 In favoring conservative, deployable designs, we have explicitly deferred
 a number of goals. Many of these goals are desirable in anonymity systems,
 a number of goals. Many of these goals are desirable in anonymity systems,
 but we choose to defer them either because they are solved elsewhere,
 but we choose to defer them either because they are solved elsewhere,
@@ -1539,124 +1540,161 @@ Mention jurisdictional arbitrage.
 
 
 Pull attacks and defenses into analysis as a subsection
 Pull attacks and defenses into analysis as a subsection
 
 
-\Section{Maintaining anonymity in Tor}
+\Section{Open Questions in Low-latency Anonymity}
 \label{sec:maintaining-anonymity}
 \label{sec:maintaining-anonymity}
 
 
-\footnote{The first Onion Routing design \cite{or-ih96} protected against
-this threat to some
-extent by requiring users to hide network access behind an onion
-router/firewall that was also forwarding traffic from other nodes.
-However, it is desirable for users to
-benefit from Onion Routing even when they can't run their own
-onion routers.
-%Such users, especially if they engage in certain unusual
-%communication behaviors, may be identifiable \cite{wright03}.
-%To
-%complicate the possibility of such attacks Tor multiplexes many
-%stream down each circuit, but still rotates the circuit
-%periodically to avoid too much linkability from requests on a single
-%circuit.
-}
-
-I probably should have noted that this means loops will be on at least
-five hop routes, which should be rare given the distribution.  I'm    
-realizing that this is reproducing some of the thought that led to a  
-default of five hops in the original onion routing design.  There were
-some different assumptions, which I won't spell out now.  Note that   
-enclave level protections really change these assumptions.  If most   
-circuits are just two hops, then just a single link observer will be  
-able to tell that two enclaves are communicating with high probability.
-So, it would seem that enclaves should have a four node minimum circuit
-to prevent trivial circuit insider identification of the whole circuit,
-and three hop minimum for circuits from an enclave to some nonclave    
-responder. But then... we would have to make everyone obey these rules 
-or a node that through timing inferred it was on a four hop circuit    
-would know that it was probably carrying enclave to enclave traffic.   
-Which... if there were even a moderate number of bad nodes in the      
-network would make it advantageous to break the connection to conduct  
-a reformation intersection attack. Ahhh! I gotta stop thinking         
-about this and work on the paper some before the family wakes up.  
-On Sat, Oct 25, 2003 at 06:57:12AM -0400, Paul Syverson wrote:
-> Which... if there were even a moderate number of bad nodes in the
-> network would make it advantageous to break the connection to conduct
-> a reformation intersection attack. Ahhh! I gotta stop thinking
-> about this and work on the paper some before the family wakes up. 
-This is the sort of issue that should go in the 'maintaining anonymity
-with tor' section towards the end. :)
-Email from between roger and me to beginning of section above. Fix and move.
-
-
-[Put as much of this as a part of open issues as is possible.]
-
-[what's an anonymity set?]
-
-packet counting attacks work great against initiators. need to do some
-level of obfuscation for that. standard link padding for passive link
-observers. long-range padding for people who own the first hop. are
-we just screwed against people who insert timing signatures into your
-traffic?
-
-Even regardless of link padding from Alice to the cloud, there will be
-times when Alice is simply not online. Link padding, at the edges or
-inside the cloud, does not help for this.
-
-how often should we pull down directories? how often send updated
-server descs?
-
-when we start up the client, should we build a circuit immediately,
-or should the default be to build a circuit only on demand? should we
-fetch a directory immediately?
-
-would we benefit from greater synchronization, to blend with the other
-users? would the reduced speed hurt us more?
-
-does the "you can't see when i'm starting or ending a stream because
-you can't tell what sort of relay cell it is" idea work, or is just
-a distraction?
-
-does running a server actually get you better protection, because traffic
-coming from your node could plausibly have come from elsewhere? how
-much mixing do you need before this is actually plausible, or is it
-immediately beneficial because many adversary can't see your node?
-
-do different exit policies at different exit nodes trash anonymity sets,
-or not mess with them much?
-
-do we get better protection against a realistic adversary by having as
-many nodes as possible, so he probably can't see the whole network,
-or by having a small number of nodes that mix traffic well? is a
-cascade topology a more realistic way to get defenses against traffic
-confirmation? does the hydra (many inputs, few outputs) topology work
-better? are we going to get a hydra anyway because most nodes will be
+% There must be a better intro than this! -NM
+In addition to the open problems discussed in
+section~\ref{subsec:non-goals}, many other questions remain to be
+solved by future research before we can be truly confident that we
+have built a secure low-latency anonymity service.
+
+Many of these open issues are questions of balance.  For example,
+how often should users rotate to fresh circuits?  Too-frequent
+rotation is inefficient and expensive, but too-infrequent rotation
+makes the user's traffic linkable.   Instead of opening a fresh
+circuit; clients can also limit linkability exit from a middle point
+of the circuit, or by truncating and re-extending the circuit, but
+more analysis is needed to determine the proper trade-off.
+[XXX mention predecessor attacks?]
+
+A similar question surrounds timing of directory operations:
+how often should directories be updated?  With too-infrequent
+updates clients receive an inaccurate picture of the network; with
+too-frequent updates the directory servers are overloaded.
+
+%do different exit policies at different exit nodes trash anonymity sets,
+%or not mess with them much?
+%
+%% Why would they?  By routing traffic to certain nodes preferentially?
+
+[XXX Choosing paths and path lengths: I'm not writing this bit till
+  Arma's pathselection stuff is in. -NM]
+
+%%%% Roger said that he'd put a path selection paragraph into section
+%%%% 4 that would replace this.
+%
+%I probably should have noted that this means loops will be on at least
+%five hop routes, which should be rare given the distribution.  I'm    
+%realizing that this is reproducing some of the thought that led to a  
+%default of five hops in the original onion routing design.  There were
+%some different assumptions, which I won't spell out now.  Note that   
+%enclave level protections really change these assumptions.  If most   
+%circuits are just two hops, then just a single link observer will be  
+%able to tell that two enclaves are communicating with high probability.
+%So, it would seem that enclaves should have a four node minimum circuit
+%to prevent trivial circuit insider identification of the whole circuit,
+%and three hop minimum for circuits from an enclave to some nonclave    
+%responder. But then... we would have to make everyone obey these rules 
+%or a node that through timing inferred it was on a four hop circuit    
+%would know that it was probably carrying enclave to enclave traffic.   
+%Which... if there were even a moderate number of bad nodes in the      
+%network would make it advantageous to break the connection to conduct  
+%a reformation intersection attack. Ahhh! I gotta stop thinking         
+%about this and work on the paper some before the family wakes up.  
+%On Sat, Oct 25, 2003 at 06:57:12AM -0400, Paul Syverson wrote:
+%> Which... if there were even a moderate number of bad nodes in the
+%> network would make it advantageous to break the connection to conduct
+%> a reformation intersection attack. Ahhh! I gotta stop thinking
+%> about this and work on the paper some before the family wakes up. 
+%This is the sort of issue that should go in the 'maintaining anonymity
+%with tor' section towards the end. :)
+%Email from between roger and me to beginning of section above. Fix and move.
+
+Throughout this paper, we have assumed that end-to-end traffic
+analysis cannot yet be defeated.  But even high-latency anonymity
+systems can be vulnerable to end-to-end traffic analysis, if the
+traffic volumes are high enough, and if users' habits are sufficiently
+distinct \cite{disclosure,statistical-disclosure}.  \emph{What can be
+  done to limit the effectiveness of these attacks against low-latency
+  systems?}  Tor already makes some effort to conceal the starts and
+ends of streams by wrapping all long-range control commands in
+identical-looking relay cells, but more analysis is needed.  Link
+padding could frustrate passive observer who count packets; long-range
+padding could work against observers who own the first hop in a
+circuit.  But more research needs to be done in order to find an
+efficient and practical approach.  Volunteers prefer not to run
+constant-bandwidth padding; but more sophisticated traffic shaping
+approaches remain somewhat unanalyzed. [XXX is this so?] Recent work
+on long-range padding \cite{long-range-padding} shows promise.  One
+could also try to reduce correlation in packet timing by batching and
+re-ordering packets, but it is unclear whether this could improve
+anonymity without introducing so much latency as to render the
+network unusable.
+
+Even if passive timing attacks were wholly solved, active timing
+attacks would remain.  \emph{What can
+  be done to address attackers who can introduce timing patterns into
+  a user's traffic?}  [XXX mention likely approaches]
+
+%%% I think we cover this by framing the problem as ``Can we make 
+%%% end-to-end characteristics of low-latency systems as good as
+%%% those of high-latency systems?''  Eliminating long-term
+%%% intersection is a hard problem.
+%
+%Even regardless of link padding from Alice to the cloud, there will be
+%times when Alice is simply not online. Link padding, at the edges or
+%inside the cloud, does not help for this.
+
+In order to scale to large numbers of users, and to prevent an
+attacker from observing the whole network at once, it may be necessary
+for low-latency anonymity systems to support far more servers than Tor
+currently anticipates.  This introduces several issues.  First, if
+approval by a centralized set of directory servers is no longer
+feasible, what mechanism should be used to prevent adversaries from
+signing up many spurious servers?  (Tarzan and Morphmix present
+possible solutions.)  Second, if clients can no longer have a complete
+picture of the network at all times how do we prevent attackers from
+manipulating client knowledge?  Third, if there are to many servers
+for every server to constantly communicate with every other, what kind
+of non-clique topology should the network use?  [XXX cite george's
+  restricted-routes paper] (Whatever topology we choose, we need some
+way to keep attackers from manipulating their position within it.)
+Fourth, since no centralized authority is tracking server reliability,
+How do we prevent unreliable servers from rendering the network
+unusable?  Fifth, do clients receive so much anonymity benefit from
+running their own servers that we should expect them all to do so, or
+do we need to find another incentive structure to motivate them?
+
+Alternatively, it may be the case that one of these problems proves
+intractable, or that the drawbacks to many-server systems prove
+greater than the benefits.  Nevertheless, we may still do well to
+consider non-clique topologies.  A cascade topology may provide more
+defense against traffic confirmation confirmation.
+% Why would it?   Cite.  -NM
+Does the hydra (many inputs, few outputs) topology work
+better? Are we going to get a hydra anyway because most nodes will be
 middleman nodes?
 middleman nodes?
 
 
-using a circuit many times is good because it's less cpu work.
-  good because of predecessor attacks with path rebuilding.
-  bad because predecessor attacks can be more likely to link you with a
-    previous circuit since you're so verbose.
-  bad because each thing you do on that circuit is linked to the other
-    things you do on that circuit.
-  how often to rotate?
-  how to decide when to exit from middle?
-  when to truncate and re-extend versus when to start new circuit?
-
-Because Tor runs over TCP, when one of the servers goes down it seems
-that all the circuits (and thus streams) going over that server must
-break. This reduces anonymity because everybody needs to reconnect
-right then (does it? how much?) and because exit connections all break
-at the same time, and it also reduces usability. It seems the problem
-is even worse in a p2p environment, because so far such systems don't
-really provide an incentive for nodes to stay connected when they're
-done browsing, so we would expect a much higher churn rate than for
-onion routing. Are there ways of allowing streams to survive the loss
-of a node in the path?
-
-discuss topologies. Cite George's non-freeroutes paper.  Maybe this
-graf goes elsewhere.
-
-discuss attracting users; incentives; usability.
-
-Choosing paths and path lengths.
+%%% Do more with this paragraph once The TCP-over-TCP paragraph is
+%%% more integrated into Related works.
+%
+As mentioned in section\ref{where-is-it-now}, Tor could improve its
+robustness against node failure by buffering stream data at the
+network's edges, and performing end-to-end acknowledgments.  The
+efficacy of this approach remains to be tested, however, and there
+may be more effective means for ensuring reliable connections in the
+presence of unreliable nodes.
+
+%%% Keeping this original paragraph for a little while, since it 
+%%% is not the same as what's written there now.
+%
+%Because Tor depends on TLS and TCP to provide a reliable transport,
+%when one of the servers goes down, all the circuits (and thus streams)
+%traveling over that server must break.  This reduces anonymity because
+%everybody needs to reconnect right then (does it? how much?)  and
+%because exit connections all break at the same time, and it also harms
+%usability. It seems the problem is even worse in a peer-to-peer
+%environment, because so far such systems don't really provide an
+%incentive for nodes to stay connected when they're done browsing, so
+%we would expect a much higher churn rate than for onion routing.
+%there ways of allowing streams to survive the loss of a node in the
+%path?
+
+% Roger or Paul suggested that we say something about incentives,
+% too, but I think that's a better candidate for our future work
+% section.  After all, we will doubtlessly learn very much about why
+% people do or don't run and use Tor in the near future. -NM
 
 
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%