Bladeren bron

circuits, streams, and tagging, o my!

svn:r682
Roger Dingledine 22 jaren geleden
bovenliggende
commit
b2c225eab7
1 gewijzigde bestanden met toevoegingen van 101 en 10 verwijderingen
  1. 101 10
      doc/tor-design.tex

+ 101 - 10
doc/tor-design.tex

@@ -528,7 +528,6 @@ The basic adversary components we consider are:
 %  same. I reworded above, I'm thinking we should leave other concerns
 %  same. I reworded above, I'm thinking we should leave other concerns
 %  for later. -PS
 %  for later. -PS
 
 
-  
 \item{Hostile Tor node:} can arbitrarily manipulate the
 \item{Hostile Tor node:} can arbitrarily manipulate the
   connections under its control, as well as creating new connections
   connections under its control, as well as creating new connections
   (that pass through itself).
   (that pass through itself).
@@ -653,7 +652,10 @@ run local software called an onion proxy (OP) that fetches directories,
 establishes paths (called \emph{virtual circuits}) over the network,
 establishes paths (called \emph{virtual circuits}) over the network,
 and handles connections from the user applications. Onion proxies accept
 and handles connections from the user applications. Onion proxies accept
 TCP streams and multiplex them across the virtual circuit. The onion
 TCP streams and multiplex them across the virtual circuit. The onion
-router on the other side of the circuit connects to the destinations of
+router on the other side 
+% I don't mean other side, I mean wherever it is on the circuit. But
+% don't want to introduce complexity this early? Hm. -RD
+of the circuit connects to the destinations of
 the TCP streams and relays data.
 the TCP streams and relays data.
 
 
 Onion routers have three types of keys. The first key is the identity
 Onion routers have three types of keys. The first key is the identity
@@ -693,7 +695,8 @@ or \emph{destroy} (to tear down a circuit).
 Relay cells have an additional header (the relay header) after the
 Relay cells have an additional header (the relay header) after the
 cell header, which specifies the stream identifier (many streams can
 cell header, which specifies the stream identifier (many streams can
 be multiplexed over a circuit), an end-to-end checksum for integrity
 be multiplexed over a circuit), an end-to-end checksum for integrity
-checking, and a relay command. Relay commands can be one of: \emph{relay
+checking, the length of the relay payload, and a relay command. Relay
+commands can be one of: \emph{relay
 data} (for data flowing down the stream), \emph{relay begin} (to open a
 data} (for data flowing down the stream), \emph{relay begin} (to open a
 stream), \emph{relay end} (to close a stream), \emph{relay connected}
 stream), \emph{relay end} (to close a stream), \emph{relay connected}
 (to notify the OP that a relay begin has succeeded), \emph{relay
 (to notify the OP that a relay begin has succeeded), \emph{relay
@@ -791,16 +794,103 @@ relay cells are encrypted. Similarly, if a node on the circuit goes down,
 the adjacent node can send a relay truncated back to Alice. Thus the
 the adjacent node can send a relay truncated back to Alice. Thus the
 ``break a node and see which circuits go down'' attack is weakened.
 ``break a node and see which circuits go down'' attack is weakened.
 
 
-\SubSection{Tagging attacks on streams}
-
-end-to-end integrity checking.  (Mention tagging.)
-
 \SubSection{Opening and closing streams}
 \SubSection{Opening and closing streams}
 \label{subsec:tcp}
 \label{subsec:tcp}
 
 
-Describe how TCP connections get opened.  (Mention DNS issues)
-Descibe closing TCP connections and 2-END handshake to mirror TCP
-close handshake.
+When Alice's application wants to open a TCP connection to a given
+address and port, it asks the OP (via SOCKS) to make the connection. The
+OP chooses the newest open circuit (or creates one if none is available),
+chooses a suitable OR on that circuit to be the exit node (usually the
+last node, but maybe others due to exit policy conflicts; see Section
+\ref{sec:exit-policies}), chooses a new random stream ID for this stream,
+and delivers a relay begin cell to that exit node. It uses a stream ID
+of zero for the begin cell (so the OR will recognize it), and the relay
+payload lists the new stream ID and the destination address and port.
+Once the exit node completes the connection to the remote host, it
+responds with a relay connected cell through the circuit. Upon receipt,
+the OP notifies the application that it can begin talking.
+
+There's a catch to using SOCKS, though -- some applications hand the
+alphanumeric address to the proxy, while others resolve it into an IP
+address first and then hand the IP to the proxy. When the application
+does the DNS resolution first, Alice broadcasts her destination. Common
+applications like Mozilla and ssh have this flaw.
+
+In the case of Mozilla, we're fine: the filtering web proxy called Privoxy
+does the SOCKS call safely, and Mozilla talks to Privoxy safely. But a
+portable general solution, such as for ssh, is an open problem. We could
+modify the local nameserver, but this approach is invasive, brittle, and
+not portable. We could encourage the resolver library to do resolution
+via TCP rather than UDP, but this approach is hard to do right, and also
+has portability problems. Our current answer is to encourage the use of
+privacy-aware proxies like Privoxy wherever possible, and also provide
+a tool similar to \emph{dig} that can do a private lookup through the
+Tor network.
+
+Ending a Tor stream is analogous to ending a TCP stream: it uses a
+two-step handshake for normal operation, or a one-step handshake for
+errors. If one side of the stream closes abnormally, that node simply
+sends a relay teardown cell, and tears down the stream. If one side
+% Nick: mention relay teardown in 'cell' subsec? good enough name? -RD
+of the stream closes the connection normally, that node sends a relay
+end cell down the circuit. When the other side has sent back its own
+relay end, the stream can be torn down. This two-step handshake allows
+for TCP-based applications that, for example, close a socket for writing
+but are still willing to read.
+
+\SubSection{Tagging attacks on streams}
+
+In the old Onion Routing design, traffic was vulnerable to a malleability
+attack: since there was no integrity checking, an adversary could
+guess some of the plaintext of a cell, xor it out, and xor in his own
+plaintext. Even an external adversary could do this despite the link
+encryption!
+
+Some examples of this attack might be to change a create cell to a
+destroy cell, to change the destination address in a relay begin cell
+to the adversary's webserver, or to change a user on an ftp connection
+from typing ``dir'' to typing ``delete *''. Any node or observer along
+the path can introduce such corruption in a stream.
+
+Tor solves the tagging attack with respect to external adversaries simply
+by using TLS. Addressing the insider tagging attack is more complex.
+
+Rather than doing integrity checking of the relay cells at each hop
+(like Mixminion \cite{minion-design}), which would increase packet size
+by a function of path length\footnote{This is also the argument against
+using recent cipher modes like EAX \cite{eax} --- we don't want the added
+message-expansion overhead at each hop, and we don't want to leak the path
+length}, we choose to accept passive timing attacks, and do integrity
+checking only at the edges of the circuit. When Alice negotiates a key
+with that hop, they both start a SHA-1 with some derivative of that key,
+thus starting out with randomness that only the two of them know. From
+then on they each incrementally add all the data bytes flowing across
+the stream to the SHA-1, and each relay cell includes the first 4 bytes
+of the current value of the hash. 
+
+The attacker must be able to guess all previous bytes between Alice
+and Bob on that circuit (including the pseudorandomness from the key
+negotiation), plus the bytes in the current cell, to remove modify the
+cell. The computational overhead isn't so bad, compared to doing an AES
+crypt at each hop in the circuit. We use only four bytes per cell to
+minimize overhead; the chance that an adversary will correctly guess a
+valid hash, plus the payload the current cell, is acceptly low, given
+that Alice or Bob tear down the circuit if they receive a bad hash.
+
+%% probably don't need to even mention this, because the randomness
+%% covers it:
+%The fun SHA1 attack where the bad guy can incrementally add to a hash
+%to get a new valid hash doesn't apply to us, because we never show any
+%hashes to anybody.
+
+\SubSection{Website fingerprinting attacks}
+
+old onion routing is vulnerable to website fingerprinting attacks like
+david martin's from usenix sec and drew's from pet2002. so is tor. we
+need to send some padding or something, including long-range padding
+(to foil the first hop), to solve this. let's hope somebody writes
+a followup to \cite{defensive-dropping} that tells us what, exactly,
+to do, and why, exactly, it helps.
 
 
 \SubSection{Congestion control and fairness}
 \SubSection{Congestion control and fairness}
 \label{subsec:congestion}
 \label{subsec:congestion}
@@ -824,6 +914,7 @@ anonnet stuff.
 \Section{Other design decisions}
 \Section{Other design decisions}
 
 
 \SubSection{Resource management and DoS prevention}
 \SubSection{Resource management and DoS prevention}
+\label{subsec:dos}
 
 
 Describe DoS prevention. cookies before tls begins, rate limiting of
 Describe DoS prevention. cookies before tls begins, rate limiting of
 create cells, link-to-link rate limiting, etc.
 create cells, link-to-link rate limiting, etc.