|
@@ -380,7 +380,7 @@ Eternity and Free~Haven.
|
|
|
\Section{Design goals and assumptions}
|
|
|
\label{sec:assumptions}
|
|
|
|
|
|
-\noindent {\large Goals}\\
|
|
|
+\noindent{\large\bf Goals}\\
|
|
|
Like other low-latency anonymity designs, Tor seeks to frustrate
|
|
|
attackers from linking communication partners, or from linking
|
|
|
multiple communications to or from a single user. Within this
|
|
@@ -429,7 +429,7 @@ deployability, readability, and ease of security analysis. Tor aims to
|
|
|
deploy a simple and stable system that integrates the best well-understood
|
|
|
approaches to protecting anonymity.\\
|
|
|
|
|
|
-\noindent {\large Non-goals}\\
|
|
|
+\noindent{\large\bf Non-goals}\\
|
|
|
\label{subsec:non-goals}
|
|
|
In favoring simple, deployable designs, we have explicitly deferred
|
|
|
several possible goals, either because they are solved elsewhere, or because
|
|
@@ -515,11 +515,12 @@ each of these attacks.
|
|
|
\Section{The Tor Design}
|
|
|
\label{sec:design}
|
|
|
|
|
|
-The Tor network is an overlay network; onion routers run as normal
|
|
|
-user-level processes without needing any special privileges.
|
|
|
+The Tor network is an overlay network; each onion router (OR)
|
|
|
+runs as a normal
|
|
|
+user-level processes without any special privileges.
|
|
|
Each onion router maintains a long-term TLS \cite{TLS}
|
|
|
connection to every other onion router.
|
|
|
-%(We further discuss this clique-topology assumption in
|
|
|
+%(We discuss alternatives to this clique-topology assumption in
|
|
|
%Section~\ref{sec:maintaining-anonymity}.)
|
|
|
% A subset of the ORs also act as
|
|
|
%directory servers, tracking which routers are in the network;
|
|
@@ -528,42 +529,41 @@ Each user
|
|
|
runs local software called an onion proxy (OP) to fetch directories,
|
|
|
establish circuits across the network,
|
|
|
and handle connections from user applications. These onion proxies accept
|
|
|
-TCP streams and multiplex them across the circuit. The onion
|
|
|
+TCP streams and multiplex them across the circuits. The onion
|
|
|
router on the other side
|
|
|
of the circuit connects to the destinations of
|
|
|
the TCP streams and relays data.
|
|
|
|
|
|
Each onion router uses three public keys: a long-term identity key, a
|
|
|
short-term onion key, and a short-term link key. The identity
|
|
|
-(signing) key is used to sign TLS certificates, to sign its router
|
|
|
-descriptor (a summary of its keys, address, bandwidth, exit policy,
|
|
|
-etc), and to sign directories if it is a directory server. Changing
|
|
|
+key is used to sign TLS certificates, to sign the OR's \emph{router
|
|
|
+descriptor} (a summary of its keys, address, bandwidth, exit policy,
|
|
|
+and so on), and (by directory servers) to sign directories. Changing
|
|
|
the identity key of a router is considered equivalent to creating a
|
|
|
-new router. The onion (decryption) key is used for decrypting requests
|
|
|
+new router. The onion key is used to decrypt requests
|
|
|
from users to set up a circuit and negotiate ephemeral keys. Finally,
|
|
|
link keys are used by the TLS protocol when communicating between
|
|
|
onion routers. Each short-term key is rotated periodically and
|
|
|
independently, to limit the impact of key compromise.
|
|
|
|
|
|
-Section~\ref{subsec:cells} discusses the structure of the fixed-size
|
|
|
+Section~\ref{subsec:cells} discusses the fixed-size
|
|
|
\emph{cells} that are the unit of communication in Tor. We describe
|
|
|
in Section~\ref{subsec:circuits} how circuits are
|
|
|
built, extended, truncated, and destroyed. Section~\ref{subsec:tcp}
|
|
|
-describes how TCP streams are routed through the network, and finally
|
|
|
+describes how TCP streams are routed through the network. We address
|
|
|
+integrity checking in Section~\ref{subsec:integrity-checking},
|
|
|
+and resource limiting in Section~\ref{subsec:rate-limit}.
|
|
|
+Finally,
|
|
|
Section~\ref{subsec:congestion} talks about congestion control and
|
|
|
fairness issues.
|
|
|
-% NICK
|
|
|
-% XXX \ref{subsec:integrity-checking} is missing
|
|
|
-% XXX \ref{xubsec:rate-limit is missing.
|
|
|
|
|
|
\SubSection{Cells}
|
|
|
\label{subsec:cells}
|
|
|
|
|
|
-Onion routers communicate with one another, and with users' OPs, via TLS
|
|
|
-connections with ephemeral keys. This prevents an attacker from
|
|
|
-impersonating an OR, conceals the contents of the connection with
|
|
|
-perfect forward secrecy, and prevents an attacker from modifying data
|
|
|
-on the wire.
|
|
|
+Onion routers communicate with one another, and with users' OPs, via
|
|
|
+TLS connections with ephemeral keys. Using TLS conceals the data on
|
|
|
+the connection with perfect forward secrecy, and prevents an attacker
|
|
|
+from modifying data on the wire or impersonating an OR.
|
|
|
|
|
|
Traffic passes along these connections in fixed-size cells. Each cell
|
|
|
is 256 bytes (but see Section~\ref{sec:conclusion} for a discussion of
|
|
@@ -582,7 +582,7 @@ padding); \emph{create} or \emph{created} (used to set up a new circuit);
|
|
|
and \emph{destroy} (to tear down a circuit).
|
|
|
|
|
|
Relay cells have an additional header (the relay header) after the
|
|
|
-cell header, containing the stream identifier (many streams can
|
|
|
+cell header, containing a stream identifier (many streams can
|
|
|
be multiplexed over a circuit); an end-to-end checksum for integrity
|
|
|
checking; the length of the relay payload; and a relay command.
|
|
|
The entire contents of the relay header and the relay cell payload
|
|
@@ -607,7 +607,7 @@ We describe each of these cell types and commands in more detail below.
|
|
|
|
|
|
Onion Routing originally built one circuit for each
|
|
|
TCP stream. Because building a circuit can take several tenths of a
|
|
|
-second (due to public-key cryptography delays and network latency),
|
|
|
+second (due to public-key cryptography and network latency),
|
|
|
this design imposed high costs on applications like web browsing that
|
|
|
open many TCP streams.
|
|
|
|
|
@@ -617,23 +617,23 @@ among their streams, users' OPs build a new circuit
|
|
|
periodically if the previous one has been used,
|
|
|
and expire old used circuits that no longer have any open streams.
|
|
|
OPs consider making a new circuit once a minute: thus
|
|
|
-even heavy users spend a negligible amount of time and CPU in
|
|
|
+even heavy users spend a negligible amount of time
|
|
|
building circuits, but only a limited number of requests can be linked
|
|
|
to each other through a given exit node. Also, because circuits are built
|
|
|
in the background, OPs can recover from failed circuit creation
|
|
|
without delaying streams and thereby harming user experience.\\
|
|
|
|
|
|
-\noindent {\large Constructing a circuit}\\
|
|
|
+\noindent{\large\bf Constructing a circuit}\\
|
|
|
%\subsubsection{Constructing a circuit}
|
|
|
\label{subsubsec:constructing-a-circuit}
|
|
|
%
|
|
|
-A user's OP constructs a circuit incrementally, negotiating a
|
|
|
+A user's OP constructs circuits incrementally, negotiating a
|
|
|
symmetric key with each OR on the circuit, one hop at a time. To begin
|
|
|
creating a new circuit, the OP (call her Alice) sends a
|
|
|
\emph{create} cell to the first node in her chosen path (call him Bob).
|
|
|
(She chooses a new
|
|
|
circID $C_{AB}$ not currently used on the connection from her to Bob.)
|
|
|
-This cell's
|
|
|
+The \emph{create} cell's
|
|
|
payload contains the first half of the Diffie-Hellman handshake
|
|
|
($g^x$), encrypted to the onion key of the OR (call him Bob). Bob
|
|
|
responds with a \emph{created} cell containing the second half of the
|
|
@@ -664,44 +664,43 @@ extend one hop further.
|
|
|
|
|
|
This circuit-level handshake protocol achieves unilateral entity
|
|
|
authentication (Alice knows she's handshaking with the OR, but
|
|
|
-the OR doesn't care who is opening the circuit---Alice has no key
|
|
|
+the OR doesn't care who is opening the circuit---Alice uses no public key
|
|
|
and is trying to remain anonymous) and unilateral key authentication
|
|
|
(Alice and the OR agree on a key, and Alice knows the OR is the
|
|
|
-only other entity who should know it). It also achieves forward
|
|
|
+only other entity who knows it). It also achieves forward
|
|
|
secrecy and key freshness. More formally, the protocol is as follows
|
|
|
(where $E_{PK_{Bob}}(\cdot)$ is encryption with Bob's public key,
|
|
|
$H$ is a secure hash function, and $|$ is concatenation):
|
|
|
-
|
|
|
-\begin{equation}
|
|
|
+\begin{equation*}
|
|
|
\begin{aligned}
|
|
|
\mathrm{Alice} \rightarrow \mathrm{Bob}&: E_{PK_{Bob}}(g^x) \\
|
|
|
\mathrm{Bob} \rightarrow \mathrm{Alice}&: g^y, H(K | \mathrm{``handshake"}) \\
|
|
|
\end{aligned}
|
|
|
-\end{equation}
|
|
|
+\end{equation*}
|
|
|
|
|
|
-In the second step, Bob proves that it was he who who received $g^x$,
|
|
|
-and who came up with $y$. We use PK encryption in the first step
|
|
|
+In the second step, Bob proves that it was he who received $g^x$,
|
|
|
+and who chose $y$. We use PK encryption in the first step
|
|
|
(rather than, say, using the first two steps of STS, which has a
|
|
|
signature in the second step) because a single cell is too small to
|
|
|
hold both a public key and a signature. Preliminary analysis with the
|
|
|
-NRL protocol analyzer \cite{meadows96} shows the above protocol to be
|
|
|
-secure (including providing perfect forward secrecy) under the
|
|
|
+NRL protocol analyzer \cite{meadows96} shows this protocol to be
|
|
|
+secure (including perfect forward secrecy) under the
|
|
|
traditional Dolev-Yao model.\\
|
|
|
|
|
|
-\noindent {\large Relay cells}\\
|
|
|
+\noindent{\large\bf Relay cells}\\
|
|
|
%\subsubsection{Relay cells}
|
|
|
%
|
|
|
Once Alice has established the circuit (so she shares keys with each
|
|
|
OR on the circuit), she can send relay cells. Recall that every relay
|
|
|
-cell has a streamID in the relay header that indicates to which
|
|
|
+cell has a streamID that indicates to which
|
|
|
stream the cell belongs. This streamID allows a relay cell to be
|
|
|
-addressed to any of the ORs on the circuit. Upon receiving a relay
|
|
|
+addressed to any OR on the circuit. Upon receiving a relay
|
|
|
cell, an OR looks up the corresponding circuit, and decrypts the relay
|
|
|
-header and payload with the appropriate session key for that circuit.
|
|
|
-If the cell is headed downstream (away from Alice) it then checks
|
|
|
+header and payload with the session key for that circuit.
|
|
|
+If the cell is headed downstream (away from Alice) the OR then checks
|
|
|
whether the decrypted streamID is recognized---either because it
|
|
|
-corresponds to an open stream at this OR for the circuit, or because
|
|
|
-it is equal to the control streamID (zero). If the OR recognizes the
|
|
|
+corresponds to an open stream at this OR for the given circuit, or because
|
|
|
+it is the control streamID (zero). If the OR recognizes the
|
|
|
streamID, it accepts the relay cell and processes it as described
|
|
|
below. Otherwise,
|
|
|
the OR looks up the circID and OR for the
|
|
@@ -711,7 +710,7 @@ of the circuit receives an unrecognized relay cell, an error has
|
|
|
occurred, and the cell is discarded.)
|
|
|
|
|
|
OPs treat incoming relay cells similarly: they iteratively unwrap the
|
|
|
-relay header and payload with the session key shared with each
|
|
|
+relay header and payload with the session keys shared with each
|
|
|
OR on the circuit, from the closest to farthest. (Because we use a
|
|
|
stream cipher, encryption operations may be inverted in any order.)
|
|
|
If at any stage the OP recognizes the streamID, the cell must have
|
|
@@ -732,11 +731,11 @@ This \emph{leaky pipe} circuit topology
|
|
|
allows Alice's streams to exit at different ORs on a single circuit.
|
|
|
Alice may choose different exit points because of their exit policies,
|
|
|
or to keep the ORs from knowing that two streams
|
|
|
-originate at the same person.
|
|
|
+originate from the same person.
|
|
|
|
|
|
-When an OR later replies to Alice with a relay cell, it only needs to
|
|
|
-encrypt the cell's relay header and payload with the single key it
|
|
|
-shares with Alice, and send the cell back toward Alice along the
|
|
|
+When an OR later replies to Alice with a relay cell, it
|
|
|
+encrypts the cell's relay header and payload with the single key it
|
|
|
+shares with Alice, and sends the cell back toward Alice along the
|
|
|
circuit. Subsequent ORs add further layers of encryption as they
|
|
|
relay the cell back to Alice.
|
|
|
|
|
@@ -744,12 +743,12 @@ To tear down a whole circuit, Alice sends a \emph{destroy} control
|
|
|
cell. Each OR in the circuit receives the \emph{destroy} cell, closes
|
|
|
all open streams on that circuit, and passes a new \emph{destroy} cell
|
|
|
forward. But just as circuits are built incrementally, they can also
|
|
|
-be torn down incrementally: Alice can instead send a \emph{relay
|
|
|
-truncate} cell to a single OR on the circuit. That node then sends a
|
|
|
+be torn down incrementally: Alice can send a \emph{relay
|
|
|
+truncate} cell to a single OR on the circuit. That OR then sends a
|
|
|
\emph{destroy} cell forward, and acknowledges with a
|
|
|
\emph{relay truncated} cell. Alice can then extend the circuit to
|
|
|
different nodes, all without signaling to the intermediate nodes (or
|
|
|
-somebody observing them) that she has changed her circuit.
|
|
|
+an observer) that she has changed her circuit.
|
|
|
Similarly, if a node on the circuit goes down, the adjacent
|
|
|
node can send a \emph{relay truncated} cell back to Alice. Thus the
|
|
|
``break a node and see which circuits go down'' attack
|
|
@@ -758,19 +757,19 @@ node can send a \emph{relay truncated} cell back to Alice. Thus the
|
|
|
\SubSection{Opening and closing streams}
|
|
|
\label{subsec:tcp}
|
|
|
|
|
|
-When Alice's application wants to open a TCP connection to a given
|
|
|
+When Alice's application wants a TCP connection to a given
|
|
|
address and port, it asks the OP (via SOCKS) to make the
|
|
|
connection. The OP chooses the newest open circuit (or creates one if
|
|
|
-none is available), chooses a suitable OR on that circuit to be the
|
|
|
+none is available), and chooses a suitable OR on that circuit to be the
|
|
|
exit node (usually the last node, but maybe others due to exit policy
|
|
|
-conflicts; see Section~\ref{subsec:exitpolicies}), chooses a new
|
|
|
-random streamID for the stream, and sends a \emph{relay begin} cell
|
|
|
-to that exit node. The OP uses a streamID of zero for this cell
|
|
|
-(so the OR will recognize it), and uses the new streamID, destination
|
|
|
-address, and port as the contents of the cell's relay payload. Once the
|
|
|
+conflicts; see Section~\ref{subsec:exitpolicies}. The OP then opens
|
|
|
+the stream by sending a \emph{relay begin} cell to the exit node,
|
|
|
+using a streamID of zero (so the OR will recognize it), containing as
|
|
|
+its relay payload a new randomly generated streamID, the destination
|
|
|
+address, and the destination port. Once the
|
|
|
exit node completes the connection to the remote host, it responds
|
|
|
with a \emph{relay connected} cell. Upon receipt, the OP sends a
|
|
|
-SOCKS reply to the application notifying it of success. The OP
|
|
|
+SOCKS reply to notify the application of its success. The OP
|
|
|
now accepts data from the application's TCP stream, packaging it into
|
|
|
\emph{relay data} cells and sending those cells along the circuit to
|
|
|
the chosen OR.
|
|
@@ -778,18 +777,18 @@ the chosen OR.
|
|
|
There's a catch to using SOCKS, however---some applications pass the
|
|
|
alphanumeric hostname to the proxy, while others resolve it into an IP
|
|
|
address first and then pass the IP address to the proxy. If the
|
|
|
-application does the DNS resolution first, Alice will thereby
|
|
|
-broadcast her destination to the DNS server. Common applications
|
|
|
+application does DNS resolution first, Alice will thereby
|
|
|
+reveal her destination to the DNS server. Common applications
|
|
|
like Mozilla and SSH have this flaw.
|
|
|
|
|
|
-In the case of Mozilla, the flaw is easy to address: the filtering web
|
|
|
+In the case of Mozilla, the flaw is easy to address: the filtering HTTP
|
|
|
proxy called Privoxy does the SOCKS call safely, and Mozilla talks to
|
|
|
Privoxy safely. But a portable general solution, such as is needed for
|
|
|
SSH, is
|
|
|
an open problem. Modifying or replacing the local nameserver
|
|
|
can be invasive, brittle, and not portable. Forcing the resolver
|
|
|
library to do resolution via TCP rather than UDP is
|
|
|
-hard, and also has portability problems. We could provide a
|
|
|
+hard, and also has portability problems. We could also provide a
|
|
|
tool similar to \emph{dig} to perform a private lookup through the
|
|
|
Tor network. Our current answer is to encourage the use of
|
|
|
privacy-aware proxies like Privoxy wherever possible.
|
|
@@ -799,28 +798,29 @@ two-step handshake for normal operation, or a one-step handshake for
|
|
|
errors. If the stream closes abnormally, the adjacent node simply sends a
|
|
|
\emph{relay teardown} cell. If the stream closes normally, the node sends
|
|
|
a \emph{relay end} cell down the circuit. When the other side has sent
|
|
|
-back its own \emph{relay end}, the stream can be torn down. Because
|
|
|
+back its own \emph{relay end} cell, the stream can be torn down. Because
|
|
|
all relay cells use layered encryption, only the destination OR knows
|
|
|
that a given relay cell is a request to close a stream. This two-step
|
|
|
-handshake allows for TCP-based applications that use half-closed
|
|
|
-connections, such as broken HTTP clients that close their side of the
|
|
|
-stream after writing but are still willing to read.
|
|
|
+handshake allows Tor to support TCP-based applications that use half-closed
|
|
|
+connections.
|
|
|
+% such as broken HTTP clients that close their side of the
|
|
|
+%stream after writing but are still willing to read.
|
|
|
|
|
|
\SubSection{Integrity checking on streams}
|
|
|
\label{subsec:integrity-checking}
|
|
|
|
|
|
Because the old Onion Routing design used a stream cipher, traffic was
|
|
|
-vulnerable to a malleability attack: even though the attacker could not
|
|
|
-decrypt cells, he could make changes to an encrypted
|
|
|
-cell to create corresponding changes to the data leaving the network.
|
|
|
+vulnerable to a malleability attack: though the attacker could not
|
|
|
+decrypt cells, any changes to encrypted data
|
|
|
+would create corresponding changes to the data leaving the network.
|
|
|
(Even an external adversary could do this, despite link encryption, by
|
|
|
inverting bits on the wire.)
|
|
|
|
|
|
This weakness allowed an adversary to change a padding cell to a destroy
|
|
|
-cell; change the destination address in a relay begin cell to the
|
|
|
-adversary's webserver; or change a user on an ftp connection from
|
|
|
-typing ``dir'' to typing ``delete~*''. Any node or external adversary
|
|
|
-along the circuit could introduce such corruption in a stream---if it
|
|
|
+cell; change the destination address in a \emph{relay begin} cell to the
|
|
|
+adversary's webserver; or change an FTP command from
|
|
|
+{\tt dir} to {\tt rm~*}. Any OR or external adversary
|
|
|
+along the circuit could introduce such corruption in a stream, if it
|
|
|
knew or could guess the encrypted content.
|
|
|
|
|
|
Tor prevents external adversaries from mounting this attack by
|
|
@@ -841,13 +841,13 @@ is vulnerable to end-to-end timing attacks; tagging attacks performed
|
|
|
within the circuit provide no additional information to the attacker.
|
|
|
|
|
|
Thus, we check integrity only at the edges of each stream. When Alice
|
|
|
-negotiates a key with a new hop, they both initialize a pair of SHA-1
|
|
|
-digests with a derivative of that key,
|
|
|
+negotiates a key with a new hop, they each initialize a SHA-1
|
|
|
+digest with a derivative of that key,
|
|
|
thus beginning with randomness that only the two of them know. From
|
|
|
-then on they each incrementally add to the SHA-1 digests the contents of
|
|
|
-all relay cells they create or accept (one digest is for cells
|
|
|
-created; one is for cells accepted), and include with each relay cell
|
|
|
-the first 4 bytes of the current value of the hash of cells created.
|
|
|
+then on they each incrementally add to the SHA-1 digest the contents of
|
|
|
+all relay cells they create, and include with each relay cell the
|
|
|
+first four bytes of the current digest. Each also keeps a SHA-1
|
|
|
+digest of data received, to verify that the received hashes are correct.
|
|
|
|
|
|
To be sure of removing or modifying a cell, the attacker must be able
|
|
|
to either deduce the current digest state (which depends on all
|
|
@@ -858,7 +858,9 @@ end-to-end encrypted across the circuit. The computational overhead
|
|
|
of computing the digests is minimal compared to doing the AES
|
|
|
encryption performed at each hop of the circuit. We use only four
|
|
|
bytes per cell to minimize overhead; the chance that an adversary will
|
|
|
-correctly guess a valid hash, plus the payload the current cell, is
|
|
|
+correctly guess a valid hash
|
|
|
+%, plus the payload the current cell,
|
|
|
+is
|
|
|
acceptably low, given that Alice or Bob tear down the circuit if they
|
|
|
receive a bad hash.
|
|
|
|
|
@@ -866,7 +868,7 @@ receive a bad hash.
|
|
|
\label{subsec:rate-limit}
|
|
|
|
|
|
Volunteers are generally more willing to run services that can limit
|
|
|
-their bandwidth usage. To accommodate them, Tor servers use a
|
|
|
+their own bandwidth usage. To accommodate them, Tor servers use a
|
|
|
token bucket approach \cite{tannenbaum96} to
|
|
|
enforce a long-term average rate of incoming bytes, while still
|
|
|
permitting short-term bursts above the allowed bandwidth. Current bucket
|
|
@@ -893,9 +895,9 @@ Further, inspired by Rennhard et al's design in \cite{anonnet}, a
|
|
|
circuit's edges heuristically distinguish interactive streams from bulk
|
|
|
streams by comparing the frequency with which they supply cells. We can
|
|
|
provide good latency for interactive streams by giving them preferential
|
|
|
-service, while still getting good overall throughput to the bulk
|
|
|
+service, while still giving good overall throughput to the bulk
|
|
|
streams. Such preferential treatment presents a possible end-to-end
|
|
|
-attack, but an adversary who can observe both
|
|
|
+attack, but an adversary observing both
|
|
|
ends of the stream can already learn this information through timing
|
|
|
attacks.
|
|
|
|
|
@@ -905,13 +907,14 @@ attacks.
|
|
|
Even with bandwidth rate limiting, we still need to worry about
|
|
|
congestion, either accidental or intentional. If enough users choose the
|
|
|
same OR-to-OR connection for their circuits, that connection can become
|
|
|
-saturated. For example, an adversary could make a large HTTP PUT request
|
|
|
-through the onion routing network to a webserver he runs, and then
|
|
|
+saturated. For example, an attacker could send a large file
|
|
|
+through the Tor network to a webserver he runs, and then
|
|
|
refuse to read any of the bytes at the webserver end of the
|
|
|
circuit. Without some congestion control mechanism, these bottlenecks
|
|
|
can propagate back through the entire network. We don't need to
|
|
|
reimplement full TCP windows (with sequence numbers,
|
|
|
-the ability to drop cells when we're full and retransmit later, etc),
|
|
|
+the ability to drop cells when we're full and retransmit later, and so
|
|
|
+on),
|
|
|
because TCP already guarantees in-order delivery of each
|
|
|
cell.
|
|
|
%But we need to investigate further the effects of the current
|
|
@@ -922,7 +925,7 @@ We describe our response below.
|
|
|
\textbf{Circuit-level throttling:}
|
|
|
To control a circuit's bandwidth usage, each OR keeps track of two
|
|
|
windows. The \emph{packaging window} tracks how many relay data cells the OR is
|
|
|
-allowed to package (from outside TCP streams) for transmission back to the OP,
|
|
|
+allowed to package (from incoming TCP streams) for transmission back to the OP,
|
|
|
and the \emph{delivery window} tracks how many relay data cells it is willing
|
|
|
to deliver to TCP streams outside the network. Each window is initialized
|
|
|
(say, to 1000 data cells). When a data cell is packaged or delivered,
|
|
@@ -960,14 +963,14 @@ can't send a \emph{relay sendme} cell when its packaging window is empty.
|
|
|
\SubSection{Resource management and denial-of-service}
|
|
|
\label{subsec:dos}
|
|
|
|
|
|
-Providing Tor as a public service provides many opportunities for an
|
|
|
-attacker to mount denial-of-service attacks against the network. While
|
|
|
+Providing Tor as a public service provides many opportunities for
|
|
|
+denial-of-service attacks against the network. While
|
|
|
flow control and rate limiting (discussed in
|
|
|
Section~\ref{subsec:congestion}) prevent users from consuming more
|
|
|
bandwidth than routers are willing to provide, opportunities remain for
|
|
|
users to
|
|
|
consume more network resources than their fair share, or to render the
|
|
|
-network unusable for other users.
|
|
|
+network unusable for others.
|
|
|
|
|
|
First of all, there are several CPU-consuming denial-of-service
|
|
|
attacks wherein an attacker can force an OR to perform expensive
|
|
@@ -1022,18 +1025,18 @@ at the exit OR.
|
|
|
We stress that Tor does not enable any new class of abuse. Spammers
|
|
|
and other attackers already have access to thousands of misconfigured
|
|
|
systems worldwide, and the Tor network is far from the easiest way
|
|
|
-to launch these antisocial or illegal attacks.
|
|
|
+to launch antisocial or illegal attacks.
|
|
|
%Indeed, because of its limited
|
|
|
%anonymity, Tor is probably not a good way to commit crimes.
|
|
|
But because the
|
|
|
onion routers can easily be mistaken for the originators of the abuse,
|
|
|
and the volunteers who run them may not want to deal with the hassle of
|
|
|
-repeatedly explaining anonymity networks, we must block or limit attacks
|
|
|
-and other abuse that travel through the Tor network.
|
|
|
+repeatedly explaining anonymity networks, we must block or limit
|
|
|
+the abuse that travels through the Tor network.
|
|
|
|
|
|
To mitigate abuse issues, in Tor, each onion router's \emph{exit policy}
|
|
|
-describes to which external addresses and ports the router will permit
|
|
|
-stream connections. On one end of the spectrum are \emph{open exit}
|
|
|
+describes to which external addresses and ports the router will
|
|
|
+connect. On one end of the spectrum are \emph{open exit}
|
|
|
nodes that will connect anywhere. On the other end are \emph{middleman}
|
|
|
nodes that only relay traffic to other Tor nodes, and \emph{private exit}
|
|
|
nodes that only connect to a local host or network. Using a private
|
|
@@ -1042,7 +1045,10 @@ given host or network---an external adversary cannot eavesdrop traffic
|
|
|
between the private exit and the final destination, and so is less sure of
|
|
|
Alice's destination and activities. Most onion routers will function as
|
|
|
\emph{restricted exits} that permit connections to the world at large,
|
|
|
-but prevent access to certain abuse-prone addresses and services. In
|
|
|
+but prevent access to certain abuse-prone addresses and services.
|
|
|
+% XXX This next sentence makes no sense to me in context; must
|
|
|
+% XXX revisit. -NM
|
|
|
+In
|
|
|
general, nodes can require a variety of forms of traffic authentication
|
|
|
\cite{or-discex00}.
|
|
|
|
|
@@ -1053,7 +1059,7 @@ general, nodes can require a variety of forms of traffic authentication
|
|
|
%can be assumed for important traffic.
|
|
|
|
|
|
Many administrators will use port restrictions to support only a
|
|
|
-limited set of well-known services, such as HTTP, SSH, or AIM.
|
|
|
+limited set of services, such as HTTP, SSH, or AIM.
|
|
|
This is not a complete solution, of course, since abuse opportunities for these
|
|
|
protocols are still well known.
|
|
|
|
|
@@ -1064,16 +1070,16 @@ vulnerabilities) can be detected in a straightforward manner.
|
|
|
Similarly, one could run automatic spam filtering software (such as
|
|
|
SpamAssassin) on email exiting the OR network.
|
|
|
|
|
|
-ORs may also choose to rewrite exiting traffic in order to append
|
|
|
-headers or other information to indicate that the traffic has passed
|
|
|
+ORs may also rewrite exiting traffic to append
|
|
|
+headers or other information indicating that the traffic has passed
|
|
|
through an anonymity service. This approach is commonly used
|
|
|
-by email-only anonymity systems. When possible, ORs can also
|
|
|
-run on servers with hostnames such as {\it anonymous}, to further
|
|
|
+by email-only anonymity systems. ORs can also
|
|
|
+run on servers with hostnames like {\tt anonymous} to further
|
|
|
alert abuse targets to the nature of the anonymous traffic.
|
|
|
|
|
|
-A mixture of open and restricted exit nodes will allow the most
|
|
|
-flexibility for volunteers running servers. But while many
|
|
|
-middleman nodes help provide a large and robust network,
|
|
|
+A mixture of open and restricted exit nodes allows the most
|
|
|
+flexibility for volunteers running servers. But while having many
|
|
|
+middleman nodes provides a large and robust network,
|
|
|
having only a few exit nodes reduces the number of points
|
|
|
an adversary needs to monitor for traffic analysis, and places a
|
|
|
greater burden on the exit nodes. This tension can be seen in the
|
|
@@ -1089,7 +1095,7 @@ Section~\ref{sec:conclusion}.
|
|
|
Finally, we note that exit abuse must not be dismissed as a peripheral
|
|
|
issue: when a system's public image suffers, it can reduce the number
|
|
|
and diversity of that system's users, and thereby reduce the anonymity
|
|
|
-of the system itself. Like usability, public perception is also a
|
|
|
+of the system itself. Like usability, public perception is a
|
|
|
security parameter. Sadly, preventing abuse of open exit nodes is an
|
|
|
unsolved problem, and will probably remain an arms race for the
|
|
|
forseeable future. The abuse problems faced by Princeton's CoDeeN
|
|
@@ -1103,30 +1109,31 @@ in-band network status updates: each router flooded a signed statement
|
|
|
to its neighbors, which propagated it onward. But anonymizing networks
|
|
|
have different security goals than typical link-state routing protocols.
|
|
|
For example, delays (accidental or intentional)
|
|
|
-that can cause different parts of the network to have different pictures
|
|
|
-of link-state and topology are not only inconvenient---they give
|
|
|
+that can cause different parts of the network to have different views
|
|
|
+of link-state and topology are not only inconvenient: they give
|
|
|
attackers an opportunity to exploit differences in client knowledge.
|
|
|
We also worry about attacks to deceive a
|
|
|
client about the router membership list, topology, or current network
|
|
|
state. Such \emph{partitioning attacks} on client knowledge help an
|
|
|
adversary to efficiently deploy resources
|
|
|
-when attacking a target \cite{minion-design}.
|
|
|
+against a target \cite{minion-design}.
|
|
|
|
|
|
|
|
|
Tor uses a small group of redundant, well-known onion routers to
|
|
|
track changes in network topology and node state, including keys and
|
|
|
-exit policies. Each such \emph{directory server} also acts as an HTTP
|
|
|
+exit policies. Each such \emph{directory server} acts as an HTTP
|
|
|
server, so participants can fetch current network state and router
|
|
|
-lists (a \emph{directory}), and so other onion routers can upload
|
|
|
-their router descriptors. Onion routers periodically publish signed
|
|
|
+lists, and so other ORs can upload
|
|
|
+state information. Onion routers periodically publish signed
|
|
|
statements of their state to each directory server, which combines this
|
|
|
state information with its own view of network liveness, and generates
|
|
|
-a signed description of the entire network state. Client software is
|
|
|
+a signed description (a \emph{directory}) of the entire network
|
|
|
+state. Client software is
|
|
|
pre-loaded with a list of the directory servers and their keys; it uses
|
|
|
this information to bootstrap each client's view of the network.
|
|
|
|
|
|
-When a directory server receives a signed statement from an onion
|
|
|
-router, it recognizes the onion router by its identity key. Directory
|
|
|
+When a directory server receives a signed statement for an OR, it
|
|
|
+checks whether the OR's identity key is recognized. Directory
|
|
|
servers do not automatically advertise unrecognized ORs. (If they did,
|
|
|
an adversary could take over the network by creating many servers
|
|
|
\cite{sybil}.) Instead, new nodes must be approved by the directory
|
|
@@ -1135,14 +1142,15 @@ node approval are an area of active research, and are discussed more
|
|
|
in Section~\ref{sec:maintaining-anonymity}.
|
|
|
|
|
|
Of course, a variety of attacks remain. An adversary who controls
|
|
|
-a directory server can track certain clients by providing different
|
|
|
+a directory server can track clients by providing them different
|
|
|
information---perhaps by listing only nodes under its control, or by
|
|
|
informing only certain clients about a given node. Even an external
|
|
|
adversary can exploit differences in client knowledge: clients who use
|
|
|
a node listed on one directory server but not the others are vulnerable.
|
|
|
|
|
|
-Thus these directory servers must be synchronized and redundant.
|
|
|
-Directories are valid if they are signed by a threshold of the directory
|
|
|
+Thus these directory servers must be synchronized and redundant, so
|
|
|
+that they can agree on a common directory. Clients should only trust
|
|
|
+this directory if it is signed by a threshold of the directory
|
|
|
servers.
|
|
|
|
|
|
The directory servers in Tor are modeled after those in Mixminion
|
|
@@ -1184,9 +1192,10 @@ must build circuits and use them to anonymously test router reliability
|
|
|
\cite{mix-acc}.
|
|
|
|
|
|
Using directory servers is simpler and more flexible than flooding.
|
|
|
-For example, flooding complicates the analysis when we
|
|
|
-start experimenting with non-clique network topologies. And because
|
|
|
-the directories are signed, they can be cached by other onion routers.
|
|
|
+Flooding is expensive, and complicates the analysis when we
|
|
|
+start experimenting with non-clique network topologies. Signed
|
|
|
+directories are less expensive, because they can be cached by other
|
|
|
+onion routers.
|
|
|
Thus directory servers are not a performance
|
|
|
bottleneck when we have many users, and do not aid traffic analysis by
|
|
|
forcing clients to periodically announce their existence to any
|
|
@@ -1224,44 +1233,46 @@ points. He may do this on any robust efficient
|
|
|
key-value lookup system with authenticated updates, such as a
|
|
|
distributed hash table (DHT) like CFS \cite{cfs:sosp01}\footnote{
|
|
|
Rather than rely on an external infrastructure, the Onion Routing network
|
|
|
-can run the DHT; to begin, we can run a simple lookup system on the
|
|
|
+can run the DHT itself. At first, we can simply run a simple lookup
|
|
|
+system on the
|
|
|
directory servers.} Alice, the client, chooses an OR as her
|
|
|
\emph{rendezvous point}. She connects to one of Bob's introduction
|
|
|
-points, informs him about her rendezvous point, and then waits for him
|
|
|
+points, informs him of her rendezvous point, and then waits for him
|
|
|
to connect to the rendezvous point. This extra level of indirection
|
|
|
helps Bob's introduction points avoid problems associated with serving
|
|
|
-unpopular files directly (for example, if Bob chooses
|
|
|
-an introduction point in Texas to serve anti-ranching propaganda,
|
|
|
+unpopular files directly (for example, if Bob serves
|
|
|
+material that the introduction point's neighbors find objectionable,
|
|
|
or if Bob's service tends to get attacked by network vandals).
|
|
|
The extra level of indirection also allows Bob to respond to some requests
|
|
|
and ignore others.
|
|
|
|
|
|
-We give an overview of the steps of a rendezvous. These steps are
|
|
|
-performed on behalf of Alice and Bob by their local onion proxies;
|
|
|
+We give an overview of the steps of a rendezvous. These are
|
|
|
+performed on behalf of Alice and Bob by their local OPs;
|
|
|
application integration is described more fully below.
|
|
|
|
|
|
\begin{tightlist}
|
|
|
\item Bob chooses some introduction points, and advertises them on
|
|
|
the DHT. He can add more later.
|
|
|
-\item Bob establishes a Tor circuit to each of his introduction points,
|
|
|
- and waits. No data is transmitted until a request is received.
|
|
|
+\item Bob builds a circuit to each of his introduction points,
|
|
|
+ and waits. No data is yet transmitted.
|
|
|
\item Alice learns about Bob's service out of band (perhaps Bob told her,
|
|
|
or she found it on a website). She retrieves the details of Bob's
|
|
|
service from the DHT.
|
|
|
-\item Alice chooses an OR to serve as the rendezvous point (RP) for this
|
|
|
- transaction. She establishes a circuit to RP, and gives it a
|
|
|
- rendezvous cookie, which it will use to recognize Bob.
|
|
|
+\item Alice chooses an OR to be the rendezvous point (RP) for this
|
|
|
+ transaction. She builds a circuit to RP, and gives it a
|
|
|
+ rendezvous cookie that it will use to recognize Bob.
|
|
|
\item Alice opens an anonymous stream to one of Bob's introduction
|
|
|
- points, and gives it a message (encrypted to Bob's public key) which tells him
|
|
|
+ points, and gives it a message (encrypted to Bob's public key)
|
|
|
+ which tells him
|
|
|
about herself, her chosen RP and the rendezvous cookie, and the
|
|
|
- first half of an ephemeral
|
|
|
- key handshake. The introduction point sends the message to Bob.
|
|
|
-\item If Bob wants to talk to Alice, he builds a new circuit to Alice's
|
|
|
- RP and provides the rendezvous cookie and the second half of the DH
|
|
|
- handshake (along with a hash of the session
|
|
|
- key they now share---by the same argument as in
|
|
|
+ first half of a DH
|
|
|
+ handshake. The introduction point sends the message to Bob.
|
|
|
+\item If Bob wants to talk to Alice, he builds a circuit to Alice's
|
|
|
+ RP and provides the rendezvous cookie, the second half of the DH
|
|
|
+ handshake, and a hash of the session
|
|
|
+ key they now share. By the same argument as in
|
|
|
Section~\ref{subsubsec:constructing-a-circuit}, Alice knows she
|
|
|
- shares the key only with the intended Bob).
|
|
|
+ shares the key only with Bob.
|
|
|
\item The RP connects Alice's circuit to Bob's. Note that RP can't
|
|
|
recognize Alice, Bob, or the data they transmit.
|
|
|
\item Alice now sends a \emph{relay begin} cell along the circuit. It
|
|
@@ -1319,9 +1330,11 @@ can choose whether to respond.
|
|
|
The authentication tokens can be used to provide selective access:
|
|
|
important users get tokens to ensure uninterrupted access to the
|
|
|
service. During normal situations, Bob's service might simply be offered
|
|
|
-directly from mirrors, and Bob gives out tokens to high-priority users. If
|
|
|
-the mirrors are knocked down by distributed DoS attacks or even
|
|
|
-physical attack, those users can switch to accessing Bob's service via
|
|
|
+directly from mirrors, while Bob gives out tokens to high-priority users. If
|
|
|
+the mirrors are knocked down,
|
|
|
+%by distributed DoS attacks or even
|
|
|
+%physical attack,
|
|
|
+those users can switch to accessing Bob's service via
|
|
|
the Tor rendezvous system.
|
|
|
|
|
|
Since Bob's introduction points might themselves be subject to DoS he
|
|
@@ -1333,7 +1346,7 @@ are not advertised in the DHT\@. This is most likely to be practical
|
|
|
if there is a relatively stable and large group of introduction points
|
|
|
generally available. Alternatively, Bob could give secret public keys
|
|
|
to selected users for consulting the DHT\@. All of these approaches
|
|
|
-have the advantage of limiting the damage that can be done even if
|
|
|
+have the advantage of limiting exposure even when
|
|
|
some of the selected high-priority users collude in the DoS\@.
|
|
|
|
|
|
\SubSection{Integration with user applications}
|
|
@@ -1341,18 +1354,19 @@ some of the selected high-priority users collude in the DoS\@.
|
|
|
Bob configures his onion proxy to know the local IP address and port of his
|
|
|
service, a strategy for authorizing clients, and a public key. Bob
|
|
|
publishes the public key, an expiration time (``not valid after''), and
|
|
|
-the current introduction points for his service into the DHT, all indexed
|
|
|
-by the hash of the public key. Note that Bob's webserver is unmodified,
|
|
|
+the current introduction points for his service into the DHT, indexed
|
|
|
+by the hash of the public key. Bob's webserver is unmodified,
|
|
|
and doesn't even know that it's hidden behind the Tor network.
|
|
|
|
|
|
Alice's applications also work unchanged---her client interface
|
|
|
remains a SOCKS proxy. We encode all of the necessary information
|
|
|
into the fully qualified domain name Alice uses when establishing her
|
|
|
connection. Location-hidden services use a virtual top level domain
|
|
|
-called `.onion': thus hostnames take the form x.y.onion where x is the
|
|
|
-authentication cookie, and y encodes the hash of PK. Alice's onion proxy
|
|
|
+called {\tt .onion}: thus hostnames take the form {\tt x.y.onion} where
|
|
|
+{\tt x} is the authentication cookie, and {\tt y} encodes the hash of
|
|
|
+the public key. Alice's onion proxy
|
|
|
examines addresses; if they're destined for a hidden server, it decodes
|
|
|
-the PK and starts the rendezvous as described in the table above.
|
|
|
+the key and starts the rendezvous as described above.
|
|
|
|
|
|
\subsection{Previous rendezvous work}
|
|
|
|
|
@@ -1368,8 +1382,8 @@ points for low-latency Internet connections was by Ian Goldberg
|
|
|
ours in three ways. First, Goldberg suggests that Alice should manually
|
|
|
hunt down a current location of the service via Gnutella; our approach
|
|
|
makes lookup transparent to the user, as well as faster and more robust.
|
|
|
-Second, in Tor the client and server negotiate ephemeral keys
|
|
|
-via Diffie-Hellman, so plaintext is not exposed at any point. Third,
|
|
|
+Second, in Tor the client and server negotiate session keys
|
|
|
+via Diffie-Hellman, so plaintext is not exposed at the rendezvous point. Third,
|
|
|
our design tries to minimize the exposure associated with running the
|
|
|
service, to encourage volunteers to offer introduction and rendezvous
|
|
|
point services. Tor's introduction points do not output any bytes to the
|
|
@@ -1385,7 +1399,7 @@ acknowledge his existence.
|
|
|
%Below we summarize a variety of attacks, and discuss how well our
|
|
|
%design withstands them.\\
|
|
|
|
|
|
-\noindent{\large Passive attacks}\\
|
|
|
+\noindent{\large\bf Passive attacks}\\
|
|
|
\emph{Observing user traffic patterns.} Observing the connection
|
|
|
from the user will not reveal her destination or data, but it will
|
|
|
reveal traffic patterns (both sent and received). Profiling via user
|
|
@@ -1453,7 +1467,7 @@ these are in principle feasible and surprises are always possible,
|
|
|
these constitute a much more complicated attack, and there is no
|
|
|
current evidence of their practicality.}\\
|
|
|
|
|
|
-\noindent {\large Active attacks}\\
|
|
|
+\noindent{\large\bf Active attacks}\\
|
|
|
\emph{Compromise keys.} An attacker who learns the TLS session key can
|
|
|
see control cells and encrypted relay cells on every circuit on that
|
|
|
connection; learning a circuit
|
|
@@ -1580,7 +1594,7 @@ releases in source code form, encourage source audits, and
|
|
|
frequently warn our users never to trust any software (even from
|
|
|
us!) that comes without source.\\
|
|
|
|
|
|
-\noindent{\large Directory attacks}\\
|
|
|
+\noindent{\large\bf Directory attacks}\\
|
|
|
\emph{Destroy directory servers.} If a few directory
|
|
|
servers drop out of operation, the others still arrive at a final
|
|
|
directory. So long as any directory servers remain in operation,
|
|
@@ -1628,7 +1642,7 @@ servers must actively test ORs by building circuits and streams as
|
|
|
appropriate. The tradeoffs of a similar approach are discussed in
|
|
|
\cite{mix-acc}.\\
|
|
|
|
|
|
-\noindent {\large Attacks against rendezvous points}\\
|
|
|
+\noindent{\large\bf Attacks against rendezvous points}\\
|
|
|
\emph{Make many introduction requests.} An attacker could
|
|
|
try to deny Bob service by flooding his Introduction Point with
|
|
|
requests. Because the introduction point can block requests that
|