|
@@ -52,7 +52,7 @@
|
|
|
\begin{abstract}
|
|
|
We present Tor, a circuit-based low-latency anonymous communication
|
|
|
system. Tor is the successor to Onion Routing
|
|
|
-and addresses many limitations in the original Onion Routing design.
|
|
|
+and addresses various limitations in the original Onion Routing design.
|
|
|
Tor works in a real-world Internet environment, requires no special
|
|
|
privileges such as root- or kernel-level access,
|
|
|
requires little synchronization or coordination between nodes, and
|
|
@@ -388,7 +388,8 @@ they avoid the well-known inefficiencies of tunneling TCP over TCP
|
|
|
|
|
|
Distributed-trust anonymizing systems need to prevent attackers from
|
|
|
adding too many servers and thus compromising too many user paths.
|
|
|
-Tor relies on a centrally maintained set of well-known servers. Tarzan
|
|
|
+Tor relies on a small set of well-known servers to make
|
|
|
+decisions about which nodes can join. Tarzan
|
|
|
and MorphMix allow unknown users to run servers, and limit an attacker
|
|
|
from becoming too much of the network based on a limited resource such
|
|
|
as number of IPs controlled. Crowds suggests requiring written, notarized
|
|
@@ -440,13 +441,13 @@ so that it can serve as a test-bed for future research in low-latency
|
|
|
anonymity systems. Many of the open problems in low-latency anonymity
|
|
|
networks, such as generating dummy traffic or preventing Sybil attacks
|
|
|
\cite{sybil}, may be solvable independently from the issues solved by
|
|
|
-Tor. Hopefully future systems will not need to reinvent Tor's design
|
|
|
-decisions. (But note that while a flexible design benefits researchers,
|
|
|
+Tor. Hopefully future systems will not need to reinvent Tor's design.
|
|
|
+(But note that while a flexible design benefits researchers,
|
|
|
there is a danger that differing choices of extensions will make users
|
|
|
distinguishable. Experiments should be run on a separate network.)
|
|
|
|
|
|
-\textbf{Conservative design:} The protocol's design and security
|
|
|
-parameters must be conservative. Additional features impose implementation
|
|
|
+\textbf{Simple design:} The protocol's design and security
|
|
|
+parameters must be well-understood. Additional features impose implementation
|
|
|
and complexity costs; adding unproven techniques to the design threatens
|
|
|
deployability, readability, and ease of security analysis. Tor aims to
|
|
|
deploy a simple and stable system that integrates the best well-understood
|
|
@@ -454,14 +455,15 @@ approaches to protecting anonymity.
|
|
|
|
|
|
\SubSection{Non-goals}
|
|
|
\label{subsec:non-goals}
|
|
|
-In favoring conservative, deployable designs, we have explicitly deferred
|
|
|
+In favoring simple, deployable designs, we have explicitly deferred
|
|
|
a number of goals, either because they are solved elsewhere, or because
|
|
|
they are an open research question.
|
|
|
|
|
|
\textbf{Not Peer-to-peer:} Tarzan and MorphMix aim to scale to completely
|
|
|
decentralized peer-to-peer environments with thousands of short-lived
|
|
|
servers, many of which may be controlled by an adversary. This approach
|
|
|
-is appealing, but still has many open problems.
|
|
|
+is appealing, but still has many open problems
|
|
|
+\cite{tarzan:ccs02,morphmix:fc04}.
|
|
|
|
|
|
\textbf{Not secure against end-to-end attacks:} Tor does not claim
|
|
|
to provide a definitive solution to end-to-end timing or intersection
|
|
@@ -522,9 +524,10 @@ network and correlating traffic entering and leaving the network---either
|
|
|
because of relationships in packet timing; relationships in the volume
|
|
|
of data sent; or relationships in any externally visible user-selected
|
|
|
options. The adversary can also mount active attacks by compromising
|
|
|
-routers or keys; by replaying traffic; by selectively DoSing trustworthy
|
|
|
-routers to encourage users to send their traffic through compromised
|
|
|
-routers, or DoSing users to see if the traffic elsewhere in the
|
|
|
+routers or keys; by replaying traffic; by selectively denying service
|
|
|
+to trustworthy routers to encourage users to send their traffic through
|
|
|
+compromised routers, or denying service to users to see if the traffic
|
|
|
+elsewhere in the
|
|
|
network stops; or by introducing patterns into traffic that can later be
|
|
|
detected. The adversary might attack the directory servers to give users
|
|
|
differing views of network state. Additionally, he can try to decrease
|
|
@@ -587,8 +590,10 @@ fairness issues.
|
|
|
% I think we should describe connections before cells. -NM
|
|
|
|
|
|
Traffic passes from one OR to another, or between a user's OP and an OR,
|
|
|
-in fixed-size cells. Each cell is 256
|
|
|
-bytes, and consists of a header and a payload. The header includes an
|
|
|
+in fixed-size cells. Each cell is 256 bytes (but see
|
|
|
+Section~\ref{sec:conclusion}
|
|
|
+for a discussion of allowing large cells and small cells on the same
|
|
|
+network), and consists of a header and a payload. The header includes an
|
|
|
anonymous circuit identifier (ACI) that specifies which circuit the
|
|
|
% Should we replace ACI with circID ? What is this 'anonymous circuit'
|
|
|
% thing anyway? -RD
|
|
@@ -611,7 +616,8 @@ be multiplexed over a circuit); an end-to-end checksum for integrity
|
|
|
checking; the length of the relay payload; and a relay command. Relay
|
|
|
commands can be one of: \emph{relay
|
|
|
data} (for data flowing down the stream), \emph{relay begin} (to open a
|
|
|
-stream), \emph{relay end} (to close a stream), \emph{relay connected}
|
|
|
+stream), \emph{relay end} (to close a stream cleanly), \emph{relay
|
|
|
+teardown} (to close a broken stream), \emph{relay connected}
|
|
|
(to notify the OP that a relay begin has succeeded), \emph{relay
|
|
|
extend} and \emph{relay extended} (to extend the circuit by a hop,
|
|
|
and to acknowledge), \emph{relay truncate} and \emph{relay truncated}
|
|
@@ -621,9 +627,6 @@ implement long-range dummies).
|
|
|
|
|
|
We describe each of these cell types in more detail below.
|
|
|
|
|
|
-% Nick: should there have been a table here? -RD
|
|
|
-% Maybe. -NM
|
|
|
-
|
|
|
\SubSection{Circuits and streams}
|
|
|
\label{subsec:circuits}
|
|
|
|
|
@@ -638,8 +641,9 @@ open many TCP streams.
|
|
|
In Tor, each circuit can be shared by many TCP streams. To avoid
|
|
|
delays, users construct circuits preemptively. To limit linkability
|
|
|
among the streams, users rotate connections by building a new circuit
|
|
|
-periodically (currently every minute) if the previous one has been
|
|
|
-used, and expire old used circuits that are no longer in use. Thus
|
|
|
+periodically if the previous one has been used,
|
|
|
+and expire old used circuits that are no longer in use. Tor considers
|
|
|
+making a new circuit once a minute: thus
|
|
|
even heavy users spend a negligible amount of time and CPU in
|
|
|
building circuits, but only a limited number of requests can be linked
|
|
|
to each other by a given exit node. Also, because circuits are built
|
|
@@ -745,25 +749,25 @@ applications like Mozilla and ssh have this flaw.
|
|
|
|
|
|
In the case of Mozilla, we're fine: the filtering web proxy called Privoxy
|
|
|
does the SOCKS call safely, and Mozilla talks to Privoxy safely. But a
|
|
|
-portable general solution, such as for ssh, is an open problem. We could
|
|
|
+portable general solution, such as for ssh, is an open problem. We can
|
|
|
modify the local nameserver, but this approach is invasive, brittle, and
|
|
|
-not portable. We could encourage the resolver library to do resolution
|
|
|
+not portable. We can encourage the resolver library to do resolution
|
|
|
via TCP rather than UDP, but this approach is hard to do right, and also
|
|
|
-has portability problems. Our current answer is to encourage the use of
|
|
|
-privacy-aware proxies like Privoxy wherever possible, and also provide
|
|
|
-a tool similar to \emph{dig} that can do a private lookup through the
|
|
|
-Tor network.
|
|
|
+has portability problems. We can provide a tool similar to \emph{dig} that
|
|
|
+can do a private lookup through the Tor network. Our current answer is to
|
|
|
+encourage the use of privacy-aware proxies like Privoxy wherever possible,
|
|
|
|
|
|
Ending a Tor stream is analogous to ending a TCP stream: it uses a
|
|
|
two-step handshake for normal operation, or a one-step handshake for
|
|
|
errors. If one side of the stream closes abnormally, that node simply
|
|
|
sends a relay teardown cell, and tears down the stream. If one side
|
|
|
-% Nick: mention relay teardown in 'cell' subsec? good enough name? -RD
|
|
|
of the stream closes the connection normally, that node sends a relay
|
|
|
end cell down the circuit. When the other side has sent back its own
|
|
|
relay end, the stream can be torn down. This two-step handshake allows
|
|
|
for TCP-based applications that, for example, close a socket for writing
|
|
|
-but are still willing to read.
|
|
|
+but are still willing to read. Remember that all relay cells use layered
|
|
|
+encryption, so only the destination OR knows what type of relay cell
|
|
|
+it is.
|
|
|
|
|
|
\SubSection{Integrity checking on streams}
|
|
|
|
|
@@ -815,6 +819,7 @@ that Alice or Bob tear down the circuit if they receive a bad hash.
|
|
|
Volunteers are generally more willing to run services that can limit
|
|
|
their bandwidth usage. To accomodate them, Tor servers use a token
|
|
|
bucket approach to limit the number of bytes they
|
|
|
+% XXX cite token bucket?
|
|
|
receive. Tokens are added to the bucket each second (when the bucket is
|
|
|
full, new tokens are discarded.) Each token represents permission to
|
|
|
receive one byte from the network---to receive a byte, the connection
|
|
@@ -947,17 +952,6 @@ to slow down other users when they build new circuits.
|
|
|
|
|
|
% What about link-to-link rate limiting?
|
|
|
|
|
|
-More worrisome are distributed denial of service attacks wherein an
|
|
|
-attacker uses a large number of compromised hosts throughout the network
|
|
|
-to consume the Tor network's resources. Although these attacks are not
|
|
|
-new to the networking literature, some proposed approaches are a poor
|
|
|
-fit to anonymous networks. For example, solutions based on backtracking
|
|
|
-harmful traffic \cite{XXX} could allow an anonymity-breaking
|
|
|
-adversary to exploit the backtracking mechanism.
|
|
|
-% XXX I don't see how you would do DDoS through Tor. And even if you
|
|
|
-% did, it seems ok to track you down. Should we remove this
|
|
|
-% paragraph? -RD
|
|
|
-
|
|
|
Attackers also have an opportunity to attack the Tor network by mounting
|
|
|
attacks on its hosts and network links. Disrupting a single circuit or
|
|
|
link breaks all currently open streams passing along that part of the
|
|
@@ -1001,7 +995,7 @@ network. (Using a private exit (if one exists) is a more secure way
|
|
|
for a client to connect to a given host or network---an external
|
|
|
adversary cannot eavesdrop traffic between the private exit and the
|
|
|
final destination, and so is less sure of Alice's destination and
|
|
|
-activities.) is less sure of Alice's destination. More generally,
|
|
|
+activities.) is less sure of Alice's destination. In general,
|
|
|
nodes can require a variety of forms of traffic authentication
|
|
|
\cite{or-discex00}.
|
|
|
|
|
@@ -1187,7 +1181,7 @@ but refuses to relay traffic from other routers, the directory servers
|
|
|
must build circuits and use them to anonymously test router reliability
|
|
|
\cite{mix-acc}.
|
|
|
|
|
|
-When a client Alice retrieves a consensus directory, she uses it if it
|
|
|
+When Alice retrieves a consensus directory, she uses it if it
|
|
|
is signed by a majority of the directory servers she knows.
|
|
|
|
|
|
Using directory servers rather than flooding provides simplicity and
|
|
@@ -1221,8 +1215,9 @@ Our design for location-hidden servers has the following properties:
|
|
|
simply by sending many requests to talk to Bob. Thus, Bob needs a
|
|
|
way to filter incoming requests.
|
|
|
\item[Robust:] Bob should be able to maintain a long-term pseudonymous
|
|
|
- identity even in the presence of router failure. Thus, Bob's identity
|
|
|
- must not be tied to a single OR.
|
|
|
+ identity even in the presence of router failure. Thus, Bob's service
|
|
|
+ must not be tied to a single OR, and Bob must be able to tie his service
|
|
|
+ to new ORs.
|
|
|
\item[Smear-resistant:] An attacker should not be able to use rendezvous
|
|
|
points to smear an OR. That is, if a social attacker tries to host a
|
|
|
location-hidden service that is illegal or disreputable, it should not
|
|
@@ -1327,8 +1322,8 @@ remains a SOCKS proxy. Thus we must encode all of the necessary
|
|
|
information into the fully qualified domain name Alice uses when
|
|
|
establishing her connections. Location-hidden services use a virtual
|
|
|
top level domain called `.onion': thus hostnames take the form
|
|
|
-x.y.onion where x encodes the hash of PK, and y is the authentication
|
|
|
-cookie. Alice's onion proxy examines hostnames and recognizes when
|
|
|
+x.y.onion where x is the authentication cookie, and y encodes the hash
|
|
|
+of PK. Alice's onion proxy examines hostnames and recognizes when
|
|
|
they're destined for a hidden server. If so, it decodes the PK and
|
|
|
starts the rendezvous as described in the table above.
|
|
|
|
|
@@ -1342,7 +1337,7 @@ self-authenticating, and so the client can recognize the same service
|
|
|
with confidence later on. His design also differs from ours in the
|
|
|
following ways: First, Goldberg suggests that the client should
|
|
|
manually hunt down a current location of the service via Gnutella;
|
|
|
-whereas our use of the DHT makes lookup faster, more robust, and
|
|
|
+whereas our use of CFS makes lookup faster, more robust, and
|
|
|
transparent to the user. Second, in Tor the client and server
|
|
|
negotiate ephemeral keys via Diffie-Hellman, so at no point in the
|
|
|
path is the plaintext exposed. Third, our design tries to minimize the
|
|
@@ -1546,7 +1541,9 @@ them.
|
|
|
traffic once the circuits have been closed.) Additionally, building
|
|
|
circuits that cross jurisdictions can make legal coercion
|
|
|
harder---this phenomenon is commonly called ``jurisdictional
|
|
|
- arbitrage.''
|
|
|
+ arbitrage.'' The JAP project recently experienced this issue, when
|
|
|
+ the German government successfully ordered them to add a backdoor to
|
|
|
+ all of their nodes.
|
|
|
|
|
|
|
|
|
\item \emph{Run a recipient.} By running a Web server, an adversary
|
|
@@ -1890,7 +1887,8 @@ issues remaining to be ironed out. In particular:
|
|
|
|
|
|
%% commented out for anonymous submission
|
|
|
%\Section{Acknowledgments}
|
|
|
-% Peter Palfrader for editing
|
|
|
+% Peter Palfrader, Geoff Goodell, Adam Shostack, Joseph Sokol-Margolis
|
|
|
+% for editing and comments
|
|
|
% Bram Cohen for congestion control discussions
|
|
|
% Adam Back for suggesting telescoping circuits
|
|
|
|