|
@@ -81,7 +81,7 @@ build a \emph{circuit}, in which each node (or ``onion router'' or ``OR'')
|
|
|
in the path knows its predecessor and successor, but no other nodes in
|
|
|
the circuit. Traffic flowing down the circuit is sent in fixed-size
|
|
|
\emph{cells}, which are unwrapped by a symmetric key at each node
|
|
|
-(like the layers of an onion) and relayed downstream. The
|
|
|
+(like the layers of an onion) and relayed downstream. The
|
|
|
Onion Routing project published several design and analysis papers
|
|
|
\cite{or-ih96,or-jsac98,or-discex00,or-pet00}. While a wide area Onion
|
|
|
Routing network was deployed briefly, the only long-running and
|
|
@@ -144,7 +144,7 @@ streams along each circuit to improve efficiency and anonymity.
|
|
|
|
|
|
\textbf{Leaky-pipe circuit topology:} Through in-band signaling
|
|
|
within the circuit, Tor initiators can direct traffic to nodes partway
|
|
|
-down the circuit. This novel approach
|
|
|
+down the circuit. This novel approach
|
|
|
allows traffic to exit the circuit from the middle---possibly
|
|
|
frustrating traffic shape and volume attacks based on observing the end
|
|
|
of the circuit. (It also allows for long-range padding if
|
|
@@ -257,7 +257,7 @@ difficult for them to prevent an attacker who can eavesdrop both ends of the
|
|
|
communication from correlating the timing and volume
|
|
|
of traffic entering the anonymity network with traffic leaving it. These
|
|
|
protocols are also vulnerable against active attacks in which an
|
|
|
-adversary introduces timing patterns into traffic entering the network and
|
|
|
+adversary introduces timing patterns into traffic entering the network and
|
|
|
looks
|
|
|
for correlated patterns among exiting traffic.
|
|
|
Although some work has been done to frustrate
|
|
@@ -274,7 +274,7 @@ confirmation (cf.\ Section~\ref{subsec:threat-model}).
|
|
|
The simplest low-latency designs are single-hop proxies such as the
|
|
|
{\bf Anonymizer} \cite{anonymizer}, wherein a single trusted server strips the
|
|
|
data's origin before relaying it. These designs are easy to
|
|
|
-analyze, but users must trust the anonymizing proxy.
|
|
|
+analyze, but users must trust the anonymizing proxy.
|
|
|
Concentrating the traffic to a single point increases the anonymity set
|
|
|
(the people a given user is hiding among), but it is vulnerable if the
|
|
|
adversary can observe all traffic going into and out of the proxy.
|
|
@@ -294,7 +294,7 @@ The {\bf Java Anon Proxy} (also known as JAP or Web MIXes) uses fixed shared
|
|
|
routes known as \emph{cascades}. As with a single-hop proxy, this
|
|
|
approach aggregates users into larger anonymity sets, but again an
|
|
|
attacker only needs to observe both ends of the cascade to bridge all
|
|
|
-the system's traffic. The Java Anon Proxy's design
|
|
|
+the system's traffic. The Java Anon Proxy's design
|
|
|
calls for padding between end users and the head of the cascade
|
|
|
\cite{web-mix}. However, it is not demonstrated whether the current
|
|
|
implementation's padding policy improves anonymity.
|
|
@@ -340,7 +340,7 @@ Tor, they may accept TCP streams and relay the data in those streams
|
|
|
along the circuit, ignoring the breakdown of that data into TCP segments
|
|
|
\cite{morphmix:fc04,anonnet}. Finally, they may accept application-level
|
|
|
protocols (such as HTTP) and relay the application requests themselves
|
|
|
-along the circuit.
|
|
|
+along the circuit.
|
|
|
Making this protocol-layer decision requires a compromise between flexibility
|
|
|
and anonymity. For example, a system that understands HTTP, such as Crowds,
|
|
|
can strip
|
|
@@ -449,7 +449,7 @@ normalization} like Privoxy or the Anonymizer. If anonymization from
|
|
|
the responder is desired for complex and variable
|
|
|
protocols like HTTP, Tor must be layered with a filtering proxy such
|
|
|
as Privoxy to hide differences between clients, and expunge protocol
|
|
|
-features that leak identity.
|
|
|
+features that leak identity.
|
|
|
Note that by this separation Tor can also provide services that
|
|
|
are anonymous to the network yet authenticated to the responder, like
|
|
|
SSH. Similarly, Tor does not currently integrate
|
|
@@ -473,7 +473,7 @@ compromise some fraction of the onion routers.
|
|
|
In low-latency anonymity systems that use layered encryption, the
|
|
|
adversary's typical goal is to observe both the initiator and the
|
|
|
responder. By observing both ends, passive attackers can confirm a
|
|
|
-suspicion that Alice is
|
|
|
+suspicion that Alice is
|
|
|
talking to Bob if the timing and volume patterns of the traffic on the
|
|
|
connection are distinct enough; active attackers can induce timing
|
|
|
signatures on the traffic to force distinct patterns. Rather
|
|
@@ -509,7 +509,7 @@ each of these attacks.
|
|
|
\Section{The Tor Design}
|
|
|
\label{sec:design}
|
|
|
|
|
|
-The Tor network is an overlay network; each onion router (OR)
|
|
|
+The Tor network is an overlay network; each onion router (OR)
|
|
|
runs as a normal
|
|
|
user-level process without any special privileges.
|
|
|
Each onion router maintains a long-term TLS \cite{TLS}
|
|
@@ -524,7 +524,7 @@ runs local software called an onion proxy (OP) to fetch directories,
|
|
|
establish circuits across the network,
|
|
|
and handle connections from user applications. These onion proxies accept
|
|
|
TCP streams and multiplex them across the circuits. The onion
|
|
|
-router on the other side
|
|
|
+router on the other side
|
|
|
of the circuit connects to the destinations of
|
|
|
the TCP streams and relays data.
|
|
|
|
|
@@ -578,8 +578,8 @@ and \emph{destroy} (to tear down a circuit).
|
|
|
Relay cells have an additional header (the relay header) after the
|
|
|
cell header, containing a stream identifier (many streams can
|
|
|
be multiplexed over a circuit); an end-to-end checksum for integrity
|
|
|
-checking; the length of the relay payload; and a relay command.
|
|
|
-The entire contents of the relay header and the relay cell payload
|
|
|
+checking; the length of the relay payload; and a relay command.
|
|
|
+The entire contents of the relay header and the relay cell payload
|
|
|
are encrypted or decrypted together as the relay cell moves along the
|
|
|
circuit, using the 128-bit AES cipher in counter mode to generate a
|
|
|
cipher stream.
|
|
@@ -622,7 +622,7 @@ without delaying streams and thereby harming user experience.\\
|
|
|
A user's OP constructs circuits incrementally, negotiating a
|
|
|
symmetric key with each OR on the circuit, one hop at a time. To begin
|
|
|
creating a new circuit, the OP (call her Alice) sends a
|
|
|
-\emph{create} cell to the first node in her chosen path (call him Bob).
|
|
|
+\emph{create} cell to the first node in her chosen path (call him Bob).
|
|
|
(She chooses a new
|
|
|
circID $C_{AB}$ not currently used on the connection from her to Bob.)
|
|
|
The \emph{create} cell's
|
|
@@ -694,7 +694,7 @@ whether the decrypted streamID is recognized---either because it
|
|
|
corresponds to an open stream at this OR for the given circuit, or because
|
|
|
it is the control streamID (zero). If the OR recognizes the
|
|
|
streamID, it accepts the relay cell and processes it as described
|
|
|
-below. Otherwise,
|
|
|
+below. Otherwise,
|
|
|
the OR looks up the circID and OR for the
|
|
|
next step in the circuit, replaces the circID as appropriate, and
|
|
|
sends the decrypted relay cell to the next OR. (If the OR at the end
|
|
@@ -713,19 +713,19 @@ encrypts the cell payload (that is, the relay header and payload) with
|
|
|
the symmetric key of each hop up to that OR. Because the streamID is
|
|
|
encrypted to a different value at each step, only at the targeted OR
|
|
|
will it have a meaningful value.\footnote{
|
|
|
- % Should we just say that 2^56 is itself negligible?
|
|
|
- % Assuming 4-hop circuits with 10 streams per hop, there are 33
|
|
|
+ % Should we just say that 2^56 is itself negligible?
|
|
|
+ % Assuming 4-hop circuits with 10 streams per hop, there are 33
|
|
|
% possible bad streamIDs before the last circuit. This still
|
|
|
% gives an error only once every 2 million terabytes (approx).
|
|
|
With 56 bits of streamID per cell, the probability of an accidental
|
|
|
collision is far lower than the chance of hardware failure.}
|
|
|
This \emph{leaky pipe} circuit topology
|
|
|
-allows Alice's streams to exit at different ORs on a single circuit.
|
|
|
+allows Alice's streams to exit at different ORs on a single circuit.
|
|
|
Alice may choose different exit points because of their exit policies,
|
|
|
or to keep the ORs from knowing that two streams
|
|
|
originate from the same person.
|
|
|
|
|
|
-When an OR later replies to Alice with a relay cell, it
|
|
|
+When an OR later replies to Alice with a relay cell, it
|
|
|
encrypts the cell's relay header and payload with the single key it
|
|
|
shares with Alice, and sends the cell back toward Alice along the
|
|
|
circuit. Subsequent ORs add further layers of encryption as they
|
|
@@ -836,7 +836,7 @@ Thus, we check integrity only at the edges of each stream. When Alice
|
|
|
negotiates a key with a new hop, they each initialize a SHA-1
|
|
|
digest with a derivative of that key,
|
|
|
thus beginning with randomness that only the two of them know. From
|
|
|
-then on they each incrementally add to the SHA-1 digest the contents of
|
|
|
+then on they each incrementally add to the SHA-1 digest the contents of
|
|
|
all relay cells they create, and include with each relay cell the
|
|
|
first four bytes of the current digest. Each also keeps a SHA-1
|
|
|
digest of data received, to verify that the received hashes are correct.
|
|
@@ -851,7 +851,7 @@ of computing the digests is minimal compared to doing the AES
|
|
|
encryption performed at each hop of the circuit. We use only four
|
|
|
bytes per cell to minimize overhead; the chance that an adversary will
|
|
|
correctly guess a valid hash
|
|
|
-%, plus the payload the current cell,
|
|
|
+%, plus the payload the current cell,
|
|
|
is
|
|
|
acceptably low, given that Alice or Bob tear down the circuit if they
|
|
|
receive a bad hash.
|
|
@@ -861,7 +861,7 @@ receive a bad hash.
|
|
|
|
|
|
Volunteers are generally more willing to run services that can limit
|
|
|
their own bandwidth usage. To accommodate them, Tor servers use a
|
|
|
-token bucket approach \cite{tannenbaum96} to
|
|
|
+token bucket approach \cite{tannenbaum96} to
|
|
|
enforce a long-term average rate of incoming bytes, while still
|
|
|
permitting short-term bursts above the allowed bandwidth. Current bucket
|
|
|
sizes are set to ten seconds' worth of traffic.
|
|
@@ -908,7 +908,7 @@ reimplement full TCP windows (with sequence numbers,
|
|
|
the ability to drop cells when we're full and retransmit later, and so
|
|
|
on),
|
|
|
because TCP already guarantees in-order delivery of each
|
|
|
-cell.
|
|
|
+cell.
|
|
|
%But we need to investigate further the effects of the current
|
|
|
%parameters on throughput and latency, while also keeping privacy in mind;
|
|
|
%see Section~\ref{sec:maintaining-anonymity} for more discussion.
|
|
@@ -950,9 +950,9 @@ Currently, non-data relay cells do not affect the windows. Thus we
|
|
|
avoid potential deadlock issues, for example, arising because a stream
|
|
|
can't send a \emph{relay sendme} cell when its packaging window is empty.
|
|
|
|
|
|
-These arbitrarily chosen parameters
|
|
|
+These arbitrarily chosen parameters
|
|
|
%are probably not optimal; more
|
|
|
-%research remains to find which parameters
|
|
|
+%research remains to find which parameters
|
|
|
seem to give tolerable throughput and delay; more research remains.
|
|
|
|
|
|
\Section{Other design decisions}
|
|
@@ -1042,7 +1042,7 @@ given host or network---an external adversary cannot eavesdrop traffic
|
|
|
between the private exit and the final destination, and so is less sure of
|
|
|
Alice's destination and activities. Most onion routers will function as
|
|
|
\emph{restricted exits} that permit connections to the world at large,
|
|
|
-but prevent access to certain abuse-prone addresses and services.
|
|
|
+but prevent access to certain abuse-prone addresses and services.
|
|
|
Additionally, in some cases the OR can authenticate clients to
|
|
|
prevent exit abuse without harming anonymity \cite{or-discex00}.
|
|
|
|
|
@@ -1134,7 +1134,7 @@ an adversary could take over the network by creating many servers
|
|
|
server administrator before they are included. Mechanisms for automated
|
|
|
node approval are an area of active research, and are discussed more
|
|
|
in Section~\ref{sec:maintaining-anonymity}.
|
|
|
-
|
|
|
+
|
|
|
Of course, a variety of attacks remain. An adversary who controls
|
|
|
a directory server can track clients by providing them different
|
|
|
information---perhaps by listing only nodes under its control, or by
|
|
@@ -1214,7 +1214,7 @@ identity even in the presence of router failure. Bob's service must
|
|
|
not be tied to a single OR, and Bob must be able to tie his service
|
|
|
to new ORs. \textbf{Smear-resistant:}
|
|
|
A social attacker who offers an illegal or disreputable location-hidden
|
|
|
-service should not be able to ``frame'' a rendezvous router by
|
|
|
+service should not be able to ``frame'' a rendezvous router by
|
|
|
making observers believe the router created that service.
|
|
|
%slander-resistant? defamation-resistant?
|
|
|
\textbf{Application-transparent:} Although we require users
|
|
@@ -1257,7 +1257,7 @@ application integration is described more fully below.
|
|
|
rendezvous cookie that it will use to recognize Bob.
|
|
|
\item Alice opens an anonymous stream to one of Bob's introduction
|
|
|
points, and gives it a message (encrypted to Bob's public key)
|
|
|
- which tells him
|
|
|
+ which tells him
|
|
|
about herself, her chosen RP and the rendezvous cookie, and the
|
|
|
first half of a DH
|
|
|
handshake. The introduction point sends the message to Bob.
|
|
@@ -1296,7 +1296,7 @@ service. During normal situations, Bob's service might simply be offered
|
|
|
directly from mirrors, while Bob gives out tokens to high-priority users. If
|
|
|
the mirrors are knocked down,
|
|
|
%by distributed DoS attacks or even
|
|
|
-%physical attack,
|
|
|
+%physical attack,
|
|
|
those users can switch to accessing Bob's service via
|
|
|
the Tor rendezvous system.
|
|
|
|
|
@@ -1369,7 +1369,7 @@ reveal traffic patterns (both sent and received). Profiling via user
|
|
|
connection patterns requires further processing, because multiple
|
|
|
application streams may be operating simultaneously or in series over
|
|
|
a single circuit.
|
|
|
-
|
|
|
+
|
|
|
\emph{Observing user content.} While content at the user end is encrypted,
|
|
|
connections to responders may not be (indeed, the responding website
|
|
|
itself may be hostile). While filtering content is not a primary goal
|
|
@@ -1394,20 +1394,20 @@ by running the OP on the Tor node or behind a firewall. This approach
|
|
|
requires an observer to separate traffic originating at the onion
|
|
|
router from traffic passing through it: a global observer can do this,
|
|
|
but it might be beyond a limited observer's capabilities.
|
|
|
-
|
|
|
+
|
|
|
\emph{End-to-end size correlation.} Simple packet counting
|
|
|
will also be effective in confirming
|
|
|
endpoints of a stream. However, even without padding, we have some
|
|
|
limited protection: the leaky pipe topology means different numbers
|
|
|
of packets may enter one end of a circuit than exit at the other.
|
|
|
-
|
|
|
+
|
|
|
\emph{Website fingerprinting.} All the effective passive
|
|
|
attacks above are traffic confirmation attacks,
|
|
|
which puts them outside our design goals. There is also
|
|
|
a passive traffic analysis attack that is potentially effective.
|
|
|
Rather than searching exit connections for timing and volume
|
|
|
correlations, the adversary may build up a database of
|
|
|
-``fingerprints'' containing file sizes and access patterns for
|
|
|
+``fingerprints'' containing file sizes and access patterns for
|
|
|
targeted websites. He can later confirm a user's connection to a given
|
|
|
site simply by consulting the database. This attack has
|
|
|
been shown to be effective against SafeWeb \cite{hintz-pet02}.
|
|
@@ -1415,7 +1415,7 @@ It may be less effective against Tor, since
|
|
|
streams are multiplexed within the same circuit, and
|
|
|
fingerprinting will be limited to
|
|
|
the granularity of cells (currently 256 bytes). Additional
|
|
|
-defenses could include
|
|
|
+defenses could include
|
|
|
larger cell sizes, padding schemes to group websites
|
|
|
into large sets, and link
|
|
|
padding or long-range dummies.\footnote{Note that this fingerprinting
|
|
@@ -1464,7 +1464,7 @@ connection. There is also a danger that application
|
|
|
protocols and associated programs can be induced to reveal information
|
|
|
about the initiator. Tor depends on Privoxy and similar protocol cleaners
|
|
|
to solve this latter problem.
|
|
|
-
|
|
|
+
|
|
|
\emph{Run an onion proxy.} It is expected that end users will
|
|
|
nearly always run their own local onion proxy. However, in some
|
|
|
settings, it may be necessary for the proxy to run
|
|
@@ -1478,7 +1478,7 @@ of the Tor network can increase the value of this traffic
|
|
|
by attacking non-observed nodes to shut them down, reduce
|
|
|
their reliability, or persuade users that they are not trustworthy.
|
|
|
The best defense here is robustness.
|
|
|
-
|
|
|
+
|
|
|
\emph{Run a hostile OR.} In addition to being a local observer,
|
|
|
an isolated hostile node can create circuits through itself, or alter
|
|
|
traffic patterns to affect traffic at other nodes. Nonetheless, a hostile
|
|
@@ -1488,8 +1488,8 @@ run multiple ORs, and can persuade the directory servers
|
|
|
that those ORs are trustworthy and independent, then occasionally
|
|
|
some user will choose one of those ORs for the start and another
|
|
|
as the end of a circuit. If an adversary
|
|
|
-controls $m>1$ out of $N$ nodes, he should be able to correlate at most
|
|
|
-$\left(\frac{m}{N}\right)^2$ of the traffic in this way---although an
|
|
|
+controls $m>1$ out of $N$ nodes, he should be able to correlate at most
|
|
|
+$\left(\frac{m}{N}\right)^2$ of the traffic in this way---although an
|
|
|
adversary
|
|
|
could possibly attract a disproportionately large amount of traffic
|
|
|
by running an OR with an unusually permissive exit policy, or by
|
|
@@ -1497,7 +1497,7 @@ degrading the reliability of other routers.
|
|
|
|
|
|
\emph{Introduce timing into messages.} This is simply a stronger
|
|
|
version of passive timing attacks already discussed earlier.
|
|
|
-
|
|
|
+
|
|
|
\emph{Tagging attacks.} A hostile node could ``tag'' a
|
|
|
cell by altering it. If the
|
|
|
stream were, for example, an unencrypted request to a Web site,
|
|
@@ -1506,14 +1506,14 @@ the association. However, integrity checks on cells prevent
|
|
|
this attack.
|
|
|
|
|
|
\emph{Replace contents of unauthenticated protocols.} When
|
|
|
-relaying an unauthenticated protocol like HTTP, a hostile exit node
|
|
|
+relaying an unauthenticated protocol like HTTP, a hostile exit node
|
|
|
can impersonate the target server. Clients
|
|
|
should prefer protocols with end-to-end authentication.
|
|
|
|
|
|
\emph{Replay attacks.} Some anonymity protocols are vulnerable
|
|
|
to replay attacks. Tor is not; replaying one side of a handshake
|
|
|
will result in a different negotiated session key, and so the rest
|
|
|
-of the recorded session can't be used.
|
|
|
+of the recorded session can't be used.
|
|
|
|
|
|
\emph{Smear attacks.} An attacker could use the Tor network for
|
|
|
socially disapproved acts, to bring the
|
|
@@ -1558,7 +1558,7 @@ ORs in the final directory as he wishes. We must ensure that directory
|
|
|
server operators are independent and attack-resistant.
|
|
|
|
|
|
\emph{Encourage directory server dissent.} The directory
|
|
|
-agreement protocol assumes that directory server operators agree on
|
|
|
+agreement protocol assumes that directory server operators agree on
|
|
|
the set of directory servers. An adversary who can persuade some
|
|
|
of the directory server operators to distrust one another could
|
|
|
split the quorum into mutually hostile camps, thus partitioning
|
|
@@ -1567,7 +1567,7 @@ this attack.
|
|
|
|
|
|
\emph{Trick the directory servers into listing a hostile OR.}
|
|
|
Our threat model explicitly assumes directory server operators will
|
|
|
-be able to filter out most hostile ORs.
|
|
|
+be able to filter out most hostile ORs.
|
|
|
% If this is not true, an
|
|
|
% attacker can flood the directory with compromised servers.
|
|
|
|
|
@@ -1579,7 +1579,7 @@ accepting TLS connections from ORs but ignoring all cells. Directory
|
|
|
servers must actively test ORs by building circuits and streams as
|
|
|
appropriate. The tradeoffs of a similar approach are discussed in
|
|
|
\cite{mix-acc}.\\
|
|
|
-
|
|
|
+
|
|
|
\noindent{\large\bf Attacks against rendezvous points}\\
|
|
|
\emph{Make many introduction requests.} An attacker could
|
|
|
try to deny Bob service by flooding his introduction points with
|
|
@@ -1587,7 +1587,7 @@ requests. Because the introduction points can block requests that
|
|
|
lack authorization tokens, however, Bob can restrict the volume of
|
|
|
requests he receives, or require a certain amount of computation for
|
|
|
every request he receives.
|
|
|
-
|
|
|
+
|
|
|
\emph{Attack an introduction point.} An attacker could
|
|
|
disrupt a location-hidden service by disabling its introduction
|
|
|
points. But because a service's identity is attached to its public
|
|
@@ -1612,7 +1612,7 @@ with a session key shared by Alice and Bob.
|
|
|
|
|
|
\Section{Open Questions in Low-latency Anonymity}
|
|
|
\label{sec:maintaining-anonymity}
|
|
|
-
|
|
|
+
|
|
|
In addition to the non-goals in
|
|
|
Section~\ref{subsec:non-goals}, many other questions must be solved
|
|
|
before we can be confident of Tor's security.
|
|
@@ -1645,7 +1645,7 @@ three nodes unrelated to herself and her destination.
|
|
|
%
|
|
|
%Thus normally she chooses
|
|
|
%three nodes, but if she is running an OR and her destination is on an OR,
|
|
|
-%she uses five.
|
|
|
+%she uses five.
|
|
|
Should Alice choose a nondeterministic path length (say,
|
|
|
increasing it from a geometric distribution) to foil an attacker who
|
|
|
uses timing to learn that he is the fifth hop and thus concludes that
|
|
@@ -1684,7 +1684,7 @@ immediately beneficial because of real-world adversaries that can't
|
|
|
observe Alice's router, but can run routers of their own?
|
|
|
|
|
|
To scale to many users, and to prevent an attacker from observing the
|
|
|
-whole network at once, it may be necessary
|
|
|
+whole network at once, it may be necessary
|
|
|
to support far more servers than Tor currently anticipates.
|
|
|
This introduces several issues. First, if approval by a centralized set
|
|
|
of directory servers is no longer feasible, what mechanism should be used
|
|
@@ -1724,7 +1724,7 @@ Tor brings together many innovations into a unified deployable system. The
|
|
|
next immediate steps include:
|
|
|
|
|
|
\emph{Scalability:} Tor's emphasis on deployability and design simplicity
|
|
|
-has led us to adopt a clique topology, semi-centralized
|
|
|
+has led us to adopt a clique topology, semi-centralized
|
|
|
directories, and a full-network-visibility model for client
|
|
|
knowledge. These properties will not scale past a few hundred servers.
|
|
|
Section~\ref{sec:maintaining-anonymity} describes some promising
|
|
@@ -1831,7 +1831,7 @@ our overall usability.
|
|
|
% 'Cypherpunk', 'Cypherpunks', 'Cypherpunk remailer'
|
|
|
% 'Onion Routing design', 'onion router' [note capitalization]
|
|
|
% 'SOCKS'
|
|
|
-% Try not to use \cite as a noun.
|
|
|
+% Try not to use \cite as a noun.
|
|
|
% 'Authorizating' sounds great, but it isn't a word.
|
|
|
% 'First, second, third', not 'Firstly, secondly, thirdly'.
|
|
|
% 'circuit', not 'channel'
|