123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795796797798799800801802803804 |
- \documentclass[times,10pt,twocolumn]{article}
- \usepackage{latex8}
- %\usepackage{times}
- \usepackage{url}
- \usepackage{graphics}
- \usepackage{amsmath}
- \pagestyle{empty}
- \renewcommand\url{\begingroup \def\UrlLeft{<}\def\UrlRight{>}\urlstyle{tt}\Url}
- \newcommand\emailaddr{\begingroup \def\UrlLeft{<}\def\UrlRight{>}\urlstyle{tt}\Url}
- % If an URL ends up with '%'s in it, that's because the line *in the .bib/.tex
- % file* is too long, so break it there (it doesn't matter if the next line is
- % indented with spaces). -DH
- %\newif\ifpdf
- %\ifx\pdfoutput\undefined
- % \pdffalse
- %\else
- % \pdfoutput=1
- % \pdftrue
- %\fi
- \newenvironment{tightlist}{\begin{list}{$\bullet$}{
- \setlength{\itemsep}{0mm}
- \setlength{\parsep}{0mm}
- % \setlength{\labelsep}{0mm}
- % \setlength{\labelwidth}{0mm}
- % \setlength{\topsep}{0mm}
- }}{\end{list}}
- \begin{document}
- %% Use dvipdfm instead. --DH
- %\ifpdf
- % \pdfcompresslevel=9
- % \pdfpagewidth=\the\paperwidth
- % \pdfpageheight=\the\paperheight
- %\fi
- \title{Tor: Design of a Next-Generation Onion Router}
- %\author{Anonymous}
- %\author{Roger Dingledine \\ The Free Haven Project \\ arma@freehaven.net \and
- %Nick Mathewson \\ The Free Haven Project \\ nickm@freehaven.net \and
- %Paul Syverson \\ Naval Research Lab \\ syverson@itd.nrl.navy.mil}
- \maketitle
- \thispagestyle{empty}
- \begin{abstract}
- We present Tor, a connection-based low-latency anonymous communication
- system. It is intended as an update and replacement for onion routing
- and addresses many limitations in the original onion routing design.
- Tor works in a real-world Internet environment,
- requires little synchronization or coordination between nodes, and
- protects against known anonymity-breaking attacks as well
- as or better than other systems with similar design parameters.
- \end{abstract}
- %\begin{center}
- %\textbf{Keywords:} anonymity, peer-to-peer, remailer, nymserver, reply block
- %\end{center}
- %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
- \Section{Overview}
- \label{sec:intro}
- Onion routing is a distributed overlay network designed to anonymize
- low-latency TCP-based applications such as web browsing, secure shell,
- and instant messaging. Users choose a path through the network and
- build a \emph{virtual circuit}, in which each node in the path knows its
- predecessor and successor, but no others. Traffic flowing down the circuit
- is sent in fixed-size \emph{cells}, which are unwrapped by a symmetric key
- at each node, revealing the downstream node. The original onion routing
- project published several design and analysis papers
- \cite{or-jsac98,or-discex00,or-ih96,or-pet00}. While there was briefly
- a wide area onion routing network,
- the only long-running and publicly accessible
- implementation was a fragile proof-of-concept that ran on a single
- machine. Many critical design and deployment issues were never implemented,
- and the design has not been updated in several years.
- Here we describe Tor, a protocol for asynchronous, loosely
- federated onion routers that provides the following improvements over
- the old onion routing design:
- \begin{tightlist}
- \item \textbf{Perfect forward secrecy:} The original onion routing
- design is vulnerable to a single hostile node recording traffic and later
- forcing successive nodes in the circuit to decrypt it. Rather than using
- onions to lay the circuits, Tor uses an incremental or \emph{telescoping}
- path-building design, where the initiator negotiates session keys with
- each successive hop in the circuit. Onion replay detection is no longer
- necessary, and the network as a whole is more reliable to boot, since
- the initiator knows which hop failed and can try extending to a new node.
- \item \textbf{Applications talk to the onion proxy via Socks:}
- The original onion routing design required a separate proxy for each
- supported application protocol, resulting in a lot of extra code (most
- of which was never written) and also meaning that a lot of TCP-based
- applications were not supported. Tor uses the unified and standard Socks
- \cite{socks4,socks5} interface, allowing us to support any TCP-based
- program without modification.
- \item \textbf{Many applications can share one circuit:} The original
- onion routing design built one circuit for each request. Aside from the
- performance issues of doing public key operations for every request, it
- also turns out that regular communications patterns mean building lots
- of circuits, which can endanger anonymity.
- The very first onion routing design \cite{or-ih96} protected against
- this to some extent by hiding network access behind an onion
- router/firewall that was also forwarding traffic from other nodes.
- However, even if this meant complete protection, many users can
- benefit from onion routing for which neither running one's own node
- nor such firewall configurations are adequately convenient to be
- feasible. Those users, especially if they engage in certain unusual
- communication behaviors, may be identifiable \cite{wright03}. To
- complicate the possibility of such attacks Tor multiplexes many
- connections down each circuit, but still rotates the circuit
- periodically to avoid too much linkability from requests on a single
- circuit.
- \item \textbf{No mixing or traffic shaping:} The original onion routing
- design called for full link padding both between onion routers and between
- onion proxies (that is, users) and onion routers \cite{or-jsac98}. The
- later analysis paper \cite{or-pet00} suggested \emph{traffic shaping}
- to provide similar protection but use less bandwidth, but did not go
- into detail. However, recent research \cite{econymics} and deployment
- experience \cite{freedom21-security} indicate that this level of resource
- use is not practical or economical; and even full link padding is still
- vulnerable to active attacks \cite{defensive-dropping}.
- %[An upcoming FC04 paper. I'll add a cite when it's out. -RD]
- \item \textbf{Leaky pipes:} Through in-band signalling within the
- circuit, Tor initiators can direct traffic to nodes partway down the
- circuit. This allows for long-range padding to frustrate traffic
- shape and volume attacks at the initiator \cite{defensive-dropping},
- but because circuits are used by more than one application, it also
- allows traffic to exit the circuit from the middle -- thus
- frustrating traffic shape and volume attacks based on observing exit
- points.
- %Or something like that. hm. Tone this down maybe? Or support it. -RD
- %How's that? -PS
- \item \textbf{Congestion control:} Earlier anonymity designs do not
- address traffic bottlenecks. Unfortunately, typical approaches to load
- balancing and flow control in overlay networks involve inter-node control
- communication and global views of traffic. Our decentralized ack-based
- congestion control maintains reasonable anonymity while allowing nodes
- at the edges of the network to detect congestion or flooding attacks
- and send less data until the congestion subsides.
- \item \textbf{Directory servers:} Rather than attempting to flood
- link-state information through the network, which can be unreliable and
- open to partitioning attacks or outright deception, Tor takes a simplified
- view towards distributing link-state information. Certain more trusted
- onion routers also serve as directory servers; they provide signed
- \emph{directories} describing all routers they know about, and which
- are currently up. Users periodically download these directories via HTTP.
- \item \textbf{End-to-end integrity checking:} Without integrity checking
- on traffic going through the network, an onion router can change the
- contents of cells as they pass by, e.g. by redirecting a connection on
- the fly so it connects to a different webserver, or by tagging encrypted
- traffic and looking for traffic at the network edges that has been
- tagged \cite{minion-design}.
- \item \textbf{Robustness to node failure:} router twins
- \item \textbf{Exit policies:}
- Tor provides a consistent mechanism for each node to specify and
- advertise an exit policy.
- \item \textbf{Rendezvous points:}
- location-protected servers
- \end{tightlist}
- We review previous work in Section \ref{sec:background}, describe
- our goals and assumptions in Section \ref{sec:assumptions},
- and then address the above list of improvements in Sections
- \ref{sec:design}-\ref{sec:maintaining-anonymity}. We then summarize
- how our design stands up to known attacks, and conclude with a list of
- open problems.
- %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
- \Section{Background and threat model}
- \label{sec:background}
- \SubSection{Related work}
- \label{sec:related-work}
- Modern anonymity designs date to Chaum's Mix-Net\cite{chaum-mix} design of
- 1981. Chaum proposed hiding sender-recipient connections by wrapping
- messages in several layers of public key cryptography, and relaying them
- through a path composed of Mix servers. Mix servers in turn decrypt, delay,
- and re-order messages, before relay them along the path towards their
- destinations.
- Subsequent relay-based anonymity designs have diverged in two
- principal directions. Some have attempted to maximize anonymity at
- the cost of introducing comparatively large and variable latencies,
- for example, Babel\cite{babel}, Mixmaster\cite{mixmaster-spec}, and
- Mixminion\cite{minion-design}. Because of this
- decision, such \emph{high-latency} networks are well-suited for anonymous
- email, but introduce too much lag for interactive tasks such as web browsing,
- internet chat, or SSH connections.
- Tor belongs to the second category: \emph{low-latency} designs that
- attempt to anonymize interactive network traffic. Because such
- traffic tends to involve a relatively large numbers of packets, it is
- difficult to prevent an attacker who can eavesdrop entry and exit
- points from correlating packets entering the anonymity network with
- packets leaving it. Although some work has been done to frustrate
- these attacks, most designs protect primarily against traffic analysis
- rather than traffic confirmation \cite{or-jsac98}. One can pad and
- limit communication to a constant rate or at least to control the
- variation in traffic shape. This can have prohibitive bandwidth costs
- and/or performance limitations. One can also use a cascade (fixed
- shared route) with a relatively fixed set of users. This assumes a
- significant degree of agreement and provides an easier target for an active
- attacker since the endpoints are generally known. However, a practical
- network with both of these features has been run for many years
- (the Java Anon Proxy, aka Web MIXes, \cite{web-mix}).
- Another low latency design that was proposed independently and at
- about the same time as onion routing was PipeNet \cite{pipenet}.
- This provided anonymity protections that were stronger than onion routing's,
- but at the cost of allowing a single user to shut down the network simply
- by not sending. It was also never implemented or formally published.
- The simplest low-latency designs are single-hop proxies such as the
- Anonymizer \cite{anonymizer}, wherein a single trusted server removes
- identifying users' data before relaying it. These designs are easy to
- analyze, but require end-users to trust the anonymizing proxy.
- More complex are distributed-trust, channel-based anonymizing systems. In
- these designs, a user establishes one or more medium-term bidirectional
- end-to-end tunnels to exit servers, and uses those tunnels to deliver a
- number of low-latency packets to and from one or more destinations per
- tunnel. Establishing tunnels is comparatively expensive and typically
- requires public-key cryptography, whereas relaying packets along a tunnel is
- comparatively inexpensive. Because a tunnel crosses several servers, no
- single server can learn the user's communication partners.
- Systems such as earlier versions of Freedom and onion routing
- build the anonymous channel all at once (using an onion). Later
- designs of Freedom and onion routing as described herein build
- the channel in stages as does AnonNet
- \cite{anonnet}. Amongst other things, this makes perfect forward
- secrecy feasible.
- Some systems, such as Crowds \cite{crowds-tissec}, do not rely on the
- changing appearance of packets to hide the path; rather they employ
- mechanisms so that an intermediary cannot be sure when it is
- receiving from/sending to the ultimate initiator. There is no public-key
- encryption needed for Crowds, but the responder and all data are
- visible to all nodes on the path so that anonymity of connection
- initiator depends on filtering all identifying information from the
- data stream. Crowds is also designed only for HTTP traffic.
- Hordes \cite{hordes-jcs} is based on Crowds but also uses multicast
- responses to hide the initiator. Herbivore \cite{herbivore} and
- P5 \cite{p5} go even further requiring broadcast.
- They each use broadcast in very different ways, and tradeoffs are made to
- make broadcast more practical. Both Herbivore and P5 are designed primarily
- for communication between communicating peers, although Herbivore
- permits external connections by requesting a peer to serve as a proxy.
- Allowing easy connections to nonparticipating responders or recipients
- is a practical requirement for many users, e.g., to visit
- nonparticipating Web sites or to exchange mail with nonparticipating
- recipients.
- Distributed-trust anonymizing systems differ in how they prevent attackers
- from controlling too many servers and thus compromising too many user paths.
- Some protocols rely on a centrally maintained set of well-known anonymizing
- servers. Current Tor design falls into this category.
- Others (such as Tarzan and MorphMix) allow unknown users to run
- servers, while using a limited resource (DHT space for Tarzan; IP space for
- MorphMix) to prevent an attacker from owning too much of the network.
- Crowds uses a centralized ``blender'' to enforce Crowd membership
- policy. For small crowds it is suggested that familiarity with all
- members is adequate. For large diverse crowds, limiting accounts in
- control of any one party is more difficult:
- ``(e.g., the blender administrator sets up an account for a user only
- after receiving a written, notarized request from that user) and each
- account to one jondo, and by monitoring and limiting the number of
- jondos on any one net- work (using IP address), the attacker would be
- forced to launch jondos using many different identities and on many
- different networks to succeed'' \cite{crowds-tissec}.
- [XXX I'm considering the subsection as ended here for now. I'm leaving the
- following notes in case we want to revisit any of them. -PS]
- There are also many systems which are intended for anonymous
- and/or censorship resistant file sharing. [XXX Should we list all these
- or just say it's out of scope for the paper?
- eternity, gnunet, freenet, freehaven, publius, tangler, taz/rewebber]
- Channel-based anonymizing systems also differ in their use of dummy traffic.
- [XXX]
- Finally, several systems provide low-latency anonymity without channel-based
- communication. Crowds and [XXX] provide anonymity for HTTP requests; [...]
- [XXX Mention error recovery?]
- anonymizer%
- pipenet%
- freedom v1%
- freedom v2%
- onion routing v1%
- isdn-mixes%
- crowds%
- real-time mixes, web mixes%
- anonnet (marc rennhard's stuff)%
- morphmix%
- P5%
- gnunet%
- rewebbers%
- tarzan%
- herbivore%
- hordes%
- cebolla (?)%
- [XXX Close by mentioning where Tor fits.]
- \SubSection{Our threat model}
- \label{subsec:threat-model}
- Like all practical low-latency systems, Tor is broken against a global
- passive adversary, the most commonly assumed adversary for analysis of
- theoretical anonymous communication designs. The adversary we assume
- is weaker than global with respect to distribution, but it is not
- merely passive. We assume a threat model derived largely from that of
- \cite{or-pet00}.
- [XXX The following is cut in from the OR analysis paper from PET 2000.
- I've already changed it a little, but didn't get very far.
- And, much if not all will eventually
- go. But I thought it a useful starting point. -PS]
- The basic adversary components we consider are:
- \begin{description}
- \item[Observer:] can observe a connection (e.g., a sniffer on an
- Internet router), but cannot initiate connections.
- \item[Disrupter:] can delay (indefinitely) or corrupt traffic on a
- link.
- \item[Hostile initiator:] can initiate (destroy) connections with
- specific routes as well as varying the timing and content of traffic
- on the connections it creates.
- \item[Hostile responder:] can vary the traffic on the connections made
- to it including refusing them entirely, intentionally modifying what
- it sends and at what rate, and selectively closing them.
- \item[Compromised Tor-node:] can arbitrarily manipulate the connections
- under its control, as well as creating new connections (that pass
- through itself).
- \end{description}
- All feasible adversaries can be composed out of these basic
- adversaries. This includes combinations such as one or more
- compromised network nodes cooperating with disrupters of links on
- which those nodes are not adjacent, or such as combinations of hostile
- outsiders and observers. However, we are able to restrict our
- analysis of adversaries to just one class, the compromised Tor-node.
- We now justify this claim.
- Especially in light of our assumption that the network forms a clique,
- a hostile outsider can perform a subset of the actions that a
- compromised COR can do. Also, while a compromised COR cannot disrupt
- or observe a link unless it is adjacent to it, any adversary that
- replaces some or all observers and/or disrupters with a compromised
- COR adjacent to the relevant link is more powerful than the adversary
- it replaces. And, in the presence of adequate link padding or bandwidth
- limiting even collaborating observers can gain no useful information about
- connections within the network. They may be able to gain information
- by observing connections to the network (in the remote-COR configuration),
- but again this is less than what the COR to which such connection is made
- can learn. Thus, by considering adversaries consisting of
- collections of compromised CORs we cover the worst case of all
- combinations of basic adversaries. Our analysis focuses on this most
- capable adversary, one or more compromised CORs.
- The possible distributions of adversaries are
- \begin{itemize}
- \item{\bf single adversary}
- \item{\bf multiple adversary:} A fixed, randomly distributed subset of
- Tor-nodes is compromised.
- \item{\bf roving adversary:} A fixed-bound size subset of Tor-nodes is
- compromised at any one time. At specific intervals, other CORs can
- become compromised or uncompromised.
- \item{\bf global adversary:} All nodes are compromised.
- \end{itemize}
- Onion Routing provides no protection against a global adversary. If
- all the CORs are compromised, they can know exactly who is talking to
- whom. The content of what was sent will be revealed as it emerges
- from the OR network, unless it has been end-to-end encrypted outside the
- OR network. Even a firewall-to-firewall connection is exposed
- if, as assumed above, our goal is to hide which local-COR is talking to
- which local-COR.
- \SubSection{Known attacks against low-latency anonymity systems}
- \label{subsec:known-attacks}
- We discuss each of these attacks in more detail below, along with the
- aspects of the Tor design that provide defense. We provide a summary
- of the attacks and our defenses against them in Section \ref{sec:attacks}.
- Passive attacks:
- simple observation,
- timing correlation,
- size correlation,
- option distinguishability,
- Active attacks:
- key compromise,
- iterated subpoena,
- run recipient,
- run a hostile node,
- compromise entire path,
- selectively DOS servers,
- introduce timing into messages,
- directory attacks,
- tagging attacks
- \Section{Design goals and assumptions}
- \label{sec:assumptions}
- \subsection{Goals}
- % Are these really our goals? ;) -NM
- Like other low-latency anonymity designs, Tor seeks to frustrate
- attackers from linking communication partners, or from linking
- multiple communications to or from a single point. Within this
- overriding goal, however, several design considerations have directed
- Tor's evolution.
- First, we have tried to build a {\bf deployable} system. [XXX why?]
- This requirement precludes designs that are expensive to run (for
- example, by requiring more bandwidth than volunteers are easy to
- provide); designs that place a heavy liability burden on operators
- (for example, by allowing attackers to implicate operators in illegal
- activities); and designs that are difficult or expensive to implement
- (for example, by requiring kernel patches to many operating systems,
- or ).
- Second, the system must be {\bf usable}. A hard-to-use system has
- fewer users---and because anonymity systems hide users among users, a
- system with fewer users provides less anonymity. Thus, usability is
- not only a convenience, but is a security requirement for anonymity
- systems.
- Third, the protocol must be {\bf extensible}, so that it can serve as
- a test-bed for future research in low-latency anonymity systems.
- (Note that while an extensible protocol benefits researchers, there is
- a danger that differing choices of extensions will render users
- distinguishable. Thus, implementations should not permit different
- protocol extensions to coexist in a single deployed network.)
- The protocol's design and security parameters must be {\bf
- conservative}. Additional features impose implementation and
- complexity costs. [XXX Say that we don't want to try to come up with
- speculative solutions to problems we don't KNOW how to solve? -NM]
- [XXX mention something about robustness? But we really aren't that
- robust. We just assume that tunneled protocols tolerate connection
- loss. -NM]
- \subsection{Non-goals}
- In favoring conservative, deployable designs, we have explicitly
- deferred a number of goals---not because they are not desirable in
- anonymity systems---but because solving them is either solved
- elsewhere, or an area of active research without a generally accepted
- solution.
- Unlike Tarzan or Morphmix, Tor does not attempt to scale to completely
- decentralized peer-to-peer environments with thousands of short-lived
- servers.
- Tor does not claim to provide a definitive solution to end-to-end
- timing or intersection attacks for users who do not run their own
- Onion Routers.
- Tor does not provide ``protocol normalization'' like the Anonymizer,
- Privoxy, or XXX. In order to provide client indistinguishibility for
- complex and variable protocols such as HTTP, Tor must be layered with
- a proxy such as Privoxy or XXX. Similarly, Tor does not currently
- integrate tunneling for non-stream-based protocols; this too must be
- provided by an external service.
- Tor is not steganographic. It doesn't try to conceal which users are
- sending or receiving communications via Tor.
- \subsection{Assumptions}
- - Threat model
- - Mostly reliable nodes: not trusted.
- - Small group of trusted dirserv ops
- - Many users of diff bandwidth come and go.
- [XXX what else?]
- %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
- \Section{The Tor Design}
- \label{sec:design}
- \Section{Other design decisions}
- \SubSection{Exit policies and abuse}
- \label{subsec:exitpolicies}
- \SubSection{Directory Servers}
- \label{subsec:dir-servers}
- \Section{Rendezvous points: location privacy}
- \label{sec:rendezvous}
- Rendezvous points are a building block for \emph{location-hidden services}
- (aka responder anonymity) in the Tor network. Location-hidden
- services means Bob can offer a tcp service, such as an Apache webserver,
- without revealing the IP of that service.
- We provide this censorship resistance for Bob by allowing him to
- advertise several onion routers (his \emph{Introduction Points}) as his
- public location. Alice, the client, chooses a node for her \emph{Meeting
- Point}. She connects to one of Bob's introduction points, informs him
- about her meeting point, and then waits for him to connect to the meeting
- point. This extra level of indirection means Bob's introduction points
- don't open themselves up to abuse by serving files directly, eg if Bob
- chooses a node in France to serve material distateful to the French. The
- extra level of indirection also allows Bob to respond to some requests
- and ignore others.
- We provide the necessary glue so that Alice can view webpages from Bob's
- location-hidden webserver with minimal invasive changes. Both Alice and
- Bob must run local onion proxies.
- The steps of a rendezvous:
- \begin{tightlist}
- \item Bob chooses some Introduction Points, and advertises them on a
- Distributed Hash Table (DHT).
- \item Bob establishes onion routing connections to each of his
- Introduction Points, and waits.
- \item Alice learns about Bob's service out of band (perhaps Bob told her,
- or she found it on a website). She looks up the details of Bob's
- service from the DHT.
- \item Alice chooses and establishes a Meeting Point (MP) for this
- transaction.
- \item Alice goes to one of Bob's Introduction Points, and gives it a blob
- (encrypted for Bob) which tells him about herself, the Meeting Point
- she chose, and the first half of an ephemeral key handshake. The
- Introduction Point sends the blob to Bob.
- \item Bob chooses whether to ignore the blob, or to onion route to MP.
- Let's assume the latter.
- \item MP plugs together Alice and Bob. Note that MP can't recognize Alice,
- Bob, or the data they transmit (they share a session key).
- \item Alice sends a Begin cell along the circuit. It arrives at Bob's
- onion proxy. Bob's onion proxy connects to Bob's webserver.
- \item Data goes back and forth as usual.
- \end{tightlist}
- When establishing an introduction point, Bob provides the onion router
- with a public ``introduction'' key. The hash of this public key
- identifies a unique service, and (since Bob is required to sign his
- messages) prevents anybody else from usurping Bob's introduction point
- in the future. Bob uses the same public key when establish the other
- introduction points for that service.
- The blob that Alice gives the introduction point includes a hash of Bob's
- public key to identify the service, an optional initial authentication
- token (the introduction point can do prescreening, eg to block replays),
- and (encrypted to Bob's public key) the location of the meeting point,
- a meeting cookie Bob should tell the meeting point so he gets connected to
- Alice, an optional authentication token so Bob choose whether to respond,
- and the first half of a DH key exchange. When Bob connects to the meeting
- place and gets connected to Alice's pipe, his first cell contains the
- other half of the DH key exchange.
- \subsection{Integration with user applications}
- For each service Bob offers, he configures his local onion proxy to know
- the local IP and port of the server, a strategy for authorizating Alices,
- and a public key. We assume the existence of a robust decentralized
- efficient lookup system which allows authenticated updates, eg
- \cite{cfs:sosp01}. (Each onion router could run a node in this lookup
- system; also note that as a stopgap measure, we can just run a simple
- lookup system on the directory servers.) Bob publishes into the DHT
- (indexed by the hash of the public key) the public key, an expiration
- time (``not valid after''), and the current introduction points for that
- service. Note that Bob's webserver is completely oblivious to the fact
- that it's hidden behind the Tor network.
- As far as Alice's experience goes, we require that her client interface
- remain a SOCKS proxy, and we require that she shouldn't have to modify
- her applications. Thus we encode all of the necessary information into
- the hostname (more correctly, fully qualified domain name) that Alice
- uses, eg when clicking on a url in her browser. Location-hidden services
- use the special top level domain called `.onion': thus hostnames take the
- form x.y.onion where x encodes the hash of PK, and y is the authentication
- cookie. Alice's onion proxy examines hostnames and recognizes when they're
- destined for a hidden server. If so, it decodes the PK and starts the
- rendezvous as described in the table above.
- \subsection{Previous rendezvous work}
- Ian Goldberg developed a similar notion of rendezvous points for
- low-latency anonymity systems \cite{ian-thesis}. His ``service tag''
- is the same concept as our ``hash of service's public key''. We make it
- a hash of the public key so it can be self-authenticating, and so the
- client can recognize the same service with confidence later on. His
- design differs from ours in the following ways though. Firstly, Ian
- suggests that the client should manually hunt down a current location of
- the service via Gnutella; whereas our use of the DHT makes lookup faster,
- more robust, and transparent to the user. Secondly, the client and server
- can share ephemeral DH keys, so at no point in the path is the plaintext
- exposed. Thirdly, our design is much more practical for deployment in a
- volunteer network, in terms of getting volunteers to offer introduction
- and meeting point services. The introduction points do not output any
- bytes to the clients. And the meeting points don't know the client,
- the server, or the stuff being transmitted. The indirection scheme
- is also designed with authentication/authorization in mind -- if the
- client doesn't include the right cookie with its request for service,
- the server doesn't even acknowledge its existence.
- \Section{Maintaining anonymity sets}
- \label{sec:maintaining-anonymity}
- packet counting attacks work great against initiators. need to do some
- level of obfuscation for that. standard link padding for passive link
- observers. long-range padding for people who own the first hop. are
- we just screwed against people who insert timing signatures into your
- traffic?
- Even regardless of link padding from Alice to the cloud, there will be
- times when Alice is simply not online. Link padding, at the edges or
- inside the cloud, does not help for this.
- how often should we pull down directories? how often send updated
- server descs?
- when we start up the client, should we build a circuit immediately,
- or should the default be to build a circuit only on demand? should we
- fetch a directory immediately?
- would we benefit from greater synchronization, to blend with the other
- users? would the reduced speed hurt us more?
- does the "you can't see when i'm starting or ending a stream because
- you can't tell what sort of relay cell it is" idea work, or is just
- a distraction?
- does running a server actually get you better protection, because traffic
- coming from your node could plausibly have come from elsewhere? how
- much mixing do you need before this is actually plausible, or is it
- immediately beneficial because many adversary can't see your node?
- do different exit policies at different exit nodes trash anonymity sets,
- or not mess with them much?
- do we get better protection against a realistic adversary by having as
- many nodes as possible, so he probably can't see the whole network,
- or by having a small number of nodes that mix traffic well? is a
- cascade topology a more realistic way to get defenses against traffic
- confirmation? does the hydra (many inputs, few outputs) topology work
- better? are we going to get a hydra anyway because most nodes will be
- middleman nodes?
- using a circuit many times is good because it's less cpu work
- good because of predecessor attacks with path rebuilding
- bad because predecessor attacks can be more likely to link you with a
- previous circuit since you're so verbose
- bad because each thing you do on that circuit is linked to the other
- things you do on that circuit
- Because Tor runs over TCP, when one of the servers goes down it seems
- that all the circuits (and thus streams) going over that server must
- break. This reduces anonymity because everybody needs to reconnect
- right then (does it? how much?) and because exit connections all break
- at the same time, and it also reduces usability. It seems the problem
- is even worse in a p2p environment, because so far such systems don't
- really provide an incentive for nodes to stay connected when they're
- done browsing, so we would expect a much higher churn rate than for
- onion routing. Are there ways of allowing streams to survive the loss
- of a node in the path?
- %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
- \Section{Attacks and Defenses}
- \label{sec:attacks}
- Below we summarize a variety of attacks and how well our design withstands
- them.
- \begin{enumerate}
- \item \textbf{Passive attacks}
- \begin{itemize}
- \item \emph{Simple observation.}
- \item \emph{Timing correlation.}
- \item \emph{Size correlation.}
- \item \emph{Option distinguishability.}
- \end{itemize}
- \item \textbf{Active attacks}
- \begin{itemize}
- \item \emph{Key compromise.}
- \item \emph{Iterated subpoena.}
- \item \emph{Run recipient.}
- \item \emph{Run a hostile node.}
- \item \emph{Compromise entire path.}
- \item \emph{Selectively DoS servers.}
- \item \emph{Introduce timing into messages.}
- \item \emph{Tagging attacks.}
- \end{itemize}
- \item \textbf{Directory attacks}
- \begin{itemize}
- \item foo
- \end{itemize}
- \end{enumerate}
- %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
- \Section{Future Directions and Open Problems}
- \label{sec:conclusion}
- Tor brings together many innovations into
- a unified deployable system. But there are still several attacks that
- work quite well, as well as a number of sustainability and run-time
- issues remaining to be ironed out. In particular:
- \begin{itemize}
- \item \emph{Scalability:} Since Tor's emphasis currently is on simplicity
- of design and deployment, the current design won't easily handle more
- than a few hundred servers, because of its clique topology. Restricted
- route topologies \cite{danezis:pet2003} promise comparable anonymity
- with much better scaling properties, but we must solve problems like
- how to randomly form the network without introducing net attacks.
- % cascades are a restricted route topology too. we must mention
- % earlier why we're not satisfied with the cascade approach.
- \item \emph{Cover traffic:} Currently we avoid cover traffic because
- it introduces clear performance and bandwidth costs, but and its
- security properties are not well understood. With more research
- \cite{SS03,defensive-dropping}, the price/value ratio may change, both for
- link-level cover traffic and also long-range cover traffic. In particular,
- we expect restricted route topologies to reduce the cost of cover traffic
- because there are fewer links to cover.
- \item \emph{Better directory distribution:} Even with the threshold
- directory agreement algorithm described in \ref{sec:dirservers},
- the directory servers are still trust bottlenecks. We must find more
- decentralized yet practical ways to distribute up-to-date snapshots of
- network status without introducing new attacks.
- \item \emph{Implementing location-hidden servers:} While Section
- \ref{sec:rendezvous} provides a design for rendezvous points and
- location-hidden servers, this feature has not yet been implemented.
- We will likely encounter additional issues, both in terms of usability
- and anonymity, that must be resolved.
- \item \emph{Wider-scale deployment:} The original goal of Tor was to
- gain experience in deploying an anonymizing overlay network, and learn
- from having actual users. We are now at the point where we can start
- deploying a wider network. We will see what happens!
- % ok, so that's hokey. fix it. -RD
- \end{itemize}
- %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
- %\Section{Acknowledgments}
- %% commented out for anonymous submission
- %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
- \bibliographystyle{latex8}
- \bibliography{tor-design}
- \end{document}
- % Style guide:
- % U.S. spelling
- % avoid contractions (it's, can't, etc.)
- % 'mix', 'mixes' (as noun)
- % 'mix-net'
- % 'mix', 'mixing' (as verb)
- % 'Mixminion Project'
- % 'Mixminion' (meaning the protocol suite or the network)
- % 'Mixmaster' (meaning the protocol suite or the network)
- % 'middleman' [Not with a hyphen; the hyphen has been optional
- % since Middle English.]
- % 'nymserver'
- % 'Cypherpunk', 'Cypherpunks', 'Cypherpunk remailer'
- %
- % 'Whenever you are tempted to write 'Very', write 'Damn' instead, so
- % your editor will take it out for you.' -- Misquoted from Mark Twain
|