|
@@ -14,7 +14,7 @@
|
|
|
|
|
|
\begin{document}
|
|
|
|
|
|
-\title{Challenges in bringing low-latency stream anonymity to the masses (DRAFT)}
|
|
|
+\title{Challenges in practical low-latency stream anonymity (DRAFT)}
|
|
|
|
|
|
\author{Roger Dingledine and Nick Mathewson}
|
|
|
\institute{The Free Haven Project\\
|
|
@@ -29,12 +29,10 @@ foo
|
|
|
|
|
|
\section{Introduction}
|
|
|
|
|
|
-Anonymous communication on the Internet today
|
|
|
-
|
|
|
-
|
|
|
Tor is a low-latency anonymous communication overlay network
|
|
|
-\cite{tor-design}. We have been operating a publicly deployed Tor network
|
|
|
-since October 2003.
|
|
|
+\cite{tor-design} designed to be practical and usable for securing TCP
|
|
|
+streams over the Internet. We have been operating a publicly deployed
|
|
|
+Tor network since October 2003.
|
|
|
|
|
|
Tor aims to resist observers and insiders by distributing each transaction
|
|
|
over several nodes in the network. This ``distributed trust'' approach
|
|
@@ -48,38 +46,39 @@ who don't want to reveal information to their competitors, and law
|
|
|
enforcement and government intelligence agencies who need
|
|
|
to do operations on the Internet without being noticed.
|
|
|
|
|
|
-Tor has been funded by both the U.S. Navy, for use in securing government
|
|
|
-communications, and also the Electronic Frontier Foundation, for use in
|
|
|
-maintain civil liberties for ordinary citizens online.
|
|
|
-The Tor protocol is one of the leading choices
|
|
|
+Tor has been funded by the U.S. Navy, for use in securing government
|
|
|
+communications, and also by the Electronic Frontier Foundation, for use
|
|
|
+in maintaining civil liberties for ordinary citizens online. The Tor
|
|
|
+protocol is one of the leading choices
|
|
|
to be the anonymizing layer in the European Union's PRIME directive to
|
|
|
help maintain privacy in Europe. The University of Dresden in Germany
|
|
|
has integrated an independent implementation of the Tor protocol into
|
|
|
-their popular Java Anon Proxy anonymizing client. This wide variety of
|
|
|
+their popular Java Anon Proxy anonymizing client. This wide variety of
|
|
|
interests helps maintain both the stability and the security of the
|
|
|
network.
|
|
|
|
|
|
-
|
|
|
-
|
|
|
-
|
|
|
-We deployed this thing called Tor. it's got all these different types of
|
|
|
-users. it's been backed by navy and eff, and prime and anonymizer looked at
|
|
|
-it. Because we're this cool, you should believe us when we tell you stuff.
|
|
|
-
|
|
|
-In this paper we give the reader an understanding of Tor's context
|
|
|
-in the anonymity space and then we go on to describe the
|
|
|
-practical challenges that stand in the way of moving from a practical
|
|
|
-useful network to a practical useful anonymous network.
|
|
|
-
|
|
|
-
|
|
|
-
|
|
|
-
|
|
|
-
|
|
|
-
|
|
|
-
|
|
|
-
|
|
|
+Tor has a weaker threat model than many anonymity designs in the
|
|
|
+literature. This is because we our primary requirements are to have a
|
|
|
+practical and useful network, and from there we aim to provide as much
|
|
|
+anonymity as we can.
|
|
|
+
|
|
|
+
|
|
|
+
|
|
|
+
|
|
|
+
|
|
|
+
|
|
|
+This paper aims to give the reader enough information to understand the
|
|
|
+technical and policy issues that Tor faces as we continue deployment,
|
|
|
+and to lay a research agenda for others to help in addressing some of
|
|
|
+these issues. Section \ref{sec:what-is-tor} gives an overview of the Tor
|
|
|
+design and ours goals. We go on in Section \ref{sec:related} to describe
|
|
|
+Tor's context in the anonymity space. Sections \ref{sec:crossroads-policy}
|
|
|
+and \ref{sec:crossroads-technical} describe the practical challenges,
|
|
|
+both policy and technical respectively, that stand in the way of moving
|
|
|
+from a practical useful network to a practical useful anonymous network.
|
|
|
|
|
|
\section{What Is Tor}
|
|
|
+\label{sec:what-is-tor}
|
|
|
|
|
|
\subsection{Distributed trust: safety in numbers}
|
|
|
|
|
@@ -153,6 +152,7 @@ Tor has the following goals.
|
|
|
and we made these assumptions when trying to design the thing.
|
|
|
|
|
|
\section{Tor's position in the anonymity field}
|
|
|
+\label{sec:related}
|
|
|
|
|
|
There are many other classes of systems: single-hop proxies, open proxies,
|
|
|
jap, mixminion, flash mixes, freenet, i2p, mute/ants/etc, tarzan,
|
|
@@ -160,49 +160,14 @@ morphmix, freedom. Give brief descriptions and brief characterizations
|
|
|
of how we differ. This is not the breakthrough stuff and we only have
|
|
|
a page or two for it.
|
|
|
|
|
|
+have a serious discussion of morphmix's assumptions, since they would
|
|
|
+seem to be the direct competition. in fact tor is a flexible architecture
|
|
|
+that would encompass morphmix, and they're nearly identical except for
|
|
|
+path selection and node discovery. and the trust system morphmix has
|
|
|
+seems overkill (and/or insecure) based on the threat model we've picked.
|
|
|
|
|
|
-\section{Crossroads}
|
|
|
-
|
|
|
-Discuss each item that Tor hasn't solved yet that isn't just coding
|
|
|
-work. Perhaps we'll have so many that we can pick out the best ones to
|
|
|
-discuss, so it's a bit less of a laundry list. Maybe they'll even fit
|
|
|
-into categories. The trick to making the paper good will be to find
|
|
|
-the right balance between going into depth and breadth of coverage.
|
|
|
-
|
|
|
-
|
|
|
-Peer-to-peer / practical issues:
|
|
|
-
|
|
|
-Network discovery, sybil, node admission, scaling. It seems that the code
|
|
|
-will ship with something and that's our trust root. We could try to get
|
|
|
-people to build a web of trust, but no. Where we go from here depends
|
|
|
-on what threats we have in mind. Really decentralized if your threat is
|
|
|
-RIAA; less so if threat is to application data or individuals or...
|
|
|
-
|
|
|
-Making use of servers with little bandwidth. How to handle hammering by
|
|
|
-certain applications.
|
|
|
-
|
|
|
-Handling servers that are far away from the rest of the network, e.g. on
|
|
|
-the continents that aren't North America and Europe. High latency,
|
|
|
-often high packet loss.
|
|
|
-
|
|
|
-Running Tor servers behind NATs, behind great-firewalls-of-China, etc.
|
|
|
-Restricted routes. How to propagate to everybody the topology? BGP
|
|
|
-style doesn't work because we don't want just *one* path. Point to
|
|
|
-Geoff's stuff.
|
|
|
-
|
|
|
-Routing-zones. It seems that our threat model comes down to diversity and
|
|
|
-dispersal. But hard for Alice to know how to act. Many questions remain.
|
|
|
-
|
|
|
-The China problem. We have lots of users in Iran and similar (we stopped
|
|
|
-logging, so it's hard to know now, but many Persian sites on how to use
|
|
|
-Tor), and they seem to be doing ok. But the China problem is bigger. Cite
|
|
|
-Stefan's paper, and talk about how we need to route through clients,
|
|
|
-and we maybe we should start with a time-release IP publishing system +
|
|
|
-advogato based reputation system, to bound the number of IPs leaked to the
|
|
|
-adversary.
|
|
|
-
|
|
|
-
|
|
|
-Policy issues:
|
|
|
+\section{Crossroads: Policy issues}
|
|
|
+\label{sec:crossroads-policy}
|
|
|
|
|
|
Bittorrent and dmca. Should we add an IDS to autodetect protocols and
|
|
|
snipe them? Takedowns and efnet abuse and wikipedia complaints and irc
|
|
@@ -212,45 +177,94 @@ servers want to?
|
|
|
Image: substantial non-infringing uses. Image is a security parameter,
|
|
|
since it impacts user base and perceived sustainability.
|
|
|
|
|
|
+good uses are kept private, bad uses are publicized. not good.
|
|
|
+
|
|
|
Sustainability. Previous attempts have been commercial which we think
|
|
|
adds a lot of unnecessary complexity and accountability. Freedom didn't
|
|
|
collect enough money to pay its servers; JAP bandwidth is supported by
|
|
|
continued money, and they periodically ask what they will do when it
|
|
|
dries up.
|
|
|
|
|
|
+How much should Tor aim to do? Applications that leak data. We can say
|
|
|
+they're not our problem, but they're somebody's problem.
|
|
|
+
|
|
|
Logging. Making logs not revealing. A happy coincidence that verbose
|
|
|
logging is our \#2 performance bottleneck. Is there a way to detect
|
|
|
modified servers, or to have them volunteer the information that they're
|
|
|
logging verbosely? Would that actually solve any attacks?
|
|
|
|
|
|
+\section{Crossroads: Scaling and Design choices}
|
|
|
+\label{sec:crossroads-design}
|
|
|
+
|
|
|
+\subsection{Transporting the stream vs transporting the packets}
|
|
|
+
|
|
|
+We periodically run into ZKS people who tell us that the process of
|
|
|
+anonymizing IPs should ``obviously'' be done at the IP layer. Here are
|
|
|
+the issues that need to be resolved before we'll be ready to switch Tor
|
|
|
+over to arbitrary IP traffic.
|
|
|
+
|
|
|
+1: we still need to do IP-level packet normalization, to stop things
|
|
|
+like ip fingerprinting. This is doable.
|
|
|
+2: we still need to be easy to integrate with user-level applications,
|
|
|
+so they can do application-level scrubbing. So we will still need
|
|
|
+application-specific proxies.
|
|
|
+3: we need a block-level encryption approach that can provide security despite
|
|
|
+packet loss and out-of-order delivery. Freedom allegedly had one, but it was
|
|
|
+never publicly specified. (We also believe that the Freedom and Cebolla designs
|
|
|
+are vulnerable to tagging attacks.)
|
|
|
+4: we still need to play with parameters for throughput, congestion control,
|
|
|
+etc -- since we need sequence numbers and maybe more to do replay detection,
|
|
|
+and just to handle duplicate frames. so we would be reimplementing some subset of tcp
|
|
|
+anyway.
|
|
|
+5: tls over udp is not implemented or even specified.
|
|
|
+6: exit policies over arbitrary IP packets seems to be an IDS-hard problem. i
|
|
|
+don't want to build an IDS into tor.
|
|
|
+7: certain protocols are going to leak information at the IP layer anyway. for
|
|
|
+example, if we anonymizer your dns requests, but they still go to comcast's dns servers,
|
|
|
+that's bad.
|
|
|
+8: hidden services, .exit addresses, etc are broken unless we have some way to
|
|
|
+reach into the application-level protocol and decide the hostname it's trying to get.
|
|
|
+
|
|
|
+\subsection{Mid-latency}
|
|
|
+
|
|
|
+Mid-latency. Can we do traffic shape to get any defense against George's
|
|
|
+PET2004 paper? Will padding or long-range dummies do anything then? Will
|
|
|
+it kill the user base or can we get both approaches to play well together?
|
|
|
|
|
|
-Anonymity issues:
|
|
|
|
|
|
-Transporting the stream vs transporting the packets.
|
|
|
|
|
|
-The DNS problem in practice.
|
|
|
+
|
|
|
|
|
|
-Applications that leak data. We can say they're not our problem, but
|
|
|
-they're somebody's problem.
|
|
|
+\subsection{Measuring performance and capacity}
|
|
|
|
|
|
How to measure performance without letting people selectively deny service
|
|
|
by distinguishing pings. Heck, just how to measure performance at all. In
|
|
|
practice people have funny firewalls that don't match up to their exit
|
|
|
policies and Tor doesn't deal.
|
|
|
|
|
|
-Mid-latency. Can we do traffic shape to get any defense against George's
|
|
|
-PET2004 paper? Will padding or long-range dummies do anything then? Will
|
|
|
-it kill the user base or can we get both approaches to play well together?
|
|
|
+Network investigation: Is all this bandwidth publishing thing a good idea?
|
|
|
+How can we collect stats better? Note weasel's smokeping, at
|
|
|
+http://seppia.noreply.org/cgi-bin/smokeping.cgi?target=Tor
|
|
|
+which probably gives george and steven enough info to break tor?
|
|
|
+
|
|
|
+\subsection{Plausible deniability}
|
|
|
|
|
|
Does running a server help you or harm you? George's Oakland attack.
|
|
|
Plausible deniability -- without even running your traffic through Tor! We
|
|
|
have to pick the path length so adversary can't distinguish client from
|
|
|
server (how many hops is good?).
|
|
|
|
|
|
+\subsection{Helper nodes}
|
|
|
+
|
|
|
When does fixing your entry or exit node help you?
|
|
|
Helper nodes in the literature don't deal with churn, and
|
|
|
especially active attacks to induce churn.
|
|
|
|
|
|
+Do general DoS attacks have anonymity implications? See e.g. Adam
|
|
|
+Back's IH paper, but I think there's more to be pointed out here.
|
|
|
+
|
|
|
+\subsection{Location-hidden services}
|
|
|
+
|
|
|
Survivable services are new in practice, yes? Hidden services seem
|
|
|
less hidden than we'd like, since they stay in one place and get used
|
|
|
a lot. They're the epitome of the need for helper nodes. This means
|
|
@@ -259,7 +273,11 @@ hard. Also, they're brittle in terms of intersection and observation
|
|
|
attacks. Would be nice to have hot-swap services, but hard to design.
|
|
|
|
|
|
|
|
|
-P2P + anonymity issues:
|
|
|
+
|
|
|
+
|
|
|
+
|
|
|
+
|
|
|
+
|
|
|
|
|
|
Incentives. Copy the page I wrote for the NSF proposal, and maybe extend
|
|
|
it if we're feeling smart.
|
|
@@ -270,55 +288,40 @@ A Tor gui, how jap's gui is nice but does not reflect the security
|
|
|
they provide.
|
|
|
Public perception, and thus advertising, is a security parameter.
|
|
|
|
|
|
-Network investigation: Is all this bandwidth publishing thing a good idea?
|
|
|
-How can we collect stats better? Note weasel's smokeping, at
|
|
|
-http://seppia.noreply.org/cgi-bin/smokeping.cgi?target=Tor
|
|
|
-which probably gives george and steven enough info to break tor?
|
|
|
-
|
|
|
-Do general DoS attacks have anonymity implications? See e.g. Adam
|
|
|
-Back's IH paper, but I think there's more to be pointed out here.
|
|
|
-
|
|
|
-
|
|
|
-
|
|
|
-have a serious discussion of morphmix's assumptions, since they would
|
|
|
-seem to be the direct competition. in fact tor is a flexible architecture
|
|
|
-that would encompass morphmix, and they're nearly identical except for
|
|
|
-path selection and node discovery. and the trust system morphmix has
|
|
|
-seems overkill (and/or insecure) based on the threat model we've picked.
|
|
|
+Peer-to-peer / practical issues:
|
|
|
|
|
|
-need to discuss how we take the approach of building the thing, and then
|
|
|
-assuming that, how much anonymity can we get. we're not here to model or
|
|
|
-to simulate or to produce equations and formulae. but those have their
|
|
|
-roles too.
|
|
|
+Network discovery, sybil, node admission, scaling. It seems that the code
|
|
|
+will ship with something and that's our trust root. We could try to get
|
|
|
+people to build a web of trust, but no. Where we go from here depends
|
|
|
+on what threats we have in mind. Really decentralized if your threat is
|
|
|
+RIAA; less so if threat is to application data or individuals or...
|
|
|
|
|
|
+Making use of servers with little bandwidth. How to handle hammering by
|
|
|
+certain applications.
|
|
|
|
|
|
+Handling servers that are far away from the rest of the network, e.g. on
|
|
|
+the continents that aren't North America and Europe. High latency,
|
|
|
+often high packet loss.
|
|
|
|
|
|
+Running Tor servers behind NATs, behind great-firewalls-of-China, etc.
|
|
|
+Restricted routes. How to propagate to everybody the topology? BGP
|
|
|
+style doesn't work because we don't want just *one* path. Point to
|
|
|
+Geoff's stuff.
|
|
|
|
|
|
+Routing-zones. It seems that our threat model comes down to diversity and
|
|
|
+dispersal. But hard for Alice to know how to act. Many questions remain.
|
|
|
|
|
|
-
|
|
|
+The China problem. We have lots of users in Iran and similar (we stopped
|
|
|
+logging, so it's hard to know now, but many Persian sites on how to use
|
|
|
+Tor), and they seem to be doing ok. But the China problem is bigger. Cite
|
|
|
+Stefan's paper, and talk about how we need to route through clients,
|
|
|
+and we maybe we should start with a time-release IP publishing system +
|
|
|
+advogato based reputation system, to bound the number of IPs leaked to the
|
|
|
+adversary.
|
|
|
|
|
|
+\section{The Future}
|
|
|
+\label{sec:conclusion}
|
|
|
|
|
|
-TCP vs UDP
|
|
|
-argument 1: we need to do IP-level packet normalization, to block things like ip
|
|
|
-fingerprinting.
|
|
|
-argument 2: we still need to be easy to integrate with applications, so they can do
|
|
|
-application-level scrubbing.
|
|
|
-argument 3: we need a block-level encryption approach that can provide security despite
|
|
|
-packet loss and out-of-order delivery. i believe you that such a thing can be created,
|
|
|
-but no thing has yet been specified. so specify it for me if you want me to believe it.
|
|
|
-(freedom and cebolla are vulnerable to tagging and malleability attacks i believe.)
|
|
|
-argument 4: we still need to play with parameters for throughput, congestion control,
|
|
|
-etc -- since we need sequence numbers and maybe more to do replay detection,
|
|
|
-and just to handle duplicate frames. so we would be reimplementing some subset of tcp
|
|
|
-anyway.
|
|
|
-argument 5: tls over udp is not implemented or even specified.
|
|
|
-argument 6: exit policies over arbitrary IP packets seems to be an IDS-hard problem. i
|
|
|
-don't want to build an IDS into tor.
|
|
|
-argument 7: certain protocols are going to leak information at the IP layer anyway. for
|
|
|
-example, if we anonymizer your dns requests, but they still go to comcast's dns servers,
|
|
|
-that's bad.
|
|
|
-argument 8: hidden services, .exit addresses, etc are broken unless we have some way to
|
|
|
-reach into the application-level protocol and decide the hostname it's trying to get.
|
|
|
|
|
|
\bibliographystyle{plain} \bibliography{tor-design}
|
|
|
|