123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223 |
- \documentclass{llncs}
- \usepackage{url}
- \usepackage{amsmath}
- \usepackage{epsfig}
- \newenvironment{tightlist}{\begin{list}{$\bullet$}{
- \setlength{\itemsep}{0mm}
- \setlength{\parsep}{0mm}
- % \setlength{\labelsep}{0mm}
- % \setlength{\labelwidth}{0mm}
- % \setlength{\topsep}{0mm}
- }}{\end{list}}
- \begin{document}
- \title{Challenges in bringing low-latency stream anonymity to the masses (DRAFT)}
- \author{Roger Dingledine and Nick Mathewson}
- \institute{The Free Haven Project\\
- \email{\{arma,nickm\}@freehaven.net}}
- \section{Introduction}
- We deployed this thing called Tor. it's got all these different types of
- users. it's been backed by navy and eff, and prime and anonymizer looked at
- it. Because we're this cool, you should believe us when we tell you stuff.
- In this paper we give the reader an understanding of Tor's context
- in the anonymity space and then we go on to describe the variety of
- practical challenges that stand in the way of moving from a practical
- useful network to a practical useful anonymous network.
- % The goal of the paper is to get the PET-audience reader up to speed
- % on all the issues we have with Tor, so he can, if he wants,
- % * understand the technical and policy and legal issues and why they're
- % tricky in practice
- % * help us out with answering some of the technical decisions
- % (and in writing it, we'll clarify our own opinions about them)
- % * help us out with answering some of the anonymity questions
- \section{What Is Tor}
- Tor works like this.
- weasel's graph of \# nodes and of bandwidth, ideally from week 0.
- Tor has the following goals.
- and we made these assumptions when trying to design the thing.
- \section{Tor's position in the anonymity field}
- There are many other classes of systems: single-hop proxies, open proxies,
- jap, mixminion, flash mixes, freenet, i2p, mute/ants/etc, tarzan,
- morphmix, freedom. Give brief descriptions and brief characterizations
- of how we differ. This is not the breakthrough stuff and we only have
- a page or two for it.
- \section{Crossroads}
- Discuss each item that Tor hasn't solved yet that isn't just coding
- work. Perhaps we'll have so many that we can pick out the best ones to
- discuss, so it's a bit less of a laundry list. Maybe they'll even fit
- into categories. The trick to making the paper good will be to find
- the right balance between going into depth and breadth of coverage.
- Peer-to-peer / practical issues:
- Network discovery, sybil, node admission, scaling. It seems that the code
- will ship with something and that's our trust root. We could try to get
- people to build a web of trust, but no. Where we go from here depends
- on what threats we have in mind. Really decentralized if your threat is
- RIAA; less so if threat is to application data or individuals or...
- Making use of servers with little bandwidth. How to handle hammering by
- certain applications.
- Handling servers that are far away from the rest of the network, e.g. on
- the continents that aren't North America and Europe. High latency,
- often high packet loss.
- Running Tor servers behind NATs, behind great-firewalls-of-China, etc.
- Restricted routes. How to propagate to everybody the topology? BGP
- style doesn't work because we don't want just *one* path. Point to
- Geoff's stuff.
- Routing-zones. It seems that our threat model comes down to diversity and
- dispersal. But hard for Alice to know how to act. Many questions remain.
- The China problem. We have lots of users in Iran and similar (we stopped
- logging, so it's hard to know now, but many Persian sites on how to use
- Tor), and they seem to be doing ok. But the China problem is bigger. Cite
- Stefan's paper, and talk about how we need to route through clients,
- and we maybe we should start with a time-release IP publishing system +
- advogato based reputation system, to bound the number of IPs leaked to the
- adversary.
- Policy issues:
- Bittorrent and dmca. Should we add an IDS to autodetect protocols and
- snipe them? Takedowns and efnet abuse and wikipedia complaints and irc
- networks. Should we allow revocation of anonymity if a threshold of
- servers want to?
- Image: substantial non-infringing uses. Image is a security parameter,
- since it impacts user base and perceived sustainability.
- Sustainability. Previous attempts have been commercial which we think
- adds a lot of unnecessary complexity and accountability. Freedom didn't
- collect enough money to pay its servers; JAP bandwidth is supported by
- continued money, and they periodically ask what they will do when it
- dries up.
- Logging. Making logs not revealing. A happy coincidence that verbose
- logging is our \#2 performance bottleneck. Is there a way to detect
- modified servers, or to have them volunteer the information that they're
- logging verbosely? Would that actually solve any attacks?
- Anonymity issues:
- Transporting the stream vs transporting the packets.
- The DNS problem in practice.
- Applications that leak data. We can say they're not our problem, but
- they're somebody's problem.
- How to measure performance without letting people selectively deny service
- by distinguishing pings. Heck, just how to measure performance at all. In
- practice people have funny firewalls that don't match up to their exit
- policies and Tor doesn't deal.
- Mid-latency. Can we do traffic shape to get any defense against George's
- PET2004 paper? Will padding or long-range dummies do anything then? Will
- it kill the user base or can we get both approaches to play well together?
- Does running a server help you or harm you? George's Oakland attack.
- Plausible deniability -- without even running your traffic through Tor! We
- have to pick the path length so adversary can't distinguish client from
- server (how many hops is good?).
- When does fixing your entry or exit node help you?
- Helper nodes in the literature don't deal with churn, and
- especially active attacks to induce churn.
- Survivable services are new in practice, yes? Hidden services seem
- less hidden than we'd like, since they stay in one place and get used
- a lot. They're the epitome of the need for helper nodes. This means
- that using Tor as a building block for Free Haven is going to be really
- hard. Also, they're brittle in terms of intersection and observation
- attacks. Would be nice to have hot-swap services, but hard to design.
- P2P + anonymity issues:
- Incentives. Copy the page I wrote for the NSF proposal, and maybe extend
- it if we're feeling smart.
- Usability: fc03 paper was great, except the lower latency you are the
- less useful it seems it is.
- A Tor gui, how jap's gui is nice but does not reflect the security
- they provide.
- Public perception, and thus advertising, is a security parameter.
- Network investigation: Is all this bandwidth publishing thing a good idea?
- How can we collect stats better? Note weasel's smokeping, at
- http://seppia.noreply.org/cgi-bin/smokeping.cgi?target=Tor
- which probably gives george and steven enough info to break tor?
- Do general DoS attacks have anonymity implications? See e.g. Adam
- Back's IH paper, but I think there's more to be pointed out here.
- % need to do somewhere in the paper:
- have a serious discussion of morphmix's assumptions, since they would
- seem to be the direct competition. in fact tor is a flexible architecture
- that would encompass morphmix, and they're nearly identical except for
- path selection and node discovery. and the trust system morphmix has
- seems overkill (and/or insecure) based on the threat model we've picked.
- need to discuss how we take the approach of building the thing, and then
- assuming that, how much anonymity can we get. we're not here to model or
- to simulate or to produce equations and formulae. but those have their
- roles too.
- %%%
- TCP vs UDP
- argument 1: we need to do IP-level packet normalization, to block things like ip
- fingerprinting.
- argument 2: we still need to be easy to integrate with applications, so they can do
- application-level scrubbing.
- argument 3: we need a block-level encryption approach that can provide security despite
- packet loss and out-of-order delivery. i believe you that such a thing can be created,
- but no thing has yet been specified. so specify it for me if you want me to believe it.
- (freedom and cebolla are vulnerable to tagging and malleability attacks i believe.)
- argument 4: we still need to play with parameters for throughput, congestion control,
- etc -- since we need sequence numbers and maybe more to do replay detection,
- and just to handle duplicate frames. so we would be reimplementing some subset of tcp
- anyway.
- argument 5: tls over udp is not implemented or even specified.
- argument 6: exit policies over arbitrary IP packets seems to be an IDS-hard problem. i
- don't want to build an IDS into tor.
- argument 7: certain protocols are going to leak information at the IP layer anyway. for
- example, if we anonymizer your dns requests, but they still go to comcast's dns servers,
- that's bad.
- argument 8: hidden services, .exit addresses, etc are broken unless we have some way to
- reach into the application-level protocol and decide the hostname it's trying to get.
- \bibliographystyle{plain} \bibliography{tor-design}
- \end{document}
|