challenges.tex 8.9 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223
  1. \documentclass{llncs}
  2. \usepackage{url}
  3. \usepackage{amsmath}
  4. \usepackage{epsfig}
  5. \newenvironment{tightlist}{\begin{list}{$\bullet$}{
  6. \setlength{\itemsep}{0mm}
  7. \setlength{\parsep}{0mm}
  8. % \setlength{\labelsep}{0mm}
  9. % \setlength{\labelwidth}{0mm}
  10. % \setlength{\topsep}{0mm}
  11. }}{\end{list}}
  12. \begin{document}
  13. \title{Challenges in bringing low-latency stream anonymity to the masses (DRAFT)}
  14. \author{Roger Dingledine and Nick Mathewson}
  15. \institute{The Free Haven Project\\
  16. \email{\{arma,nickm\}@freehaven.net}}
  17. \section{Introduction}
  18. We deployed this thing called Tor. it's got all these different types of
  19. users. it's been backed by navy and eff, and prime and anonymizer looked at
  20. it. Because we're this cool, you should believe us when we tell you stuff.
  21. In this paper we give the reader an understanding of Tor's context
  22. in the anonymity space and then we go on to describe the variety of
  23. practical challenges that stand in the way of moving from a practical
  24. useful network to a practical useful anonymous network.
  25. % The goal of the paper is to get the PET-audience reader up to speed
  26. % on all the issues we have with Tor, so he can, if he wants,
  27. % * understand the technical and policy and legal issues and why they're
  28. % tricky in practice
  29. % * help us out with answering some of the technical decisions
  30. % (and in writing it, we'll clarify our own opinions about them)
  31. % * help us out with answering some of the anonymity questions
  32. \section{What Is Tor}
  33. Tor works like this.
  34. weasel's graph of \# nodes and of bandwidth, ideally from week 0.
  35. Tor has the following goals.
  36. and we made these assumptions when trying to design the thing.
  37. \section{Tor's position in the anonymity field}
  38. There are many other classes of systems: single-hop proxies, open proxies,
  39. jap, mixminion, flash mixes, freenet, i2p, mute/ants/etc, tarzan,
  40. morphmix, freedom. Give brief descriptions and brief characterizations
  41. of how we differ. This is not the breakthrough stuff and we only have
  42. a page or two for it.
  43. \section{Crossroads}
  44. Discuss each item that Tor hasn't solved yet that isn't just coding
  45. work. Perhaps we'll have so many that we can pick out the best ones to
  46. discuss, so it's a bit less of a laundry list. Maybe they'll even fit
  47. into categories. The trick to making the paper good will be to find
  48. the right balance between going into depth and breadth of coverage.
  49. Peer-to-peer / practical issues:
  50. Network discovery, sybil, node admission, scaling. It seems that the code
  51. will ship with something and that's our trust root. We could try to get
  52. people to build a web of trust, but no. Where we go from here depends
  53. on what threats we have in mind. Really decentralized if your threat is
  54. RIAA; less so if threat is to application data or individuals or...
  55. Making use of servers with little bandwidth. How to handle hammering by
  56. certain applications.
  57. Handling servers that are far away from the rest of the network, e.g. on
  58. the continents that aren't North America and Europe. High latency,
  59. often high packet loss.
  60. Running Tor servers behind NATs, behind great-firewalls-of-China, etc.
  61. Restricted routes. How to propagate to everybody the topology? BGP
  62. style doesn't work because we don't want just *one* path. Point to
  63. Geoff's stuff.
  64. Routing-zones. It seems that our threat model comes down to diversity and
  65. dispersal. But hard for Alice to know how to act. Many questions remain.
  66. The China problem. We have lots of users in Iran and similar (we stopped
  67. logging, so it's hard to know now, but many Persian sites on how to use
  68. Tor), and they seem to be doing ok. But the China problem is bigger. Cite
  69. Stefan's paper, and talk about how we need to route through clients,
  70. and we maybe we should start with a time-release IP publishing system +
  71. advogato based reputation system, to bound the number of IPs leaked to the
  72. adversary.
  73. Policy issues:
  74. Bittorrent and dmca. Should we add an IDS to autodetect protocols and
  75. snipe them? Takedowns and efnet abuse and wikipedia complaints and irc
  76. networks. Should we allow revocation of anonymity if a threshold of
  77. servers want to?
  78. Image: substantial non-infringing uses. Image is a security parameter,
  79. since it impacts user base and perceived sustainability.
  80. Sustainability. Previous attempts have been commercial which we think
  81. adds a lot of unnecessary complexity and accountability. Freedom didn't
  82. collect enough money to pay its servers; JAP bandwidth is supported by
  83. continued money, and they periodically ask what they will do when it
  84. dries up.
  85. Logging. Making logs not revealing. A happy coincidence that verbose
  86. logging is our \#2 performance bottleneck. Is there a way to detect
  87. modified servers, or to have them volunteer the information that they're
  88. logging verbosely? Would that actually solve any attacks?
  89. Anonymity issues:
  90. Transporting the stream vs transporting the packets.
  91. The DNS problem in practice.
  92. Applications that leak data. We can say they're not our problem, but
  93. they're somebody's problem.
  94. How to measure performance without letting people selectively deny service
  95. by distinguishing pings. Heck, just how to measure performance at all. In
  96. practice people have funny firewalls that don't match up to their exit
  97. policies and Tor doesn't deal.
  98. Mid-latency. Can we do traffic shape to get any defense against George's
  99. PET2004 paper? Will padding or long-range dummies do anything then? Will
  100. it kill the user base or can we get both approaches to play well together?
  101. Does running a server help you or harm you? George's Oakland attack.
  102. Plausible deniability -- without even running your traffic through Tor! We
  103. have to pick the path length so adversary can't distinguish client from
  104. server (how many hops is good?).
  105. When does fixing your entry or exit node help you?
  106. Helper nodes in the literature don't deal with churn, and
  107. especially active attacks to induce churn.
  108. Survivable services are new in practice, yes? Hidden services seem
  109. less hidden than we'd like, since they stay in one place and get used
  110. a lot. They're the epitome of the need for helper nodes. This means
  111. that using Tor as a building block for Free Haven is going to be really
  112. hard. Also, they're brittle in terms of intersection and observation
  113. attacks. Would be nice to have hot-swap services, but hard to design.
  114. P2P + anonymity issues:
  115. Incentives. Copy the page I wrote for the NSF proposal, and maybe extend
  116. it if we're feeling smart.
  117. Usability: fc03 paper was great, except the lower latency you are the
  118. less useful it seems it is.
  119. A Tor gui, how jap's gui is nice but does not reflect the security
  120. they provide.
  121. Public perception, and thus advertising, is a security parameter.
  122. Network investigation: Is all this bandwidth publishing thing a good idea?
  123. How can we collect stats better? Note weasel's smokeping, at
  124. http://seppia.noreply.org/cgi-bin/smokeping.cgi?target=Tor
  125. which probably gives george and steven enough info to break tor?
  126. Do general DoS attacks have anonymity implications? See e.g. Adam
  127. Back's IH paper, but I think there's more to be pointed out here.
  128. % need to do somewhere in the paper:
  129. have a serious discussion of morphmix's assumptions, since they would
  130. seem to be the direct competition. in fact tor is a flexible architecture
  131. that would encompass morphmix, and they're nearly identical except for
  132. path selection and node discovery. and the trust system morphmix has
  133. seems overkill (and/or insecure) based on the threat model we've picked.
  134. need to discuss how we take the approach of building the thing, and then
  135. assuming that, how much anonymity can we get. we're not here to model or
  136. to simulate or to produce equations and formulae. but those have their
  137. roles too.
  138. %%%
  139. TCP vs UDP
  140. argument 1: we need to do IP-level packet normalization, to block things like ip
  141. fingerprinting.
  142. argument 2: we still need to be easy to integrate with applications, so they can do
  143. application-level scrubbing.
  144. argument 3: we need a block-level encryption approach that can provide security despite
  145. packet loss and out-of-order delivery. i believe you that such a thing can be created,
  146. but no thing has yet been specified. so specify it for me if you want me to believe it.
  147. (freedom and cebolla are vulnerable to tagging and malleability attacks i believe.)
  148. argument 4: we still need to play with parameters for throughput, congestion control,
  149. etc -- since we need sequence numbers and maybe more to do replay detection,
  150. and just to handle duplicate frames. so we would be reimplementing some subset of tcp
  151. anyway.
  152. argument 5: tls over udp is not implemented or even specified.
  153. argument 6: exit policies over arbitrary IP packets seems to be an IDS-hard problem. i
  154. don't want to build an IDS into tor.
  155. argument 7: certain protocols are going to leak information at the IP layer anyway. for
  156. example, if we anonymizer your dns requests, but they still go to comcast's dns servers,
  157. that's bad.
  158. argument 8: hidden services, .exit addresses, etc are broken unless we have some way to
  159. reach into the application-level protocol and decide the hostname it's trying to get.
  160. \bibliographystyle{plain} \bibliography{tor-design}
  161. \end{document}