roadmap-2007.tex 13 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325
  1. \documentclass{article}
  2. \newenvironment{tightlist}{\begin{list}{$\bullet$}{
  3. \setlength{\itemsep}{0mm}
  4. \setlength{\parsep}{0mm}
  5. % \setlength{\labelsep}{0mm}
  6. % \setlength{\labelwidth}{0mm}
  7. % \setlength{\topsep}{0mm}
  8. }}{\end{list}}
  9. \newcommand{\tmp}[1]{{\bf #1} [......] \\}
  10. \begin{document}
  11. \title{Tor Development Roadmap: Wishlist for Nov 2006--Dec 2007}
  12. \author{Roger Dingledine \and Nick Mathewson \and Shava Nerad}
  13. \maketitle
  14. \pagestyle{plain}
  15. \section{Introduction}
  16. Hi, Roger! Hi, Shava. This paragraph should get deleted soon. Right now,
  17. this document goes into about as much detail as I'd like to go into for a
  18. technical audience, since that's the audience I know best. It doesn't have
  19. time estimates everywhere. It isn't well prioritized, and it doesn't
  20. distinguish well between things that need lots of research and things that
  21. don't. The breakdowns don't all make sense. There are lots of things where
  22. I don't make it clear how they fit into larger goals, and lots of larger
  23. goals that don't break down into little things. It isn't all stuff we can do
  24. for sure, and it isn't even all stuff we can do for sure in 2007. The
  25. tmp\{\} macro indicates stuff I haven't said enough about. That said, here
  26. goes...
  27. Tor (the software) and Tor (the overall software/network/support/document
  28. suite) are now experiencing all the crises of success. Over the next year,
  29. we're probably going to grow more in terms of users, developers, and funding
  30. than before. This gives us the opportunity to perform long-neglected
  31. maintenance tasks.
  32. \section{Code and design infrastructure}
  33. \subsection{Protocol revision}
  34. To maintain backward compatibility, we've postponed major protocol
  35. changes and redesigns for a long time. Because of this, there are a number
  36. of sensible revisions we've been putting off until we could deploy several of
  37. them at once. To do each of these, we first need to discuss design
  38. alternatives with cryptographers and other outside collaborators to
  39. make sure that our choices are secure.
  40. First of all, our protocol needs better {\bf versioning support} so that we
  41. can make backward-incompatible changes to our core protocol. There are
  42. difficult anonymity issues here, since many naive designs would make it easy
  43. to tell clients apart based on their supported versions.
  44. With protocol versioning support would come the ability to {\bf future-proof
  45. our ciphersuites}. For example, not only our OR protocol, but also our
  46. directory protocol, is pretty firmly tied to the SHA-1 hash function, which
  47. though not insecure for our purposes, has begun to show its age. We should
  48. remove assumptions thoughout our design based on the assumption that public
  49. keys, secret keys, or digests will remain any particular size infinitely.
  50. A new protocol could support {\bf multiple cell sizes}. Right now, all data
  51. passes through the Tor network divided into 512-byte cells. This is
  52. efficient for high-bandwidth protocols, but inefficient for protocols
  53. like SSH or AIM that send information in small chunks. Of course, we need to
  54. investigate the extent to which multiple sizes could make it easier for an
  55. adversary to fingerprint a traffic pattern.
  56. Our OR {\bf authentication protocol}, though provably
  57. secure\cite{goldberg-tap}, relies more on particular aspects of RSA and our
  58. implementation thereof than we had initially believed. To future-proof
  59. against changes, we should replace it with a less delicate approach.
  60. \subsection{Scalability}
  61. \subsubsection{Improved directory performance}
  62. Right now, clients download a statement of the {\bf network status} made by
  63. each directory authority. We could reduce network bandwidth significantly by
  64. having the authorities jointly sign a statement reflecting their vote on the
  65. current network status. This would save clients up to 160K per hour, and
  66. make their view of the network more uniform. Of course, we'd need to make
  67. sure the voting process was secure and resilient to failures in the network.
  68. We should {\bf shorten router descriptors}, since the current format includes
  69. a great deal of information that's only of interest to the directory
  70. authorities, and not of interest to clients. We can do this by having each
  71. router upload a short-form and a long-form signed descriptor, and having
  72. clients download only the short form. Even a naive version of this would
  73. save about 40\% of the bandwidth currently spent on descriptors.
  74. We should {\bf have routers upload their descriptors even less often}, so
  75. that clients do not need to download replacements every 18 hours whether any
  76. information has changed or not. (As of Tor 0.1.2.3-alpha, clients tolerate
  77. routers that don't upload often, but routers still upload at least every 18
  78. hours to support older clients.)
  79. \subsubsection{Non-clique topology}
  80. Our current network design achieves a certain amount of its anonymity by
  81. making clients act like each other through the simple expedient of making
  82. sure that all clients know all servers, and that any server can talk to any
  83. other server. But as the number of servers increases to serve an
  84. ever-greater number of clients, these assumptions become impractical.
  85. At worst, if these scalability issues become troubling before a solution is
  86. found, we can design and build a solution to {\bf split the network into
  87. multiple slices} until a better solution comes along. This is not ideal,
  88. since rather than looking like all other users from a point of view of path
  89. selection, users would ``only'' look like 200,000--300,000 other users.
  90. We are in the process of designing {\bf improved schemes for network
  91. scalability}. Some approaches focus on limiting what an adversary can know
  92. about what a user knows; others focus on reducing the extent to which an
  93. adversary can exploit this knowledge. These are currently in their infancy,
  94. and will probably not be needed in 2007, but they must be designed in 2007 if
  95. they are to be deployed in 2008.
  96. \subsubsection{Relay incentives}
  97. \tmp{We need incentives to relay.}
  98. \subsection{Portability}
  99. Our {\bf Windows implementation}, though much improved, continues to lag
  100. behind Unix and Mac OS X, especially when running as a server. We hope to
  101. merge promising patches from Mike Chiussi to address this point, and bring
  102. Windows performance on par with other platforms.
  103. We should have {\bf better support for portable devices}, including modes of
  104. operation that require less RAM, and that write to disk less frequently (to
  105. avoid wearing out flash RAM).
  106. \subsection{Performance: resource usage}
  107. \tmp{Use less RAM when we have little. Make buffer code smarter}
  108. \tmp{Allow separate bandwidth buckets for different bandwidth classes} This
  109. gets us more users happy to run servers.
  110. \tmp{Write-limiting for directory servers}
  111. \tmp{Don't use so many sockets} We can save some for hidden services and for
  112. encrypted directories.
  113. \subsection{Performance: network usage}
  114. \tmp{Do research to figure out how well capacity is actually used.}
  115. \tmp{Tune pathgen algorithms to use it better.}
  116. \subsection{Blue-sky: UDP}
  117. \section{Blocking resistance}
  118. \subsection{Design for blocking resistance}
  119. We have written a design document explaining our general approach to blocking
  120. resistance. We should workshop it with other experts in the field to get
  121. their ideas about how we can improve Tor's efficacy as an anti-censorship
  122. tool.
  123. \subsection{Implementation: client-side and bridges-side}
  124. Our anticensorship design calls for some nodes to act as ``bridges'' that can
  125. circumvent a national firewall, and others inside the firewall to act as pure
  126. clients. The design here is quite clear-cut; we're probably ready to begin
  127. implementing it. To implement bridges, we need only to have servers publish
  128. themselves as limited-availability relays to a special bridge authority if
  129. they judge they'd make good servers. Clients need a flexible interface to
  130. learn about bridges and to act on knowledge of bridges.
  131. Clients also need to {\bf use the encrypted directory variant} added in Tor
  132. 0.1.2.3-alpha. This will let them retrieve directory information over Tor
  133. once they've got their initial bridges.
  134. Bridges will want to be able to {\bf listen on multiple addresses and ports}
  135. if they can, to give the adversary more ports to block.
  136. Additionally, we should {\bf resist content-based filters}. Though an
  137. adversary can't see what users are saying, some aspects of our protocol are
  138. easy to fingerprint {\em as} Tor. We should correct this where possible.
  139. \subsection{Implementation: bridge authorities}
  140. Our design anticipates an arms race between discovery methods and censors.
  141. We need to begin the infrastructure on our side quickly, preferably in a
  142. flexible language like Python, so we can adapt quickly to censorship.
  143. \section{Security}
  144. \subsection{Security research projects}
  145. \tmp{Mixed-latency}
  146. \tmp{long-distance padding}
  147. \tmp{router-zones}
  148. \tmp{defenses against end-to-end correlation} We don't expect any to work
  149. right now, but it would be useful to learn that one did. Alternatively,
  150. proving that one didn't would free up researchers in the field to go work on
  151. other things.
  152. \subsection{Implementation security}
  153. \tmp{Encrypt more keys}
  154. \tmp{Talk Coverity or somebody with a copy of vs2005 into running tools on
  155. our code}
  156. \tmp{Directory guards}
  157. \subsection{Detect corrupt exits and other servers}
  158. \tmp{Improved feedback mechanism for tools like SOAT to use}
  159. \tmp{More tools like SOAT: check for routers that bork SSL, routers that
  160. sniff (and use) passwords...}
  161. \tmp{Add a way for authorities to declare families.}
  162. \tmp{Make authority administration simpler so authority ops spend less time
  163. on random junk and more time on care and feeding of the network.}
  164. \tmp{Authorities should measure Stable (and maybe Fast) themselves, and not
  165. just believe declared router uptime.}
  166. \subsection{Protocol security}
  167. \tmp{Build in hooks for DoS-resistance: when we need it, we'll really need
  168. it.}
  169. \section{Development infrastructure}
  170. \subsection{Build farm}
  171. We've begun to deploy a cross-platform distributed build farm of hosts
  172. that build and test the Tor source every time it changes in our development
  173. repository.
  174. We need to {\bf get more participants}, so that we can test a larger variety
  175. of platforms. (Previously, we've only found out when our code had broken on
  176. obscure platforms when somebody got around to building it.)
  177. We need also to {\bf add our dependencies} to the build farm, so that we can
  178. ensure that libraries we need (especially libevent) do not stop working on
  179. any important platform between one release and the next.
  180. \subsection{Improved testing harness}
  181. Currently, our {\bf unit tests} cover only about XX\% of the code base. This
  182. is uncomfortably low; we should write more and switch to a more flexible
  183. testing framework.
  184. We should also write flexible {\bf automated single-host deployment tests} so
  185. we can more easily verify that the current codebase works with the network.
  186. \subsection{Centralized build system}
  187. We currently rely on a separate packager to maintain the packaging system and
  188. to build Tor on each platform for which we distribute binaries. Separate
  189. package maintainers is sensible, but separate package builders has meant
  190. long turnaround times between source releases and package releases. We
  191. should create the necessary infrastructure for us to produce binaries for all
  192. major packages within an hour or so of source release.
  193. \subsection{Improved metrics}
  194. \tmp{We'd like to know how the network is doing.}
  195. \tmp{We'd like to know where users are in an even less intrusive way.}
  196. \tmp{We'd like to know how much of the network is getting used.}
  197. \subsection{Controller library}
  198. \tmp{release a general-purpose controller library}
  199. \section{User experience}
  200. \subsection{Get blocked less, get blocked less hard}
  201. \tmp{Implement and publicize blind-signature based credential scheme}
  202. \tmp{Maybe make a minimal RBL thing}
  203. \subsection{All-in-one bundle}
  204. \tmp{a.k.a ``Torpedo'', but rename this.}
  205. \subsection{LiveCD Tor}
  206. \tmp{a.k.a anonym.os done right}
  207. \subsection{Interface improvements}
  208. \tmp{Allow controllers to manipulate server status.}
  209. \subsection{Firewall-level deployment}
  210. \tmp{Make our new TransPort logic more portable and tested}
  211. \tmp{Write logic for Tor to act as a DNS server}
  212. \tmp{Write necessary glue code, scripts, and docs so users who want to use
  213. Tor as a firewall-like thing can. Consider a livecd.}
  214. \subsection{Localization}
  215. Right now, most of our user-facing code is internationalized. We need to
  216. internationalize the last few hold-outs (like the Tor installer), and get
  217. more translations for the parts that are already internationalized.
  218. Also, we should look into a {\bf unified translator's solution}. Currently,
  219. since different tools have been internationalized using the
  220. framework-appropriate method, different tools require translators to localize
  221. them via different interfaces. Inasmuch as possible, we should make
  222. translators only need to use a single tool to translate the whole Tor suite.
  223. \section{Documentation}
  224. \subsection{Unified documentation scheme}
  225. \tmp{Keep track of all the docs we've got}
  226. \tmp{Unify the docs into a single book-like thing} This will also help us
  227. identify what sections of the ``book'' are missing.
  228. \subsection{Missing technical documentation}
  229. \tmp{Revised design paper, or design paper plus errata}
  230. \tmp{``How to play nice with Tor''}
  231. \end{document}