123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431 |
- \documentclass{article}
- \newenvironment{tightlist}{\begin{list}{$\bullet$}{
- \setlength{\itemsep}{0mm}
- \setlength{\parsep}{0mm}
- % \setlength{\labelsep}{0mm}
- % \setlength{\labelwidth}{0mm}
- % \setlength{\topsep}{0mm}
- }}{\end{list}}
- \newcommand{\tmp}[1]{{\bf #1} [......] \\}
- \begin{document}
- \title{Tor Development Roadmap: Wishlist for Nov 2006--Dec 2007}
- \author{Roger Dingledine \and Nick Mathewson \and Shava Nerad}
- \maketitle
- \pagestyle{plain}
- \section{Introduction}
- Hi, Roger! Hi, Shava. This paragraph should get deleted soon. Right now,
- this document goes into about as much detail as I'd like to go into for a
- technical audience, since that's the audience I know best. It doesn't have
- time estimates everywhere. It isn't well prioritized, and it doesn't
- distinguish well between things that need lots of research and things that
- don't. The breakdowns don't all make sense. There are lots of things where
- I don't make it clear how they fit into larger goals, and lots of larger
- goals that don't break down into little things. It isn't all stuff we can do
- for sure, and it isn't even all stuff we can do for sure in 2007. The
- tmp\{\} macro indicates stuff I haven't said enough about. That said, here
- goes...
- Tor (the software) and Tor (the overall software/network/support/document
- suite) are now experiencing all the crises of success. Over the next year,
- we're probably going to grow more in terms of users, developers, and funding
- than before. This gives us the opportunity to perform long-neglected
- maintenance tasks.
- \section{Code and design infrastructure}
- \subsection{Protocol revision}
- To maintain backward compatibility, we've postponed major protocol
- changes and redesigns for a long time. Because of this, there are a number
- of sensible revisions we've been putting off until we could deploy several of
- them at once. To do each of these, we first need to discuss design
- alternatives with other cryptographers and outside collaborators to
- make sure that our choices are secure.
- First of all, our protocol needs better {\bf versioning support} so that we
- can make backward-incompatible changes to our core protocol. There are
- difficult anonymity issues here, since many naive designs would make it easy
- to tell clients apart (and then track them) based on their supported versions.
- With protocol versioning support would come the ability to {\bf future-proof
- our ciphersuites}. For example, not only our OR protocol, but also our
- directory protocol, is pretty firmly tied to the SHA-1 hash function, which
- though not yet known to be insecure for our purposes, has begun to show
- its age. We should
- remove assumptions thoughout our design based on the assumption that public
- keys, secret keys, or digests will remain any particular size indefinitely.
- A new protocol could support {\bf multiple cell sizes}. Right now, all data
- passes through the Tor network divided into 512-byte cells. This is
- efficient for high-bandwidth protocols, but inefficient for protocols
- like SSH or AIM that send information in small chunks. Of course, we need to
- investigate the extent to which multiple sizes could make it easier for an
- adversary to fingerprint a traffic pattern.
- Our OR {\bf authentication protocol}, though provably
- secure\cite{tap:pet2006}, relies more on particular aspects of RSA and our
- implementation thereof than we had initially believed. To future-proof
- against changes, we should replace it with a less delicate approach.
- \tmp{Stream migration?}
- \tmp{Use a better AES mode that has built-in integrity checking,
- doesn't grow with the number of hops, is not patented, and
- is implemented and maintained by smart people.}
- \subsection{Scalability}
- \subsubsection{Improved directory efficiency}
- Right now, clients download a statement of the {\bf network status} made by
- each directory authority. We could reduce network bandwidth significantly by
- having the authorities jointly sign a statement reflecting their vote on the
- current network status. This would save clients up to 160K per hour, and
- make their view of the network more uniform. Of course, we'd need to make
- sure the voting process was secure and resilient to failures in the network.
- We should {\bf shorten router descriptors}, since the current format includes
- a great deal of information that's only of interest to the directory
- authorities, and not of interest to clients. We can do this by having each
- router upload a short-form and a long-form signed descriptor, and having
- clients download only the short form. Even a naive version of this would
- save about 40\% of the bandwidth currently spent by clients downloading
- descriptors.
- We should {\bf have routers upload their descriptors even less often}, so
- that clients do not need to download replacements every 18 hours whether any
- information has changed or not. (As of Tor 0.1.2.3-alpha, clients tolerate
- routers that don't upload often, but routers still upload at least every 18
- hours to support older clients.)
- \subsubsection{Non-clique topology}
- Our current network design achieves a certain amount of its anonymity by
- making clients act like each other through the simple expedient of making
- sure that all clients know all servers, and that any server can talk to any
- other server. But as the number of servers increases to serve an
- ever-greater number of clients, these assumptions become impractical.
- At worst, if these scalability issues become troubling before a solution is
- found, we can design and build a solution to {\bf split the network into
- multiple slices} until a better solution comes along. This is not ideal,
- since rather than looking like all other users from a point of view of path
- selection, users would ``only'' look like 200,000--300,000 other users.
- We are in the process of designing {\bf improved schemes for network
- scalability}. Some approaches focus on limiting what an adversary can know
- about what a user knows; others focus on reducing the extent to which an
- adversary can exploit this knowledge. These are currently in their infancy,
- and will probably not be needed in 2007, but they must be designed in 2007 if
- they are to be deployed in 2008.
- \subsubsection{Relay incentives}
- \tmp{We need incentives to relay.}
- \subsection{Portability}
- Our {\bf Windows implementation}, though much improved, continues to lag
- behind Unix and Mac OS X, especially when running as a server. We hope to
- merge promising patches from Mike Chiussi to address this point, and bring
- Windows performance on par with other platforms.
- We should have {\bf better support for portable devices}, including modes of
- operation that require less RAM, and that write to disk less frequently (to
- avoid wearing out flash RAM).
- \subsection{Performance: resource usage}
- \tmp{Use less RAM when we have little. Make buffer code smarter}
- \tmp{Allow separate bandwidth buckets for different bandwidth classes} This
- gets us more users happy to run servers.
- \tmp{Write-limiting for directory servers}
- \tmp{Don't use so many sockets} We can save some for hidden services and for
- encrypted directories.
- \subsection{Performance: network usage}
- \tmp{Do research to figure out how well capacity is actually used.}
- \tmp{Adapt to congestion better. Dynamic SENDME window sizes.}
- \tmp{Tune pathgen algorithms to use it better.}
- \subsection{Performance: one Tor client, many users}
- \tmp{Many organizations want to manage a single Tor client on their
- firewall for many users, rather than having each user install a separate
- Tor client.} Nobody has tried this before, and we bet it will scale
- really poorly.
- Other stress-testing, and fix bottlenecks we find.
- \subsection{Tor servers on asymmetric bandwidth}
- \subsection{Running Tor as both client and server}
- many performance tradeoffs and balances that need more attention.
- \subsection{Blue-sky: UDP}
- \tmp{support udp traffic}
- \tmp{Use udp as a transport}
- \section{Blocking resistance}
- \subsection{Design for blocking resistance}
- We have written a design document explaining our general approach to blocking
- resistance. We should workshop it with other experts in the field to get
- their ideas about how we can improve Tor's efficacy as an anti-censorship
- tool.
- \subsection{Implementation: client-side and bridges-side}
- Our anticensorship design calls for some nodes to act as ``bridges'' that can
- circumvent a national firewall, and others inside the firewall to act as pure
- clients. This part of the design is quite clear-cut; we're probably ready to begin
- implementing it. To implement bridges, we need only to have servers publish
- themselves as limited-availability relays to a special bridge authority if
- they judge they'd make good servers. Clients need a flexible interface to
- learn about bridges and to act on knowledge of bridges.
- Clients also need to {\bf use the encrypted directory variant} added in Tor
- 0.1.2.3-alpha. This will let them retrieve directory information over Tor
- once they've got their initial bridges.
- Bridges will want to be able to {\bf listen on multiple addresses and ports}
- if they can, to give the adversary more ports to block.
- Additionally, we should {\bf resist content-based filters}. Though an
- adversary can't see what users are saying, some aspects of our protocol are
- easy to fingerprint {\em as} Tor. We should correct this where possible.
- \subsection{Implementation: bridge authorities}
- The design here is also reasonably clear-cut: we need to run some
- directory authorities with a slightly modified protocol that doesn't leak
- the entire list of bridges. Thus users can learn up-to-date information
- for bridges they already know about, but they can't learn about arbitrary
- new bridges.
- \subsection{Implementation: how users discover bridges}
- Our design anticipates an arms race between discovery methods and censors.
- We need to begin the infrastructure on our side quickly, preferably in a
- flexible language like Python, so we can adapt quickly to censorship.
- \subsection{The Tor website, docs, and mirrors}
- They're the first to be blocked. How do users learn about Tor in the
- first place, and how do they fetch a genuine copy of Tor?
- \section{Security}
- \subsection{Security research projects}
- \tmp{Mixed-latency}
- \tmp{long-distance padding}
- \tmp{router-zones}
- \tmp{defenses against end-to-end correlation} We don't expect any to work
- right now, but it would be useful to learn that one did. Alternatively,
- proving that one didn't would free up researchers in the field to go work on
- other things.
- \tmp{website fingperprinting} They work great in simulations, but in
- practice we hear they don't work nearly as well. We should get some actual
- numbers on both sides of the issue, and figure out what's going on.
- \subsection{Implementation security}
- \tmp{Encrypt more keys}
- \tmp{Talk Coverity or somebody with a copy of vs2005 into running tools on
- our code} And figure out a way to get our code checked periodically rather
- than just once.
- \tmp{Directory guards}
- \subsection{Detect corrupt exits and other servers}
- \tmp{Improved feedback mechanism for tools like SOAT to use}
- \tmp{More tools like SOAT: check for routers that bork SSL, routers that
- sniff (and use) passwords...}
- \tmp{Add a way for authorities to declare families.}
- \tmp{Make authority administration simpler so authority ops spend less time
- on random junk and more time on care and feeding of the network.}
- \tmp{Authorities should measure Stable (and maybe Fast) themselves, and not
- just believe declared router uptime.}
- \subsection{Protocol security}
- \tmp{Build in hooks for DoS-resistance: when we need it, we'll really need
- it.}
- \section{Development infrastructure}
- \subsection{Build farm}
- We've begun to deploy a cross-platform distributed build farm of hosts
- that build and test the Tor source every time it changes in our development
- repository.
- We need to {\bf get more participants}, so that we can test a larger variety
- of platforms. (Previously, we've only found out when our code had broken on
- obscure platforms when somebody got around to building it.)
- We need also to {\bf add our dependencies} to the build farm, so that we can
- ensure that libraries we need (especially libevent) do not stop working on
- any important platform between one release and the next.
- \subsection{Improved testing harness}
- Currently, our {\bf unit tests} cover only about XX\% of the code base. This
- is uncomfortably low; we should write more and switch to a more flexible
- testing framework.
- We should also write flexible {\bf automated single-host deployment tests} so
- we can more easily verify that the current codebase works with the network.
- \subsection{Centralized build system}
- We currently rely on a separate packager to maintain the packaging system and
- to build Tor on each platform for which we distribute binaries. Separate
- package maintainers is sensible, but separate package builders has meant
- long turnaround times between source releases and package releases. We
- should create the necessary infrastructure for us to produce binaries for all
- major packages within an hour or so of source release.
- \subsection{Improved metrics}
- \tmp{We'd like to know how the network is doing.}
- \tmp{We'd like to know where users are in an even less intrusive way.}
- \tmp{We'd like to know how much of the network is getting used.}
- \subsection{Controller library}
- We've done lots of design and development on our controller interface, which
- allows UI applications and other tools to interact with Tor. We could
- encourage the development of more such tools by releasing a {\bf
- general-purpose controller library}, ideally with API support for several
- popular programming languages.
- \section{User experience}
- \subsection{Get blocked less, get blocked less broadly}
- Right now, some services block connections from the Tor network because
- they don't have a better
- way to keep vandals from abusing them than blocking IP addresses associated
- with vandalism. Our approach so far has been to educate them about better
- solutions that currently exist, but we should also {\bf create better
- solutions for limiting vandalism by anonymous users} like credential and
- blind-signature based implementations, and encourage their use. Other
- promising starting points including writing a patch and explanation for
- Wikipedia, and helping Freenode to document, maintain, and expand its
- current Tor-friendly position.
- Those who do block Tor users also block overbroadly, sometimes blacklisting
- operators of Tor servers that do not permit exit to their services. We could
- obviate innocent reasons for doing so by designing a {\bf narrowly-targeted Tor
- RBL service} so that those who wanted to overblock Tor clould no longer
- plead incompetence.
- \subsection{All-in-one bundle}
- \tmp{a.k.a ``Torpedo'', but rename this.}
- \subsection{LiveCD Tor}
- \tmp{a.k.a anonym.os done right}
- \subsection{A Tor client in a VM}
- \tmp{a.k.a JanusVM} which is quite related to the firewall-level deployment
- section below
- \subsection{Interface improvements}
- \tmp{Allow controllers to manipulate server status.}
- (Why is this in the User Experience section?)
- \subsection{Firewall-level deployment}
- Another useful deployment mode for some users is using {\bf Tor in a firewall
- configuration}, and directing all their traffic through Tor. This can be a
- little tricky to set up currently, but it's an effective way to make sure no
- traffic leaves the host un-anonymized. To achieve this, we need to {\bf
- improve and port our new TransPort} feature which allows Tor to be used
- without SOCKS support; to {\bf add an anonymizing DNS proxy} feature to Tor;
- and to {\bf construct a recommended set of firewall configurations} to redirect
- traffic to Tor.
- This is an area where {\bf deployment via a livecd}, or an installation
- targetted at specialized home routing hardware, could be useful.
- \subsection{Assess software and configurations for anonymity risks}
- which firefox extensions to use, and which to avoid. best practices for
- how to torify each class of application.
- clean up our own bundled software:
- E.g. Merge the good features of Foxtor into Torbutton
- \subsection{Localization}
- Right now, most of our user-facing code is internationalized. We need to
- internationalize the last few hold-outs (like the Tor installer), and get
- more translations for the parts that are already internationalized.
- [Do you mean the Vidalia bundle installer, or the Tor-installer-for-experts? -RD]
- Also, we should look into a {\bf unified translator's solution}. Currently,
- since different tools have been internationalized using the
- framework-appropriate method, different tools require translators to localize
- them via different interfaces. Inasmuch as possible, we should make
- translators only need to use a single tool to translate the whole Tor suite.
- \section{Support}
- would be nice to set up some actual user support infrastructure, especially
- focusing on server operators and on coordinating volunteers.
- \section{Documentation}
- \subsection{Unified documentation scheme}
- We need to {\bf inventory our documentation.} Our documentation so far has
- been mostly produced on an {\it ad hoc} basis, in response to particular
- needs and requests. We should figure out what documentation we have, which of
- it (if any) should get priority, and whether we can't put it all into a
- single format.
- We could {\bf unify the docs} into a single book-like thing. This will also
- help us identify what sections of the ``book'' are missing.
- \subsection{Missing technical documentation}
- We should {\bf revise our design paper} to reflect the new decisions and
- research we've made since it was published in 2004. This will help other
- researchers evaluate and suggest improvements to Tor's current design.
- Other projects sometimes implement the client side of our prototocol. We
- encourage this, but we should write {\bf a document about how to avoid
- excessive resource use}, so we don't need to worry that they will do so
- without regard to the effect of their choices on server resources.
- \subsection{Missing user documentation}
- \tmp{Discoursive and comprehensive docs}
- \bibliographystyle{plain} \bibliography{tor-design}
- \end{document}
|