| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795796797798799800801802803804805806807808809810811812813814815816817818819820821822823824825826827828829830831832833834835836837838839840841842843844845846847848849850851852853854855856857858859860861862863864865866867868869870871872873874875876877878879880881882883884885886887888889890891892893894895 | 
							- \documentclass{article}
 
- \usepackage{url}
 
- \usepackage{fullpage}
 
- \newenvironment{tightlist}{\begin{list}{$\bullet$}{
 
-   \setlength{\itemsep}{0mm}
 
-     \setlength{\parsep}{0mm}
 
-     %  \setlength{\labelsep}{0mm}
 
-     %  \setlength{\labelwidth}{0mm}
 
-     %  \setlength{\topsep}{0mm}
 
-     }}{\end{list}}
 
- \newcommand{\tmp}[1]{{\bf #1} [......] \\}
 
- \newcommand{\plan}[1]{ {\bf (#1)}}
 
- \begin{document}
 
- \title{Tor Development Roadmap: Wishlist for 2008 and beyond}
 
- \author{Roger Dingledine \and Nick Mathewson}
 
- \date{}
 
- \maketitle
 
- \pagestyle{plain}
 
- \section{Introduction}
 
- Tor (the software) and Tor (the overall software/network/support/document
 
- suite) are now experiencing all the crises of success.  Over the next
 
- years, we're probably going to grow even more in terms of users, developers,
 
- and funding than before. This document attempts to lay out all the
 
- well-understood next steps that Tor needs to take. We should periodically
 
- reorganize it to reflect current and intended priorities.
 
- \section{Everybody can be a relay}
 
- We've made a lot of progress towards letting an ordinary Tor client also
 
- serve as a Tor relay. But these issues remain.
 
- \subsection{UPNP}
 
- We should teach Vidalia how to speak UPNP to automatically open and
 
- forward ports on common (e.g. Linksys) routers. There are some promising
 
- Qt-based UPNP libs out there, and in any case there are others (e.g. in
 
- Perl) that we can base it on.
 
- \subsection{``ORPort auto'' to look for a reachable port}
 
- Vidalia defaults to port 443 on Windows and port 8080 elsewhere. But if
 
- that port is already in use, or the ISP filters incoming connections
 
- on that port (some cablemodem providers filter 443 inbound), the user
 
- needs to learn how to notice this, and then pick a new one and type it
 
- into Vidalia.
 
- We should add a new option ``auto'' that cycles through a set of preferred
 
- ports, testing bindability and reachability for each of them, and only
 
- complains to the user once it's given up on the common choices.
 
- \subsection{Incentives design}
 
- Roger has been working with researchers at Rice University to simulate
 
- and analyze a new design where the directory authorities assign gold
 
- stars to well-behaving relays, and then all the relays give priority
 
- to traffic from gold-starred relays. The great feature of the design is
 
- that not only does it provide the (explicit) incentive to run a relay,
 
- but it also aims to grow the overall capacity of the network, so even
 
- non-relays will benefit.
 
- It needs more analysis, and perhaps more design work, before we try
 
- deploying it.
 
- \subsection{Windows libevent}
 
- Tor relays still don't work well or reliably on Windows XP or Windows
 
- Vista, because we don't use the Windows-native ``overlapped IO''
 
- approach. Christian King made a good start at teaching libevent about
 
- overlapped IO during Google Summer of Code 2007, and next steps are
 
- to a) finish that, b) teach Tor to do openssl calls on buffers rather
 
- than directly to the network, and c) teach Tor to use the new libevent
 
- buffers approach.
 
- \subsection{Network scaling}
 
- If we attract many more relays, we will need to handle the growing pains
 
- in terms of getting all the directory information to all the users.
 
- The first piece of this issue is a practical question: since the
 
- directory size scales linearly with more relays, at some point it
 
- will no longer be practical for every client to learn about every
 
- relay. We can try to reduce the amount of information each client needs
 
- to fetch (e.g. based on fetching less information preemptively as in
 
- Section~\ref{subsec:fewer-descriptor-fetches} below), but eventually
 
- clients will need to learn about only a subset of the network, and we
 
- will need to design good ways to divide up the network information.
 
- The second piece is an anonymity question that arises from this
 
- partitioning: if Tor's security comes from having all the clients
 
- behaving in similar ways, yet we are now giving different clients
 
- different directory information, how can we minimize the new anonymity
 
- attacks we introduce?
 
- \subsection{Using fewer sockets}
 
- Since in the current network every Tor relay can reach every other Tor
 
- relay, and we have many times more users than relays, pretty much every
 
- possible link in the network is in use. That is, the current network
 
- is a clique in practice.
 
- And since each of these connections requires a TCP socket, it's going
 
- to be hard for the network to grow much larger: many systems come with
 
- a default of 1024 file descriptors allowed per process, and raising
 
- that ulimit is hard for end users. Worse, many low-end gateway/firewall
 
- routers can't handle this many connections in their routing table.
 
- One approach is a restricted-route topology~\cite{danezis:pet2003}:
 
- predefine which relays can reach which other relays, and communicate
 
- these restrictions to the relays and the clients. We need to compute
 
- which links are acceptable in a way that's decentralized yet scalable,
 
- and in a way that achieves a small-worlds property; and we
 
- need an efficient (compact) way to characterize the topology information
 
- so all the users could keep up to date.
 
- Another approach would be to switch to UDP-based transport between
 
- relays, so we don't need to keep the TCP sockets open at all. Needs more
 
- investigation too.
 
- \subsection{Auto bandwidth detection and rate limiting, especially for
 
-       asymmetric connections.}
 
- \subsection{Better algorithms for giving priority to local traffic}
 
- Proposal 111 made a lot of progress at separating local traffic from
 
- relayed traffic, so Tor users can rate limit the relayed traffic at a
 
- stricter level. But since we want to pass both traffic classes over the
 
- same TCP connection, we can't keep them entirely separate. The current
 
- compromise is that we treat all bytes to/from a given connectin as
 
- local traffic if any of the bytes within the past N seconds were local
 
- bytes.  But a) we could use some more intelligent heuristics, and b)
 
- this leaks information to an active attacker about when local traffic
 
- was sent/received.
 
- \subsection{Tolerate absurdly wrong clocks, even for relays}
 
- Many of our users are on Windows, running with a clock several days or
 
- even several years off from reality. Some of them are even intentionally
 
- in this state so they can run software that will only run in the past.
 
- Before Tor 0.1.1.x, Tor clients would still function if their clock was
 
- wildly off --- they simply got a copy of the directory and believed it.
 
- Starting in Tor 0.1.1.x (and even moreso in Tor 0.2.0.x), the clients
 
- only use networkstatus documents that they believe to be recent, so
 
- clients with extremely wrong clocks no longer work. (This bug has been
 
- an unending source of vague and confusing bug reports.)
 
- The first step is for clients to recognize when all the directory material
 
- they're fetching has roughly the same offset from their current time,
 
- and then automatically correct for it.
 
- Once that's working well, clients who opt to become bridge relays should
 
- be able to use the same approach to serve accurate directory information
 
- to their bridge users.
 
- \subsection{Risks from being a relay}
 
- Three different research
 
- papers~\cite{back01,clog-the-queue,attack-tor-oak05} describe ways to
 
- identify the nodes in a circuit by running traffic through candidate nodes
 
- and looking for dips in the traffic while the circuit is active. These
 
- clogging attacks are not that scary in the Tor context so long as relays
 
- are never clients too. But if we're trying to encourage more clients to
 
- turn on relay functionality too (whether as bridge relays or as normal
 
- relays), then we need to understand this threat better and learn how to
 
- mitigate it.
 
- One promising research direction is to investigate the RelayBandwidthRate
 
- feature that lets Tor rate limit relayed traffic differently from local
 
- traffic. Since the attacker's ``clogging'' traffic is not in the same
 
- bandwidth class as the traffic initiated by the user, it may be harder
 
- to detect interference. Or it may not be.
 
- \subsection{First a bridge, then a public relay?}
 
- Once enough of the items in this section are done, I want all clients
 
- to start out automatically detecting their reachability and opting
 
- to be bridge relays.
 
- Then if they realize they have enough consistency and bandwidth, they
 
- should automatically upgrade to being non-exit relays.
 
- What metrics should we use for deciding when we're fast enough
 
- and stable enough to switch? Given that the list of bridge relays needs
 
- to be kept secret, it doesn't make much sense to switch back.
 
- \section{Tor on low resources / slow links}
 
- \subsection{Reducing directory fetches further}
 
- \label{subsec:fewer-descriptor-fetches}
 
- \subsection{AvoidDiskWrites}
 
- \subsection{Using less ram}
 
- \subsection{Better DoS resistance for tor servers / authorities}
 
- \section{Blocking resistance}
 
- \subsection{Better bridge-address-distribution strategies}
 
- \subsection{Get more volunteers running bridges}
 
- \subsection{Handle multiple bridge authorities}
 
- \subsection{Anonymity for bridge users: second layer of entry guards, etc?}
 
- \subsection{More TLS normalization}
 
- \subsection{Harder to block Tor software distribution}
 
- \subsection{Integration with Psiphon}
 
- \section{Packaging}
 
- \subsection{Switch Privoxy out for Polipo}
 
-       - Make Vidalia able to launch more programs itself
 
- \subsection{Continue Torbutton improvements}
 
-       especially better docs
 
- \subsection{Vidalia and stability (especially wrt ongoing Windows problems)}
 
-       learn how to get useful crash reports (tracebacks) from Windows users
 
- \subsection{Polipo support on Windows}
 
- \subsection{Auto update for Tor, Vidalia, others}
 
- \subsection{Tor browser bundle for USB and standalone use}
 
- \subsection{LiveCD solution}
 
- \subsection{VM-based solution}
 
- \subsection{Tor-on-enclave-firewall configuration}
 
- \subsection{General tutorials on what common applications are Tor-friendly}
 
- \subsection{Controller libraries (torctl) plus documentation}
 
- \subsection{Localization and translation (Vidalia, Torbutton, web pages)}
 
- \section{Interacting better with Internet sites}
 
- \subsection{Make tordnsel (tor exitlist) better and more well-known}
 
- \subsection{Nymble}
 
- \subsection{Work with Wikipedia, Slashdot, Google(, IRC networks)}
 
- \subsection{IPv6 support for exit destinations}
 
- \section{Network health}
 
- \subsection{torflow / soat to detect bad relays}
 
- \subsection{make authorities more automated}
 
- \subsection{torstatus pages and better trend tracking}
 
- \subsection{better metrics for assessing network health / growth}
 
-       - geoip usage-by-country reporting and aggregation
 
-         (Once that's working, switch to Directory guards)
 
- \section{Performance research}
 
- \subsection{Load balance better}
 
- \subsection{Improve our congestion control algorithms}
 
- \subsection{Two-hops vs Three-hops}
 
- \subsection{Transport IP packets end-to-end}
 
- \section{Outreach and user education}
 
- \subsection{"Who uses Tor" use cases}
 
- \subsection{Law enforcement contacts}
 
-       - "Was this IP address a Tor relay recently?" database
 
- \subsection{Commercial/enterprise outreach. Help them use Tor well and
 
-       not fear it.}
 
- \subsection{NGO outreach and training.}
 
-       - "How to be a safe blogger"
 
- \subsection{More activist coordinators, more people to answer user questions}
 
- \subsection{More people to hold hands of server operators}
 
- \subsection{Teaching the media about Tor}
 
- \subsection{The-dangers-of-plaintext awareness}
 
- \subsection{check.torproject.org and other "privacy checkers"}
 
- \subsection{Stronger legal FAQ for US}
 
- \subsection{Legal FAQs for other countries}
 
- \section{Anonymity research}
 
- \subsection{estimate relay bandwidth more securely}
 
- \subsection{website fingerprinting attacks}
 
- \subsection{safer e2e defenses}
 
- \subsection{Using Tor when you really need anonymity. Can you compose it
 
-       with other steps, like more trusted guards or separate proxies?}
 
- \subsection{Topology-aware routing; routing-zones, steven's pet2007 paper.}
 
- \subsection{Exactly what do guard nodes provide?}
 
- Entry guards seem to defend against all sorts of attacks. Can we work
 
- through all the benefits they provide? Papers like Nikita's CCS 2007
 
- paper make me think their value is not well-understood by the research
 
- community.
 
- \section{Organizational growth and stability}
 
- \subsection{A contingency plan if Roger gets hit by a bus}
 
-       - Get a new executive director
 
- \subsection{More diversity of funding}
 
-       - Don't rely on any one funder as much
 
-       - Don't rely on any sector or funder category as much
 
- \subsection{More Tor-funded people who are skilled at peripheral apps like
 
-       Vidalia, Torbutton, Polipo, etc}
 
- \subsection{More coordinated media handling and strategy}
 
- \subsection{Clearer and more predictable trademark behavior}
 
- \subsection{More outside funding for internships, etc e.g. GSoC.}
 
- \section{Hidden services}
 
- \subsection{Scaling: how to handle many hidden services}
 
- \subsection{Performance: how to rendezvous with them quickly}
 
- \subsection{Authentication/authorization: how to tolerate DoS / load}
 
- \section{Tor as a general overlay network}
 
- \subsection{Choose paths / exit by country}
 
- \subsection{Easier to run your own private servers and have Tor use them
 
-       anywhere in the path}
 
- \subsection{Easier to run an independent Tor network}
 
- \section{Code security/correctness}
 
- \subsection{veracode}
 
- \subsection{code audit}
 
- \subsection{more fuzzing tools}
 
- \subsection{build farm, better testing harness}
 
- \subsection{Long-overdue code refactoring and cleanup}
 
- \section{Protocol security}
 
- \subsection{safer circuit handshake}
 
- \subsection{protocol versioning for future compatibility}
 
- \subsection{cell sizes}
 
- \subsection{adapt to new key sizes, etc}
 
- \bibliographystyle{plain} \bibliography{tor-design}
 
- \end{document}
 
- \section{Code and design infrastructure}
 
- \subsection{Protocol revision}
 
- To maintain backward compatibility, we've postponed major protocol
 
- changes and redesigns for a long time.  Because of this, there are a number
 
- of sensible revisions we've been putting off until we could deploy several of
 
- them at once.  To do each of these, we first need to discuss design
 
- alternatives with other cryptographers and outside collaborators to
 
- make sure that our choices are secure.
 
- First of all, our protocol needs better {\bf versioning support} so that we
 
- can make backward-incompatible changes to our core protocol.  There are
 
- difficult anonymity issues here, since many naive designs would make it easy
 
- to tell clients apart (and then track them) based on their supported versions.
 
- With protocol versioning support would come the ability to {\bf future-proof
 
-   our ciphersuites}.  For example, not only our OR protocol, but also our
 
- directory protocol, is pretty firmly tied to the SHA-1 hash function, which
 
- though not yet known to be insecure for our purposes, has begun to show
 
- its age.  We should
 
- remove assumptions throughout our design based on the assumption that public
 
- keys, secret keys, or digests will remain any particular size indefinitely.
 
- Our OR {\bf authentication protocol}, though provably
 
- secure\cite{tap:pet2006}, relies more on particular aspects of RSA and our
 
- implementation thereof than we had initially believed.  To future-proof
 
- against changes, we should replace it with a less delicate approach.
 
- \plan{For all the above: 2 person-months to specify, spread over several
 
-   months with time for interaction with external participants.  One
 
-   person-month to implement.  Start specifying in early 2007.}
 
- We might design a {\bf stream migration} feature so that streams tunneled
 
- over Tor could be more resilient to dropped connections and changed IPs.
 
- \plan{Not in 2007.}
 
- A new protocol could support {\bf multiple cell sizes}.  Right now, all data
 
- passes through the Tor network divided into 512-byte cells.  This is
 
- efficient for high-bandwidth protocols, but inefficient for protocols
 
- like SSH or AIM that send information in small chunks.  Of course, we need to
 
- investigate the extent to which multiple sizes could make it easier for an
 
- adversary to fingerprint a traffic pattern. \plan{Not in 2007.}
 
- As a part of our design, we should investigate possible {\bf cipher modes}
 
- other than counter mode.  For example, a mode with built-in integrity
 
- checking, error propagation, and random access could simplify our protocol
 
- significantly.  Sadly, many of these are patented and unavailable for us.
 
- \plan{Not in 2007.}
 
- \subsection{Scalability}
 
- \subsubsection{Improved directory efficiency}
 
- We should {\bf have routers upload their descriptors even less often}, so
 
- that clients do not need to download replacements every 18 hours whether any
 
- information has changed or not.  (As of Tor 0.1.2.3-alpha, clients tolerate
 
- routers that don't upload often, but routers still upload at least every 18
 
- hours to support older clients.) \plan{Must do, but not until 0.1.1.x is
 
- deprecated in mid 2007. 1 week.}
 
- \subsubsection{Non-clique topology}
 
- Our current network design achieves a certain amount of its anonymity by
 
- making clients act like each other through the simple expedient of making
 
- sure that all clients know all servers, and that any server can talk to any
 
- other server.  But as the number of servers increases to serve an
 
- ever-greater number of clients, these assumptions become impractical.
 
- At worst, if these scalability issues become troubling before a solution is
 
- found, we can design and build a solution to {\bf split the network into
 
- multiple slices} until a better solution comes along.  This is not ideal,
 
- since rather than looking like all other users from a point of view of path
 
- selection, users would ``only'' look like 200,000--300,000 other
 
- users.\plan{Not unless needed.}
 
- We are in the process of designing {\bf improved schemes for network
 
-   scalability}.  Some approaches focus on limiting what an adversary can know
 
- about what a user knows; others focus on reducing the extent to which an
 
- adversary can exploit this knowledge.  These are currently in their infancy,
 
- and will probably not be needed in 2007, but they must be designed in 2007 if
 
- they are to be deployed in 2008.\plan{Design in 2007; unknown difficulty.
 
-   Write a paper.}
 
- \subsubsection{Relay incentives}
 
- To support more users on the network, we need to get more servers.  So far,
 
- we've relied on volunteerism to attract server operators, and so far it's
 
- served us well.  But in the long run, we need to {\bf design incentives for
 
-   users to run servers} and relay traffic for others.  Most obviously, we
 
- could try to build the network so that servers offered improved service for
 
- other servers, but we would need to do so without weakening anonymity and
 
- making it obvious which connections originate from users running servers.  We
 
- have some preliminary designs~\cite{incentives-txt,tor-challenges},
 
- but need to perform
 
- some more research to make sure they would be safe and effective.\plan{Write
 
-   a draft paper; 2 person-months.}
 
- (XXX we did that)
 
- \subsection{Portability}
 
- Our {\bf Windows implementation}, though much improved, continues to lag
 
- behind Unix and Mac OS X, especially when running as a server.  We hope to
 
- merge promising patches from Christian King to address this point, and bring
 
- Windows performance on par with other platforms.\plan{Do in 2007; 1.5 months
 
-   to integrate not counting Mike's work.}
 
- We should have {\bf better support for portable devices}, including modes of
 
- operation that require less RAM, and that write to disk less frequently (to
 
- avoid wearing out flash RAM).\plan{Optional; 2 weeks.}
 
- \subsection{Performance: resource usage}
 
- We've been working on {\bf using less RAM}, especially on servers.  This has
 
- paid off a lot for directory caches in the 0.1.2, which in some cases are
 
- using 90\% less memory than they used to require.  But we can do better,
 
- especially in the area around our buffer management algorithms, by using an
 
- approach more like the BSD and Linux kernels use instead of our current ring
 
- buffer approach.  (For OR connections, we can just use queues of cell-sized
 
- chunks produced with a specialized allocator.)  This could potentially save
 
- around 25 to 50\% of the memory currently allocated for network buffers, and
 
- make Tor a more attractive proposition for restricted-memory environments
 
- like old computers, mobile devices, and the like.\plan{Do in 2007; 2-3 weeks
 
-   plus one week measurement.} (XXX We did this, but we need to do something
 
- more/else.)
 
- \subsection{Performance: network usage}
 
- We know too little about how well our current path
 
- selection algorithms actually spread traffic around the network in practice.
 
- We should {\bf research the efficacy of our traffic allocation} and either
 
- assure ourselves that it is close enough to optimal as to need no improvement
 
- (unlikely) or {\bf identify ways to improve network usage}, and get more
 
- users' traffic delivered faster.  Performing this research will require
 
- careful thought about anonymity implications.
 
- We should also {\bf examine the efficacy of our congestion control
 
-   algorithm}, and see whether we can improve client performance in the
 
- presence of a congested network through dynamic `sendme' window sizes or
 
- other means.  This will have anonymity implications too if we aren't careful.
 
- \plan{For both of the above: research, design and write
 
-   a measurement tool in 2007: 1 month.  See if we can interest a graduate
 
-   student.}
 
- We should work on making Tor's cell-based protocol  perform better on
 
- networks with low bandwidth
 
- and high packet loss.\plan{Do in 2007 if we're funded to do it; 4-6 weeks.}
 
- \subsection{Performance scenario: one Tor client, many users}
 
- We should {\bf improve Tor's performance when a single Tor handles many
 
-   clients}.  Many organizations want to manage a single Tor client on their
 
- firewall for many users, rather than having each user install a separate
 
- Tor client.  We haven't optimized for this scenario, and it is likely that
 
- there are some code paths in the current implementation that become
 
- inefficient when a single Tor is servicing hundreds or thousands of client
 
- connections.  (Additionally, it is likely that such clients have interesting
 
- anonymity requirements the we should investigate.)  We should profile Tor
 
- under appropriate loads, identify bottlenecks, and fix them.\plan{Do in 2007
 
-   if we're funded to do it; 4-8 weeks.}
 
- \subsection{Tor servers on asymmetric bandwidth}
 
- Tor should work better on servers that have asymmetric connections like cable
 
- or DSL.  Because Tor has separate TCP connections between each
 
- hop, if the incoming bytes are arriving just fine and the outgoing bytes are
 
- all getting dropped on the floor, the TCP push-back mechanisms don't really
 
- transmit this information back to the incoming streams.\plan{Do in 2007 since
 
-   related to bandwidth limiting.  3-4 weeks.}
 
- \subsection{Running Tor as both client and server}
 
- Many performance tradeoffs and balances that might need more attention.
 
- We first need to track and fix whatever bottlenecks emerge; but we also
 
- need to invent good algorithms for prioritizing the client's traffic
 
- without starving the server's traffic too much.\plan{No idea; try
 
- profiling and improving things in 2007.}
 
- \subsection{Protocol redesign for UDP}
 
- Tor has relayed only TCP traffic since its first versions, and has used
 
- TLS-over-TCP to do so.  This approach has proved reliable and flexible, but
 
- in the long term we will need to allow UDP traffic on the network, and switch
 
- some or all of the network to using a UDP transport.  {\bf Supporting UDP
 
-   traffic} will make Tor more suitable for protocols that require UDP, such
 
- as many VOIP protocols.  {\bf Using a UDP transport} could greatly reduce
 
- resource limitations on servers, and make the network far less interruptible
 
- by lossy connections.  Either of these protocol changes would require a great
 
- deal of design work, however.  We hope to be able to enlist the aid of a few
 
- talented graduate students to assist with the initial design and
 
- specification, but the actual implementation will require significant testing
 
- of different reliable transport approaches.\plan{Maybe do a design in 2007 if
 
- we find an interested academic.  Ian or Ben L might be good partners here.}
 
- \section{Blocking resistance}
 
- \subsection{Design for blocking resistance}
 
- We have written a design document explaining our general approach to blocking
 
- resistance.  We should workshop it with other experts in the field to get
 
- their ideas about how we can improve Tor's efficacy as an anti-censorship
 
- tool.
 
- \subsection{Implementation: client-side and bridges-side}
 
- Bridges will want to be able to {\bf listen on multiple addresses and ports}
 
- if they can, to give the adversary more ports to block.
 
- \subsection{Research: anonymity implications from becoming a bridge}
 
- see arma's bridge proposal; e.g. should bridge users use a second layer of
 
- entry guards?
 
- \subsection{Implementation: bridge authority}
 
- we run some
 
- directory authorities with a slightly modified protocol that doesn't leak
 
- the entire list of bridges. Thus users can learn up-to-date information
 
- for bridges they already know about, but they can't learn about arbitrary
 
- new bridges.
 
- we need a design for distributing the bridge authority over more than one
 
- server
 
- \subsection{Normalizing the Tor protocol on the wire}
 
- Additionally, we should {\bf resist content-based filters}.  Though an
 
- adversary can't see what users are saying, some aspects of our protocol are
 
- easy to fingerprint {\em as} Tor.  We should correct this where possible.
 
- Look like Firefox; or look like nothing?
 
- Future research: investigate timing similarities with other protocols.
 
- \subsection{Research: scanning-resistance}
 
- \subsection{Research/Design/Impl: how users discover bridges}
 
- Our design anticipates an arms race between discovery methods and censors.
 
- We need to begin the infrastructure on our side quickly, preferably in a
 
- flexible language like Python, so we can adapt quickly to censorship.
 
- phase one: personal bridges
 
- phase two: families of personal bridges
 
- phase three: more structured social network
 
- phase four: bag of tricks
 
- Research: phase five...
 
- Integration with Psiphon, etc?
 
- \subsection{Document best practices for users}
 
- Document best practices for various activities common among
 
- blocked users (e.g. WordPress use).
 
- \subsection{Research: how to know if a bridge has been blocked?}
 
- \subsection{GeoIP maintenance, and "private" user statistics}
 
- How to know if the whole idea is working?
 
- \subsection{Research: hiding whether the user is reading or publishing?}
 
- \subsection{Research: how many bridges do you need to know to maintain
 
- reachability?}
 
- \subsection{Resisting censorship of the Tor website, docs, and mirrors}
 
- We should take some effort to consider {\bf initial distribution of Tor and
 
-   related information} in countries where the Tor website and mirrors are
 
- censored.  (Right now, most countries that block access to Tor block only the
 
- main website and leave mirrors and the network itself untouched.)  Falling
 
- back on word-of-mouth is always a good last resort, but we should also take
 
- steps to make sure it's relatively easy for users to get ahold of a copy.
 
- \section{Security}
 
- \subsection{Security research projects}
 
- We should investigate approaches with some promise to help Tor resist
 
- end-to-end traffic correlation attacks.  It's an open research question
 
- whether (and to what extent) {\bf mixed-latency} networks, {\bf low-volume
 
-   long-distance padding}, or other approaches can resist these attacks, which
 
- are currently some of the most effective against careful Tor users.  We
 
- should research these questions and perform simulations to identify
 
- opportunities for strengthening our design without dropping performance to
 
- unacceptable levels. %Cite something
 
- \plan{Start doing this in 2007; write a paper.  8-16 weeks.}
 
- We've got some preliminary results suggesting that {\bf a topology-aware
 
-   routing algorithm}~\cite{feamster:wpes2004} could reduce Tor users'
 
- vulnerability against local or ISP-level adversaries, by ensuring that they
 
- are never in a position to watch both ends of a connection.  We need to
 
- examine the effects of this approach in more detail and consider side-effects
 
- on anonymity against other kinds of adversaries.  If the approach still looks
 
- promising, we should investigate ways for clients to implement it (or an
 
- approximation of it) without having to download routing tables for the whole
 
- Internet. \plan{Not in 2007 unless a graduate student wants to do it.}
 
- %\tmp{defenses against end-to-end correlation}  We don't expect any to work
 
- %right now, but it would be useful to learn that one did.  Alternatively,
 
- %proving that one didn't would free up researchers in the field to go work on
 
- %other things.
 
- %
 
- % See above; I think I got this.
 
- We should research the efficacy of {\bf website fingerprinting} attacks,
 
- wherein an adversary tries to match the distinctive traffic and timing
 
- pattern of the resources constituting a given website to the traffic pattern
 
- of a user's client.  These attacks work great in simulations, but in
 
- practice we hear they don't work nearly as well.  We should get some actual
 
- numbers to investigate the issue, and figure out what's going on.  If we
 
- resist these attacks, or can improve our design to resist them, we should.
 
- % add cites
 
- \plan{Possibly part of end-to-end correlation paper.  Otherwise, not in 2007
 
-   unless a graduate student is interested.}
 
- \subsection{Implementation security}
 
- We should also {\bf mark RAM that holds key material as non-swappable} so
 
- that there is no risk of recovering key material from a hard disk
 
- compromise.  This would require submitting patches upstream to OpenSSL, where
 
- support for marking memory as sensitive is currently in a very preliminary
 
- state.\plan{Nice to do, but not in immediate Tor scope.}
 
- There are numerous tools for identifying trouble spots in code (such as
 
- Coverity or even VS2005's code analysis tool) and we should convince somebody
 
- to run some of them against the Tor codebase.  Ideally, we could figure out a
 
- way to get our code checked periodically rather than just once.\plan{Almost
 
-   no time once we talk somebody into it.}
 
- We should try {\bf protocol fuzzing} to identify errors in our
 
- implementation.\plan{Not in 2007 unless we find a grad student or
 
-   undergraduate who wants to try.}
 
- Our guard nodes help prevent an attacker from being able to become a chosen
 
- client's entry point by having each client choose a few favorite entry points
 
- as ``guards'' and stick to them.   We should implement a {\bf directory
 
-   guards} feature to keep adversaries from enumerating Tor users by acting as
 
- a directory cache.\plan{Do in 2007; 2 weeks.}
 
- \subsection{Detect corrupt exits and other servers}
 
- With the success of our network, we've attracted servers in many locations,
 
- operated by many kinds of people.  Unfortunately, some of these locations
 
- have compromised or defective networks, and some of these people are
 
- untrustworthy or incompetent.  Our current design relies on authority
 
- administrators to identify bad nodes and mark them as nonfunctioning.  We
 
- should {\bf automate the process of identifying malfunctioning nodes} as
 
- follows:
 
- We should create a generic {\bf feedback mechanism for add-on tools} like
 
- Mike Perry's ``Snakes on a Tor'' to report failing nodes to authorities.
 
- \plan{Do in 2006; 1-2 weeks.}
 
- We should write tools to {\bf detect more kinds of innocent node failure},
 
- such as nodes whose network providers intercept SSL, nodes whose network
 
- providers censor popular websites, and so on.  We should also try to detect
 
- {\bf routers that snoop traffic}; we could do this by launching connections
 
- to throwaway accounts, and seeing which accounts get used.\plan{Do in 2007;
 
-   ask Mike Perry if he's interested.  4-6 weeks.}
 
- We should add {\bf an efficient way for authorities to mark a set of servers
 
-   as probably collaborating} though not necessarily otherwise dishonest.
 
- This happens when an administrator starts multiple routers, but doesn't mark
 
- them as belonging to the same family.\plan{Do during v2.1 directory protocol
 
-   redesign; 1-2 weeks to implement.}
 
- To avoid attacks where an adversary claims good performance in order to
 
- attract traffic, we should {\bf have authorities measure node performance}
 
- (including stability and bandwidth) themselves, and not simply believe what
 
- they're told. We also measure stability by tracking MTBF.  Measuring
 
- bandwidth will be tricky, since it's hard to distinguish between a server with
 
- low capacity, and a high-capacity server with most of its capacity in
 
- use. See also Nikita's NDSS 2008 paper.\plan{Do it if we can interest
 
- a grad student.}
 
- {\bf Operating a directory authority should be easier.}  We rely on authority
 
- operators to keep the network running well, but right now their job involves
 
- too much busywork and administrative overhead.  A better interface for them
 
- to use could free their time to work on exception cases rather than on
 
- adding named nodes to the network.\plan{Do in 2007; 4-5 weeks.}
 
- \subsection{Protocol security}
 
- In addition to other protocol changes discussed above,
 
- % And should we move some of them down here? -NM
 
- we should add {\bf hooks for denial-of-service resistance}; we have some
 
- preliminary designs, but we shouldn't postpone them until we really need them.
 
- If somebody tries a DDoS attack against the Tor network, we won't want to
 
- wait for all the servers and clients to upgrade to a new
 
- version.\plan{Research project; do this in 2007 if funded.}
 
- \section{Development infrastructure}
 
- \subsection{Build farm}
 
- We've begun to deploy a cross-platform distributed build farm of hosts
 
- that build and test the Tor source every time it changes in our development
 
- repository.
 
- We need to {\bf get more participants}, so that we can test a larger variety
 
- of platforms.  (Previously, we've only found out when our code had broken on
 
- obscure platforms when somebody got around to building it.)
 
- We need also to {\bf add our dependencies} to the build farm, so that we can
 
- ensure that libraries we need (especially libevent) do not stop working on
 
- any important platform between one release and the next.
 
- \plan{This is ongoing as more buildbots arrive.}
 
- \subsection{Improved testing harness}
 
- Currently, our {\bf unit tests} cover only about 20\% of the code base.  This
 
- is uncomfortably low; we should write more and switch to a more flexible
 
- testing framework.\plan{Ongoing basis, time permitting.}
 
- We should also write flexible {\bf automated single-host deployment tests} so
 
- we can more easily verify that the current codebase works with the
 
- network.\plan{Worthwhile in 2007; would save lots of time.  2-4 weeks.}
 
- We should build automated {\bf stress testing} frameworks so we can see which
 
- realistic loads cause Tor to perform badly, and regularly profile Tor against
 
- these loads.  This would give us {\it in vitro} performance values to
 
- supplement our deployment experience.\plan{Worthwhile in 2007; 2-6 weeks.}
 
- We should improve our memory profiling code.\plan{...}
 
- \subsection{Centralized build system}
 
- We currently rely on a separate packager to maintain the packaging system and
 
- to build Tor on each platform for which we distribute binaries.  Separate
 
- package maintainers is sensible, but separate package builders has meant
 
- long turnaround times between source releases and package releases.  We
 
- should create the necessary infrastructure for us to produce binaries for all
 
- major packages within an hour or so of source release.\plan{We should
 
-   brainstorm this at least in 2007.}
 
- \subsection{Improved metrics}
 
- We need a way to {\bf measure the network's health, capacity, and degree of
 
-   utilization}.  Our current means for doing this are ad hoc and not
 
- completely accurate
 
- We need better ways to {\bf tell which countries are users are coming from,
 
-   and how many there are}.  A good perspective of the network helps us
 
- allocate resources and identify trouble spots, but our current approaches
 
- will work less and less well as we make it harder for adversaries to
 
- enumerate users.  We'll probably want to shift to a smarter, statistical
 
- approach rather than our current ``count and extrapolate'' method.
 
- \plan{All of this in 2007 if funded; 4-8 weeks}
 
- % \tmp{We'd like to know how much of the network is getting used.}
 
- % I think this is covered above -NM
 
- \subsection{Controller library}
 
- We've done lots of design and development on our controller interface, which
 
- allows UI applications and other tools to interact with Tor.  We could
 
- encourage the development of more such tools by releasing a {\bf
 
-   general-purpose controller library}, ideally with API support for several
 
- popular programming languages.\plan{2006 or 2007; 1-2 weeks.}
 
- \section{User experience}
 
- \subsection{Get blocked less, get blocked less broadly}
 
- Right now, some services block connections from the Tor network because
 
- they don't have a better
 
- way to keep vandals from abusing them than blocking IP addresses associated
 
- with vandalism.  Our approach so far has been to educate them about better
 
- solutions that currently exist, but we should also {\bf create better
 
- solutions for limiting vandalism by anonymous users} like credential and
 
- blind-signature based implementations, and encourage their use. Other
 
- promising starting points including writing a patch and explanation for
 
- Wikipedia, and helping Freenode to document, maintain, and expand its
 
- current Tor-friendly position.\plan{Do a writeup here in 2007; 1-2 weeks.}
 
- Those who do block Tor users also block overbroadly, sometimes blacklisting
 
- operators of Tor servers that do not permit exit to their services.  We could
 
- obviate innocent reasons for doing so by designing a {\bf narrowly-targeted Tor
 
-   RBL service} so that those who wanted to overblock Tor could no longer
 
- plead incompetence.\plan{Possibly in 2007 if we decide it's a good idea; 3
 
-   weeks.}
 
- \subsection{All-in-one bundle}
 
- We need a well-tested, well-documented bundle of Tor and supporting
 
- applications configured to use it correctly.  We have an initial
 
- implementation well under way, but it will need additional work in
 
- identifying requisite Firefox extensions, identifying security threats,
 
- improving user experience, and so on.  This will need significantly more work
 
- before it's ready for a general public release.
 
- \subsection{LiveCD Tor}
 
- We need a nice bootable livecd containing a minimal OS and a few applications
 
- configured to use it correctly.  The Anonym.OS project demonstrated that this
 
- is quite feasible, but their project is not currently maintained.
 
- \subsection{A Tor client in a VM}
 
- \tmp{a.k.a JanusVM} which is quite related to the firewall-level deployment
 
- section below. JanusVM is a Linux kernel running in VMWare. It gets an IP
 
- address from the network, and serves as a DHCP server for its host Windows
 
- machine. It intercepts all outgoing traffic and redirects it into Privoxy,
 
- Tor, etc. This Linux-in-Windows approach may help us with scalability in
 
- the short term, and it may also be a good long-term solution rather than
 
- accepting all security risks in Windows.
 
- %\subsection{Interface improvements}
 
- %\tmp{Allow controllers to manipulate server status.}
 
- % (Why is this in the User Experience section?) -RD
 
- % I think it's better left to a generic ``make controller iface better'' item.
 
- \subsection{Firewall-level deployment}
 
- Another useful deployment mode for some users is using {\bf Tor in a firewall
 
-   configuration}, and directing all their traffic through Tor.  This can be a
 
- little tricky to set up currently, but it's an effective way to make sure no
 
- traffic leaves the host un-anonymized.  To achieve this, we need to {\bf
 
-   improve and port our new TransPort} feature which allows Tor to be used
 
- without SOCKS support; to {\bf add an anonymizing DNS proxy} feature to Tor;
 
- and to {\bf construct a recommended set of firewall configurations} to redirect
 
- traffic to Tor.
 
- This is an area where {\bf deployment via a livecd}, or an installation
 
- targeted at specialized home routing hardware, could be useful.
 
- \subsection{Assess software and configurations for anonymity risks}
 
- Right now, users and packagers are more or less on their own when selecting
 
- Firefox extensions.  We should {\bf assemble a recommended list of browser
 
-   extensions} through experiment, and include this in the application bundles
 
- we distribute.
 
- We should also describe {\bf best practices for using Tor with each class of
 
-   application}. For example, Ethan Zuckerman has written a detailed
 
- tutorial on how to use Tor, Firefox, GMail, and Wordpress to blog with
 
- improved safety. There are many other cases on the Internet where anonymity
 
- would be helpful, and there are a lot of ways to screw up using Tor.
 
- The Foxtor and Torbutton extensions serve similar purposes; we should pick a
 
- favorite, and merge in the useful features of the other.
 
- %\tmp{clean up our own bundled software:
 
- %E.g. Merge the good features of Foxtor into Torbutton}
 
- %
 
- % What else did you have in mind? -NM
 
- \subsection{Localization}
 
- Right now, most of our user-facing code is internationalized.  We need to
 
- internationalize the last few hold-outs (like the Tor expert installer), and get
 
- more translations for the parts that are already internationalized.
 
- Also, we should look into a {\bf unified translator's solution}.  Currently,
 
- since different tools have been internationalized using the
 
- framework-appropriate method, different tools require translators to localize
 
- them via different interfaces.  Inasmuch as possible, we should make
 
- translators only need to use a single tool to translate the whole Tor suite.
 
- \section{Support}
 
- It would be nice to set up some {\bf user support infrastructure} and
 
- {\bf contributor support infrastructure}, especially focusing on server
 
- operators and on coordinating volunteers.
 
- This includes intuitive and easy ticket systems for bug reports and
 
- feature suggestions (not just mailing lists with a half dozen people
 
- and no clear roles for who answers what), but it also includes a more
 
- personalized and efficient framework for interaction so we keep the
 
- attention and interest of the contributors, and so we make them feel
 
- helpful and wanted.
 
- \section{Documentation}
 
- \subsection{Unified documentation scheme}
 
- We need to {\bf inventory our documentation.}  Our documentation so far has
 
- been mostly produced on an {\it ad hoc} basis, in response to particular
 
- needs and requests.  We should figure out what documentation we have, which of
 
- it (if any) should get priority, and whether we can't put it all into a
 
- single format.
 
- We could {\bf unify the docs} into a single book-like thing.  This will also
 
- help us identify what sections of the ``book'' are missing.
 
- \subsection{Missing technical documentation}
 
- We should {\bf revise our design paper} to reflect the new decisions and
 
- research we've made since it was published in 2004.  This will help other
 
- researchers evaluate and suggest improvements to Tor's current design.
 
- Other projects sometimes implement the client side of our protocol.  We
 
- encourage this, but we should write {\bf a document about how to avoid
 
- excessive resource use}, so we don't need to worry that they will do so
 
- without regard to the effect of their choices on server resources.
 
- \subsection{Missing user documentation}
 
- Our documentation falls into two broad categories: some is `discoursive' and
 
- explains in detail why users should take certain actions, and other
 
- documentation is `comprehensive' and describes all of Tor's features.  Right
 
- now, we have no document that is both deep, readable, and thorough.  We
 
- should correct this by identifying missing spots in our design.
 
- \bibliographystyle{plain} \bibliography{tor-design}
 
- \end{document}
 
 
  |