| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795796797798799800801802803804805806807808809810811812813814815816817818819820821822823824825826827828829830831832833834835836837838839840841842843844845846847848849850851852853854855856857858859860861862863864865866867868869870871872873874875876877878879880881882883884885886887888889890891892893894895896897898899900901902903904905906907908909910911912913914915916917918919920921922923924925926927928929930931932933934935936937938939940941942943944945946947948949950951952953954955956957958959960961962963964965966967968969970971972973974975976977978979980981982983984985986987988989990991992993994995996997998999100010011002100310041005100610071008100910101011101210131014101510161017101810191020102110221023102410251026102710281029103010311032103310341035103610371038103910401041104210431044104510461047104810491050105110521053105410551056105710581059106010611062106310641065106610671068106910701071107210731074107510761077107810791080108110821083108410851086108710881089109010911092109310941095109610971098109911001101110211031104110511061107110811091110111111121113111411151116111711181119112011211122112311241125112611271128112911301131113211331134113511361137113811391140114111421143114411451146114711481149115011511152115311541155115611571158115911601161116211631164116511661167116811691170 | 
							- \documentclass{llncs}
 
- \usepackage{url}
 
- \usepackage{amsmath}
 
- \usepackage{epsfig}
 
- %\setlength{\textwidth}{5.9in}
 
- %\setlength{\textheight}{8.4in}
 
- %\setlength{\topmargin}{.5cm}
 
- %\setlength{\oddsidemargin}{1cm}
 
- %\setlength{\evensidemargin}{1cm}
 
- \newenvironment{tightlist}{\begin{list}{$\bullet$}{
 
-   \setlength{\itemsep}{0mm}
 
-     \setlength{\parsep}{0mm}
 
-     %  \setlength{\labelsep}{0mm}
 
-     %  \setlength{\labelwidth}{0mm}
 
-     %  \setlength{\topsep}{0mm}
 
-     }}{\end{list}}
 
- \begin{document}
 
- \title{Design of a blocking-resistant anonymity system}
 
- %\author{Roger Dingledine\inst{1} \and Nick Mathewson\inst{1}}
 
- \author{Roger Dingledine \and Nick Mathewson}
 
- \institute{The Free Haven Project\\
 
- \email{\{arma,nickm\}@freehaven.net}}
 
- \maketitle
 
- \pagestyle{plain}
 
- \begin{abstract}
 
- Websites around the world are increasingly being blocked by
 
- government-level firewalls. Many people use anonymizing networks like
 
- Tor to contact sites without letting an attacker trace their activities,
 
- and as an added benefit they are no longer affected by local censorship.
 
- But if the attacker simply denies access to the Tor network itself,
 
- blocked users can no longer benefit from the security Tor offers.
 
- Here we describe a design that builds upon the current Tor network
 
- to provide an anonymizing network that resists blocking
 
- by government-level attackers.
 
- \end{abstract}
 
- \section{Introduction and Goals}
 
- Anonymizing networks such as Tor~\cite{tor-design} bounce traffic around
 
- a network of relays. They aim to hide not only what is being said, but
 
- also who is communicating with whom, which users are using which websites,
 
- and so on. These systems have a broad range of users, including ordinary
 
- citizens who want to avoid being profiled for targeted advertisements,
 
- corporations who don't want to reveal information to their competitors,
 
- and law enforcement and government intelligence agencies who need to do
 
- operations on the Internet without being noticed.
 
- Historically, research on anonymizing systems has assumed a passive
 
- attacker who monitors the user (call her Alice) and tries to discover her
 
- activities, yet lets her reach any piece of the network. In more modern
 
- threat models such as Tor's, the adversary is allowed to perform active
 
- attacks such as modifying communications in hopes of tricking Alice
 
- into revealing her destination, or intercepting some of her connections
 
- to run a man-in-the-middle attack. But these systems still assume that
 
- Alice can eventually reach the anonymizing network.
 
- An increasing number of users are making use of the Tor software
 
- not so much for its anonymity properties but for its censorship
 
- resistance properties -- if they access Internet sites like Wikipedia
 
- and Blogspot via Tor, they are no longer affected by local censorship
 
- and firewall rules. In fact, an informal user study (described in
 
- Appendix~\ref{app:geoip}) showed China as the third largest user base
 
- for Tor clients, with perhaps ten thousand people accessing the Tor
 
- network from China each day.
 
- The current Tor design is easy to block if the attacker controls Alice's
 
- connection to the Tor network --- by blocking the directory authorities,
 
- by blocking all the server IP addresses in the directory, or by filtering
 
- based on the signature of the Tor TLS handshake. Here we describe a
 
- design that builds upon the current Tor network to provide an anonymizing
 
- network that also resists this blocking. Specifically,
 
- Section~\ref{sec:adversary} discusses our threat model --- that is,
 
- the assumptions we make about our adversary; Section~\ref{sec:current-tor}
 
- describes the components of the current Tor design and how they can be
 
- leveraged for a new blocking-resistant design; Section~\ref{sec:related}
 
- explains the features and drawbacks of the currently deployed solutions;
 
- and ...
 
- %And adding more different classes of users and goals to the Tor network
 
- %improves the anonymity for all Tor users~\cite{econymics,usability:weis2006}.
 
- \section{Adversary assumptions}
 
- \label{sec:adversary}
 
- The history of blocking-resistance designs is littered with conflicting
 
- assumptions about what adversaries to expect and what problems are
 
- in the critical path to a solution. Here we try to enumerate our best
 
- understanding of the current situation around the world.
 
- In the traditional security style, we aim to describe a strong attacker
 
- --- if we can defend against this attacker, we inherit protection
 
- against weaker attackers as well. After all, we want a general design
 
- that will work for people in China, people in Iran, people in Thailand,
 
- whistleblowers in firewalled corporate networks, and people in whatever
 
- turns out to be the next oppressive situation. In fact, by designing with
 
- a variety of adversaries in mind, we can take advantage of the fact that
 
- adversaries will be in different stages of the arms race at each location.
 
- We assume there are three main network attacks in use by censors
 
- currently~\cite{clayton:pet2006}:
 
- \begin{tightlist}
 
- \item Block destination by automatically searching for certain strings
 
- in TCP packets.
 
- \item Block destination by manually listing its IP address at the
 
- firewall.
 
- \item Intercept DNS requests and give bogus responses for certain
 
- destination hostnames.
 
- \end{tightlist}
 
- We assume the network firewall has very limited CPU per
 
- connection~\cite{clayton:pet2006}. Against an adversary who spends
 
- hours looking through the contents of each packet, we would need
 
- some stronger mechanism such as steganography, which introduces its
 
- own problems~\cite{active-wardens,tcpstego,bar}.
 
- More broadly, we assume that the chance that the authorities try to
 
- block a given system grows as its popularity grows. That is, a system
 
- used by only a few users will probably never be blocked, whereas a
 
- well-publicized system with many users will receive much more scrutiny.
 
- We assume that readers of blocked content are not in as much danger
 
- as publishers. So far in places like China, the authorities mainly go
 
- after people who publish materials and coordinate organized movements
 
- against the state~\cite{mackinnon}. If they find that a user happens
 
- to be reading a site that should be blocked, the typical response is
 
- simply to block the site. Of course, even with an encrypted connection,
 
- the adversary may be able to distinguish readers from publishers by
 
- observing whether Alice is mostly downloading bytes or mostly uploading
 
- them --- we discuss this issue more in Section~\ref{subsec:upload-padding}.
 
- We assume that while various different regimes can coordinate and share
 
- notes, there will be a significant time lag between one attacker learning
 
- how to overcome a facet of our design and other attackers picking it up.
 
- Similarly, we assume that in the early stages of deployment the insider
 
- threat isn't as high of a risk, because no attackers have put serious
 
- effort into breaking the system yet.
 
- We assume that government-level attackers are not always uniform across
 
- the country. For example, there is no single centralized place in China
 
- that coordinates its censorship decisions and steps.
 
- We assume that our users have control over their hardware and
 
- software --- they don't have any spyware installed, there are no
 
- cameras watching their screen, etc. Unfortunately, in many situations
 
- these threats are very real~\cite{zuckerman-threatmodels}; yet
 
- software-based security systems like ours are poorly equipped to handle
 
- a user who is entirely observed and controlled by the adversary. See
 
- Section~\ref{subsec:cafes-and-livecds} for more discussion of what little
 
- we can do about this issue.
 
- We assume that widespread access to the Internet is economically and/or
 
- socially valuable in each deployment country. After all, if censorship
 
- is more important than Internet access, the firewall administrators have
 
- an easy job: they should simply block everything. The corollary to this
 
- assumption is that we should design so that increased blocking of our
 
- system results in increased economic damage or public outcry.
 
- We assume that the user will be able to fetch a genuine
 
- version of Tor, rather than one supplied by the adversary; see
 
- Section~\ref{subsec:trust-chain} for discussion on helping the user
 
- confirm that he has a genuine version and that he can connect to the
 
- real Tor network.
 
- \section{Components of the current Tor design}
 
- \label{sec:current-tor}
 
- Tor is popular and sees a lot of use. It's the largest anonymity
 
- network of its kind.
 
- Tor has attracted more than 800 routers from around the world.
 
- A few sentences about how Tor works.
 
- In this section, we examine some of the reasons why Tor has taken off,
 
- with particular emphasis to how we can take advantage of these properties
 
- for a blocking-resistance design.
 
- Tor aims to provide three security properties:
 
- \begin{tightlist}
 
- \item 1. A local network attacker can't learn, or influence, your
 
- destination.
 
- \item 2. No single router in the Tor network can link you to your
 
- destination.
 
- \item 3. The destination, or somebody watching the destination,
 
- can't learn your location.
 
- \end{tightlist}
 
- For blocking-resistance, we care most clearly about the first
 
- property. But as the arms race progresses, the second property
 
- will become important --- for example, to discourage an adversary
 
- from volunteering a relay in order to learn that Alice is reading
 
- or posting to certain websites. The third property is not so clearly
 
- important in this context, but we believe it will turn out to be helpful:
 
- consider websites and other Internet services that have been pressured
 
- recently into treating clients differently depending on their network
 
- location~\cite{google-geolocation}.
 
- % and cite{goodell-syverson06} once it's finalized.
 
- The Tor design provides other features as well over manual or ad
 
- hoc circumvention techniques.
 
- Firstly, the Tor directory authorities automatically aggregate, test,
 
- and publish signed summaries of the available Tor routers. Tor clients
 
- can fetch these summaries to learn which routers are available and
 
- which routers have desired properties. Directory information is cached
 
- throughout the Tor network, so once clients have bootstrapped they never
 
- need to interact with the authorities directly. (To tolerate a minority
 
- of compromised directory authorities, we use a threshold trust scheme ---
 
- see Section~\ref{subsec:trust-chain} for details.)
 
- Secondly, Tor clients can be configured to use any directory authorities
 
- they want. They use the default authorities if no others are specified,
 
- but it's easy to start a separate (or even overlapping) Tor network just
 
- by running a different set of authorities and convincing users to prefer
 
- a modified client. For example, we could launch a distinct Tor network
 
- inside China; some users could even use an aggregate network made up of
 
- both the main network and the China network. But we should not be too
 
- quick to create other Tor networks --- part of Tor's anonymity comes from
 
- users behaving like other users, and there are many unsolved anonymity
 
- questions if different users know about different pieces of the network.
 
- Thirdly, in addition to automatically learning from the chosen directories
 
- which Tor routers are available and working, Tor takes care of building
 
- paths through the network and rebuilding them as needed. So the user
 
- never has to know how paths are chosen, never has to manually pick
 
- working proxies, and so on. More generally, at its core the Tor protocol
 
- is simply a tool that can build paths given a set of routers. Tor is
 
- quite flexible about how it learns about the routers and how it chooses
 
- the paths. Harvard's Blossom project~\cite{blossom-thesis} makes this
 
- flexibility more concrete: Blossom makes use of Tor not for its security
 
- properties but for its reachability properties. It runs a separate set
 
- of directory authorities, its own set of Tor routers (called the Blossom
 
- network), and uses Tor's flexible path-building to let users view Internet
 
- resources from any point in the Blossom network.
 
- Fourthly, Tor separates the role of \emph{internal relay} from the
 
- role of \emph{exit relay}. That is, some volunteers choose just to relay
 
- traffic between Tor users and Tor routers, and others choose to also allow
 
- connections to external Internet resources. Because we don't force all
 
- volunteers to play both roles, we end up with more relays. This increased
 
- diversity in turn is what gives Tor its security: the more options the
 
- user has for her first hop, and the more options she has for her last hop,
 
- the less likely it is that a given attacker will be watching both ends
 
- of her circuit~\cite{tor-design}. As a bonus, because our design attracts
 
- more internal relays that want to help out but don't want to deal with
 
- being an exit relay, we end up with more options for the first hop ---
 
- the one most critical to being able to reach the Tor network.
 
- Fifthly, Tor is sustainable. Zero-Knowledge Systems offered the commercial
 
- but now-defunct Freedom Network~\cite{freedom21-security}, a design with
 
- security comparable to Tor's, but its funding model relied on collecting
 
- money from users to pay relays. Modern commercial proxy systems similarly
 
- need to keep collecting money to support their infrastructure. On the
 
- other hand, Tor has built a self-sustaining community of volunteers who
 
- donate their time and resources. This community trust is rooted in Tor's
 
- open design: we tell the world exactly how Tor works, and we provide all
 
- the source code. Users can decide for themselves, or pay any security
 
- expert to decide, whether it is safe to use. Further, Tor's modularity
 
- as described above, along with its open license, mean that its impact
 
- will continue to grow.
 
- Sixthly, Tor has an established user base of hundreds of
 
- thousands of people from around the world. This diversity of
 
- users contributes to sustainability as above: Tor is used by
 
- ordinary citizens, activists, corporations, law enforcement, and
 
- even governments and militaries~\cite{tor-use-cases}, and they can
 
- only achieve their security goals by blending together in the same
 
- network~\cite{econymics,usability:weis2006}. This user base also provides
 
- something else: hundreds of thousands of different and often-changing
 
- addresses that we can leverage for our blocking-resistance design.
 
- We discuss and adapt these components further in
 
- Section~\ref{sec:bridges}. But first we examine the strengths and
 
- weaknesses of other blocking-resistance approaches, so we can expand
 
- our repertoire of building blocks and ideas.
 
- \section{Current proxy solutions}
 
- \label{sec:related}
 
- Relay-based blocking-resistance schemes generally have two main
 
- components: a relay component and a discovery component. The relay part
 
- encompasses the process of establishing a connection, sending traffic
 
- back and forth, and so on --- everything that's done once the user knows
 
- where he's going to connect. Discovery is the step before that: the
 
- process of finding one or more usable relays.
 
- For example, we described several pieces of Tor in the previous section,
 
- but we can divide them into the process of building paths and sending
 
- traffic over them (relay) and the process of learning from the directory
 
- servers about what routers are available (discovery). With this distinction
 
- in mind, we now examine several categories of relay-based schemes.
 
- \subsection{Centrally-controlled shared proxies}
 
- Existing commercial anonymity solutions (like Anonymizer.com) are based
 
- on a set of single-hop proxies. In these systems, each user connects to
 
- a single proxy, which then relays the user's traffic. These public proxy
 
- systems are typically characterized by two features: they control and
 
- operator the proxies centrally, and many different users get assigned
 
- to each proxy.
 
- In terms of the relay component, single proxies provide weak security
 
- compared to systems that distribute trust over multiple relays, since a
 
- compromised proxy can trivially observe all of its users' actions, and
 
- an eavesdropper only needs to watch a single proxy to perform timing
 
- correlation attacks against all its users' traffic. Worse, all users
 
- need to trust the proxy company to have good security itself as well as
 
- to not reveal user activities.
 
- On the other hand, single-hop proxies are easier to deploy, and they
 
- can provide better performance than distributed-trust designs like Tor,
 
- since traffic only goes through one relay. They're also more convenient
 
- from the user's perspective --- since users entirely trust the proxy,
 
- they can just use their web browser directly.
 
- Whether public proxy schemes are more or less scalable than Tor is
 
- still up for debate: commercial anonymity systems can use some of their
 
- revenue to provision more bandwidth as they grow, whereas volunteer-based
 
- anonymity systems can attract thousands of fast relays to spread the load.
 
- The discovery piece can take several forms. Most commercial anonymous
 
- proxies have one or a handful of commonly known websites, and their users
 
- log in to those websites and relay their traffic through them. When
 
- these websites get blocked (generally soon after the company becomes
 
- popular), if the company cares about users in the blocked areas, they
 
- start renting lots of disparate IP addresses and rotating through them
 
- as they get blocked. They notify their users of new addresses by email,
 
- for example. It's an arms race, since attackers can sign up to receive the
 
- email too, but they have one nice trick available to them: because they
 
- have a list of paying subscribers, they can notify certain subscribers
 
- about updates earlier than others.
 
- Access control systems on the proxy let them provide service only to
 
- users with certain characteristics, such as paying customers or people
 
- from certain IP address ranges.
 
- Discovery despite a government-level firewall is a complex and unsolved
 
- topic, and we're stuck in this same arms race ourselves; we explore it
 
- in more detail in Section~\ref{sec:discovery}. But first we examine the
 
- other end of the spectrum --- getting volunteers to run the proxies,
 
- and telling only a few people about each proxy.
 
- \subsection{Independent personal proxies}
 
- Personal proxies such as Circumventor~\cite{circumventor} and
 
- CGIProxy~\cite{cgiproxy} use the same technology as the public ones as
 
- far as the relay component goes, but they use a different strategy for
 
- discovery. Rather than managing a few centralized proxies and constantly
 
- getting new addresses for them as the old addresses are blocked, they
 
- aim to have a large number of entirely independent proxies, each managing
 
- its own (much smaller) set of users.
 
- As the Circumventor site~\cite{circumventor} explains, ``You don't
 
- actually install the Circumventor \emph{on} the computer that is blocked
 
- from accessing Web sites. You, or a friend of yours, has to install the
 
- Circumventor on some \emph{other} machine which is not censored.''
 
- This tactic has great advantages in terms of blocking-resistance ---
 
- recall our assumption in Section~\ref{sec:adversary} that the attention
 
- a system attracts from the attacker is proportional to its number of
 
- users and level of publicity. If each proxy only has a few users, and
 
- there is no central list of proxies, most of them will never get noticed.
 
- On the other hand, there's a huge scalability question that so far has
 
- prevented these schemes from being widely useful: how does the fellow
 
- in China find a person in Ohio who will run a Circumventor for him? In
 
- some cases he may know and trust some people on the outside, but in many
 
- cases he's just out of luck. Just as hard, how does a new volunteer in
 
- Ohio find a person in China who needs it?
 
- %discovery is also hard because the hosts keep vanishing if they're
 
- %on dynamic ip. But not so bad, since they can use dyndns addresses.
 
- This challenge leads to a hybrid design --- centrally-distributed
 
- personal proxies --- which we will investigate in more detail in
 
- Section~\ref{sec:discovery}.
 
- \subsection{Open proxies}
 
- Yet another currently used approach to bypassing firewalls is to locate
 
- open and misconfigured proxies on the Internet. A quick Google search
 
- for ``open proxy list'' yields a wide variety of freely available lists
 
- of HTTP, HTTPS, and SOCKS proxies. Many small companies have sprung up
 
- providing more refined lists to paying customers.
 
- There are some downsides to using these oen proxies though. Firstly,
 
- the proxies are of widely varying quality in terms of bandwidth and
 
- stability, and many of them are entirely unreachable. Secondly, unlike
 
- networks of volunteers like Tor, the legality of routing traffic through
 
- these proxies is questionable: it's widely believed that most of them
 
- don't realize what they're offering, and probably wouldn't allow it if
 
- they realized. Thirdly, in many cases the connection to the proxy is
 
- unencrypted, so firewalls that filter based on keywords in IP packets
 
- will not be hindered. And lastly, many users are suspicious that some
 
- open proxies are a little \emph{too} convenient: are they run by the
 
- adversary, in which case they get to monitor all the user's requests
 
- just as single-hop proxies can?
 
- A distributed-trust design like Tor resolves each of these issues for
 
- the relay component, but a constantly changing set of thousands of open
 
- relays is clearly a useful idea for a discovery component. For example,
 
- users might be able to make use of these proxies to bootstrap their
 
- first introduction into the Tor network.
 
- \subsection{JAP}
 
- Stefan's WPES paper is probably the closest related work, and is
 
- the starting point for the design in this paper.
 
- \subsection{steganography}
 
- infranet
 
- \subsection{break your sensitive strings into multiple tcp packets;
 
- ignore RSTs}
 
- \subsection{Internal caching networks}
 
- Freenet is deployed inside China and caches outside content.
 
- \subsection{Skype}
 
- port-hopping. encryption. voice communications not so susceptible to
 
- keystroke loggers (even graphical ones).
 
- \subsection{Tor itself}
 
- And lastly, we include Tor itself in the list of current solutions
 
- to firewalls. Tens of thousands of people use Tor from countries that
 
- routinely filter their Internet. Tor's website has been blocked in most
 
- of them. But why hasn't the Tor network been blocked yet?
 
- We have several theories. The first is the most straightforward: tens of
 
- thousands of people are simply too few to matter. It may help that Tor is
 
- perceived to be for experts only, and thus not worth attention yet. The
 
- more subtle variant on this theory is that we've positioned Tor in the
 
- public eye as a tool for retaining civil liberties in more free countries,
 
- so perhaps blocking authorities don't view it as a threat. (We revisit
 
- this idea when we consider whether and how to publicize a a Tor variant
 
- that improves blocking-resistance --- see Section~\ref{subsec:publicity}
 
- for more discussion.)
 
- The broader explanation is that most government-level filters are not
 
- created by people setting out to block all possible ways to bypass
 
- them. They're created by people who want to do a good enough job that
 
- they can still appear in control. They realize that there will always
 
- be ways for a few people to get around the firewall, and as long as Tor
 
- has not publically threatened their control, they see no urgent need to
 
- block it yet.
 
- We should recognize that we're \emph{already} in the arms race. These
 
- constraints can give us insight into the priorities and capabilities of
 
- our various attackers.
 
- \section{The relay component of our blocking-resistant design}
 
- \label{sec:bridges}
 
- Section~\ref{sec:current-tor} describes many reasons why Tor is
 
- well-suited as a building block in our context, but several changes will
 
- allow the design to resist blocking better. The most critical changes are
 
- to get more relay addresses, and to distribute them to users differently.
 
- %We need to address three problems:
 
- %- adapting the relay component of Tor so it resists blocking better.
 
- %- Discovery.
 
- %- Tor's network signature.
 
- %Here we describe the new pieces we need to add to the current Tor design.
 
- \subsection{Bridge relays}
 
- Hundreds of thousands of people around the world use Tor. We can leverage
 
- our already self-selected user base to produce a list of thousands of
 
- often-changing IP addresses. Specifically, we can give them a little
 
- button in the GUI that says ``Tor for Freedom'', and users who click
 
- the button will turn into \emph{bridge relays}, or just \emph{bridges}
 
- for short. They can rate limit relayed connections to 10 KB/s (almost
 
- nothing for a broadband user in a free country, but plenty for a user
 
- who otherwise has no access at all), and since they are just relaying
 
- bytes back and forth between blocked users and the main Tor network, they
 
- won't need to make any external connections to Internet sites. Because
 
- of this separation of roles, and because we're making use of software
 
- that the volunteers have already installed for their own use, we expect
 
- our scheme to attract and maintain more volunteers than previous schemes.
 
- As usual, there are new anonymity and security implications from running a
 
- bridge relay, particularly from letting people relay traffic through your
 
- Tor client; but we leave this discussion for Section~\ref{sec:security}.
 
- %...need to outline instructions for a Tor config that will publish
 
- %to an alternate directory authority, and for controller commands
 
- %that will do this cleanly.
 
- \subsection{The bridge directory authority (BDA)}
 
- How do the bridge relays advertise their existence to the world? We
 
- introduce a second new component of the design: a specialized directory
 
- authority that aggregates and tracks bridges. Bridge relays periodically
 
- publish server descriptors (summaries of their keys, locations, etc,
 
- signed by their long-term identity key), just like the relays in the
 
- ``main'' Tor network, but in this case they publish them only to the
 
- bridge directory authorities.
 
- The main difference between bridge authorities and the directory
 
- authorities for the main Tor network is that the main authorities provide
 
- out a list of every known relay, but the bridge authorities only give
 
- out a server descriptor if you already know its identity key. That is,
 
- you can keep up-to-date on a bridge's location and other information
 
- once you know about it, but you can't just grab a list of all the bridges.
 
- The identity keys, IP address, and directory port for the bridge
 
- authorities ship by default with the Tor software, so the bridge relays
 
- can be confident they're publishing to the right location, and the
 
- blocked users can establish an encrypted authenticated channel. See
 
- Section~\ref{subsec:trust-chain} for more discussion of the public key
 
- infrastructure and trust chain.
 
- Bridges use Tor to publish their descriptors privately and securely,
 
- so even an attacker monitoring the bridge directory authority's network
 
- can't make a list of all the addresses contacting the authority and
 
- track them that way.
 
- %\subsection{A simple matter of engineering}
 
- %
 
- %Although we've described bridges and bridge authorities in simple terms
 
- %above, some design modifications and features are needed in the Tor
 
- %codebase to add them. We describe the four main changes here.
 
- %
 
- %Firstly, we need to get smarter about rate limiting:
 
- %Bandwidth classes
 
- %
 
- %Secondly, while users can in fact configure which directory authorities
 
- %they use, we need to add a new type of directory authority and teach
 
- %bridges to fetch directory information from the main authorities while
 
- %publishing server descriptors to the bridge authorities. We're most of
 
- %the way there, since we can already specify attributes for directory
 
- %authorities:
 
- %add a separate flag named ``blocking''.
 
- %
 
- %Thirdly, need to build paths using bridges as the first
 
- %hop. One more hole in the non-clique assumption.
 
- %
 
- %Lastly, since bridge authorities don't answer full network statuses,
 
- %we need to add a new way for users to learn the current status for a
 
- %single relay or a small set of relays --- to answer such questions as
 
- %``is it running?'' or ``is it behaving correctly?'' We describe in
 
- %Section~\ref{subsec:enclave-dirs} a way for the bridge authority to
 
- %publish this information without resorting to signing each answer
 
- %individually.
 
- \subsection{Putting them together}
 
- If a blocked user knows the identity keys of a set of bridge relays, and
 
- he has correct address information for at least one of them, he can use
 
- that one to make a secure connection to the bridge authority and update
 
- his knowledge about the other bridge relays. He can also use it to make
 
- secure connections to the main Tor network and directory servers, so he
 
- can build circuits and connect to the rest of the Internet. All of these
 
- updates happen in the background: from the blocked user's perspective,
 
- he just accesses the Internet via his Tor client like always.
 
- So now we've reduced the problem from how to circumvent the firewall
 
- for all transactions (and how to know that the pages you get have not
 
- been modified by the local attacker) to how to learn about a working
 
- bridge relay.
 
- There's another catch though. We need to make sure that the network
 
- traffic we generate by simply connecting to a bridge relay doesn't stand
 
- out too much.
 
- %The following section describes ways to bootstrap knowledge of your first
 
- %bridge relay, and ways to maintain connectivity once you know a few
 
- %bridge relays.
 
- % (See Section~\ref{subsec:first-bridge} for a discussion
 
- %of exactly what information is sufficient to characterize a bridge relay.)
 
- \section{Hiding Tor's network signatures}
 
- \label{sec:network-signature}
 
- \label{subsec:enclave-dirs}
 
- Currently, Tor uses two protocols for its network communications. The
 
- main protocol uses TLS for encrypted and authenticated communication
 
- between Tor instances. The second protocol is standard HTTP, used for
 
- fetching directory information. All Tor servers listen on their ``ORPort''
 
- for TLS connections, and some of them opt to listen on their ``DirPort''
 
- as well, to serve directory information. Tor servers choose whatever port
 
- numbers they like; the server descriptor they publish to the directory
 
- tells users where to connect.
 
- One format for communicating address information about a bridge relay is
 
- its IP address and DirPort. From there, the user can ask the bridge's
 
- directory cache for an up-to-date copy of its server descriptor, and
 
- learn its current circuit keys, its ORPort, and so on.
 
- However, connecting directly to the directory cache involves a plaintext
 
- HTTP request. A censor could create a network signature for the request
 
- and/or its response, thus preventing these connections. To resolve this
 
- vulnerability, we've modified the Tor protocol so that users can connect
 
- to the directory cache via the main Tor port --- they establish a TLS
 
- connection with the bridge as normal, and then send a special ``begindir''
 
- relay command to establish an internal connection to its directory cache.
 
- Therefore a better way to summarize a bridge's address is by its IP
 
- address and ORPort, so all communications between the client and the
 
- bridge will the ordinary TLS. But there are other details that need
 
- more investigation.
 
- What port should bridges pick for their ORPort? We currently recommend
 
- that they listen on port 443 (the default HTTPS port) if they want to
 
- be most useful, because clients behind standard firewalls will have
 
- the best chance to reach them. Is this the best choice in all cases,
 
- or should we encourage some fraction of them pick random ports, or other
 
- ports commonly permitted on firewalls like 53 (DNS) or 110 (POP)? We need
 
- more research on our potential users, and their current and anticipated
 
- firewall restrictions.
 
- Furthermore, we need to look at the specifics of Tor's TLS handshake.
 
- Right now Tor uses some predictable strings in its TLS handshakes. For
 
- example, it sets the X.509 organizationName field to "Tor", and it puts
 
- the Tor server's nickname in the certificate's commonName field. We
 
- should tweak the handshake protocol so it doesn't rely on any details
 
- in the certificate headers, yet it remains secure. Should we replace
 
- it with blank entries for each field, or should we research the common
 
- values that Firefox and Internet Explorer use and try to imitate those?
 
- Worse, Tor's TLS handshake involves sending two certificates in each
 
- direction: one certificate contains the self-signed identity key for
 
- the router, and the second contains the current link key, signed by the
 
- identity key. We use these to authenticate that we're talking to the right
 
- router, and also to establish perfect forward secrecy for that link.
 
- How much will these extra certificates make Tor's TLS handshake stand
 
- out? We have to work on normalizing our appearance not just in terms
 
- of the fields used in each certificate, but also in the number of
 
- certificates we present for each side.
 
- % Nick, I need help with the above paragraph. What are the two certs
 
- % for really, and how much work would it be to start acting like a normal
 
- % browser? -RD
 
- Lastly, what if the adversary starts observing the network traffic even
 
- more closely? Even if our TLS handshake looks innocent, our traffic timing
 
- and volume still look different than a user making a secure web connection
 
- to his bank. The same techniques used in the growing trend to build tools
 
- to recognize encrypted Bittorrent traffic~\cite{bt-traffic-shaping}
 
- could be used to identify Tor communication and recognize bridge
 
- relays. Rather than trying to look like encrypted web traffic, we may be
 
- better off trying to blend with some other encrypted network protocol. The
 
- first step is to compare typical network behavior for a Tor client to
 
- typical network behavior for various other protocols. This statistical
 
- cat-and-mouse game is made more complex by the fact that Tor transports a
 
- variety of protocols, and we'll want to automatically handle web browsing
 
- differently from, say, instant messaging.
 
- \subsection{Identity keys as part of addressing information}
 
- We have described a way for the blocked user to bootstrap into the
 
- network once he knows the IP address and ORPort of a bridge. What about
 
- local spoofing attacks? That is, since we never learned an identity
 
- key fingerprint for the bridge, a local attacker could intercept our
 
- connection and pretend to be the bridge we had in mind. It turns out
 
- that giving false information isn't that bad --- since the Tor client
 
- ships with trusted keys for the bridge directory authority and the Tor
 
- network directory authorities, the user can learn whether he's being
 
- given a real connection to the bridge authorities or not. (After all,
 
- if the adversary intercepts every connection the user makes and gives
 
- him a bad connection each time, there's nothing we can do.)
 
- What about anonymity-breaking attacks from observing traffic, if the
 
- blocked user doesn't start out knowing the identity key of his intended
 
- bridge? The vulnerabilities aren't so bad in this case either ---
 
- the adversary could do the same attacks just by monitoring the network
 
- traffic.
 
- Once the Tor client has fetched the bridge's server descriptor, it should
 
- remember the identity key fingerprint for that bridge relay. Thus if
 
- the bridge relay moves to a new IP address, the client can query the
 
- bridge directory authority to look up a fresh server descriptor using
 
- this fingerprint.
 
- So we've shown that it's \emph{possible} to bootstrap into the network
 
- just by learning the IP address and ORPort of a bridge, but are there
 
- situations where it's more convenient or more secure to learn the bridge's
 
- identity fingerprint as well as instead, while bootstrapping? We keep
 
- that question in mind as we next investigate bootstrapping and discovery.
 
- \section{Discovering and maintaining working bridge relays}
 
- \label{sec:discovery}
 
- Tor's modular design means that we can develop a better relay component
 
- independently of developing the discovery component. This modularity's
 
- great promise is that we can pick any discovery approach we like; but the
 
- unfortunate fact is that we have no magic bullet for discovery. We're
 
- in the same arms race as all the other designs we described in
 
- Section~\ref{sec:related}.
 
- 3 options:
 
- - independent proxies. just tell your friends.
 
- - public proxies. given out like circumventors. or all sorts of other rate limiting ways.
 
- - social network scheme, with accounts and stuff.
 
- In the first subsection we describe how to find a first bridge.
 
- Thus they can reach the BDA. From here we either assume a social
 
- network or other mechanism for learning IP:dirport or key fingerprints
 
- as above, or we assume an account server that allows us to limit the
 
- number of new bridge relays an external attacker can discover.
 
- Going to be an arms race. Need a bag of tricks. Hard to say
 
- which ones will work. Don't spend them all at once.
 
- \subsection{Bootstrapping: finding your first bridge}
 
- \label{subsec:first-bridge}
 
- Most government firewalls are not perfect. They allow connections to
 
- Google cache or some open proxy servers, or they let file-sharing or
 
- Skype or World-of-Warcraft connections through.
 
- For users who can't use any of these techniques, hopefully they know
 
- a friend who can --- for example, perhaps the friend already knows some
 
- bridge relay addresses.
 
- (If they can't get around it at all, then we can't help them --- they
 
- should go meet more people.)
 
- Some techniques are sufficient to get us an IP address and a port,
 
- and others can get us IP:port:key. Lay out some plausible options
 
- for how users can bootstrap into learning their first bridge.
 
- Round one:
 
- - the bridge authority server will hand some out.
 
- - get one from your friend.
 
- - send us mail with a unique account, and get an automated answer.
 
- - 
 
- Round two:
 
- - social network thing
 
- attack: adversary can reconstruct your social network by learning who
 
- knows which bridges.
 
- \subsection{Centrally-distributed personal proxies}
 
- Circumventor, realizing that its adoption will remain limited if would-be
 
- users can't connect with volunteers, has started a mailing list to
 
- distribute new proxy addresses every few days. From experimentation
 
- it seems they have concluded that sending updates every 3 or 4 days is
 
- sufficient to stay ahead of the current attackers.
 
- If there are many volunteer proxies and many interested users, a central
 
- watering hole to connect them is a natural solution. On the other hand,
 
- at first glance it appears that we've inherited the \emph{bad} parts of
 
- each of the above designs: not only do we have to attract many volunteer
 
- proxies, but the users also need to get to a single site that is sure
 
- to be blocked.
 
- There are two reasons why we're in better shape. Firstly, the users don't
 
- actually need to reach the watering hole directly: it can respond to
 
- email, for example. Secondly, 
 
- % In fact, the JAP
 
- %project~\cite{web-mix,koepsell:wpes2004} suggested an alternative approach
 
- %to a mailing list: new users email a central address and get an automated
 
- %response listing a proxy for them.
 
- % While the exact details of the
 
- %proposal are still to be worked out, the idea of giving out
 
- \subsection{Discovery based on social networks}
 
- A token that can be exchanged at the BDA (assuming you
 
- can reach it) for a new IP:dirport or server descriptor.
 
- The account server
 
- runs as a Tor controller for the bridge authority
 
- Users can establish reputations, perhaps based on social network
 
- connectivity, perhaps based on not getting their bridge relays blocked,
 
- Probably the most critical lesson learned in past work on reputation
 
- systems in privacy-oriented environments~\cite{p2p-econ} is the need for
 
- verifiable transactions. That is, the entity computing and advertising
 
- reputations for participants needs to actually learn in a convincing
 
- way that a given transaction was successful or unsuccessful.
 
- (Lesson from designing reputation systems~\cite{p2p-econ}: easy to
 
- reward good behavior, hard to punish bad behavior.
 
- \subsection{How to allocate bridge addresses to users}
 
- Hold a fraction in reserve, in case our currently deployed tricks
 
- all fail at once --- so we can move to new approaches quickly.
 
- (Bridges that sign up and don't get used yet will be sad; but this
 
- is a transient problem --- if bridges are on by default, nobody will
 
- mind not being used.)
 
- Perhaps each bridge should be known by a single bridge directory
 
- authority. This makes it easier to trace which users have learned about
 
- it, so easier to blame or reward. It also makes things more brittle,
 
- since loss of that authority means its bridges aren't advertised until
 
- they switch, and means its bridge users are sad too.
 
- (Need a slick hash algorithm that will map our identity key to a
 
- bridge authority, in a way that's sticky even when we add bridge
 
- directory authorities, but isn't sticky when our authority goes
 
- away. Does this exist?)
 
- Divide bridges into buckets based on their identity key.
 
- [Design question: need an algorithm to deterministically map a bridge's
 
- identity key into a category that isn't too gameable. Take a keyed
 
- hash of the identity key plus a secret the bridge authority keeps?
 
- An adversary signing up bridges won't easily be able to learn what
 
- category he's been put in, so it's slow to attack.]
 
- One portion of the bridges is the public bucket. If you ask the
 
- bridge account server for a public bridge, it will give you a random
 
- one of these. We expect they'll be the first to be blocked, but they'll
 
- help the system bootstrap until it *does* get blocked, and remember that
 
- we're dealing with different blocking regimes around the world that will
 
- progress at different rates.
 
- The generalization of the public bucket is a bucket based on the bridge
 
- user's IP address: you can learn a random entry only from the subbucket
 
- your IP address (actually, your /24) maps to.
 
- Another portion of the bridges can be sectioned off to be given out in
 
- a time-release basis. The bucket is partitioned into pieces which are
 
- deterministically available only in certain time windows.
 
- And of course another portion is made available for the social network
 
- design above.
 
- Captchas.
 
- Is it useful to load balance which bridges are handed out? The above
 
- bucket concept makes some bridges wildly popular and others less so.
 
- But I guess that's the point.
 
- \subsection{How do we know if a bridge relay has been blocked?}
 
- We need some mechanism for testing reachability from inside the
 
- blocked area.
 
- The easiest answer is for certain users inside the area to sign up as
 
- testing relays, and then we can route through them and see if it works.
 
- First problem is that different network areas block different net masks,
 
- and it will likely be hard to know which users are in which areas. So
 
- if a bridge relay isn't reachable, is that because of a network block
 
- somewhere, because of a problem at the bridge relay, or just a temporary
 
- outage?
 
- Second problem is that if we pick random users to test random relays, the
 
- adversary should sign up users on the inside, and enumerate the relays
 
- we test. But it seems dangerous to just let people come forward and
 
- declare that things are blocked for them, since they could be tricking
 
- us. (This matters even moreso if our reputation system above relies on
 
- whether things get blocked to punish or reward.)
 
- Another answer is not to measure directly, but rather let the bridges
 
- report whether they're being used. If they periodically report to their
 
- bridge directory authority how much use they're seeing, the authority
 
- can make smart decisions from there.
 
- If they install a geoip database, they can periodically report to their
 
- bridge directory authority which countries they're seeing use from. This
 
- might help us to track which countries are making use of Ramp, and can
 
- also let us learn about new steps the adversary has taken in the arms
 
- race. (If the bridges don't want to install a whole geoip subsystem, they
 
- can report samples of the /24 network for their users, and the authorities
 
- can do the geoip work. This tradeoff has clear downsides though.)
 
- Worry: adversary signs up a bunch of already-blocked bridges. If we're
 
- stingy giving out bridges, users in that country won't get useful ones.
 
- (Worse, we'll blame the users when the bridges report they're not
 
- being used?)
 
- Worry: the adversary could choose not to block bridges but just record
 
- connections to them. So be it, I guess.
 
- \subsection{How to learn how well the whole idea is working}
 
- We need some feedback mechanism to learn how much use the bridge network
 
- as a whole is actually seeing. Part of the reason for this is so we can
 
- respond and adapt the design; part is because the funders expect to see
 
- progress reports.
 
- The above geoip-based approach to detecting blocked bridges gives us a
 
- solution though.
 
- \section{Security considerations}
 
- \label{sec:security}
 
- \subsection{Observers can tell who is publishing and who is reading}
 
- \label{subsec:upload-padding}
 
- Should bridge users sometimes send bursts of long-range drop cells?
 
- \subsection{Anonymity effects from becoming a bridge relay}
 
- Against some attacks, becoming a bridge relay can improve anonymity. The
 
- simplest example is an attacker who owns a small number of Tor servers. He
 
- will see a connection from the bridge, but he won't be able to know
 
- whether the connection originated there or was relayed from somebody else.
 
- There are some cases where it doesn't seem to help: if an attacker can
 
- watch all of the bridge's incoming and outgoing traffic, then it's easy
 
- to learn which connections were relayed and which started there. (In this
 
- case he still doesn't know the final destinations unless he is watching
 
- them too, but in this case bridges are no better off than if they were
 
- an ordinary client.)
 
- There are also some potential downsides to running a bridge. First, while
 
- we try to make it hard to enumerate all bridges, it's still possible to
 
- learn about some of them, and for some people just the fact that they're
 
- running one might signal to an attacker that they place a high value
 
- on their anonymity. Second, there are some more esoteric attacks on Tor
 
- relays that are not as well-understood or well-tested --- for example, an
 
- attacker may be able to ``observe'' whether the bridge is sending traffic
 
- even if he can't actually watch its network, by relaying traffic through
 
- it and noticing changes in traffic timing~\cite{attack-tor-oak05}. On
 
- the other hand, it may be that limiting the bandwidth the bridge is
 
- willing to relay will allow this sort of attacker to determine if it's
 
- being used as a bridge but not whether it is adding traffic of its own.
 
- It is an open research question whether the benefits outweigh the risks. A
 
- lot of the decision rests on which the attacks users are most worried
 
- about. For most users, we don't think running a bridge relay will be
 
- that damaging.
 
- \subsection{Trusting local hardware: Internet cafes and LiveCDs}
 
- \label{subsec:cafes-and-livecds}
 
- Assuming that users have their own trusted hardware is not
 
- always reasonable.
 
- For Internet cafe Windows computers that let you attach your own USB key,
 
- a USB-based Tor image would be smart. There's Torpark, and hopefully
 
- there will be more options down the road. Worries about hardware or
 
- software keyloggers and other spyware --- and physical surveillance.
 
- If the system lets you boot from a CD or from a USB key, you can gain
 
- a bit more security by bringing a privacy LiveCD with you. Hardware
 
- keyloggers and physical surveillance still a worry. LiveCDs also useful
 
- if it's your own hardware, since it's easier to avoid leaving breadcrumbs
 
- everywhere.
 
- \subsection{Forward compatibility and retiring bridge authorities}
 
- Eventually we'll want to change the identity key and/or location
 
- of a bridge authority. How do we do this mostly cleanly?
 
- \subsection{The trust chain}
 
- \label{subsec:trust-chain}
 
- Tor's ``public key infrastructure'' provides a chain of trust to
 
- let users verify that they're actually talking to the right servers.
 
- There are four pieces to this trust chain.
 
- Firstly, when Tor clients are establishing circuits, at each step
 
- they demand that the next Tor server in the path prove knowledge of
 
- its private key~\cite{tor-design}. This step prevents the first node
 
- in the path from just spoofing the rest of the path. Secondly, the
 
- Tor directory authorities provide a signed list of servers along with
 
- their public keys --- so unless the adversary can control a threshold
 
- of directory authorities, he can't trick the Tor client into using other
 
- Tor servers. Thirdly, the location and keys of the directory authorities,
 
- in turn, is hard-coded in the Tor source code --- so as long as the user
 
- got a genuine version of Tor, he can know that he is using the genuine
 
- Tor network. And lastly, the source code and other packages are signed
 
- with the GPG keys of the Tor developers, so users can confirm that they
 
- did in fact download a genuine version of Tor.
 
- But how can a user in an oppressed country know that he has the correct
 
- key fingerprints for the developers? As with other security systems, it
 
- ultimately comes down to human interaction. The keys are signed by dozens
 
- of people around the world, and we have to hope that our users have met
 
- enough people in the PGP web of trust~\cite{pgp-wot} that they can learn
 
- the correct keys. For users that aren't connected to the global security
 
- community, though, this question remains a critical weakness.
 
- % XXX make clearer the trust chain step for bridge directory authorities
 
- \subsection{Security through obscurity: publishing our design}
 
- Many other schemes like dynaweb use the typical arms race strategy of
 
- not publishing their plans. Our goal here is to produce a design ---
 
- a framework --- that can be public and still secure. Where's the tradeoff?
 
- \section{Performance improvements}
 
- \label{sec:performance}
 
- \subsection{Fetch server descriptors just-in-time}
 
- I guess we should encourage most places to do this, so blocked
 
- users don't stand out.
 
- network-status and directory optimizations. caching better. partitioning
 
- issues?
 
- \section{Maintaining reachability}
 
- \subsection{How many bridge relays should you know about?}
 
- If they're ordinary Tor users on cable modem or DSL, many of them will
 
- disappear and/or move periodically. How many bridge relays should a
 
- blockee know
 
- about before he's likely to have at least one reachable at any given point?
 
- How do we factor in a parameter for "speed that his bridges get discovered
 
- and blocked"?
 
- The related question is: if the bridge relays change IP addresses
 
- periodically, how often does the bridge user need to "check in" in order
 
- to keep from being cut out of the loop?
 
- \subsection{Cablemodem users don't provide important websites}
 
- \label{subsec:block-cable}
 
- ...so our adversary could just block all DSL and cablemodem networks,
 
- and for the most part only our bridge relays would be affected.
 
- The first answer is to aim to get volunteers both from traditionally
 
- ``consumer'' networks and also from traditionally ``producer'' networks.
 
- The second answer (not so good) would be to encourage more use of consumer
 
- networks for popular and useful websites.
 
- Other attack: China pressures Verizon to discourage its users from
 
- running bridges.
 
- \subsection{Scanning-resistance}
 
- If it's trivial to verify that we're a bridge, and we run on a predictable
 
- port, then it's conceivable our attacker would scan the whole Internet
 
- looking for bridges. (In fact, he can just scan likely networks like
 
- cablemodem and DSL services --- see Section~\ref{block-cable} for a related
 
- attack.) It would be nice to slow down this attack. It would
 
- be even nicer to make it hard to learn whether we're a bridge without
 
- first knowing some secret.
 
- Password protecting the bridges.
 
- Could provide a password to the bridge user. He provides a nonced hash of
 
- it or something when he connects. We'd need to give him an ID key for the
 
- bridge too, and wait to present the password until we've TLSed, else the
 
- adversary can pretend to be the bridge and MITM him to learn the password.
 
- \subsection{How to motivate people to run bridge relays}
 
- One of the traditional ways to get people to run software that benefits
 
- others is to give them motivation to install it themselves.  An often
 
- suggested approach is to install it as a stunning screensaver so everybody
 
- will be pleased to run it. We take a similar approach here, by leveraging
 
- the fact that these users are already interested in protecting their
 
- own Internet traffic, so they will install and run the software.
 
- Make all Tor users become bridges if they're reachable -- needs more work
 
- on usability first, but we're making progress.
 
- Also, we can make a snazzy network graph with Vidalia that emphasizes
 
- the connections the bridge user is currently relaying. (Minor anonymity
 
- implications, but hey.) (In many cases there won't be much activity,
 
- so this may backfire. Or it may be better suited to full-fledged Tor
 
- servers.)
 
- \subsection{What if the clients can't install software?}
 
- Bridge users without Tor clients
 
- Bridge relays could always open their socks proxy. This is bad though,
 
- firstly
 
- because they learn the bridge users' destinations, and secondly because
 
- we've learned that open socks proxies tend to attract abusive users who
 
- have no idea they're using Tor.
 
- Bridges could require passwords in the socks handshake (not supported
 
- by most software including Firefox). Or they could run web proxies
 
- that require authentication and then pass the requests into Tor. This
 
- approach is probably a good way to help bootstrap the Psiphon network,
 
- if one of its barriers to deployment is a lack of volunteers willing
 
- to exit directly to websites. But it clearly drops some of the nice
 
- anonymity features Tor provides.
 
- \subsection{Publicity attracts attention}
 
- \label{subsec:publicity}
 
- both good and bad.
 
- \subsection{The Tor website: how to get the software}
 
- \section{Future designs}
 
- \subsection{Bridges inside the blocked network too}
 
- Assuming actually crossing the firewall is the risky part of the
 
- operation, can we have some bridge relays inside the blocked area too,
 
- and more established users can use them as relays so they don't need to
 
- communicate over the firewall directly at all? A simple example here is
 
- to make new blocked users into internal bridges also -- so they sign up
 
- on the BDA as part of doing their query, and we give out their addresses
 
- rather than (or along with) the external bridge addresses. This design
 
- is a lot trickier because it brings in the complexity of whether the
 
- internal bridges will remain available, can maintain reachability with
 
- the outside world, etc.
 
- Hidden services as bridges. Hidden services as bridge directory authorities.
 
- \bibliographystyle{plain} \bibliography{tor-design}
 
- \appendix
 
- \section{Counting Tor users by country}
 
- \label{app:geoip}
 
- \end{document}
 
- ship geoip db to bridges. they look up users who tls to them in the db,
 
- and upload a signed list of countries and number-of-users each day. the
 
- bridge authority aggregates them and publishes stats.
 
- bridge relays have buddies
 
- they ask a user to test the reachability of their buddy.
 
- leaks O(1) bridges, but not O(n).
 
- we should not be blockable by ordinary cisco censorship features.
 
- that is, if they want to block our new design, they will need to
 
- add a feature to block exactly this.
 
- strategically speaking, this may come in handy.
 
- hash identity key + secret that bridge authority knows. start
 
- out dividing into 2^n buckets, where n starts at 0, and we choose
 
- which bucket you're in based on the first n bits of the hash.
 
- Bridges come in clumps of 4 or 8 or whatever. If you know one bridge
 
- in a clump, the authority will tell you the rest. Now bridges can
 
- ask users to test reachability of their buddies.
 
- Giving out clumps helps with dynamic IP addresses too. Whether it
 
- should be 4 or 8 depends on our churn.
 
- the account server. let's call it a database, it doesn't have to
 
- be a thing that human interacts with.
 
- rate limiting mechanisms:
 
- energy spent. captchas. relaying traffic for others?
 
- send us $10, we'll give you an account
 
- so how do we reward people for being good?
 
 
  |