|
@@ -423,7 +423,7 @@ financial health as well as network security.
|
|
|
% this para should probably move to the scalability / directory system. -RD
|
|
|
% Nope. Cut for space, except for small comment added above -PFS
|
|
|
|
|
|
-\section{Policy issues}
|
|
|
+\section{Social challenges}
|
|
|
|
|
|
Many of the issues the Tor project needs to address extend beyond
|
|
|
system design and technology development. In particular, the
|
|
@@ -498,7 +498,7 @@ accurately communicating security levels to its users.
|
|
|
|
|
|
On the other hand, while the number of active concurrent users may not
|
|
|
matter as much as we'd like, it still helps to have some other users
|
|
|
-who use the network. We investigate this issue in the next section.
|
|
|
+on the network. We investigate this issue next.
|
|
|
|
|
|
\subsection{Reputability and perceived social value}
|
|
|
Another factor impacting the network's security is its reputability:
|
|
@@ -803,8 +803,8 @@ time.
|
|
|
|
|
|
\section{Design choices}
|
|
|
|
|
|
-In addition to social issues, Tor also faces some design challenges that must
|
|
|
-be addressed as the network develops.
|
|
|
+In addition to social issues, Tor also faces some design tradeoffs that must
|
|
|
+be investigated as the network develops.
|
|
|
|
|
|
\subsection{Transporting the stream vs transporting the packets}
|
|
|
\label{subsec:stream-vs-packet}
|
|
@@ -915,54 +915,6 @@ reduce usability. Further, if we let clients label certain circuits as
|
|
|
mid-latency as they are constructed, we could handle both types of traffic
|
|
|
on the same network, giving users a choice between speed and security.
|
|
|
|
|
|
-\subsection{Measuring performance and capacity}
|
|
|
-\label{subsec:performance}
|
|
|
-
|
|
|
-One of the paradoxes with engineering an anonymity network is that we'd like
|
|
|
-to learn as much as we can about how traffic flows so we can improve the
|
|
|
-network, but we want to prevent others from learning how traffic flows in
|
|
|
-order to trace users' connections through the network. Furthermore, many
|
|
|
-mechanisms that help Tor run efficiently
|
|
|
-require measurements about the network.
|
|
|
-
|
|
|
-Currently, nodes try to deduce their own available bandwidth (based on how
|
|
|
-much traffic they have been able to transfer recently) and include this
|
|
|
-information in the descriptors they upload to the directory. Clients
|
|
|
-choose servers weighted by their bandwidth, neglecting really slow
|
|
|
-servers and capping the influence of really fast ones.
|
|
|
-
|
|
|
-This is, of course, eminently cheatable. A malicious node can get a
|
|
|
-disproportionate amount of traffic simply by claiming to have more bandwidth
|
|
|
-than it does. But better mechanisms have their problems. If bandwidth data
|
|
|
-is to be measured rather than self-reported, it is usually possible for
|
|
|
-nodes to selectively provide better service for the measuring party, or
|
|
|
-sabotage the measured value of other nodes. Complex solutions for
|
|
|
-mix networks have been proposed, but do not address the issues
|
|
|
-completely~\cite{mix-acc,casc-rep}.
|
|
|
-
|
|
|
-Even with no cheating, network measurement is complex. It is common
|
|
|
-for views of a node's latency and/or bandwidth to vary wildly between
|
|
|
-observers. Further, it is unclear whether total bandwidth is really
|
|
|
-the right measure; perhaps clients should instead be considering nodes
|
|
|
-based on unused bandwidth or observed throughput.
|
|
|
-% XXXX say more here?
|
|
|
-
|
|
|
-%How to measure performance without letting people selectively deny service
|
|
|
-%by distinguishing pings. Heck, just how to measure performance at all. In
|
|
|
-%practice people have funny firewalls that don't match up to their exit
|
|
|
-%policies and Tor doesn't deal.
|
|
|
-
|
|
|
-%Network investigation: Is all this bandwidth publishing thing a good idea?
|
|
|
-%How can we collect stats better? Note weasel's smokeping, at
|
|
|
-%http://seppia.noreply.org/cgi-bin/smokeping.cgi?target=Tor
|
|
|
-%which probably gives george and steven enough info to break tor?
|
|
|
-
|
|
|
-Even if we can collect and use this network information effectively, we need
|
|
|
-to make sure that it is not more useful to attackers than to us. While it
|
|
|
-seems plausible that bandwidth data alone is not enough to reveal
|
|
|
-sender-recipient connections under most circumstances, it could certainly
|
|
|
-reveal the path taken by large traffic flows under low-usage circumstances.
|
|
|
-
|
|
|
\subsection{Running a Tor node, path length, and helper nodes}
|
|
|
\label{subsec:helper-nodes}
|
|
|
|
|
@@ -1111,79 +1063,119 @@ of the Tor project and their support for privacy, and secondly to offer
|
|
|
a way for their users, using unmodified software, to get end-to-end
|
|
|
encryption and end-to-end authentication to their website.
|
|
|
|
|
|
-\subsection{Trust and discovery}
|
|
|
-\label{subsec:trust-and-discovery}
|
|
|
+\subsection{Location diversity and ISP-class adversaries}
|
|
|
+\label{subsec:routing-zones}
|
|
|
|
|
|
-The published Tor design adopted a deliberately simplistic design for
|
|
|
-authorizing new nodes and informing clients about Tor nodes and their status.
|
|
|
-In the early Tor designs, all nodes periodically uploaded a signed description
|
|
|
-of their locations, keys, and capabilities to each of several well-known {\it
|
|
|
- directory servers}. These directory servers constructed a signed summary
|
|
|
-of all known Tor nodes (a ``directory''), and a signed statement of which
|
|
|
-nodes they
|
|
|
-believed to be operational at any given time (a ``network status''). Clients
|
|
|
-periodically downloaded a directory in order to learn the latest nodes and
|
|
|
-keys, and more frequently downloaded a network status to learn which nodes are
|
|
|
-likely to be running. Tor nodes also operate as directory caches, in order to
|
|
|
-lighten the bandwidth on the authoritative directory servers.
|
|
|
+Anonymity networks have long relied on diversity of node location for
|
|
|
+protection against attacks---typically an adversary who can observe a
|
|
|
+larger fraction of the network can launch a more effective attack. One
|
|
|
+way to achieve dispersal involves growing the network so a given adversary
|
|
|
+sees less. Alternately, we can arrange the topology so traffic can enter
|
|
|
+or exit at many places (for example, by using a free-route network
|
|
|
+like Tor rather than a cascade network like JAP). Lastly, we can use
|
|
|
+distributed trust to spread each transaction over multiple jurisdictions.
|
|
|
+But how do we decide whether two nodes are in related locations?
|
|
|
|
|
|
-In order to prevent Sybil attacks (wherein an adversary signs up many
|
|
|
-purportedly independent nodes in order to increase her chances of observing
|
|
|
-a stream as it enters and leaves the network), the early Tor directory design
|
|
|
-required the operators of the authoritative directory servers to manually
|
|
|
-approve new nodes. Unapproved nodes were included in the directory,
|
|
|
-but clients
|
|
|
-did not use them at the start or end of their circuits. In practice,
|
|
|
-directory administrators performed little actual verification, and tended to
|
|
|
-approve any Tor node whose operator could compose a coherent email.
|
|
|
-This procedure
|
|
|
-may have prevented trivial automated Sybil attacks, but would do little
|
|
|
-against a clever attacker.
|
|
|
+Feamster and Dingledine defined a \emph{location diversity} metric
|
|
|
+in \cite{feamster:wpes2004}, and began investigating a variant of location
|
|
|
+diversity based on the fact that the Internet is divided into thousands of
|
|
|
+independently operated networks called {\em autonomous systems} (ASes).
|
|
|
+The key insight from their paper is that while we typically think of a
|
|
|
+connection as going directly from the Tor client to her first Tor node,
|
|
|
+actually it traverses many different ASes on each hop. An adversary at
|
|
|
+any of these ASes can monitor or influence traffic. Specifically, given
|
|
|
+plausible initiators and recipients and path random path selection,
|
|
|
+some ASes in the simulation were able to observe 10\% to 30\% of the
|
|
|
+transactions (that is, learn both the origin and the destination) on
|
|
|
+the deployed Tor network (33 nodes as of June 2004).
|
|
|
|
|
|
-There are a number of flaws in this system that need to be addressed as we
|
|
|
-move forward. They include:
|
|
|
-\begin{tightlist}
|
|
|
-\item Each directory server represents an independent point of failure; if
|
|
|
- any one were compromised, it could immediately compromise all of its users
|
|
|
- by recommending only compromised nodes.
|
|
|
-\item The more nodes join the network, the more unreasonable it
|
|
|
- becomes to expect clients to know about them all. Directories
|
|
|
- become infeasibly large, and downloading the list of nodes becomes
|
|
|
- burdensome.
|
|
|
-\item The validation scheme may do as much harm as it does good. It is not
|
|
|
- only incapable of preventing clever attackers from mounting Sybil attacks,
|
|
|
- but may deter node operators from joining the network. (For instance, if
|
|
|
- they expect the validation process to be difficult, or if they do not share
|
|
|
- any languages in common with the directory server operators.)
|
|
|
-\end{tightlist}
|
|
|
+The paper concludes that for best protection against the AS-level
|
|
|
+adversary, nodes should be in ASes that have the most links to other ASes:
|
|
|
+Tier-1 ISPs such as AT\&T and Abovenet. Further, a given transaction
|
|
|
+is safest when it starts or ends in a Tier-1 ISP. Therefore, assuming
|
|
|
+initiator and responder are both in the U.S., it actually \emph{hurts}
|
|
|
+our location diversity to add far-flung nodes in continents like Asia
|
|
|
+or South America.
|
|
|
|
|
|
-We could try to move the system in several directions, depending on our
|
|
|
-choice of threat model and requirements. If we did not need to increase
|
|
|
-network capacity in order to support more users, we could simply
|
|
|
- adopt even stricter validation requirements, and reduce the number of
|
|
|
-nodes in the network to a trusted minimum.
|
|
|
-But, we can only do that if can simultaneously make node capacity
|
|
|
-scale much more than we anticipate feasible soon, and if we can find
|
|
|
-entities willing to run such nodes, an equally daunting prospect.
|
|
|
+Many open questions remain. First, it will be an immense engineering
|
|
|
+challenge to get an entire BGP routing table to each Tor client, or to
|
|
|
+summarize it sufficiently. Without a local copy, clients won't be
|
|
|
+able to safely predict what ASes will be traversed on the various paths
|
|
|
+through the Tor network to the final destination. Tarzan~\cite{tarzan:ccs02}
|
|
|
+and MorphMix~\cite{morphmix:fc04} suggest that we compare IP prefixes to
|
|
|
+determine location diversity; but the above paper showed that in practice
|
|
|
+many of the Mixmaster nodes that share a single AS have entirely different
|
|
|
+IP prefixes. When the network has scaled to thousands of nodes, does IP
|
|
|
+prefix comparison become a more useful approximation?
|
|
|
+%
|
|
|
+Second, we can take advantage of caching certain content at the
|
|
|
+exit nodes, to limit the number of requests that need to leave the
|
|
|
+network at all. What about taking advantage of caches like Akamai or
|
|
|
+Google~\cite{shsm03}? (Note that they're also well-positioned as global
|
|
|
+adversaries.)
|
|
|
+%
|
|
|
+Third, if we follow the paper's recommendations and tailor path selection
|
|
|
+to avoid choosing endpoints in similar locations, how much are we hurting
|
|
|
+anonymity against larger real-world adversaries who can take advantage
|
|
|
+of knowing our algorithm?
|
|
|
+%
|
|
|
+Lastly, can we use this knowledge to figure out which gaps in our network
|
|
|
+would most improve our robustness to this class of attack, and go recruit
|
|
|
+new nodes with those ASes in mind?
|
|
|
|
|
|
+%Tor's security relies in large part on the dispersal properties of its
|
|
|
+%network. We need to be more aware of the anonymity properties of various
|
|
|
+%approaches so we can make better design decisions in the future.
|
|
|
|
|
|
-In order to address the first two issues, it seems wise to move to a system
|
|
|
-including a number of semi-trusted directory servers, no one of which can
|
|
|
-compromise a user on its own. Ultimately, of course, we cannot escape the
|
|
|
-problem of a first introducer: since most users will run Tor in whatever
|
|
|
-configuration the software ships with, the Tor distribution itself will
|
|
|
-remain a potential single point of failure so long as it includes the seed
|
|
|
-keys for directory servers, a list of directory servers, or any other means
|
|
|
-to learn which nodes are on the network. But omitting this information
|
|
|
-from the Tor distribution would only delegate the trust problem to the
|
|
|
-individual users, most of whom are presumably less informed about how to make
|
|
|
-trust decisions than the Tor developers.
|
|
|
+\subsection{The China problem}
|
|
|
+\label{subsec:china}
|
|
|
|
|
|
-%Network discovery, sybil, node admission, scaling. It seems that the code
|
|
|
-%will ship with something and that's our trust root. We could try to get
|
|
|
-%people to build a web of trust, but no. Where we go from here depends
|
|
|
-%on what threats we have in mind. Really decentralized if your threat is
|
|
|
-%RIAA; less so if threat is to application data or individuals or...
|
|
|
+Citizens in a variety of countries, such as most recently China and
|
|
|
+Iran, are periodically blocked from accessing various sites outside
|
|
|
+their country. These users try to find any tools available to allow
|
|
|
+them to get-around these firewalls. Some anonymity networks, such as
|
|
|
+Six-Four~\cite{six-four}, are designed specifically with this goal in
|
|
|
+mind; others like the Anonymizer~\cite{anonymizer} are paid by sponsors
|
|
|
+such as Voice of America to set up a network to encourage Internet
|
|
|
+freedom. Even though Tor wasn't
|
|
|
+designed with ubiquitous access to the network in mind, thousands of
|
|
|
+users across the world are trying to use it for exactly this purpose.
|
|
|
+% Academic and NGO organizations, peacefire, \cite{berkman}, etc
|
|
|
+
|
|
|
+Anti-censorship networks hoping to bridge country-level blocks face
|
|
|
+a variety of challenges. One of these is that they need to find enough
|
|
|
+exit nodes---servers on the `free' side that are willing to relay
|
|
|
+arbitrary traffic from users to their final destinations. Anonymizing
|
|
|
+networks including Tor are well-suited to this task, since we have
|
|
|
+already gathered a set of exit nodes that are willing to tolerate some
|
|
|
+political heat.
|
|
|
+
|
|
|
+The other main challenge is to distribute a list of reachable relays
|
|
|
+to the users inside the country, and give them software to use them,
|
|
|
+without letting the authorities also enumerate this list and block each
|
|
|
+relay. Anonymizer solves this by buying lots of seemingly-unrelated IP
|
|
|
+addresses (or having them donated), abandoning old addresses as they are
|
|
|
+`used up', and telling a few users about the new ones. Distributed
|
|
|
+anonymizing networks again have an advantage here, in that we already
|
|
|
+have tens of thousands of separate IP addresses whose users might
|
|
|
+volunteer to provide this service since they've already installed and use
|
|
|
+the software for their own privacy~\cite{koepsell:wpes2004}. Because
|
|
|
+the Tor protocol separates routing from network discovery \cite{tor-design},
|
|
|
+volunteers could configure their Tor clients
|
|
|
+to generate node descriptors and send them to a special directory
|
|
|
+server that gives them out to dissidents who need to get around blocks.
|
|
|
+
|
|
|
+Of course, this still doesn't prevent the adversary
|
|
|
+from enumerating all the volunteer relays and blocking them preemptively.
|
|
|
+Perhaps a tiered-trust system could be built where a few individuals are
|
|
|
+given relays' locations, and they recommend other individuals by telling them
|
|
|
+those addresses, thus providing a built-in incentive to avoid letting the
|
|
|
+adversary intercept them. Max-flow trust algorithms~\cite{advogato}
|
|
|
+might help to bound the number of IP addresses leaked to the adversary. Groups
|
|
|
+like the W3C are looking into using Tor as a component in an overall system to
|
|
|
+help address censorship; we wish them luck.
|
|
|
+
|
|
|
+%\cite{infranet}
|
|
|
|
|
|
\section{Scaling}
|
|
|
\label{sec:scaling}
|
|
@@ -1282,119 +1274,127 @@ further study.
|
|
|
%efficiency over baseline, and also to determine how far we are from
|
|
|
%optimal efficiency (what we could get if we ignored the anonymity goals).
|
|
|
|
|
|
-\subsection{Location diversity and ISP-class adversaries}
|
|
|
-\label{subsec:routing-zones}
|
|
|
+\subsection{Trust and discovery}
|
|
|
+\label{subsec:trust-and-discovery}
|
|
|
|
|
|
-Anonymity networks have long relied on diversity of node location for
|
|
|
-protection against attacks---typically an adversary who can observe a
|
|
|
-larger fraction of the network can launch a more effective attack. One
|
|
|
-way to achieve dispersal involves growing the network so a given adversary
|
|
|
-sees less. Alternately, we can arrange the topology so traffic can enter
|
|
|
-or exit at many places (for example, by using a free-route network
|
|
|
-like Tor rather than a cascade network like JAP). Lastly, we can use
|
|
|
-distributed trust to spread each transaction over multiple jurisdictions.
|
|
|
-But how do we decide whether two nodes are in related locations?
|
|
|
+The published Tor design adopted a deliberately simplistic design for
|
|
|
+authorizing new nodes and informing clients about Tor nodes and their status.
|
|
|
+In the early Tor designs, all nodes periodically uploaded a signed description
|
|
|
+of their locations, keys, and capabilities to each of several well-known {\it
|
|
|
+ directory servers}. These directory servers constructed a signed summary
|
|
|
+of all known Tor nodes (a ``directory''), and a signed statement of which
|
|
|
+nodes they
|
|
|
+believed to be operational at any given time (a ``network status''). Clients
|
|
|
+periodically downloaded a directory in order to learn the latest nodes and
|
|
|
+keys, and more frequently downloaded a network status to learn which nodes are
|
|
|
+likely to be running. Tor nodes also operate as directory caches, in order to
|
|
|
+lighten the bandwidth on the authoritative directory servers.
|
|
|
|
|
|
-Feamster and Dingledine defined a \emph{location diversity} metric
|
|
|
-in \cite{feamster:wpes2004}, and began investigating a variant of location
|
|
|
-diversity based on the fact that the Internet is divided into thousands of
|
|
|
-independently operated networks called {\em autonomous systems} (ASes).
|
|
|
-The key insight from their paper is that while we typically think of a
|
|
|
-connection as going directly from the Tor client to her first Tor node,
|
|
|
-actually it traverses many different ASes on each hop. An adversary at
|
|
|
-any of these ASes can monitor or influence traffic. Specifically, given
|
|
|
-plausible initiators and recipients and path random path selection,
|
|
|
-some ASes in the simulation were able to observe 10\% to 30\% of the
|
|
|
-transactions (that is, learn both the origin and the destination) on
|
|
|
-the deployed Tor network (33 nodes as of June 2004).
|
|
|
+In order to prevent Sybil attacks (wherein an adversary signs up many
|
|
|
+purportedly independent nodes in order to increase her chances of observing
|
|
|
+a stream as it enters and leaves the network), the early Tor directory design
|
|
|
+required the operators of the authoritative directory servers to manually
|
|
|
+approve new nodes. Unapproved nodes were included in the directory,
|
|
|
+but clients
|
|
|
+did not use them at the start or end of their circuits. In practice,
|
|
|
+directory administrators performed little actual verification, and tended to
|
|
|
+approve any Tor node whose operator could compose a coherent email.
|
|
|
+This procedure
|
|
|
+may have prevented trivial automated Sybil attacks, but would do little
|
|
|
+against a clever attacker.
|
|
|
|
|
|
-The paper concludes that for best protection against the AS-level
|
|
|
-adversary, nodes should be in ASes that have the most links to other ASes:
|
|
|
-Tier-1 ISPs such as AT\&T and Abovenet. Further, a given transaction
|
|
|
-is safest when it starts or ends in a Tier-1 ISP. Therefore, assuming
|
|
|
-initiator and responder are both in the U.S., it actually \emph{hurts}
|
|
|
-our location diversity to add far-flung nodes in continents like Asia
|
|
|
-or South America.
|
|
|
+There are a number of flaws in this system that need to be addressed as we
|
|
|
+move forward. They include:
|
|
|
+\begin{tightlist}
|
|
|
+\item Each directory server represents an independent point of failure; if
|
|
|
+ any one were compromised, it could immediately compromise all of its users
|
|
|
+ by recommending only compromised nodes.
|
|
|
+\item The more nodes join the network, the more unreasonable it
|
|
|
+ becomes to expect clients to know about them all. Directories
|
|
|
+ become infeasibly large, and downloading the list of nodes becomes
|
|
|
+ burdensome.
|
|
|
+\item The validation scheme may do as much harm as it does good. It is not
|
|
|
+ only incapable of preventing clever attackers from mounting Sybil attacks,
|
|
|
+ but may deter node operators from joining the network. (For instance, if
|
|
|
+ they expect the validation process to be difficult, or if they do not share
|
|
|
+ any languages in common with the directory server operators.)
|
|
|
+\end{tightlist}
|
|
|
|
|
|
-Many open questions remain. First, it will be an immense engineering
|
|
|
-challenge to get an entire BGP routing table to each Tor client, or to
|
|
|
-summarize it sufficiently. Without a local copy, clients won't be
|
|
|
-able to safely predict what ASes will be traversed on the various paths
|
|
|
-through the Tor network to the final destination. Tarzan~\cite{tarzan:ccs02}
|
|
|
-and MorphMix~\cite{morphmix:fc04} suggest that we compare IP prefixes to
|
|
|
-determine location diversity; but the above paper showed that in practice
|
|
|
-many of the Mixmaster nodes that share a single AS have entirely different
|
|
|
-IP prefixes. When the network has scaled to thousands of nodes, does IP
|
|
|
-prefix comparison become a more useful approximation?
|
|
|
-%
|
|
|
-Second, we can take advantage of caching certain content at the
|
|
|
-exit nodes, to limit the number of requests that need to leave the
|
|
|
-network at all. What about taking advantage of caches like Akamai or
|
|
|
-Google~\cite{shsm03}? (Note that they're also well-positioned as global
|
|
|
-adversaries.)
|
|
|
-%
|
|
|
-Third, if we follow the paper's recommendations and tailor path selection
|
|
|
-to avoid choosing endpoints in similar locations, how much are we hurting
|
|
|
-anonymity against larger real-world adversaries who can take advantage
|
|
|
-of knowing our algorithm?
|
|
|
-%
|
|
|
-Lastly, can we use this knowledge to figure out which gaps in our network
|
|
|
-would most improve our robustness to this class of attack, and go recruit
|
|
|
-new nodes with those ASes in mind?
|
|
|
+We could try to move the system in several directions, depending on our
|
|
|
+choice of threat model and requirements. If we did not need to increase
|
|
|
+network capacity in order to support more users, we could simply
|
|
|
+ adopt even stricter validation requirements, and reduce the number of
|
|
|
+nodes in the network to a trusted minimum.
|
|
|
+But, we can only do that if can simultaneously make node capacity
|
|
|
+scale much more than we anticipate feasible soon, and if we can find
|
|
|
+entities willing to run such nodes, an equally daunting prospect.
|
|
|
|
|
|
-%Tor's security relies in large part on the dispersal properties of its
|
|
|
-%network. We need to be more aware of the anonymity properties of various
|
|
|
-%approaches so we can make better design decisions in the future.
|
|
|
|
|
|
-\subsection{The China problem}
|
|
|
-\label{subsec:china}
|
|
|
+In order to address the first two issues, it seems wise to move to a system
|
|
|
+including a number of semi-trusted directory servers, no one of which can
|
|
|
+compromise a user on its own. Ultimately, of course, we cannot escape the
|
|
|
+problem of a first introducer: since most users will run Tor in whatever
|
|
|
+configuration the software ships with, the Tor distribution itself will
|
|
|
+remain a potential single point of failure so long as it includes the seed
|
|
|
+keys for directory servers, a list of directory servers, or any other means
|
|
|
+to learn which nodes are on the network. But omitting this information
|
|
|
+from the Tor distribution would only delegate the trust problem to the
|
|
|
+individual users, most of whom are presumably less informed about how to make
|
|
|
+trust decisions than the Tor developers.
|
|
|
|
|
|
-Citizens in a variety of countries, such as most recently China and
|
|
|
-Iran, are periodically blocked from accessing various sites outside
|
|
|
-their country. These users try to find any tools available to allow
|
|
|
-them to get-around these firewalls. Some anonymity networks, such as
|
|
|
-Six-Four~\cite{six-four}, are designed specifically with this goal in
|
|
|
-mind; others like the Anonymizer~\cite{anonymizer} are paid by sponsors
|
|
|
-such as Voice of America to set up a network to encourage Internet
|
|
|
-freedom. Even though Tor wasn't
|
|
|
-designed with ubiquitous access to the network in mind, thousands of
|
|
|
-users across the world are trying to use it for exactly this purpose.
|
|
|
-% Academic and NGO organizations, peacefire, \cite{berkman}, etc
|
|
|
+%Network discovery, sybil, node admission, scaling. It seems that the code
|
|
|
+%will ship with something and that's our trust root. We could try to get
|
|
|
+%people to build a web of trust, but no. Where we go from here depends
|
|
|
+%on what threats we have in mind. Really decentralized if your threat is
|
|
|
+%RIAA; less so if threat is to application data or individuals or...
|
|
|
|
|
|
-Anti-censorship networks hoping to bridge country-level blocks face
|
|
|
-a variety of challenges. One of these is that they need to find enough
|
|
|
-exit nodes---servers on the `free' side that are willing to relay
|
|
|
-arbitrary traffic from users to their final destinations. Anonymizing
|
|
|
-networks including Tor are well-suited to this task, since we have
|
|
|
-already gathered a set of exit nodes that are willing to tolerate some
|
|
|
-political heat.
|
|
|
+\subsection{Measuring performance and capacity}
|
|
|
+\label{subsec:performance}
|
|
|
|
|
|
-The other main challenge is to distribute a list of reachable relays
|
|
|
-to the users inside the country, and give them software to use them,
|
|
|
-without letting the authorities also enumerate this list and block each
|
|
|
-relay. Anonymizer solves this by buying lots of seemingly-unrelated IP
|
|
|
-addresses (or having them donated), abandoning old addresses as they are
|
|
|
-`used up', and telling a few users about the new ones. Distributed
|
|
|
-anonymizing networks again have an advantage here, in that we already
|
|
|
-have tens of thousands of separate IP addresses whose users might
|
|
|
-volunteer to provide this service since they've already installed and use
|
|
|
-the software for their own privacy~\cite{koepsell:wpes2004}. Because
|
|
|
-the Tor protocol separates routing from network discovery \cite{tor-design},
|
|
|
-volunteers could configure their Tor clients
|
|
|
-to generate node descriptors and send them to a special directory
|
|
|
-server that gives them out to dissidents who need to get around blocks.
|
|
|
+One of the paradoxes with engineering an anonymity network is that we'd like
|
|
|
+to learn as much as we can about how traffic flows so we can improve the
|
|
|
+network, but we want to prevent others from learning how traffic flows in
|
|
|
+order to trace users' connections through the network. Furthermore, many
|
|
|
+mechanisms that help Tor run efficiently
|
|
|
+require measurements about the network.
|
|
|
|
|
|
-Of course, this still doesn't prevent the adversary
|
|
|
-from enumerating all the volunteer relays and blocking them preemptively.
|
|
|
-Perhaps a tiered-trust system could be built where a few individuals are
|
|
|
-given relays' locations, and they recommend other individuals by telling them
|
|
|
-those addresses, thus providing a built-in incentive to avoid letting the
|
|
|
-adversary intercept them. Max-flow trust algorithms~\cite{advogato}
|
|
|
-might help to bound the number of IP addresses leaked to the adversary. Groups
|
|
|
-like the W3C are looking into using Tor as a component in an overall system to
|
|
|
-help address censorship; we wish them luck.
|
|
|
+Currently, nodes try to deduce their own available bandwidth (based on how
|
|
|
+much traffic they have been able to transfer recently) and include this
|
|
|
+information in the descriptors they upload to the directory. Clients
|
|
|
+choose servers weighted by their bandwidth, neglecting really slow
|
|
|
+servers and capping the influence of really fast ones.
|
|
|
|
|
|
-%\cite{infranet}
|
|
|
+This is, of course, eminently cheatable. A malicious node can get a
|
|
|
+disproportionate amount of traffic simply by claiming to have more bandwidth
|
|
|
+than it does. But better mechanisms have their problems. If bandwidth data
|
|
|
+is to be measured rather than self-reported, it is usually possible for
|
|
|
+nodes to selectively provide better service for the measuring party, or
|
|
|
+sabotage the measured value of other nodes. Complex solutions for
|
|
|
+mix networks have been proposed, but do not address the issues
|
|
|
+completely~\cite{mix-acc,casc-rep}.
|
|
|
+
|
|
|
+Even with no cheating, network measurement is complex. It is common
|
|
|
+for views of a node's latency and/or bandwidth to vary wildly between
|
|
|
+observers. Further, it is unclear whether total bandwidth is really
|
|
|
+the right measure; perhaps clients should instead be considering nodes
|
|
|
+based on unused bandwidth or observed throughput.
|
|
|
+% XXXX say more here?
|
|
|
+
|
|
|
+%How to measure performance without letting people selectively deny service
|
|
|
+%by distinguishing pings. Heck, just how to measure performance at all. In
|
|
|
+%practice people have funny firewalls that don't match up to their exit
|
|
|
+%policies and Tor doesn't deal.
|
|
|
+
|
|
|
+%Network investigation: Is all this bandwidth publishing thing a good idea?
|
|
|
+%How can we collect stats better? Note weasel's smokeping, at
|
|
|
+%http://seppia.noreply.org/cgi-bin/smokeping.cgi?target=Tor
|
|
|
+%which probably gives george and steven enough info to break tor?
|
|
|
+
|
|
|
+Even if we can collect and use this network information effectively, we need
|
|
|
+to make sure that it is not more useful to attackers than to us. While it
|
|
|
+seems plausible that bandwidth data alone is not enough to reveal
|
|
|
+sender-recipient connections under most circumstances, it could certainly
|
|
|
+reveal the path taken by large traffic flows under low-usage circumstances.
|
|
|
|
|
|
\subsection{Non-clique topologies}
|
|
|
|
|
@@ -1493,7 +1493,7 @@ coexist with the variety of Internet services and their established
|
|
|
authentication mechanisms. We can't just keep escalating the blacklist
|
|
|
standoff forever.
|
|
|
%
|
|
|
-Fourth, as described in Section~\ref{sec:scaling}, the current Tor
|
|
|
+Fourth, the current Tor
|
|
|
architecture does not scale even to handle current user demand. We must
|
|
|
find designs and incentives to let clients relay traffic too, without
|
|
|
sacrificing too much anonymity.
|