19 лет назад · 2557555cd4
--- a/doc/design-paper/blocking.tex
+++ b/doc/design-paper/blocking.tex
@@ -49,7 +49,7 @@ by government-level attackers.
 
				 
			
 
				 Anonymizing networks like Tor~\cite{tor-design} bounce traffic around a
			
 
				 network of encrypting relays.  Unlike encryption, which hides only {\it what}
			
 
				-is said, these network also aim to hide who is communicating with whom, which
			
 
				+is said, these networks also aim to hide who is communicating with whom, which
			
 
				 users are using which websites, and similar relations.  These systems have a
			
 
				 broad range of users, including ordinary citizens who want to avoid being
			
 
				 profiled for targeted advertisements, corporations who don't want to reveal
			
@@ -71,8 +71,9 @@ less for its anonymity properties than for its censorship
 
				 resistance properties---if they use Tor to access Internet sites like
			
 
				 Wikipedia
			
 
				 and Blogspot, they are no longer affected by local censorship
			
 
				-and firewall rules. In fact, an informal user study (described in
			
 
				-Appendix~\ref{app:geoip}) showed China as the third largest user base
			
 
				+and firewall rules. In fact, an informal user study
			
 
				+%(described in Appendix~\ref{app:geoip})
			
 
				+showed China as the third largest user base
			
 
				 for Tor clients, with perhaps ten thousand people accessing the Tor
			
 
				 network from China each day.
			
 
				 
			
@@ -112,7 +113,7 @@ security implications; ..... %write the rest.
 
				 To design an effective anticensorship tool, we need a good model for the
			
 
				 goals and resources of the censors we are evading.  Otherwise, we risk
			
 
				 spending our effort on keeping the adversaries from doing things they have no
			
 
				-interest in doing and thwarting techniques they do not use.
			
 
				+interest in doing, and thwarting techniques they do not use.
			
 
				 The history of blocking-resistance designs is littered with conflicting
			
 
				 assumptions about what adversaries to expect and what problems are
			
 
				 in the critical path to a solution. Here we describe our best
			
@@ -123,7 +124,7 @@ attacker---if we can defend against this attacker, we inherit protection
 
				 against weaker attackers as well.  After all, we want a general design
			
 
				 that will work for citizens of China, Iran, Thailand, and other censored
			
 
				 countries; for
			
 
				-whistleblowers in firewalled corporate network; and for people in
			
 
				+whistleblowers in firewalled corporate networks; and for people in
			
 
				 unanticipated oppressive situations. In fact, by designing with
			
 
				 a variety of adversaries in mind, we can take advantage of the fact that
			
 
				 adversaries will be in different stages of the arms race at each location,
			
@@ -131,7 +132,7 @@ so a server blocked in one locale can still be useful in others.
 
				 
			
 
				 We assume that the attackers' goals are somewhat complex.
			
 
				 \begin{tightlist}
			
 
				-\item The attacker would like to restrict the flow of certain kinds
			
 
				+\item The attacker would like to restrict the flow of certain kinds of
			
 
				   information, particularly when this information is seen as embarrassing to
			
 
				   those in power (such as information about rights violations or corruption),
			
 
				   or when it enables or encourages others to oppose them effectively (such as
			
@@ -142,10 +143,11 @@ We assume that the attackers' goals are somewhat complex.
 
				 \item Usually, censors make a token attempt to block a few sites for
			
 
				   obscenity, blasphemy, and so on, but their efforts here are mainly for
			
 
				   show.
			
 
				-\item Complete blocking (where nobody at all can ever download) is not a
			
 
				+\item Complete blocking (where nobody at all can ever download censored
			
 
				+  content) is not a
			
 
				   goal. Attackers typically recognize that perfect censorship is not only
			
 
				   impossible, but unnecessary: if ``undesirable'' information is known only
			
 
				-  to a small few, resources can be focused elsewhere
			
 
				+  to a small few, further censoring efforts can be focused elsewhere.
			
 
				 \item Similarly, the censors are not attempting to shut down or block {\it
			
 
				   every} anticensorship tool---merely the tools that are popular and
			
 
				   effective (because these tools impede the censors' information restriction
			
@@ -167,8 +169,9 @@ We assume that the attackers' goals are somewhat complex.
 
				   greater danger than consumers; the attacker would like to not only block
			
 
				   their work, but identify them for reprisal.
			
 
				 \item The censors (or their governments) would like to have a working, useful
			
 
				-  Internet. Otherwise, they could simply ``censor'' the Internet by outlawing
			
 
				-  it entirely, or blocking access to all but a tiny list of sites.
			
 
				+  Internet. There are economic, political, and social factors that prevent
			
 
				+  them from ``censoring'' the Internet by outlawing it entirely, or by
			
 
				+  blocking access to all but a tiny list of sites.
			
 
				   Nevertheless, the censors {\it are} willing to block innocuous content
			
 
				   (like the bulk of a newspaper's reporting) in order to censor other content
			
 
				   distributed through the same channels (like that newspaper's coverage of
			
@@ -194,7 +197,7 @@ connection~\cite{clayton:pet2006}.  Against an adversary who could carefully
 
				 examine the contents of every packet and correlate the packets in every
			
 
				 stream on the network, we would need some stronger mechanism such as
			
 
				 steganography, which introduces its own
			
 
				-problems~\cite{active-wardens,tcpstego,bar}.  But we make a ``weak
			
 
				+problems~\cite{active-wardens,tcpstego}.  But we make a ``weak
			
 
				 steganography'' assumption here: to remain unblocked, it is necessary to
			
 
				 remain unobservable only by computational resources on par with a modern
			
 
				 router, firewall, proxy, or IDS.
			
@@ -203,7 +206,7 @@ We assume that while various different regimes can coordinate and share
 
				 notes, there will be a time lag between one attacker learning how to overcome
			
 
				 a facet of our design and other attackers picking it up.  (The most common
			
 
				 vector of transmission seems to be commercial providers of censorship tools:
			
 
				-once a provider add a feature to meet one country's needs or requests, the
			
 
				+once a provider adds a feature to meet one country's needs or requests, the
			
 
				 feature is available to all of the provider's customers.)  Conversely, we
			
 
				 assume that insider attacks become a higher risk only after the early stages
			
 
				 of network development, once the system has reached a certain level of
			
@@ -225,7 +228,8 @@ we can do about this issue.
 
				 We assume that the attacker may be able to use political and economic
			
 
				 resources to secure the cooperation of extraterritorial or multinational
			
 
				 corporations and entities in investigating information sources.  For example,
			
 
				-the censors can threaten the hosts of troublesome blogs with economic
			
 
				+the censors can threaten the service providers of troublesome blogs
			
 
				+with economic
			
 
				 reprisals if they do not reveal the authors' identities.
			
 
				 
			
 
				 We assume that the user will be able to fetch a genuine
			
@@ -266,15 +270,17 @@ from volunteering a relay in order to learn that Alice is reading
 
				 or posting to certain websites. The third property helps keep users safe from
			
 
				 collaborating websites: consider websites and other Internet services 
			
 
				 that have been pressured
			
 
				-recently into revealing the identity of bloggers~\cite{arrested-bloggers}
			
 
				+recently into revealing the identity of bloggers
			
 
				+%~\cite{arrested-bloggers}
			
 
				 or treating clients differently depending on their network
			
 
				-location~\cite{google-geolocation}.
			
 
				-% and cite{goodell-syverson06} once it's finalized.
			
 
				+location~\cite{goodell-syverson06}.
			
 
				+%~\cite{google-geolocation}.
			
 
				 
			
 
				 The Tor design provides other features as well that are not typically
			
 
				 present in manual or ad hoc circumvention techniques.
			
 
				 
			
 
				-First, Tor has a fairly mature way to distribute information about servers.
			
 
				+First, Tor has a well-analyzed and well-understood way to distribute
			
 
				+information about servers.
			
 
				 Tor directory authorities automatically aggregate, test,
			
 
				 and publish signed summaries of the available Tor routers. Tor clients
			
 
				 can fetch these summaries to learn which routers are available and
			
@@ -340,7 +346,8 @@ Sixth, Tor has an established user base of hundreds of
 
				 thousands of people from around the world. This diversity of
			
 
				 users contributes to sustainability as above: Tor is used by
			
 
				 ordinary citizens, activists, corporations, law enforcement, and
			
 
				-even government and military users~\cite{tor-use-cases}, and they can
			
 
				+even government and military users\footnote{http://tor.eff.org/overview},
			
 
				+and they can
			
 
				 only achieve their security goals by blending together in the same
			
 
				 network~\cite{econymics,usability:weis2006}. This user base also provides
			
 
				 something else: hundreds of thousands of different and often-changing
			
@@ -351,9 +358,10 @@ single server from linking users to their communication partners.  Despite
 
				 initial appearances, {\it distributed-trust anonymity is critical for
			
 
				 anticensorship efforts}.  If any single server can expose dissident bloggers
			
 
				 or compile a list of users' behavior, the censors can profitably compromise
			
 
				-that server's operator applying economic pressure to their employers,
			
 
				+that server's operator, perhaps by  applying economic pressure to their
			
 
				+employers,
			
 
				 breaking into their computer, pressuring their family (if they have relatives
			
 
				-in the censored area), or so on.  Furthermore, in systems where any relay can
			
 
				+in the censored area), or so on.  Furthermore, in designs where any relay can
			
 
				 expose its users, the censors can spread suspicion that they are running some
			
 
				 of the relays and use this belief to chill use of the network.
			
 
				 
			
@@ -497,10 +505,12 @@ first introduction into the Tor network.
 
				 
			
 
				 \subsection{Blocking resistance and JAP}
			
 
				 
			
 
				-K\"{o}psell's Blocking Resistance design~\cite{koepsell:wpes2004} is probably
			
 
				+K\"{o}psell and Hilling's Blocking Resistance
			
 
				+design~\cite{koepsell:wpes2004} is probably
			
 
				 the closest related work, and is the starting point for the design in this
			
 
				-paper.  In this design, the JAP anonymity system is used as a base instead of
			
 
				-Tor.  Volunteers operate a large number of access points to the core JAP
			
 
				+paper.  In this design, the JAP anonymity system~\cite{web-mix} is used
			
 
				+as a base instead of Tor.  Volunteers operate a large number of access
			
 
				+points that relay traffic to the core JAP
			
 
				 network, which in turn anonymizes users' traffic.  The software to run these
			
 
				 relays is, as in our design, included in the JAP client software and enabled
			
 
				 only when the user decides to enable it.  Discovery is handled with a
			
@@ -539,17 +549,20 @@ about relays also allows the censor to do so, he can trivially discover and
 
				 block their addresses, even if the steganography would prevent mere traffic
			
 
				 observation from revealing the relays' addresses.
			
 
				 
			
 
				-\subsection{RST-evasion}
			
 
				+\subsection{RST-evasion and other packet-level tricks}
			
 
				+
			
 
				 In their analysis of China's firewall's content-based blocking, Clayton,
			
 
				 Murdoch and Watson discovered that rather than blocking all packets in a TCP
			
 
				 streams once a forbidden word was noticed, the firewall was simply forging
			
 
				 RST packets to make the communicating parties believe that the connection was
			
 
				-closed~\cite{clayton:pet2006}.  Two mechanisms were proposed: altering
			
 
				-operating systems to ignore forged RST packets, and ensuring that sensitive
			
 
				-words are split across multiple TCP packets so that the censors' firewalls
			
 
				-can't notice them without performing expensive stream reconstruction.  The
			
 
				-later technique relies on the same insight as our weak steganography
			
 
				-assumption.
			
 
				+closed~\cite{clayton:pet2006}. They proposed altering operating systems
			
 
				+to ignore forged RST packets.
			
 
				+
			
 
				+Other packet-level responses to filtering include splitting
			
 
				+sensitive words across multiple TCP packets, so that the censors'
			
 
				+firewalls can't notice them without performing expensive stream
			
 
				+reconstruction~\cite{ptacek98insertion}. This technique relies on the
			
 
				+same insight as our weak steganography assumption.
			
 
				 
			
 
				 \subsection{Internal caching networks}
			
 
				 
			
@@ -557,15 +570,16 @@ Freenet~\cite{freenet-pets00} is an anonymous peer-to-peer data store.
 
				 Analyzing Freenet's security can be difficult, as its design is in flux as
			
 
				 new discovery and routing mechanisms are proposed, and no complete
			
 
				 specification has (to our knowledge) been written.  Freenet servers relay
			
 
				-requests for specific content (indexed by a digest of the content) to the
			
 
				-server that hosts it, and then caches the content as it works its way back to
			
 
				+requests for specific content (indexed by a digest of the content)
			
 
				+``toward'' the server that hosts it, and then cache the content as it
			
 
				+follows the same path back to
			
 
				 the requesting user.  If Freenet's routing mechanism is successful in
			
 
				 allowing nodes to learn about each other and route correctly even as some
			
 
				 node-to-node links are blocked by firewalls, then users inside censored areas
			
 
				 can ask a local Freenet server for a piece of content, and get an answer
			
 
				 without having to connect out of the country at all.  Of course, operators of
			
 
				 servers inside the censored area can still be targeted, and the addresses of
			
 
				-external serves can still be blocked.
			
 
				+external servers can still be blocked.
			
 
				 
			
 
				 \subsection{Skype}
			
 
				 
			
@@ -573,10 +587,10 @@ The popular Skype voice-over-IP software uses multiple techniques to tolerate
 
				 restrictive networks, some of which allow it to continue operating in the
			
 
				 presence of censorship.  By switching ports and using encryption, Skype
			
 
				 attempts to resist trivial blocking and content filtering.  Even if no
			
 
				-encryption were used, it would still be quite expensive to scan all voice
			
 
				+encryption were used, it would still be expensive to scan all voice
			
 
				 traffic for sensitive words.  Also, most current keyloggers are unable to
			
 
				 store voice traffic.  Nevertheless, Skype can still be blocked, especially at
			
 
				-it central directory service.
			
 
				+its central directory service.
			
 
				 
			
 
				 \subsection{Tor itself}
			
 
				 
			
@@ -1295,7 +1309,7 @@ Tor encrypts traffic on the local network, and it obscures the eventual
 
				 destination of the communication, but it doesn't do much to obscure the
			
 
				 traffic volume. In particular, a user publishing a home video will have a
			
 
				 different network signature than a user reading an online news article.
			
 
				-Based on our assumption in Section~\ref{sec:assumptions} that users who
			
 
				+Based on our assumption in Section~\ref{sec:adversary} that users who
			
 
				 publish material are in more danger, should we work to improve Tor's
			
 
				 security in this situation?
			
 
				 
			
@@ -1510,7 +1524,7 @@ If it's trivial to verify that a given address is operating as a bridge,
 
				 and most bridges run on a predictable port, then it's conceivable our
			
 
				 attacker could scan the whole Internet looking for bridges. (In fact, he
			
 
				 can just concentrate on scanning likely networks like cablemodem and DSL
			
 
				-services---see Section~\ref{block-cable} above for related attacks.) It
			
 
				+services---see Section~\ref{subsec:block-cable} above for related attacks.) It
			
 
				 would be nice to slow down this attack. It would be even nicer to make
			
 
				 it hard to learn whether we're a bridge without first knowing some
			
 
				 secret. We call this general property \emph{scanning resistance}.
			
--- a/doc/design-paper/tor-design.bib
+++ b/doc/design-paper/tor-design.bib
@@ -1374,6 +1374,25 @@ Stefan Katzenbeisser and Fernando P\'{e}rez-Gonz\'{a}lez},
 
				    note = {\url{http://nms.lcs.mit.edu/~feamster/papers/usenixsec2002.pdf}},
			
 
				 }
			
 
				 
			
 
				+@techreport{ ptacek98insertion,
			
 
				+  author = "Thomas H. Ptacek and Timothy N. Newsham",
			
 
				+  title = "Insertion, Evasion, and Denial of Service: Eluding Network Intrusion Detection",
			
 
				+  institution = "Secure Networks, Inc.",
			
 
				+  address = "Suite 330, 1201 5th Street S.W, Calgary, Alberta, Canada, T2R-0Y6",
			
 
				+  year = "1998",
			
 
				+  url = "citeseer.ist.psu.edu/ptacek98insertion.html",
			
 
				+}
			
 
				+
			
 
				+@inproceedings{active-wardens,
			
 
				+  author = "Gina Fisk and Mike Fisk and Christos Papadopoulos and Joshua Neil",
			
 
				+  title = "Eliminating Steganography in Internet Traffic with Active Wardens",
			
 
				+  booktitle = {Information Hiding Workshop (IH 2002)},
			
 
				+  year = {2002},
			
 
				+  month = {October},
			
 
				+  editor = {Fabien Petitcolas},
			
 
				+  publisher = {Springer-Verlag, LNCS 2578},
			
 
				+}
			
 
				+
			
 
				 %%% Local Variables:
			
 
				 %%% mode: latex
			
 
				 %%% TeX-master: "tor-design"