22 лет назад · a91c6d27bf
--- a/doc/tor-design.tex
+++ b/doc/tor-design.tex
@@ -1524,6 +1524,7 @@ In this section, we discuss how well Tor meets our stated design goals
 
				 and its resistance to attacks.
			
 
				 
			
 
				 \SubSection{Meeting Basic Goals}
			
 
				+% None of these seem to say very much.  Should this subsection be removed?
			
 
				 \begin{tightlist}
			
 
				 \item [Basic Anonymity:] Because traffic is encrypted, changing in
			
 
				   appearance, and can flow from anywhere to anywhere within the
			
@@ -1532,9 +1533,8 @@ and its resistance to attacks.
 
				   the network will not be able to link the initiator and responder.
			
 
				   Nor is it possible to directly correlate any two communication
			
 
				   sessions as coming from a single source without additional
			
 
				-  information. Resistance to specific anonymity threats will be discussed
			
 
				-  below.
			
 
				-  
			
 
				+  information. Resistance to more sophisticated anonymity threats is
			
 
				+  discussed below.
			
 
				 \item[Deployability:] Tor requires no specialized hardware. Tor
			
 
				   requires no kernel modifications; it runs in user space (currently
			
 
				   on Linux, various BSDs, and Windows). All of these imply a low
			
@@ -1542,17 +1542,21 @@ and its resistance to attacks.
 
				   Tor nodes have good relatively persistent net connectivity
			
 
				   (currently T1 or better);
			
 
				 % Is that reasonable to say? We haven't really discussed it -P.S.
			
 
				+% Roger thinks otherwise; he will fix this. -NM
			
 
				   however, there is no padding overhead, and operators can limit
			
 
				   bandwidth on any link.  Tor is freely available under the modified
			
 
				-  BSD license, and operators are able to choose there own exit
			
 
				-  strategies. These reduce legal and social liability barriers to
			
 
				+  BSD license, and operators are able to choose their own exit
			
 
				+  policies, thus reducing legal and social barriers to
			
 
				   running a node.
			
 
				   
			
 
				 \item[Usability:] As noted, Tor runs in user space. So does the onion
			
 
				-  proxy, which is easy to install and run. And SOCKS aware
			
 
				-  applications require nothing more than to be pointed at this proxy.
			
 
				+  proxy, which is comparatively easy to install and run. SOCKS-aware
			
 
				+  applications require nothing more than to be pointed at the onion
			
 
				+  proxy; other applications can be redirected to use SOCKS for their
			
 
				+  outgoing TCP connections by drop-in libraries such as tsocks.
			
 
				   
			
 
				-\item[Flexibility:] Tor's design and implementation is modular.  So,
			
 
				+\item[Flexibility:] Tor's design and implementation is fairly modular,
			
 
				+  so that,
			
 
				   for example, a scalable P2P replacement for the directory servers
			
 
				   would not substantially impact other aspects of the system.  Tor
			
 
				   runs on top of TCP, so design options that could not easily do so
			
@@ -1562,26 +1566,28 @@ and its resistance to attacks.
 
				   two systems, which seems to be relatively straightforward. This will
			
 
				   allow testing and direct comparison of the two rather different
			
 
				   designs.
			
 
				-
			
 
				+  % Do we want to say this?  I don't think we should talk about this
			
 
				+  % kind of discussion till we have more positive results.
			
 
				   
			
 
				 \item[Conservative design:] Tor opts for practicality when there is no
			
 
				   clear resolution of anonymity tradeoffs or practical means to
			
 
				   achieve resolution. Thus, we do not currently pad or mix; although
			
 
				   it would be easy to add either of these. Indeed, our system allows
			
 
				-  longrange and variable padding if this should ever be shown to have
			
 
				+  long-range and variable padding if this should ever be shown to have
			
 
				   a clear advantage.  Similarly, we do not currently attempt to
			
 
				-  resolve such issues as pseudospoofing to dominate the network except
			
 
				+  resolve such issues as Sybil attacks to dominate the network except
			
 
				   by such direct means as personal familiarity of director operators
			
 
				   with all node operators.
			
 
				 \end{tightlist}
			
 
				 
			
 
				-
			
 
				 \SubSection{Attacks and Defenses}
			
 
				 \label{sec:attacks}
			
 
				 
			
 
				 Below we summarize a variety of attacks and how well our design withstands
			
 
				 them.
			
 
				 
			
 
				+[XXX Note that some of these attacks are outside our threat model! -NM]
			
 
				+
			
 
				 \subsubsection*{Passive attacks}
			
 
				 \begin{tightlist}
			
 
				 \item \emph{Observing user traffic patterns.} Observations of connection
			
@@ -1599,143 +1605,157 @@ them.
 
				   websites may not be. Further, a responding website may itself be
			
 
				   considered an adversary. Filtering content is not a primary goal of
			
 
				   Onion Routing; nonetheless, Tor can directly make use of Privoxy and
			
 
				-  related services via SOCKS and thus provide their application data
			
 
				-  stream anonymization.
			
 
				-
			
 
				+  related filtering services via SOCKS and thus anonymize their
			
 
				+  application data streams.
			
 
				 
			
 
				 \item \emph{Option distinguishability.} Configuration options can be a
			
 
				   source of distinguishable patterns. In general there is economic
			
 
				   incentive to allow preferential services \cite{econymics}, and some
			
 
				   degree of configuration choice is a factor in attracting large
			
 
				-  numbers of users to provide anonymity. We offer a standardized set
			
 
				-  of client option configurations to maximize attractiveness of the
			
 
				-  system while minimizing affect on anonymity set size.
			
 
				-% This needs to go into the spec at least, yes? How else are we
			
 
				-% making this true? -PS
			
 
				+  numbers of users to provide anonymity.  So far, however, we have
			
 
				+  not found a compelling use case in Tor for any client-configurable
			
 
				+  options.  Thus, clients are currently distinguishable only by their
			
 
				+  behavior.
			
 
				   
			
 
				-\item \emph{End-to-end Timing correlation.} Onion Routing only
			
 
				-  minimally hides end-to-end timing correlations. If an attacker
			
 
				-  suspects communication between a given initiator and responder, and
			
 
				-  can watch patterns of traffic at the initiator end and the responder
			
 
				-  end, then he will be able to confirm the correspondence with high
			
 
				-  probability. The greatest protection currently against such
			
 
				-  confirmation is if the connection between the onion proxy and the
			
 
				-  first Tor node is hidden, e.g., because it is local or behind a
			
 
				-  firewall. Except for obscuring multiple users behind one such
			
 
				-  firewall, this just requires the observer to separate the traffic
			
 
				-  that terminates at the onion router from that which passes through
			
 
				-  it, and to filter the greater volume of terminating traffic than a
			
 
				-  single initiator would multiplex. We do not expect that to be a
			
 
				-  large problem for an attacker who can observe traffic at both ends
			
 
				-  of an application connection.
			
 
				+\item \emph{End-to-end Timing correlation.}  Tor only minimally hides
			
 
				+  end-to-end timing correlations. If an attacker can watch patterns of
			
 
				+  traffic at the initiator end and the responder end, then he will be
			
 
				+  able to confirm the correspondence with high probability. The
			
 
				+  greatest protection currently against such confirmation is if the
			
 
				+  connection between the onion proxy and the first Tor node is hidden,
			
 
				+  possibly because it is local or behind a firewall.  This approach
			
 
				+  requires an observer to separate traffic originating the onion
			
 
				+  router from traffic passes through it.  We still do not, however,
			
 
				+  predict this approach to be a large problem for an attacker who can
			
 
				+  observe traffic at both ends of an application connection.
			
 
				   
			
 
				 \item \emph{End-to-end Size correlation.} Simple packet counting
			
 
				-  without timing consideration will also be somewhat effective in
			
 
				-  confirming endpoints of a connection through Onion Routing; although
			
 
				-  slightly less so. This is because, even without padding, the leaky
			
 
				-  pipe topology means different numbers of packets may enter one end
			
 
				-  of a circuit than exit at the other.
			
 
				+  without timing consideration will also be effective in confirming
			
 
				+  endpoints of a connection through Onion Routing; although slightly
			
 
				+  less so. This is because, even without padding, the leaky pipe
			
 
				+  topology means different numbers of packets may enter one end of a
			
 
				+  circuit than exit at the other.
			
 
				   
			
 
				 \item \emph{Website fingerprinting.} All the above passive
			
 
				   attacks that are at all effective are traffic confirmation attacks.
			
 
				   This puts them outside our general design goals. There is also
			
 
				-  passive traffic analysis attack that is potentially effective.
			
 
				-  Instead of searching far end connections for timing and volume
			
 
				+  a passive traffic analysis attack that is potentially effective.
			
 
				+  Instead of searching exit connections for timing and volume
			
 
				   correlations it is possible to build up a database of
			
 
				-  ``fingerprints'' for large numbers of websites. If one now wants to
			
 
				+  ``fingerprints'' containing file sizes and access patterns for a
			
 
				+  large numbers of interesting websites. If one now wants to
			
 
				   monitor the activity of a user, it may be possible to confirm a
			
 
				-  connection to a site simply by consulting the database. This has
			
 
				+  connection to a site simply by consulting the database. This attack has
			
 
				   been shown to be effective against SafeWeb \cite{hintz-pet02}. Onion
			
 
				   Routing is not as vulnerable as SafeWeb to this attack: There is the
			
 
				   possibility that multiple streams are exiting the circuit at
			
 
				-  different places concurrently.  Also, fingerprinting is limited to
			
 
				+  different places concurrently.  Also, fingerprinting will be limited to
			
 
				   the granularity of cells, currently 256 bytes. Larger cell sizes
			
 
				   and/or minimal padding schemes that group websites into large sets
			
 
				-  are possible responses.  But this remains an open problem. Note that
			
 
				+  are possible responses.  But this remains an open problem.  Link
			
 
				+  padding or long-range dummies may also make fingerprints harder to
			
 
				+  detect. (Note that
			
 
				   such fingerprinting should not be confused with the latency attacks
			
 
				   of \cite{back01}. Those require a fingerprint of the latencies of
			
 
				   all circuits through the network, combined with those from the
			
 
				-  network edges to the targetted user and the responder website. While
			
 
				+  network edges to the targeted user and the responder website. While
			
 
				   these are in principal feasible and surprises are always possible,
			
 
				   these constitute a much more complicated attack, and there is no
			
 
				-  current evidence of their practicality.
			
 
				-
			
 
				+  current evidence of their practicality.)
			
 
				 
			
 
				-\item Content analysis. Not our main thing, but, Privoxy to
			
 
				-  anonymization of data stream.
			
 
				+\item \emph{Content analysis.}  Tor explicitly provides no content
			
 
				+  rewriting for any protocol at a higher level than TCP.  When
			
 
				+  protocol cleaners are available, however (as Privoxy is for HTTP),
			
 
				+  Tor can integrate them in order to address these attacks.
			
 
				 
			
 
				 \end{tightlist}
			
 
				 
			
 
				 \subsubsection*{Active attacks}
			
 
				 \begin{tightlist}
			
 
				-\item \emph{Key compromise.} Onion Routing makes use of several kinds
			
 
				-  of keys.  Links between Tor nodes are protected by TLS negotiated
			
 
				-  session keys over which all traffic is multiplexed.  Long-term
			
 
				-  signature keys sign information about Tor nodes, directory servers
			
 
				-  and the like. Medium-term encryption keys are used to send a
			
 
				-  Diffie-Hellman key from an onion proxy to an onion router. And,
			
 
				-  session keys encrypt traffic between onion routers and the onion
			
 
				-  proxy. Session key compromise will obviate for the lifetime of the
			
 
				-  circuit the change in appearance of cells on a circuit passing
			
 
				-  through a specific onion router if that compromise is done by the
			
 
				-  immediate neighboring onion routers in a circuit. Compromise of the
			
 
				-  mid-term keys will result in a similar compromise of all session
			
 
				-  keys until the mid-term key changes. Note that, because of perfect
			
 
				-  forward secrecy, this does not affect previously established keys or
			
 
				-  indeed any session keys unless the node is also compromised.
			
 
				-  Compromise of a long-term key means that all information about a
			
 
				-  node can be forged following the compromise. This includes what the
			
 
				-  correct mid-term keys are, and in the case of directory servers,
			
 
				-  information about which nodes are in the network, which keys they
			
 
				-  are current for those nodes, etc.
			
 
				-
			
 
				+\item \emph{Key compromise.}  We consider the impact of a compromise
			
 
				+  for each type of key in turn, from the shortest- to the
			
 
				+  longest-lived.  If a circuit session key is compromised, the
			
 
				+  attacker can unwrap a single layer of encryption from the relay
			
 
				+  cells traveling along that circuit.  (Only nodes on the circuit can
			
 
				+  see these cells.)  If a TLS session key is compromised, an attacker
			
 
				+  can view all the cells on TLS connection until the key is
			
 
				+  renegotiated.  (These cells are themselves encrypted.)  If a TLS
			
 
				+  private key is compromised, the attacker can fool others into
			
 
				+  thinking that he is the affected OR, but still cannot accept any
			
 
				+  connections.  If an onion private key is compromised, the attacker
			
 
				+  can impersonate the OR in circuits, but only if the attacker has
			
 
				+  also compromised the OR's TLS private key, or is running the
			
 
				+  previous OR in the circuit.  (This compromise affects newly created
			
 
				+  circuits, but because of perfect forward secrecy, the attacker
			
 
				+  cannot hijack old circuits without compromising their session keys.)
			
 
				+  In any case, an attacker can only take advantage of a compromise in
			
 
				+  these mid-term private keys until they expire.  Only by
			
 
				+  compromising a node's identity key can an attacker replace that
			
 
				+  node indefinitely, by sending new forged mid-term keys to the
			
 
				+  directories.  Finally, an attacker who can compromise a
			
 
				+  \emph{directory's} identity key can influence every client's view
			
 
				+  of the network---but only to the degree made possible by gaining a
			
 
				+  vote with the rest of the the directory servers.
			
 
				+
			
 
				+\item \emph{Iterated compromise.} A roving adversary who can
			
 
				+  compromise ORs (by system intrusion, legal coersion, or extralegal
			
 
				+  coersion) could march down length of a circuit compromising the
			
 
				+  nodes until he reaches the end.  Unless the adversary can complete
			
 
				+  this attack within the lifetime of the circuit, however, the ORs
			
 
				+  will have discarded the necessary information before the attack can
			
 
				+  be completed.  (Thanks to the perfect forward secrecy of session
			
 
				+  keys, the attacker cannot cannot force nodes to decrypt recorded
			
 
				+  traffic once the circuits have been closed.)
			
 
				   
			
 
				-\item \emph{Iterated subpoena.} A roving adversary can march down the
			
 
				-  length of a circuit compromising the nodes until he reaches both of
			
 
				-  the endpoints.  In \cite{or-pet00} the algorithmic structure of this
			
 
				-  attack was described. But, only the unlikely case of compromise
			
 
				-  during the lifetime of a circuit was considered. Far more likely is
			
 
				-  that nodes in a circuit will be compromised after the fact, by legal
			
 
				-  means, rubber-hose cryptanalysis, etc. Perfect forward secrecy of
			
 
				-  session keys makes this attack unaffective against Tor as long as
			
 
				-  Diffie-Hellman keys are discarded as soon as they are no longer
			
 
				-  needed.
			
 
				-  
			
 
				-\item \emph{Run recipient.} By running a Web server, an adversary can
			
 
				-  try to identify the initiator of connections to it and possibly also
			
 
				-  attrack users to itself by providing attractive content. There is
			
 
				-  always a danger that the application protocols and associated
			
 
				-  programs can be induced to reveal information about the initiator's
			
 
				-  system. This is not directly in Onion Routing's protection area, so,
			
 
				-  to the extent it is a concern, we are dependent on Privoxy and
			
 
				-  others to keep up with the issue. A Web server can also attempt to
			
 
				-  provide recognizable volume and timing signatures. This is simply a
			
 
				-  stronger version of the passive confirmation adversary against which
			
 
				-  we already acknowledged vulnerability.
			
 
				+\item \emph{Run a recipient.} By running a Web server, an adversary
			
 
				+  trivially learns the timing patterns of those connecting to it, and
			
 
				+  can introduce arbitrary patterns in its responses.  This can greatly
			
 
				+  facilitate end-to-end attacks: If the adversary can induce certain
			
 
				+  users to connect to connect to his webserver (perhaps by providing
			
 
				+  content targeted at those users), she now holds one end of their
			
 
				+  connection.  Additonally, here is a danger that the application
			
 
				+  protocols and associated programs can be induced to reveal
			
 
				+  information about the initiator.  This is not directly in Onion
			
 
				+  Routing's protection area, so we are dependent on Privoxy and
			
 
				+  similar protocol cleaners to solve the problem.
			
 
				   
			
 
				 \item \emph{Run an onion proxy.} It is expected that end users will
			
 
				   nearly always run their own local onion proxy. However, in some
			
 
				-  settings, it may be necessary for the proxy to run remotely.
			
 
				-  Typically this would be in a secure setting where it was necessary
			
 
				-  to monitor the activity of those connecting to the proxy. But, if
			
 
				-  the onion proxy is compromised, then all future connections through
			
 
				-  it are completely compromised.
			
 
				+  settings, it may be necessary for the proxy to run
			
 
				+  remotely---typically, in an institutional setting where it was
			
 
				+  necessary to monitor the activity of those connecting to the proxy.
			
 
				+  The drawback, of course, is that if the onion proxy is compromised,
			
 
				+  then all future connections through it are completely compromised.
			
 
				+
			
 
				+\item \emph{DoS non-observed nodes.} An observer who can observe some
			
 
				+  of the Tor network can increase the value of this traffic analysis
			
 
				+  if it can attack non-observed nodes to shut them down, reduce
			
 
				+  their reliability, or persuade users that they are not trustworthy.
			
 
				+  The best defense here is robustness.
			
 
				   
			
 
				-\item \emph{Run a hostile node.} A hostile node can reveal everything
			
 
				-  about circuits passing through it. It can also create circuits
			
 
				-  through itself to affect traffic at other nodes. Its ability to
			
 
				-  directly DoS a neighbor is now limited by bandwidth throttling. It
			
 
				-  can enhance the amount of network traffic it can see by attacking
			
 
				-  other nodes sufficiently to shut them down or greatly reduce their
			
 
				-  service. Nonetheless, in terms of compromising anonymity of the
			
 
				-  endpoints of a circuit by its observations, a hostile node is only
			
 
				-  significant if it is immediately adjacent to that endpoint.
			
 
				+\item \emph{Run a hostile node.}  In addition to the abilties of a
			
 
				+  local observer, an isolated hostile node can create circuits through
			
 
				+  itself, or alter traffic patterns, in order to affect traffic at
			
 
				+  other nodes. Its ability to directly DoS a neighbor is now limited
			
 
				+  by bandwidth throttling. Nonetheless, in order to compromise the
			
 
				+  anonymity of the endpoints of a circuit by its observations, a
			
 
				+  hostile node is only significant if it is immediately adjacent to
			
 
				+  that endpoint. 
			
 
				   
			
 
				+\item \emph{Run multiple hostile nodes.}  If an adversary is able to
			
 
				+  run multiple ORs, and is able to persuade the directory servers
			
 
				+  that those ORs are trustworthy and independant, then occasionally
			
 
				+  some user will choose one of those ORs for the start and another of
			
 
				+  those ORs as the end of a circuit.  When this happens, the user's
			
 
				+  anonymity is compromised for those circuits.  If an adversary can
			
 
				+  control $m$ out of $N$ nodes, he will be able to correlate at most 
			
 
				+  $\frac{m}{N}$ of the traffic in this way.
			
 
				+
			
 
				 \item \emph{Compromise entire path.} Anyone compromising both
			
 
				   endpoints of a circuit can confirm this with high probability. If
			
 
				   the entire path is compromised, this becomes a certainty; however,
			
 
				-  the added benefit to the adversary of such an attack is such that it
			
 
				-  is most likely only as a coincidence.
			
 
				+  the added benefit to the adversary of such an attack is small in
			
 
				+  relation to the difficulty.
			
 
				   
			
 
				 \item \emph{Run a hostile directory server.} Directory servers control
			
 
				   admission to the network. However, because the network directory
			
@@ -1746,9 +1766,7 @@ them.
 
				   bandwidth limited; however, it is possible to open up sufficient
			
 
				   numbers of circuits that converge at a single onion router to
			
 
				   overwhelm its network connection, its ability to process new
			
 
				-  circuits or both. This threat is diminished by router twins since
			
 
				-  now the attack must be run on all twins of the attacked node to be
			
 
				-  successful.
			
 
				+  circuits or both.
			
 
				 
			
 
				 %OK so I noticed that twins are completely removed from the paper above,
			
 
				 % but it's after 5 so I'll leave that problem to you guys. -PS
			
@@ -1758,11 +1776,10 @@ them.
 
				   
			
 
				 \item \emph{Tagging attacks.} A hostile node could try to ``tag'' a
			
 
				   cell by altering it. This would render it unreadable, but if the
			
 
				-  connection is, e.g., an unencrypted one to a Web site, the garbled
			
 
				-  content coming out at the appropriate time could confirm the
			
 
				-  association. However, integrity checks on cells will prevent this
			
 
				-  from succeeding.
			
 
				-
			
 
				+  connection is, for example, an unencrypted request to a Web site,
			
 
				+  the garbled content coming out at the appropriate time could confirm
			
 
				+  the association. However, integrity checks on cells prevent
			
 
				+  this attack from succeeding.
			
 
				 
			
 
				 [XXXX Damn it's 5:10. So, I'm stopping here. Good luck with what's left
			
 
				 tonight. Hopefully less than it looks. -PS]
			
@@ -1827,9 +1844,6 @@ Pull attacks and defenses into analysis as a subsection
 
				 \Section{Open Questions in Low-latency Anonymity}
			
 
				 \label{sec:maintaining-anonymity}
			
 
				  
			
 
				-
			
 
				-
			
 
				-
			
 
				 % There must be a better intro than this! -NM
			
 
				 In addition to the open problems discussed in
			
 
				 section~\ref{subsec:non-goals}, many other questions remain to be