|
@@ -0,0 +1,115 @@
|
|
|
+Filename: 163-detecting-clients.txt
|
|
|
+Title: Detecting whether a connection comes from a client
|
|
|
+Author: Nick Mathewson
|
|
|
+Created: 22-May-2009
|
|
|
+Target: 0.2.2
|
|
|
+Status: Open
|
|
|
+
|
|
|
+
|
|
|
+Overview:
|
|
|
+
|
|
|
+ Some aspects of Tor's design require relays to distinguish
|
|
|
+ connections from clients from connections that come from relays.
|
|
|
+ The existing means for doing this is easy to spoof. We propose
|
|
|
+ a better approach.
|
|
|
+
|
|
|
+Motivation:
|
|
|
+
|
|
|
+ There are at least two reasons for which Tor servers want to tell
|
|
|
+ which connections come from clients and which come from other
|
|
|
+ servers:
|
|
|
+
|
|
|
+ 1) Some exits, proposal 152 notwithstanding, want to disallow
|
|
|
+ their use as single-hop proxies.
|
|
|
+ 2) Some performance-related proposals involve prioritizing
|
|
|
+ traffic from relays, or limiting traffic per client (but not
|
|
|
+ per relay).
|
|
|
+
|
|
|
+ Right now, we detect client vs server status based on how the
|
|
|
+ client opens circuits. (Check out the code that implements the
|
|
|
+ AllowSingleHopExits option if you want all the details.) This
|
|
|
+ method is depressingly easy to fake, though. This document
|
|
|
+ proposes better means.
|
|
|
+
|
|
|
+Goals:
|
|
|
+
|
|
|
+ To make grabbing relay privileges at least as difficult as just
|
|
|
+ running a relay.
|
|
|
+
|
|
|
+ In the analysis below, "using server privileges" means taking any
|
|
|
+ action that only servers are supposed to do, like delivering a
|
|
|
+ BEGIN cell to an exit node that doesn't allow single hop exits,
|
|
|
+ or claiming server-like amounts of bandwidth.
|
|
|
+
|
|
|
+Passive detection:
|
|
|
+
|
|
|
+ A connection is definitely a client connection if it takes one of
|
|
|
+ the TLS methods during setup that does not establish an identity
|
|
|
+ key.
|
|
|
+
|
|
|
+ A circuit is definitely a client circuit if it is initiated with
|
|
|
+ a CREATE_FAST cell, though the node could be a client or a server.
|
|
|
+
|
|
|
+ A node that's listed in a recent consensus is probably a server.
|
|
|
+
|
|
|
+ A node to which we have successfully extended circuits from
|
|
|
+ multiple origins is probably a server.
|
|
|
+
|
|
|
+Active detection:
|
|
|
+
|
|
|
+ If a node doesn't try to use server privileges at all, we never
|
|
|
+ need to care whether it's a server.
|
|
|
+
|
|
|
+ When a node or circuit tries to use server privileges, if it is
|
|
|
+ "definitely a client" as per above, we can refuse it immediately.
|
|
|
+
|
|
|
+ If it's "probably a server" as per above, we can accept it.
|
|
|
+
|
|
|
+ Otherwise, we have either a client, or a server that is neither
|
|
|
+ listed in any consensus or used by any other clients -- in other
|
|
|
+ words, a new or private server.
|
|
|
+
|
|
|
+ For these servers, we should attempt to build one or more test
|
|
|
+ circuits through them. If enough of the circuits succeed, the
|
|
|
+ node is a real relay. If not, it is probably a client.
|
|
|
+
|
|
|
+ While we are waiting for the test circuits to succeed, we should
|
|
|
+ allow a short grace period in which server privileges are
|
|
|
+ permitted. When a test is done, we should remember its outcome
|
|
|
+ for a while, so we don't need to do it again.
|
|
|
+
|
|
|
+Why it's hard to do good testing:
|
|
|
+
|
|
|
+ Doing a test circuit starting with an unlisted router requires
|
|
|
+ only that we have an open connection for it. Doing a test
|
|
|
+ circuit starting elsewhere _through_ an unlisted router--though
|
|
|
+ more reliable-- would require that we have a known address, port,
|
|
|
+ identity key, and onion key for the router. Only the address and
|
|
|
+ identity key are easily available via the current Tor protocol in
|
|
|
+ all cases.
|
|
|
+
|
|
|
+ We could fix this part by requiring that all servers support
|
|
|
+ BEGIN_DIR and support downloading at least a current descriptor
|
|
|
+ for themselves.
|
|
|
+
|
|
|
+Open questions:
|
|
|
+
|
|
|
+ What are the thresholds for the needed numbers of circuits
|
|
|
+ for us to decide that a node is a relay?
|
|
|
+
|
|
|
+ [Suggested answer: two circuits from two distinct hosts.]
|
|
|
+
|
|
|
+ How do we pick grace periods? How long do we remember the
|
|
|
+ outcome of a test?
|
|
|
+
|
|
|
+ [Suggested answer: 10 minute grace period; 48 hour memory of
|
|
|
+ test outcomes.]
|
|
|
+
|
|
|
+ If we can build circuits starting at a suspect node, but we don't
|
|
|
+ have enough information to try extending circuits elsewhere
|
|
|
+ through the node, should we conclude that the node is
|
|
|
+ "server-like" or not?
|
|
|
+
|
|
|
+ [Suggested answer: for now, just try making circuits through
|
|
|
+ the node. Extend this to extending circuits as needed.]
|
|
|
+
|