15 years ago · 1fb3a60f54
--- a/doc/spec/proposals/ideas/xxx-pluggable-transport.txt
+++ b/doc/spec/proposals/ideas/xxx-pluggable-transport.txt
@@ -0,0 +1,430 @@
 
				+Filename: xxx-pluggable-transport.txt
			
 
				+Title: Pluggable transports for circumvention
			
 
				+Author: Jacob Appelbaum, Nick Mathewson
			
 
				+Created: 15-Oct-2010
			
 
				+Status: Draft
			
 
				+
			
 
				+Overview
			
 
				+
			
 
				+  This is a document about transport plugins; it does not cover
			
 
				+  discovery, or bridgedb improvements. Each transport plugin
			
 
				+  specification should make clear any external requirements but those
			
 
				+  are generally out of scope if they fall into discovery or
			
 
				+  infrastructure components.
			
 
				+
			
 
				+  We should include a description of how to write a good set of plugins,
			
 
				+  how to evaluate and how to classify a plugin. For example, if a plugin
			
 
				+  is said to be hard to detect on the wire if you know what it is and
			
 
				+  how it works, it should say so. If it's easy, it's still possibly
			
 
				+  functional for a given network but perhaps it is not well hidden or
			
 
				+  automatically filtered. Detection and blocking are not always the same
			
 
				+  thing right off. In both cases, a plugin should be quite clear about
			
 
				+  its security claims.
			
 
				+
			
 
				+Target use-cases[a][b]
			
 
				+
			
 
				+  Here's some stuff we want to be able to support.  We're listing these
			
 
				+  in the draft to try to define the problem space.  We won't put this
			
 
				+  section in the final version.
			
 
				+
			
 
				+  1. The 'obfuscated SSH' superencipherment:
			
 
				+   http://github.com/brl/obfuscated-openssh/blob/master/README.obfuscation
			
 
				+
			
 
				+  2. Big P2P-network style transports where instead of connecting to a
			
 
				+   bridge at a known IP, you connect to a bridge by a username, a public
			
 
				+   key, or whatever.
			
 
				+
			
 
				+     1. We need the ability to have two kinds of proxies - one for
			
 
				+       incoming connections and one for outgoing connections. [Sure, but
			
 
				+       that's about how we implement stuff arg arg dumb touchpad -NM]
			
 
				+
			
 
				+        1. Probably we want to have the ability to  get connections
			
 
				+           anyway we'll take them
			
 
				+
			
 
				+       2. So, bridges use the incoming kind, and clients use the ougoing
			
 
				+          kind? Sounds  right.-N
			
 
				+           1. Probably also we're a multi-plexed incoming kind of Tor
			
 
				+           relay - so we should take connections from say localhost's
			
 
				+           little helper and also, we should take connections from
			
 
				+           external ips. This would be useful to identify though. I think
			
 
				+           this is how we would already work as of today.
			
 
				+
			
 
				+            1. You mean, regular non-bridge relays should support this
			
 
				+            too?  I hadn't considered that.  it has seemed pointless
			
 
				+            because of IP blocking, but if we have a p2p transport, it
			
 
				+            would be useful for regular relays  to allow it.  Yes -io
			
 
				+
			
 
				+              1. Also it would be nice for stats purposes to ensure that
			
 
				+              we know what kinds of connections we're handling, even if
			
 
				+              we basically treat them exactly the same. Perhaps Karsten
			
 
				+              wants to weigh in on how we should have Tor handle these
			
 
				+              things? I guess we'll really fuck up his stats collection
			
 
				+              if all of sudden he's getting lots of connections from
			
 
				+              127.0.0.1...
			
 
				+
			
 
				+   1. Various protocol-impersonation tools
			
 
				+      1. NSTX, iodyne, Ozymandns or such, for the lulz.
			
 
				+          1. DNS tunneling of many types - eg: TXT records or the NULL
			
 
				+             protocol trick
			
 
				+      1. HTTP -- many kinds are possible, some may even be right
			
 
				+         1. HTTP POST requests are implemented in Firepass
			
 
				+      1. FTP
			
 
				+         1. Perhaps some kind of anonymous ftp login with sending and
			
 
				+           receiving of  data would be useful?
			
 
				+            1. Lots to think about before designing off the cuff crappy
			
 
				+               protocol covert channels
			
 
				+      1. NTP
			
 
				+        1. Hardly anyone knows about NTP these days - it's almost always
			
 
				+           outbound allowed and it's usually not well inspected
			
 
				+           1. That makes it good for short-term circumvention, but bad
			
 
				+              for long-term hiding.
			
 
				+      1. Triangle-boy
			
 
				+      2. IPSec look-alike
			
 
				+      3. UDP
			
 
				+      4. IPv6
			
 
				+    1. A forged-RST-ignoring tool
			
 
				+       1. A forged-RST-ignoring tool that pretends that it is getting all
			
 
				+         of its connections closed and retrying all the time, when really
			
 
				+         it is just carrying on with business as usual.  Hooray for
			
 
				+         crypto.
			
 
				+         1. Perhaps it's a good idea to mention CCTT?
			
 
				+    1. What else goes here?
			
 
				+      1. We should ask Nextgens about protocol filters from Freenet
			
 
				+      2. http://gray-world.net/papers.shtml
			
 
				+      3. http://gray-world.net/pr_cook_cc.shtml
			
 
				+      4. http://gray-world.net/pr_firepass.shtml
			
 
				+      5. We should ensure we cover the topics and lessons learned from
			
 
				+        "FIREWALL RESISTANCE TO METAFEROGRAPHY IN NETWORK COMMUNICATIONS"
			
 
				+      - see
			
 
				+      https://ritdml.rit.edu/bitstream/handle/1850/12272/RSavacoolThesis5-21-2010.pdf
			
 
				+
			
 
				+  Here's some stuff that seems out-of-scope:
			
 
				+
			
 
				+   1. A generic firewall-breaker that works with all Tor nodes and
			
 
				+    bridges.  Like, if you're using a VPN to get through your firewall,
			
 
				+    and it lets you connect to any Tor node, you can just use it without
			
 
				+    any special plug-in support.  I think this spec is just for stuff
			
 
				+    that requires buy-in from the server side of the connection.  Agreed?
			
 
				+
			
 
				+  1. Yeah - I think we should simply codify the proxy stuff to ensure
			
 
				+    that we plan to remain pluggable for incoming and outgoing connections
			
 
				+    in some formal way.
			
 
				+
			
 
				+  I'm uncertain if we want to support stuff like:
			
 
				+
			
 
				+  1. An ssh tunnel that uses openssh to tunnel raw tor packets, with no
			
 
				+  actual TLS going on underneath.  Promising, but risky. -NM
			
 
				+
			
 
				+  1. I think there isn't much to gain by doing this but perhaps so - we
			
 
				+  are too dependent on TLS and our certs are trivial to fingerprint -io
			
 
				+
			
 
				+  1. Also, Tor-over-TLS-tunneled-over-SSH looks even weirder than
			
 
				+   Tor-over-SSH. -N
			
 
				+
			
 
				+  2. It might be nice to allow certs [cn] fields  to be configurable by
			
 
				+  bridge nodes? -io
			
 
				+
			
 
				+   1. If we allowed "raw traffic" transports, a transport could get this
			
 
				+   trivially by implementing TLS with the right certs. -NM
			
 
				+
			
 
				+   1. perhaps we just want a "raw traffic port" where we connect to pass
			
 
				+   around cells? thoughts?
			
 
				+
			
 
				+  1. A bridge-discovery-and-round-robin p2p tool that connects you to a
			
 
				+   randomly chosen one of an unknown number of bridges.
			
 
				+
			
 
				+  1. Stackable plugins
			
 
				+     1. Tor over DNS over HTTP Post over Obfuscated Tor to reach the Tor
			
 
				+       network to read a copy of uncensored Google News.
			
 
				+        1. Christ, what the fuck world are we building? Or even more,
			
 
				+        what kind of world are we resisting?
			
 
				+     1. More like RST-drop plus sshobfs over HTTP over VPN.
			
 
				+
			
 
				+
			
 
				+Goals & Motivation
			
 
				+
			
 
				+  Frequently, people want to try a novel circumvention method to help
			
 
				+  users connect to Tor bridges.  Some of these methods are already
			
 
				+  pretty easy to deploy: if the user knows an unblocked VPN or open
			
 
				+  SOCKS proxy, they can just use that with the Tor client today.
			
 
				+
			
 
				+  Less easy to deploy are methods that require participation by both the
			
 
				+  client and the bridge.  In order of increasing sophistication, we
			
 
				+  might want to support:
			
 
				+
			
 
				+  1. A protocol obfuscation tool that transforms the output of a TLS
			
 
				+     connection into something that looks like HTTP as it leaves the client,
			
 
				+     and back to TLS as it arrives at the bridge.
			
 
				+  2. An additional authentication step that a client would need to
			
 
				+     perform for a given bridge before being allowed to connect.
			
 
				+  3. An information passing system that uses a side-channel in some
			
 
				+     existing protocol to convey traffic between a client and a bridge
			
 
				+     without the two of them ever communicating directly.
			
 
				+  4. A set of clients to tunnel client->bridge traffic over an existing
			
 
				+     large p2p network, such that the bridge is known by an identifier
			
 
				+     in that network rather than by an IP address.
			
 
				+
			
 
				+  We could in theory support these almost fine with Tor as it stands
			
 
				+  today: every Tor client can take a SOCKS proxy to use for its outgoing
			
 
				+  traffic, so a suitable client proxy could handle the client's traffic
			
 
				+  and connections on its behalf, while a corresponding program on the
			
 
				+  bridge side could handle the bridge's side of the protocol
			
 
				+  transformation.  Nevertheless, there are some reasons to add support
			
 
				+  for transportation plugins to Tor itself:
			
 
				+
			
 
				+  1. It would be good for bridges to have a standard way to advertise
			
 
				+     which transports they support, so that clients can have multiple
			
 
				+     local transport proxies, and automatically use the right one for
			
 
				+     the right bridge.
			
 
				+
			
 
				+  2. There are some changes to our architecture that we'll need for a
			
 
				+     system like this to work.  For testing purposes, if a bridge blocks
			
 
				+     off its regular ORPort and instead has an obfuscated ORPort, the
			
 
				+     bridge authority has no way to test it.  Also, unless the bridge
			
 
				+     has some way to tell that the bridge-side proxy at 127.0.0.1 is not
			
 
				+     the origin of all the connections it is relaying, it might decide
			
 
				+     that there are too many connections from 127.0.0.1, and start
			
 
				+     paring them down to avoid a DoS.
			
 
				+
			
 
				+  3.
			
 
				+  4. (what else?)
			
 
				+
			
 
				+Non-Goals
			
 
				+
			
 
				+  We're not going to talk about automatic verification of plugin
			
 
				+  correctness and safety via sandboxing, proof-carrying code, or
			
 
				+  whatever.
			
 
				+
			
 
				+  We need to do more with discovery and distribution, but that's not
			
 
				+  what this proposal is about.  We're pretty convinced that the problems
			
 
				+  are sufficiently orthogonal that we should be fine so long as we don't
			
 
				+  preclude a single program from implementing both transport and
			
 
				+  discovery extensions.
			
 
				+
			
 
				+  This proposal is not about what transport plugins are the best ones
			
 
				+  for people to write.
			
 
				+
			
 
				+  We've considered issues involved with completely replacing Tor's TLS
			
 
				+  with another encryption layer, rather than layering it inside the
			
 
				+  obfuscation layer.  We describe how to do this in an appendix to the
			
 
				+  current proposal, though we are not currently sure whether it's a good
			
 
				+  idea to implement.
			
 
				+
			
 
				+Design overview
			
 
				+
			
 
				+  Clients run one or more "Transport client" programs that act like
			
 
				+  SOCKS proxies.  They accept connections on localhost on different
			
 
				+  ports. Each one implements one or more transport methods.  Parameters
			
 
				+  are passed from Tor inside the regular username/password parts of the
			
 
				+  SOCKS protocol.
			
 
				+
			
 
				+  Bridges (and maybe relays) run one or more programs that act like
			
 
				+  stunnel-server (or whatever the option is): they get connections from
			
 
				+  the network (typically by listening for connections on the network)
			
 
				+  and relay them to the Bridge's real ORPort.
			
 
				+
			
 
				+  1. The bridge needs to know which methods these servers support
			
 
				+
			
 
				+  1. The bridge needs to advertise this fact some way that the clients
			
 
				+  will find out about it--probably by sticking it in its bridge
			
 
				+  descriptor so that the bridgedb can find out and see that the clients
			
 
				+  get informed.
			
 
				+
			
 
				+  2. Somebody needs to launch these programs
			
 
				+
			
 
				+  3. The bridge may want to just not have a public ORPort at all.
			
 
				+
			
 
				+  4. The bridge may not want to advertise a real IP at all
			
 
				+
			
 
				+  5. The bridge will want to find out from the program any client
			
 
				+  identification information it can get (IP, etc) to implement rules
			
 
				+  about max clients at once
			
 
				+
			
 
				+  Any methods that are wildly successful, we can bake into Tor.
			
 
				+
			
 
				+Proposed terminology:
			
 
				+
			
 
				+  Transport protocol:
			
 
				+  Transport proxy:
			
 
				+
			
 
				+Specifications: Client behavior
			
 
				+
			
 
				+  Bridge lines can now follow the extended format "bridge method
			
 
				+  address:port [[keyid=]id-fingerprint] [k=v] [k=v] [k=v]". To connect
			
 
				+  to such a bridge, a client must open a local connection to the SOCKS
			
 
				+  proxy for "method", and ask it to connect to address:port.  If
			
 
				+  [id-fingerprint] is provided, it should expect the public identity key
			
 
				+  on the TLS connection to match the digest provided in
			
 
				+  [id-fingerprint].  If any [k=v] items are provided, they are
			
 
				+  configuration parameters for the proxy: Tor should separate them with
			
 
				+  NUL bytes and put them user and password fields of the request,
			
 
				+  splitting them across the fields as necessary.  The "id-fingerprint"
			
 
				+  field is always provided in a field named "keyid", if it was given.
			
 
				+
			
 
				+
			
 
				+  example: if the bridge line is "bridge trebuchet www.example.com:3333
			
 
				+  rocks=20 height=5.6m" then, if the Tor client knows that the
			
 
				+  ‘trebuchet' method is provided by a SOCKS5 proxy on 127.0.0.1:19999,
			
 
				+  it should connect to that proxy, ask it to connect to www.example.com,
			
 
				+  and provide the string "rocks=20\0height=5.6m" as the username, the
			
 
				+  password, or split across the username and password.
			
 
				+
			
 
				+
			
 
				+  There are two ways to tell Tor clients about protocol proxies:
			
 
				+  external proxies and managed proxies.  An external proxy is configured
			
 
				+  with "Transport trebuchet socks5 127.0.0.1:9999".  This tells Tor that
			
 
				+  another program is already running to handle ‘trubuchet' connections,
			
 
				+  and Tor doesn't need to worry about it.  A managed proxy is configured
			
 
				+  with "Transport trebuchet /usr/libexec/tor-proxies/trebuchet
			
 
				+  [options]", and tells Tor to launch an external program on-demand to
			
 
				+  provide a socks proxy for ‘trebuchet' connections. The Tor client only
			
 
				+  launches one instance of each external program, even if the same
			
 
				+  executable is listed for more than one method.
			
 
				+
			
 
				+  The same program can implement a managed or an external proxy: it just
			
 
				+  needs to take an argument saying which one to be.
			
 
				+
			
 
				+  [I don't like the terminology here. We should pick better words before
			
 
				+  this "external/managed" stuff catches on.  Also, to most users a
			
 
				+  "proxy" is a computer that relays stuff for them, not a local program
			
 
				+  on their computer. -NM I think we should go with Helper of some kind
			
 
				+  as it's less technically overloaded and more friendly feeling - io
			
 
				+  "Helper" is too overloaded already. -NM]
			
 
				+
			
 
				+Client proxy behavior
			
 
				+
			
 
				+   When launched from the command-line by a Tor client, a transport
			
 
				+   proxy needs to tell Tor which methods and ports it supports.  It does
			
 
				+   this by printing one or more METHOD: lines to its stdout.  These look
			
 
				+   like CMETHOD: trebuchet SOCKS5 127.0.0.1:19999 ARGS:rocks,height
			
 
				+   OPT-ARGS:tensile-strength
			
 
				+
			
 
				+   The ARGS field lists mandatory parameters that must appear in every
			
 
				+   bridge line for this method. The OPT-ARGS field lists optional
			
 
				+   parameters.  If no ARGS or OPT-ARGS field is provided, Tor should not
			
 
				+   check the parameters in bridge lines for this method.
			
 
				+
			
 
				+   The proxy should print a single "METHODS:DONE" line after it is
			
 
				+   finished telling Tor about the methods it provides.
			
 
				+
			
 
				+   [Should methods be versionable? Can they be? -nm I think probably?
			
 
				+   -io Then how? -nm]
			
 
				+
			
 
				+   The transport proxy MUST exit cleanly when it receives a SIGTERM from
			
 
				+   Tor.
			
 
				+
			
 
				+   The Tor client MUST ignore lines beginning with a keyword and a colon
			
 
				+   if it does not recognize the keyword.
			
 
				+
			
 
				+   In the future, if we need a control mechanism, we can use the
			
 
				+   stdin/stdout from Tor to the transport proxy.
			
 
				+
			
 
				+Transport proxy requirements
			
 
				+
			
 
				+   A transport proxy MUST handle SOCKS connect requests using the SOCKS
			
 
				+   version it advertises.
			
 
				+
			
 
				+Server proxy behavior
			
 
				+
			
 
				+   [So, we can have this work like client proxies, where the bridge
			
 
				+   launches some programs, and they tell the bridge, "I am giving you
			
 
				+   method X with parameters Y"?  Do you have to take all the methods? If
			
 
				+   not, which do you specify?]
			
 
				+
			
 
				+   [Do we allow programs that get started independently?]
			
 
				+
			
 
				+   [We'll need to figure out how this works with port forwarding.  Is
			
 
				+   port forwarding the bridge's problem, the proxy's problem, or some
			
 
				+   combination of the two?]
			
 
				+
			
 
				+   [If we're using the bridge authority/bridgedb system for distributing
			
 
				+   bridge info, the right place to advertise bridge lines is probably
			
 
				+   the extrainfo document.  We also need a way to tell the bridge
			
 
				+   authority "don't give out a default bridge line for me"]
			
 
				+
			
 
				+Server behavior
			
 
				+
			
 
				+Bridge authority behavior
			
 
				+
			
 
				+Implementation plan
			
 
				+
			
 
				+   Finish the design work here.
			
 
				+   Clean up all the inline conversations to just get summarized by the
			
 
				+   conclusions they arrived at.
			
 
				+
			
 
				+   Turn this into a draft proposal
			
 
				+
			
 
				+   Circulate and discuss on or-dev
			
 
				+
			
 
				+   (Use Cinderblock Of Loving Correction to reeducate anybody who tries
			
 
				+   to divert discussion of how pluggable transports should work into
			
 
				+   discussion of what is the best possible transport, or whatever.)
			
 
				+
			
 
				+   We should ship a couple of null plugin implementations in one or two
			
 
				+   popular, portable languages so that people get an idea of how to
			
 
				+   write the stuff.
			
 
				+
			
 
				+   1. We should have one that's just a proof of concept that does
			
 
				+      nothing but transfer bytes back and forth.
			
 
				+
			
 
				+   1. We should not do a rot13 one.
			
 
				+
			
 
				+   2. We should implement a basic proxy that does not transform the bytes at all
			
 
				+
			
 
				+   1. We should implement DNS or HTTP using other software (as goodell
			
 
				+      did years ago with DNS) as an example of wrapping existing code into
			
 
				+      our plugin model.
			
 
				+
			
 
				+   2. The obfuscated-ssh superencipherment is pretty trivial and pretty
			
 
				+   useful.  It makes the protocol stringwise unfingerprintable.
			
 
				+
			
 
				+      1. Nick needs to be told firmly not to bikeshed the obfuscated-ssh
			
 
				+        superencipherment too badly
			
 
				+
			
 
				+         1. Go ahead, bikeshed my day
			
 
				+
			
 
				+   1. If we do a raw-traffic proxy, openssh tunnels would be the logical choice.
			
 
				+
			
 
				+Appendix: recommendations for transports
			
 
				+
			
 
				+  Be free/open-source software.  Also, if you think your code might
			
 
				+  someday do so well at circumvention that it should be implemented
			
 
				+  inside Tor, it should use the same license as Tor.
			
 
				+
			
 
				+  Use libraries that Tor already requires. (You can rely on openssl and
			
 
				+  libevent being present if current Tor is present.)
			
 
				+
			
 
				+  Be portable: most Tor users are on Windows, and most Tor developers
			
 
				+  are not, so designing your code for just one of these platforms will
			
 
				+  make it either get a small userbase, or poor auditing.
			
 
				+
			
 
				+  Think secure: if your code is in a C-like language, and it's hard to
			
 
				+  read it and become convinced it's safe then, it's probably not safe.
			
 
				+
			
 
				+  Think small: we want to minimize the bytes that a Windows user needs
			
 
				+  to download for a transport client.
			
 
				+
			
 
				+  Specify: if you can't come up with a good explanation
			
 
				+
			
 
				+  Avoid security-through-obscurity if possible.  Specify.
			
 
				+
			
 
				+  Resist trivial fingerprinting: There should be no good string or regex
			
 
				+  to search for to distinguish your protocol from protocols permitted by
			
 
				+  censors.
			
 
				+
			
 
				+  Imitate a real profile: There are many ways to implement most
			
 
				+  protocols -- and in many cases, most possible variants of a given
			
 
				+  protocol won't actually exist in the wild.
			
 
				+
			
 
				+Appendix: Raw-traffic transports
			
 
				+
			
 
				+  This section describes an optional extension to the proposal above.
			
 
				+
			
 
				+
			
 
				+[a]I agree that we should remove this section - perhaps we should also save the links and move them to the possible plugin examples? - ioerror
			
 
				+
			
 
				+[b]This whole section should get removed from the final thing. I tried to summarize broad themes in the Motivations section below. - NM
			
 
				+
			
 
				+[c]That doesn't really help - does it? Or do you mean that the Tor should set the CN to be say, the IP or hostname of the relay? - ioerror
			
 
				+
			
 
				+The "Address" field when we have it. After that, the hostname if we know it.  After that, do a PTR lookup on our IP.  After that, use our IP. -NM