tor-spec-udp.txt 17 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366
  1. [This proposed Tor extension has not been implemented yet. It is currently
  2. in request-for-comments state. -RD]
  3. Tor Unreliable Datagram Extension Proposal
  4. Marc Liberatore
  5. Abstract
  6. Contents
  7. 0. Introduction
  8. Tor is a distributed overlay network designed to anonymize low-latency
  9. TCP-based applications. The current tor specification supports only
  10. TCP-based traffic. This limitation prevents the use of tor to anonymize
  11. other important applications, notably voice over IP software. This document
  12. is a proposal to extend the tor specification to support UDP traffic.
  13. The basic design philosophy of this extension is to add support for
  14. tunneling unreliable datagrams through tor with as few modifications to the
  15. protocol as possible. As currently specified, tor cannot directly support
  16. such tunneling, as connections between nodes are built using transport layer
  17. security (TLS) atop TCP. The latency incurred by TCP is likely unacceptable
  18. to the operation of most UDP-based application level protocols.
  19. Thus, we propose the addition of links between nodes using datagram
  20. transport layer security (DTLS). These links allow packets to traverse a
  21. route through tor quickly, but their unreliable nature requires minor
  22. changes to the tor protocol. This proposal outlines the necessary
  23. additions and changes to the tor specification to support UDP traffic.
  24. We note that a separate set of DTLS links between nodes creates a second
  25. overlay, distinct from the that composed of TLS links. This separation and
  26. resulting decrease in each anonymity set's size will make certain attacks
  27. easier. However, it is our belief that VoIP support in tor will
  28. dramatically increase its appeal, and correspondingly, the size of its user
  29. base, number of deployed nodes, and total traffic relayed. These increases
  30. should help offset the loss of anonymity that two distinct networks imply.
  31. 1. Overview of Tor-UDP and its complications
  32. As described above, this proposal extends the Tor specification to support
  33. UDP with as few changes as possible. Tor's overlay network is managed
  34. through TLS based connections; we will re-use this control plane to set up
  35. and tear down circuits that relay UDP traffic. These circuits be built atop
  36. DTLS, in a fashion analogous to how Tor currently sends TCP traffic over
  37. TLS.
  38. The unreliability of DTLS circuits creates problems for Tor at two levels:
  39. 1. Tor's encryption of the relay layer does not allow independent
  40. decryption of individual records. If record N is not received, then
  41. record N+1 will not decrypt correctly, as the counter for AES/CTR is
  42. maintained implicitly.
  43. 2. Tor's end-to-end integrity checking works under the assumption that
  44. all RELAY cells are delivered. This assumption is invalid when cells
  45. are sent over DTLS.
  46. The fix for the first problem is straightforward: add an explicit sequence
  47. number to each cell. To fix the second problem, we introduce a
  48. system of nonces and hashes to RELAY packets.
  49. In the following sections, we mirror the layout of the Tor Protocol
  50. Specification, presenting the necessary modifications to the Tor protocol as
  51. a series of deltas.
  52. 2. Connections
  53. Tor-UDP uses DTLS for encryption of some links. All DTLS links must have
  54. corresponding TLS links, as all control messages are sent over TLS. All
  55. implementations MUST support the DTLS ciphersuite "[TODO]".
  56. DTLS connections are formed using the same protocol as TLS connections.
  57. This occurs upon request, following at CREATE_UDP or CREATE_FAST_UDP cell,
  58. as detailed in section 4.6.
  59. Once a paired TLS/DTLS connection is established, the two sides send cells
  60. to one another. All but two types of cells are sent over TLS links. RELAY
  61. cells containing the commands RELAY_UDP_DATA and RELAY_UDP_DROP, specified
  62. below, are sent over DTLS links. [Should all cells still be 512 bytes long?
  63. Perhaps upon completion of a preliminary implementation, we should do a
  64. performance evaluation for some class of UDP traffic, such as VoIP. - ML]
  65. Cells may be sent embedded in TLS or DTLS records of any size or divided
  66. across such records. The framing of these records MUST NOT leak any more
  67. information than the above differentiation on the basis of cell type. [I am
  68. uncomfortable with this leakage, but don't see any simple, elegant way
  69. around it. -ML]
  70. As with TLS connections, DTLS connections are not permanent.
  71. 3. Cell format
  72. Each cell contains the following fields:
  73. CircID [2 bytes]
  74. Command [1 byte]
  75. Sequence Number [2 bytes]
  76. Payload (padded with 0 bytes) [507 bytes]
  77. [Total size: 512 bytes]
  78. The 'Command' field holds one of the following values:
  79. 0 -- PADDING (Padding) (See Sec 6.2)
  80. 1 -- CREATE (Create a circuit) (See Sec 4)
  81. 2 -- CREATED (Acknowledge create) (See Sec 4)
  82. 3 -- RELAY (End-to-end data) (See Sec 5)
  83. 4 -- DESTROY (Stop using a circuit) (See Sec 4)
  84. 5 -- CREATE_FAST (Create a circuit, no PK) (See Sec 4)
  85. 6 -- CREATED_FAST (Circuit created, no PK) (See Sec 4)
  86. 7 -- CREATE_UDP (Create a UDP circuit) (See Sec 4)
  87. 8 -- CREATED_UDP (Acknowledge UDP create) (See Sec 4)
  88. 9 -- CREATE_FAST_UDP (Create a UDP circuit, no PK) (See Sec 4)
  89. 10 -- CREATED_FAST_UDP(UDP circuit created, no PK) (See Sec 4)
  90. The sequence number allows for AES/CTR decryption of RELAY cells
  91. independently of one another; this functionality is required to support
  92. cells sent over DTLS. The sequence number is described in more detail in
  93. section 4.5.
  94. [Should the sequence number only appear in RELAY packets? The overhead is
  95. small, and I'm hesitant to force more code paths on the implementor.]
  96. [Having separate commands for UDP circuits seems necessary, unless we can
  97. assume a flag day event for a large number of tor nodes.]
  98. 4. Circuit management
  99. 4.2. Setting circuit keys
  100. Keys are set up for UDP circuits in the same fashion as for TCP circuits.
  101. Each UDP circuit shares keys with its corresponding TCP circuit.
  102. 4.3. Creating circuits
  103. UDP circuits are created as TCP circuits, using the *_UDP cells as
  104. appropriate.
  105. 4.4. Tearing down circuits
  106. UDP circuits are torn down as TCP circuits, using the *_UDP cells as
  107. appropriate.
  108. 4.5. Routing relay cells
  109. When an OR receives a RELAY cell, it checks the cell's circID and
  110. determines whether it has a corresponding circuit along that
  111. connection. If not, the OR drops the RELAY cell.
  112. Otherwise, if the OR is not at the OP edge of the circuit (that is,
  113. either an 'exit node' or a non-edge node), it de/encrypts the payload
  114. with AES/CTR, as follows:
  115. 'Forward' relay cell (same direction as CREATE):
  116. Use Kf as key; decrypt, using sequence number to synchronize
  117. ciphertext and keystream.
  118. 'Back' relay cell (opposite direction from CREATE):
  119. Use Kb as key; encrypt, using sequence number to synchronize
  120. ciphertext and keystream.
  121. Note that in counter mode, decrypt and encrypt are the same operation.
  122. Each stream encrypted by a Kf or Kb has a corresponding unique state,
  123. captured by a sequence number; the originator of each such stream chooses
  124. the initial sequence number randomly, and increments it only with RELAY
  125. cells. [This counts cells; unlike, say, TCP, tor uses fixed-size cells, so
  126. there's no need for counting bytes directly. Right? - ML]
  127. The OR then decides whether it recognizes the relay cell, by
  128. inspecting the payload as described in section 5.1 below. If the OR
  129. recognizes the cell, it processes the contents of the relay cell.
  130. Otherwise, it passes the decrypted relay cell along the circuit if
  131. the circuit continues. If the OR at the end of the circuit
  132. encounters an unrecognized relay cell, an error has occurred: the OR
  133. sends a DESTROY cell to tear down the circuit.
  134. When a relay cell arrives at an OP, the OP decrypts the payload
  135. with AES/CTR as follows:
  136. OP receives data cell:
  137. For I=N...1,
  138. Decrypt with Kb_I, using the sequence number as above. If the
  139. payload is recognized (see section 5.1), then stop and process
  140. the payload.
  141. For more information, see section 5 below.
  142. 4.6. CREATE_UDP and CREATED_UDP cells
  143. Users set up UDP circuits incrementally. The procedure is similar to that
  144. for TCP circuits, as described in section 4.1. In addition to the TLS
  145. connection to the first node, the OP also attempts to open a DTLS
  146. connection. If this succeeds, the OP sends a CREATE_UDP cell, with a
  147. payload in the same format as a CREATE cell. To extend a UDP circuit past
  148. the first hop, the OP sends an EXTEND_UDP relay cell (see section 5) which
  149. instructs the last node in the circuit to send a CREATE_UDP cell to extend
  150. the circuit.
  151. The relay payload for an EXTEND_UDP relay cell consists of:
  152. Address [4 bytes]
  153. TCP port [2 bytes]
  154. UDP port [2 bytes]
  155. Onion skin [186 bytes]
  156. Identity fingerprint [20 bytes]
  157. The address field and ports denote the IPV4 address and ports of the next OR
  158. in the circuit.
  159. The payload for a CREATED_UDP cell or the relay payload for an
  160. RELAY_EXTENDED_UDP cell is identical to that of the corresponding CREATED or
  161. RELAY_EXTENDED cell. Both circuits are established using the same key.
  162. Note that the existence of a UDP circuit implies the
  163. existence of a corresponding TCP circuit, sharing keys, sequence numbers,
  164. and any other relevant state.
  165. 4.6.1 CREATE_FAST_UDP/CREATED_FAST_UDP cells
  166. As above, the OP must successfully connect using DTLS before attempting to
  167. send a CREATE_FAST_UDP cell. Otherwise, the procedure is the same as in
  168. section 4.1.1.
  169. 5. Application connections and stream management
  170. 5.1. Relay cells
  171. Within a circuit, the OP and the exit node use the contents of RELAY cells
  172. to tunnel end-to-end commands, TCP connections ("Streams"), and UDP packets
  173. across circuits. End-to-end commands and UDP packets can be initiated by
  174. either edge; streams are initiated by the OP.
  175. The payload of each unencrypted RELAY cell consists of:
  176. Relay command [1 byte]
  177. 'Recognized' [2 bytes]
  178. StreamID [2 bytes]
  179. Digest [4 bytes]
  180. Length [2 bytes]
  181. Data [498 bytes]
  182. The relay commands are:
  183. 1 -- RELAY_BEGIN [forward]
  184. 2 -- RELAY_DATA [forward or backward]
  185. 3 -- RELAY_END [forward or backward]
  186. 4 -- RELAY_CONNECTED [backward]
  187. 5 -- RELAY_SENDME [forward or backward]
  188. 6 -- RELAY_EXTEND [forward]
  189. 7 -- RELAY_EXTENDED [backward]
  190. 8 -- RELAY_TRUNCATE [forward]
  191. 9 -- RELAY_TRUNCATED [backward]
  192. 10 -- RELAY_DROP [forward or backward]
  193. 11 -- RELAY_RESOLVE [forward]
  194. 12 -- RELAY_RESOLVED [backward]
  195. 13 -- RELAY_BEGIN_UDP [forward]
  196. 14 -- RELAY_DATA_UDP [forward or backward]
  197. 15 -- RELAY_EXTEND_UDP [forward]
  198. 16 -- RELAY_EXTENDED_UDP [backward]
  199. 17 -- RELAY_DROP_UDP [forward or backward]
  200. Commands labelled as "forward" must only be sent by the originator
  201. of the circuit. Commands labelled as "backward" must only be sent by
  202. other nodes in the circuit back to the originator. Commands marked
  203. as either can be sent either by the originator or other nodes.
  204. The 'recognized' field in any unencrypted relay payload is always set to
  205. zero.
  206. The 'digest' field can have two meanings. For all cells sent over TLS
  207. connections (that is, all commands and all non-UDP RELAY data), it is
  208. computed as the first four bytes of the running SHA-1 digest of all the
  209. bytes that have been sent reliably and have been destined for this hop of
  210. the circuit or originated from this hop of the circuit, seeded from Df or Db
  211. respectively (obtained in section 4.2 above), and including this RELAY
  212. cell's entire payload (taken with the digest field set to zero). Cells sent
  213. over DTLS connections do not affect this running digest. Each cell sent
  214. over DTLS (that is, RELAY_DATA_UDP and RELAY_DROP_UDP) has the digest field
  215. set to the SHA-1 digest of the current RELAY cells' entire payload, with the
  216. digest field set to zero. Coupled with a randomly-chosen streamID, this
  217. provides per-cell integrity checking on UDP cells.
  218. When the 'recognized' field of a RELAY cell is zero, and the digest
  219. is correct, the cell is considered "recognized" for the purposes of
  220. decryption (see section 4.5 above).
  221. (The digest does not include any bytes from relay cells that do
  222. not start or end at this hop of the circuit. That is, it does not
  223. include forwarded data. Therefore if 'recognized' is zero but the
  224. digest does not match, the running digest at that node should
  225. not be updated, and the cell should be forwarded on.)
  226. All RELAY cells pertaining to the same tunneled TCP stream have the
  227. same streamID. Such streamIDs are chosen arbitrarily by the OP. RELAY
  228. cells that affect the entire circuit rather than a particular
  229. stream use a StreamID of zero.
  230. All RELAY cells pertaining to the same UDP tunnel have the same streamID.
  231. This streamID is chosen randomly by the OP, but cannot be zero.
  232. The 'Length' field of a relay cell contains the number of bytes in
  233. the relay payload which contain real payload data. The remainder of
  234. the payload is padded with NUL bytes.
  235. If the RELAY cell is recognized but the relay command is not
  236. understood, the cell must be dropped and ignored. Its contents
  237. still count with respect to the digests, though. [Before
  238. 0.1.1.10, Tor closed circuits when it received an unknown relay
  239. command. Perhaps this will be more forward-compatible. -RD]
  240. 5.2.1. Opening UDP tunnels and transferring data
  241. To open a new anonymized UDP connection, the OP chooses an open
  242. circuit to an exit that may be able to connect to the destination
  243. address, selects a random streamID not yet used on that circuit,
  244. and constructs a RELAY_BEGIN_UDP cell with a payload encoding the address
  245. and port of the destination host. The payload format is:
  246. ADDRESS | ':' | PORT | [00]
  247. where ADDRESS can be a DNS hostname, or an IPv4 address in
  248. dotted-quad format, or an IPv6 address surrounded by square brackets;
  249. and where PORT is encoded in decimal.
  250. [What is the [00] for? -NM]
  251. [It's so the payload is easy to parse out with string funcs -RD]
  252. Upon receiving this cell, the exit node resolves the address as necessary.
  253. If the address cannot be resolved, the exit node replies with a RELAY_END
  254. cell. (See 5.4 below.) Otherwise, the exit node replies with a
  255. RELAY_CONNECTED cell, whose payload is in one of the following formats:
  256. The IPv4 address to which the connection was made [4 octets]
  257. A number of seconds (TTL) for which the address may be cached [4 octets]
  258. or
  259. Four zero-valued octets [4 octets]
  260. An address type (6) [1 octet]
  261. The IPv6 address to which the connection was made [16 octets]
  262. A number of seconds (TTL) for which the address may be cached [4 octets]
  263. [XXXX Versions of Tor before 0.1.1.6 ignore and do not generate the TTL
  264. field. No version of Tor currently generates the IPv6 format.]
  265. The OP waits for a RELAY_CONNECTED cell before sending any data.
  266. Once a connection has been established, the OP and exit node
  267. package UDP data in RELAY_DATA_UDP cells, and upon receiving such
  268. cells, echo their contents to the corresponding socket.
  269. RELAY_DATA_UDP cells sent to unrecognized streams are dropped.
  270. Relay RELAY_DROP_UDP cells are long-range dummies; upon receiving such
  271. a cell, the OR or OP must drop it.
  272. 5.3. Closing streams
  273. UDP tunnels are closed in a fashion corresponding to TCP connections.
  274. 6. Flow Control
  275. UDP streams are not subject to flow control.
  276. 7.2. Router descriptor format.
  277. The items' formats are as follows:
  278. "router" nickname address ORPort SocksPort DirPort UDPPort
  279. Indicates the beginning of a router descriptor. "address" must be
  280. an IPv4 address in dotted-quad format. The last three numbers
  281. indicate the TCP ports at which this OR exposes
  282. functionality. ORPort is a port at which this OR accepts TLS
  283. connections for the main OR protocol; SocksPort is deprecated and
  284. should always be 0; DirPort is the port at which this OR accepts
  285. directory-related HTTP connections; and UDPPort is a port at which
  286. this OR accepts DTLS connections for UDP data. If any port is not
  287. supported, the value 0 is given instead of a port number.