123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529 |
- $Id$
- TOR (The Onion Router) Spec
- Note: This is an attempt to specify TOR as it exists as implemented in
- early March, 2003. It is not recommended that others implement this
- design as it stands; future versions of TOR will implement improved
- protocols.
- 0. Notation:
- PK -- a public key.
- SK -- a private key
- K -- a key for a symmetric cypher
- a|b -- concatenation of 'a' with 'b'.
- a[i:j] -- Bytes 'i' through 'j'-1 (inclusive) of the string a.
- All numeric values are encoded in network (big-endian) order.
- Unless otherwise specified, all symmetric ciphers are DES in OFB
- mode, with an IV of all 0 bytes. All asymmetric ciphers are RSA
- with 1024-bit keys, and exponents of 65537.
- [We will move to AES once we can assume everybody will have it. -RD]
- 1. System overview
- [Something to start with here. Do feel free to change/expand. -RD]
- Tor is an implementation of version 2 of Onion Routing.
- Onion Routing is a connection-oriented anonymizing communication
- service. Users build a layered block of asymmetric encryptions
- (an "onion") which describes a source-routed path through a set of
- nodes. Those nodes build a "virtual circuit" through the network, in which
- each node knows its predecessor and successor, but no others. Traffic
- flowing down the circuit is unwrapped by a symmetric key at each node,
- which reveals the downstream node.
- 2. Connections
- 2.1. Establishing OR-to-OR connections
- When one onion router opens a connection to another, the initiating
- OR (called the 'client') and the listening OR (called the 'server')
- perform the following handshake.
- Before the handshake begins, the client and server know one
- another's (1024-bit) public keys, IPV4 addresses, and ports.
- 1. Client connects to server:
- The client generates a pair of 8-byte symmetric keys (one
- [K_f] for the 'forward' stream from client to server, and one
- [K_b] for the 'backward' stream from server to client.
- The client then generates a 'Client authentication' message [M]
- containing:
- The client's published IPV4 address [4 bytes]
- The client's published port [2 bytes]
- The server's published IPV4 address [4 bytes]
- The server's published port [2 bytes]
- The forward key (K_f) [8 bytes]
- The backward key (K_f) [8 bytes]
- The maximum bandwidth (bytes/s) [4 bytes]
- [Total: 36 bytes]
- The client then RSA-encrypts the message with the server's
- public key, and PKCS1 padding to given an encrypted message
- [Commentary: 1024 bytes is probably too short, and this protocol can't
- support IPv6. -NM]
- [1024 is too short for a high-latency remailer; but perhaps it's
- fine for us, given our need for speed and also given our greater
- vulnerability to other attacks? Onions are infrequent enough now
- that maybe we could handle it; but I worry it will impact
- scalability, and handling more users is important.-RD]
-
- The client then opens a TCP connection to the server, sends
- the 128-byte RSA-encrypted data to the server, and waits for a
- reply.
- 2. Server authenticates to client:
- Upon receiving a TCP connection, the server waits to receive
- 128 bytes from the client. It decrypts the message with its
- private key, and checks the PKCS1 padding. If the padding is
- incorrect, or if the message's length is other than 32 bytes,
- the server closes the TCP connection and stops handshaking.
- The server then checks the list of known ORs for one with the
- address and port given in the client's authentication. If no
- such OR is known, or if the server is already connected to
- that OR, the server closes the current TCP connection and
- stops handshaking.
- For later use, the server sets its keys for this connection,
- setting K_f to the client's K_b, and K_b to the client's K_f.
- The server then creates a server authentication message[M2] as
- follows:
- Modified client authentication [32 bytes]
- A random nonce [N] [8 bytes]
- [Total: 40 bytes]
- The client authentication is generated from M by replacing
- the client's preferred bandwidth [B_c] with the server's
- preferred bandwidth [B_s], if B_s < B_c.
- The server encrypts M2 with the client's public key (found
- from the list of known routers), using PKCS1 padding.
- The server sends the 128-byte encrypted message to the client,
- and waits for a reply.
- 3. Client authenticates to server.
- Once the client has received 128 bytes, it decrypts them with
- its public key, and checks the PKCS1 padding. If the padding
- is invalid, or the decrypted message's length is other than 40
- bytes, the client closes the TCP connection.
- The client checks that the addresses and keys in the reply
- message are the same as the ones it originally sent. If not,
- it closes the TCP connection.
- The client updates the connection's bandwidth to that set by
- the server, and generates the following authentication message [M3]:
- The client's published IPV4 address [4 bytes]
- The client's published port [2 bytes]
- The server's published IPV4 address [4 bytes]
- The server's published port [2 bytes]
- The server-generated nonce [N] [8 bytes]
- [Total: 20 bytes]
- Once again, the client encrypts this message using the
- server's public key and PKCS1 padding, and sends the resulting
- 128-byte message to the server.
- 4. Server checks client authentication
- The server once again waits to receive 128 bytes from the
- client, decrypts the message with its private key, and checks
- the PKCS1 padding. If the padding is incorrect, or if the
- message's length is other than 20 bytes, the server closes the
- TCP connection and stops handshaking.
- If the addresses in the decrypted message M3 match those in M
- and M2, and if the nonce in M3 is the same as in M2, the
- handshake is complete, and the client and server begin sending
- cells to one another. Otherwise, the server closes the TCP
- connection.
- 2.2. Establishing OP-to-OR connections
- When an Onion Proxy (OP) needs to establish a connection to an OR,
- the handshake is simpler because the OR does not need to verify the
- OP's identity. The OP and OR establish the following steps:
- 1. OP connects to OR:
- First, the OP generates a pair of 8-byte symmetric keys (one
- [K_f] for the 'forward' stream from OP to OR, and one
- [K_b] for the 'backward' stream from OR to OP.
- The OP generates a message [M] in the following format:
- Maximum bandwidth (bytes/s) [4 bytes]
- Forward key [K_f] [8 bytes]
- Backward key [K_b] [8 bytes]
- [Total: 20 bytes]
- The OP encrypts M with the OR's public key and PKCS1 padding,
- opens a TCP connection to the OR's TCP port, and sends the
- resulting 128-byte encrypted message to the OR.
- 2. OR receives keys:
- When the OR receives a connection from an OP [This is on a
- different port, right? How does it know the difference? -NM],
- [Correct. The 'or_port' config variable specifies the OR port,
- and the op_port variable specified the OP port. -RD]
- it waits for 128 bytes of data, and decrypts the resulting
- data with its private key, checking the PKCS1 padding. If the
- padding is invalid, or the message is not 20 bytes long, the
- OR closes the connection.
- Otherwise, the connection is established, and the O is ready
- to receive cells.
- The server sets its keys for this connection, setting K_f to
- the client's K_b, and K_b to the client's K_f.
- 2.3. Sending cells and link encryption
- Once the handshake is complete, the ORs or OR and OP send cells
- (specified below) to one another. Cells are sent serially,
- encrypted with the DES-OFB keystream specified by the handshake
- protocol. Over a connection, communicants encrypt outgoing cells
- with the connection's K_f, and decrypt incoming cells with the
- connection's K_b.
- [Commentary: This means that OR/OP->OR connections are malleable; I
- can flip bits in cells as they go across the wire, and see flipped
- bits coming out the cells as they are decrypted at the next
- server. I need to look more at the data format to see whether
- this is exploitable, but if there's no integrity checking there
- either, I suspect we may have an attack here. -NM]
- [Yes, this protocol is open to tagging attacks. The payloads are
- encrypted inside the network, so it's only at the edge node and beyond
- that it's a worry. But adversaries can already count packets and
- observe/modify timing. It's not worth putting in hashes; indeed, it
- would be quite hard, because one of the sides of the circuit doesn't
- know the keys that are used for de/encrypting at each hop, so couldn't
- craft hashes anyway. See the Bandwidth Throttling (threat model)
- thread on http://archives.seul.org/or/dev/Jul-2002/threads.html. -RD]
- [Even if I don't control both sides of the connection, I can still
- do evil stuff. For instance, if I can guess that a cell is a
- TOPIC_COMMAND_BEGIN cell to www.slashdot.org:80 , I can change the
- address and port to point to a machine I control. -NM]
- 3. Cell Packet format
- The basic unit of communication for onion routers and onion
- proxies is a fixed-width "Cell." Each Cell contains the following
- fields:
- ACI (anonymous circuit identifier) [2 bytes]
- Command [1 byte]
- Length [1 byte]
- Sequence number (unused, set to 0) [4 bytes]
- Payload (padded with 0 bytes) [120 bytes]
- [Total size: 128 bytes]
- The 'Command' field holds one of the following values:
- 0 -- PADDING (Padding) (See Sec 6.2)
- 1 -- CREATE (Create a circuit) (See Sec 4)
- 2 -- DATA (End-to-end data) (See Sec 5)
- 3 -- DESTROY (Stop using a circuit) (See Sec 4)
- 4 -- SENDME (For flow control) (See Sec 6.1)
- The interpretation of 'Length' and 'Payload' depend on the type of
- the cell.
- PADDING: Length is 0; Payload is 120 bytes of 0's.
- CREATE: Length is a value between 1 and 120; the first 'length'
- bytes of payload contain a portion of an onion.
- DATA: Length is a value between 4 and 120; the first 'length'
- bytes of payload contain useful data.
- DESTROY: Neither field is used.
- SENDME: Length encodes a window size, payload is unused.
- Unused fields are filled with 0 bytes. The payload is padded with
- 0 bytes.
- PADDING cells are currently used to implement connection
- keepalive. ORs and OPs send one another a PADDING cell every few
- minutes.
- CREATE and DESTROY cells are used to manage circuits; see section
- 4 below.
- DATA cells are used to send commands and data along a circuit; see
- section 5 below.
- SENDME cells are used for flow control; see section 6 below.
- 4. Onions and circuit management
- 4.1. Setting up circuits
- An onion is a multi-layered structure, with one layer for each node
- in a circuit. Each (unencrypted) layer has the following fields:
- Version [1 byte]
- Back cipher [4 bits]
- Forward cipher [4 bits]
- Port [2 bytes]
- Address [4 bytes]
- Expiration time [4 bytes]
- Key seed material [16 bytes]
- [Total: 28 bytes]
- The value of Version is currently 2.
- The forward and backward ciphers fields can take the following values:
- 0: Identity
- 1: Single DES in OFB
- 2: RC4
- The port and address field denote the IPV4 address and port of
- the next onion router in the circuit, or are set to 0 for the
- last hop.
- The expiration time is a number of seconds since the epoch (1
- Jan 1970); by default, it is set to the current time plus one
- day.
- When constructing an onion to create a circuit from OR_1,
- OR_2... OR_N, the onion creator performs the following steps:
- 1. Let M = 100 random bytes.
- 2. For I=N downto 1:
-
- A. Create an onion layer L, setting Version=2,
- BackCipher=DES/OFB(1), ForwardCipher=DES/OFB(2),
- ExpirationTime=now + 1 day, and Seed=16 random bytes.
- If I=N, set Port=Address=0. Else, set Port and Address to
- the IPV4 port and address of OR_{I+1}.
- B. Let M = L | M.
- C. Let K1_I = SHA1(Seed).
- Let K2_I = SHA1(K1_I).
- Let K3_I = SHA1(K2_I).
- D. Encrypt the first 128 bytes of M with the RSA key of
- OR_I, using no padding. Encrypt the remaining portion of
- M with DES/OFB, using K1_I as a key and an all-0 IV.
- 3. M is now the onion.
- To create a connection using the onion M, an OP or OR performs the
- following steps:
- 1. If not already connected to the first router in the chain,
- open a new connection to that router.
- 2. Choose an ACI not already in use on the connection with the
- first router in the chain. If our address/port pair is
- numerically higher than the address/port pair of the other
- side, then let the high bit of the ACI be 1, else 0.
- 3. To send M over the wire, prepend a 4-byte integer containing
- Len(M). Call the result M'. Let N=ceil(Len(M')/120).
- Divide M' into N chunks, such that:
- Chunk_I = M'[(I-1)*120:I*120] for 1 <= I <= N-1
- Chunk_N = M'[(N-1)*120:Len(M')]
- 4. Send N CREATE cells along the connection, setting the ACI
- on each to the selected ACI, setting the payload on each to
- the corresponding 'Chunk_I', and setting the length on each
- to the length of the payload.
- Upon receiving a CREATE cell along a connection, an OR performs
- the following steps:
- 1. If we already have an 'open' circuit along this connection
- with this ACI, drop the cell.
- Otherwise, if we have no circuit along this connection with
- this ACI, let L = the integer value of the first 4 bytes of
- the payload. Create a half-open circuit with this ACI, and
- begin queueing CREATE cells for this circuit.
- Otherwise, we have a half-open circuit. If the total
- payload length of the CREATE cells for this circuit is at
- least equal to the onion length in the first cell (minus
- 4), then process the onion.
-
- 2. Once we have a complete onion, decrypt the first 128 bytes
- of the onion with this OR's RSA private key, and extract
- the outmost onion layer. If the version, back cipher, or
- forward cipher is unrecognized, or the expiration time is
- in the past, then tear down the circuit (see section 4.2).
- Compute K1 through K3 as above. Use K1 to decrypt the rest
- of the onion using DES/OFB.
- If we are not the exit node, remove the first layer from the
- decrypted onion, and send the remainder to the next OR
- on the circuit, as specified above. (Note that we'll
- choose a different ACI for this circuit on the connection
- with the next OR.)
- As an optimization, OR implementations may delay processing onions
- until a break in traffic allows time to do so without harming
- network latency too greatly.
- 4.2. Tearing down circuits
- Circuits are torn down when an unrecoverable error occurs along
- the circuit, or when all topics on a circuit are closed and the
- circuit's intended lifetime is over.
- To tear down a circuit, an OR or OP sends a DESTROY cell with that
- direction's ACI to the adjacent nodes on that circuit.
- Upon receiving a DESTROY cell, an OR frees resources associated
- with the corresponding circuit. If it's not the start or end of the
- circuit, it sends a DESTROY cell for that circuit to the next OR in
- the circuit. If the node is the start or end of the circuit, then
- it tears down any associated edge connections (see section 5.1).
- After a DESTROY cell has been processed, an OR ignores all data or
- destroy cells for the corresponding circuit.
- 4.3. Routing data cells
- When an OR receives a DATA cell, it checks the cell's ACI and
- determines whether it has a corresponding circuit along that
- connection. If not, the OR drops the DATA cell.
- Otherwise, if the OR is not at the OP edge of the circuit (that is,
- either an 'exit node' or a non-edge node), it de/encrypts the length
- field and the payload with DES/OFB, as follows:
- 'Forward' data cell (same direction as onion):
- Use K2 as key; encrypt.
- 'Back' data cell (opposite direction from onion):
- Use K3 as key; decrypt.
- Otherwise, if the data cell has arrived to the OP edge of the circuit,
- the OP de/encrypts the length and payload fields with DES/OFB as
- follows:
- OP sends data cell:
- For I=1...N, decrypt with K2_I.
- OP receives data cell:
- For I=N...1, encrypt with K3_I.
- Edge nodes process the length and payload fields of DATA cells as
- described in section 5 below.
- 5. Application connections and topic management
- 5.1. Topics and TCP streams
- Within a circuit, the OP and the exit node use the contents of DATA
- packets to tunnel TCP connections ("Topics") across circuits.
- These connections are initiated by the OP.
- The first 4 bytes of each data cell are reserved as follows:
- Topic command [1 byte]
- Unused, set to 0. [1 byte]
- Topic ID [2 bytes]
- The recognized topic commands are:
- 1 -- TOPIC_BEGIN
- 2 -- TOPIC_DATA
- 3 -- TOPIC_END
- 4 -- TOPIC_CONNECTED
- 5 -- TOPIC_SENDME
- All DATA cells pertaining to the same tunneled connection have the
- same topic ID.
- To create a new anonymized TCP connection, the OP sends a
- TOPIC_BEGIN data cell with a payload encoding the address and port
- of the destination host. The payload format is:
- ADDRESS | ',' | PORT | '\000'
- where ADDRESS may be a DNS hostname, or an IPv4 address in
- dotted-quad format; and where PORT is encoded in decimal.
- Upon receiving this packet, the exit node resolves the address as
- necessary, and opens a new TCP connection to the target port. If
- the address cannot be resolved, or a connection can't be
- established, the exit node replies with a TOPIC_END cell.
- Otherwise, the exit node replies with a TOPIC_CONNECTED cell.
- The OP waits for a TOPIC_CONNECTED cell before sending any data.
- Once a connection has been established, the OP and exit node
- package stream data in TOPIC_DATA cells, and upon receiving such
- cells, echo their contents to the corresponding TCP stream.
- When one side of the TCP stream is closed, the corresponding edge
- node sends a TOPIC_END cell along the circuit; upon receiving a
- TOPIC_END cell, the edge node closes the corresponding TCP stream.
- [This should probably become:
- When one side of the TCP stream is closed, the corresponding edge
- node sends a TOPIC_END cell along the circuit; upon receiving a
- TOPIC_END cell, the edge node closes its side of the corresponding
- TCP stream (by sending a FIN packet), but continues to accept and
- package incoming data until both sides of the TCP stream are
- closed. At that point, the edge node sends a second TOPIC_END
- cell, and drops its record of the topic. -NM]
- 6. Flow control
- 6.1. Link throttling
- As discussed above in section 2.1, ORs and OPs negotiate a maximum
- bandwidth upon startup. The communicants only read up to that
- number of bytes per second on average, though they may smooth the
- number of bytes read over a 10-second window.
- [???? more detail? -NM]
- Communicants rely on TCP flow control to prevent the bandwidth
- from being exceeded.
- 6.2. Link padding
- On every cell connection, every ????/bandwidth seconds, if less
- than MIN(bandwidth/(100*128), 10) cells are waiting to be sent
- along a connection, nodes add a single padding cell to the cells
- they will send along the connection.
- 6.3. Circuit flow control
- To control a circuit's bandwidth usage, each node keeps track of
- how many cells it is allowed to send to the next hop in the circuit
- before queueing cells. This 'window' value is initially set to
- 1000 cells in each direction. Each edge node on a circuit sends a
- SENDME cell (with length=100) every time it has received 100 cells
- on the circuit. When a node receives a SENDME cell for a circuit,
- it increases the circuit's window in the corresponding by the value
- of the cell's length field, and (if not an edge node) passes an
- equivalent SENDME cell to the next node in the circuit.
- If a window value ever reaches 0, the OR queues cells for the
- corresponding circuit and direction until it receives an
- appropriate SENDME cell.
- 6.4. Topic flow control
- Edge nodes use TOPIC_SENDME data cells to implement end-to-end flow
- control for individual connections across circuits. As with
- circuit flow control, edge nodes begin with a window of cells (500)
- per topic, and increment the window by a fixed value (50) upon
- receiving a TOPIC_SENDME cell. Edge nodes create and additional
- TOPIC_SENDME cells when [????] -NM
- 7. Directories and routers
- [????]
|