123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523 |
- $Id$
- Tor Spec
- Note: This is an attempt to specify Tor as it exists as implemented in
- early June, 2003. It is not recommended that others implement this
- design as it stands; future versions of Tor will implement improved
- protocols.
- TODO: (very soon)
- - Specify truncate/truncated payloads?
- - Specify RELAY_END payloads. [It's 1 byte of reason, then X bytes of
- data, right?]
- - Sendme w/stream0 is circuit sendme
- - Integrate -NM and -RD comments
- - EXTEND cells should have hostnames or nicknames, so that OPs never
- resolve OR hostnames. Else DNS servers can give different answers to
- different OPs, and compromise their anonymity.
- EVEN LATER:
- - Do TCP-style sequencing and ACKing of DATA cells so that we can afford
- to lose some data cells.
- 0. Notation:
- PK -- a public key.
- SK -- a private key
- K -- a key for a symmetric cypher
- a|b -- concatenation of 'a' with 'b'.
- All numeric values are encoded in network (big-endian) order.
- Unless otherwise specified, all symmetric ciphers are AES in counter
- mode, with an IV of all 0 bytes. Asymmetric ciphers are either RSA
- with 1024-bit keys and exponents of 65537, or DH with the safe prime
- from rfc2409, section 6.2, whose hex representation is:
- "FFFFFFFFFFFFFFFFC90FDAA22168C234C4C6628B80DC1CD129024E08"
- "8A67CC74020BBEA63B139B22514A08798E3404DDEF9519B3CD3A431B"
- "302B0A6DF25F14374FE1356D6D51C245E485B576625E7EC6F44C42E9"
- "A637ED6B0BFF5CB6F406B7EDEE386BFB5A899FA5AE9F24117C4B1FE6"
- "49286651ECE65381FFFFFFFFFFFFFFFF"
- 1. System overview
- Tor is a connection-oriented anonymizing communication service. Users
- build a path known as a "virtual circuit" through the network, in which
- each node knows its predecessor and successor, but no others. Traffic
- flowing down the circuit is unwrapped by a symmetric key at each node,
- which reveals the downstream node.
- 2. Connections
- There are two ways to connect to an onion router (OR). The first is
- as an onion proxy (OP), which allows the OP to authenticate the OR
- without authenticating itself. The second is as another OR, which
- allows mutual authentication.
- Tor uses TLS for link encryption, using the cipher suite
- "TLS_DHE_RSA_WITH_AES_128_CBC_SHA". An OR always sends a
- self-signed X.509 certificate whose commonName is the server's
- nickname, and whose public key is in the server directory.
-
- All parties receiving certificates must confirm that the public
- key is as it appears in the server directory, and close the
- connection if it is not.
- Once a TLS connection is established, the two sides send cells
- (specified below) to one another. Cells are sent serially. All
- cells are 256 bytes long. Cells may be sent embedded in TLS
- records of any size or divided across TLS records, but the framing
- of TLS records should not leak information about the type or
- contents of the cells.
- OR-to-OR connections are never deliberately closed. An OP should
- close a connection to an OR if there are no circuits running over
- the connection, and an amount of time (KeepalivePeriod, defaults to
- 5 minutes) has passed.
- 3. Cell Packet format
- The basic unit of communication for onion routers and onion
- proxies is a fixed-width "cell". Each cell contains the following
- fields:
- CircID [2 bytes]
- Command [1 byte]
- Length [1 byte]
- Sequence number (unused, set to 0) [4 bytes]
- Payload (padded with 0 bytes) [248 bytes]
- [Total size: 256 bytes]
- The 'Command' field holds one of the following values:
- 0 -- PADDING (Padding) (See Sec 6.2)
- 1 -- CREATE (Create a circuit) (See Sec 4)
- 2 -- CREATED (Acknowledge create) (See Sec 4)
- 3 -- RELAY (End-to-end data) (See Sec 5)
- 4 -- DESTROY (Stop using a circuit) (See Sec 4)
- The interpretation of 'Length' and 'Payload' depend on the type of
- the cell.
- PADDING: Neither field is used.
- CREATE: Length is 144; the payload contains the first phase of the
- DH handshake.
- CREATED: Length is 128; the payload contains the second phase of
- the DH handshake.
- RELAY: Length is a value between 8 and 248; the first 'length'
- bytes of payload contain useful data.
- DESTROY: Neither field is used.
- Unused fields are filled with 0 bytes. The payload is padded with
- 0 bytes.
- PADDING cells are currently used to implement connection
- keepalive. ORs and OPs send one another a PADDING cell every few
- minutes.
- CREATE and DESTROY cells are used to manage circuits; see section
- 4 below.
- RELAY cells are used to send commands and data along a circuit; see
- section 5 below.
- 4. Circuit management
- 4.1. CREATE and CREATED cells
- Users set up circuits incrementally, one hop at a time. To create
- a new circuit, users send a CREATE cell to the first node, with the
- first half of the DH handshake; that node responds with a CREATED cell
- with the second half of the DH handshake. To extend a circuit past
- the first hop, the user sends an EXTEND relay cell (see section 5)
- which instructs the last node in the circuit to send a CREATE cell
- to extend the circuit.
- The payload for a CREATE cell is an 'onion skin', consisting of:
- RSA-encrypted data [128 bytes]
- Symmetrically-encrypted data [16 bytes]
- The RSA-encrypted portion contains:
- Symmetric key [16 bytes]
- First part of DH data (g^x) [112 bytes]
- The symmetrically encrypted portion contains:
- Second part of DH data (g^x) [16 bytes]
- The two parts of the DH data, once decrypted and concatenated, form
- g^x as calculated by the client.
- The relay payload for an EXTEND relay cell consists of:
- Address [4 bytes]
- Port [2 bytes]
- Onion skin [144 bytes]
- The port and address field denote the IPV4 address and port of the
- next onion router in the circuit.
- 4.2. Setting circuit keys
- Once the handshake between the OP and an OR is completed, both
- servers can now calculate g^xy with ordinary DH. From the base key
- material g^xy, they compute two 16 byte keys, called Kf and Kb as
- follows. First, the server represents g^xy as a big-endian
- unsigned integer. Next, the server computes 40 bytes of key data
- as K = SHA1(g^xy | [00]) | SHA1(g^xy | [01]) where "00" is a single
- octet whose value is zero, and "01" is a single octet whose value
- is one. The first 16 bytes of K form Kf, and the next 16 bytes of
- K form Kb.
- Kf is used to encrypt the stream of data going from the OP to the
- OR, whereas Kb is used to encrypt the stream of data going from the
- OR to the OP.
- 4.3. Creating circuits
- When creating a circuit through the network, the circuit creator
- performs the following steps:
- 1. Choose a chain of N onion routers (R_1...R_N) to constitute
- the path, such that no router appears in the path twice.
- [this is wrong, see October 2003 discussion on or-dev]
- 2. If not already connected to the first router in the chain,
- open a new connection to that router.
- 3. Choose a circID not already in use on the connection with the
- first router in the chain. If we are an onion router and our
- nickname is lexicographically greater than the nickname of the
- other side, then let the high bit of the circID be 1, else 0.
- 4. Send a CREATE cell along the connection, to be received by
- the first onion router.
- 5. Wait until a CREATED cell is received; finish the handshake
- and extract the forward key Kf_1 and the back key Kb_1.
- 6. For each subsequent onion router R (R_2 through R_N), extend
- the circuit to R.
- To extend the circuit by a single onion router R_M, the circuit
- creator performs these steps:
- 1. Create an onion skin, encrypting the RSA-encrypted part with
- R's public key.
- 2. Encrypt and send the onion skin in a relay EXTEND cell along
- the circuit (see section 5).
- 3. When a relay EXTENDED cell is received, calculate the shared
- keys. The circuit is now extended.
- When an onion router receives an EXTEND relay cell, it sends a
- CREATE cell to the next onion router, with the enclosed onion skin
- as its payload. The initiating onion router chooses some circID not
- yet used on the connection between the two onion routers. (But see
- section 4.3. above, concerning choosing circIDs.)
- As an extension (called router twins), if the desired next onion
- router R in the circuit is down, and some other onion router R'
- has the same key as R, then it's ok to extend to R' rather than R.
- When an onion router receives a CREATE cell, if it already has a
- circuit on the given connection with the given circID, it drops the
- cell. Otherwise, sometime after receiving the CREATE cell, it completes
- the DH handshake, and replies with a CREATED cell, containing g^y
- as its [128 byte] payload. Upon receiving a CREATED cell, an onion
- router packs it payload into an EXTENDED relay cell (see section 5),
- and sends that cell up the circuit. Upon receiving the EXTENDED
- relay cell, the OP can retrieve g^y.
- (As an optimization, OR implementations may delay processing onions
- until a break in traffic allows time to do so without harming
- network latency too greatly.)
- 4.4. Tearing down circuits
- Circuits are torn down when an unrecoverable error occurs along
- the circuit, or when all streams on a circuit are closed and the
- circuit's intended lifetime is over. Circuits may be torn down
- either completely or hop-by-hop.
- To tear down a circuit completely, an OR or OP sends a DESTROY
- cell to the adjacent nodes on that circuit, using the appropriate
- direction's circID.
- Upon receiving an outgoing DESTROY cell, an OR frees resources
- associated with the corresponding circuit. If it's not the end of
- the circuit, it sends a DESTROY cell for that circuit to the next OR
- in the circuit. If the node is the end of the circuit, then it tears
- down any associated edge connections (see section 5.1).
- After a DESTROY cell has been processed, an OR ignores all data or
- destroy cells for the corresponding circuit.
- To tear down part of a circuit, the OP sends a RELAY_TRUNCATE cell
- signaling a given OR (Stream ID zero). That OR sends a DESTROY
- cell to the next node in the circuit, and replies to the OP with a
- RELAY_TRUNCATED cell.
- When an unrecoverable error occurs along one connection in a
- circuit, the nodes on either side of the connection should, if they
- are able, act as follows: the node closer to the OP should send a
- RELAY_TRUNCATED cell towards the OP; the node farther from the OP
- should send a DESTROY cell down the circuit.
- [We'll have to reevaluate this section once we figure out cleaner
- circuit/connection killing conventions. -RD]
- 4.5. Routing data cells
- When an OR receives a RELAY cell, it checks the cell's circID and
- determines whether it has a corresponding circuit along that
- connection. If not, the OR drops the RELAY cell.
- Otherwise, if the OR is not at the OP edge of the circuit (that is,
- either an 'exit node' or a non-edge node), it de/encrypts the length
- field and the payload with AES/CTR, as follows:
- 'Forward' relay cell (same direction as CREATE):
- Use Kf as key; encrypt.
- 'Back' relay cell (opposite direction from CREATE):
- Use Kb as key; decrypt.
- If the OR recognizes the stream ID on the cell (it is either the ID
- of an open stream or the signaling (zero) ID), the OR processes the
- contents of the relay cell. Otherwise, it passes the decrypted
- relay cell along the circuit if the circuit continues, or drops the
- cell if it's the end of the circuit. [Getting an unrecognized
- relay cell at the end of the circuit must be allowed for now;
- we can reexamine this once we've designed full tcp-style close
- handshakes. -RD]
- Otherwise, if the data cell is coming from the OP edge of the
- circuit, the OP decrypts the length and payload fields with AES/CTR as
- follows:
- OP sends data cell to node R_M:
- For I=1...M, decrypt with Kf_I.
- Otherwise, if the data cell is arriving at the OP edge if the
- circuit, the OP encrypts the length and payload fields with AES/CTR as
- follows:
- OP receives data cell:
- For I=N...1,
- Encrypt with Kb_I. If the stream ID is a recognized
- stream for R_I, or if the stream ID is the signaling
- ID (zero), then stop and process the payload.
- For more information, see section 5 below.
- 5. Application connections and stream management
- 5.1. Streams
- Within a circuit, the OP and the exit node use the contents of
- RELAY packets to tunnel end-to-end commands and TCP connections
- ("Streams") across circuits. End-to-end commands can be initiated
- by either edge; streams are initiated by the OP.
- The first 8 bytes of each relay cell are reserved as follows:
- Relay command [1 byte]
- Stream ID [7 bytes]
- The relay commands are:
- 1 -- RELAY_BEGIN
- 2 -- RELAY_DATA
- 3 -- RELAY_END
- 4 -- RELAY_CONNECTED
- 5 -- RELAY_SENDME
- 6 -- RELAY_EXTEND
- 7 -- RELAY_EXTENDED
- 8 -- RELAY_TRUNCATE
- 9 -- RELAY_TRUNCATED
- 10 -- RELAY_DROP
- All RELAY cells pertaining to the same tunneled stream have the
- same stream ID. Stream ID's are chosen randomly by the OP. A
- stream ID is considered "recognized" on a circuit C by an OP or an
- OR if it already has an existing stream established on that
- circuit, or if the stream ID is equal to the signaling stream ID,
- which is all zero: [00 00 00 00 00 00 00]
- To create a new anonymized TCP connection, the OP sends a
- RELAY_BEGIN data cell with a payload encoding the address and port
- of the destination host. The stream ID is zero. The payload format is:
- NEWSTREAMID | ADDRESS | ':' | PORT | '\000'
- where NEWSTREAMID is the newly generated Stream ID to use for
- this stream, ADDRESS may be a DNS hostname, or an IPv4 address in
- dotted-quad format; and where PORT is encoded in decimal.
- Upon receiving this packet, the exit node resolves the address as
- necessary, and opens a new TCP connection to the target port. If
- the address cannot be resolved, or a connection can't be
- established, the exit node replies with a RELAY_END cell.
- Otherwise, the exit node replies with a RELAY_CONNECTED cell.
- The OP waits for a RELAY_CONNECTED cell before sending any data.
- Once a connection has been established, the OP and exit node
- package stream data in RELAY_DATA cells, and upon receiving such
- cells, echo their contents to the corresponding TCP stream.
- Relay RELAY_DROP cells are long-range dummies; upon receiving such
- a cell, the OR or OP must drop it.
- 5.2. Closing streams
- [Note -- TCP streams can only be half-closed for reading. Our
- Bickford's conversation was incorrect. -NM]
- Because TCP connections can be half-open, we follow an equivalent
- to TCP's FIN/FIN-ACK/ACK protocol to close streams.
- An exit connection can have a TCP stream in one of three states:
- 'OPEN', 'DONE_PACKAGING', and 'DONE_DELIVERING'. For the purposes
- of modeling transitions, we treat 'CLOSED' as a fourth state,
- although connections in this state are not, in fact, tracked by the
- onion router.
- A stream begins in the 'OPEN' state. Upon receiving a 'FIN' from
- the corresponding TCP connection, the edge node sends a 'RELAY_END'
- cell along the circuit and changes its state to 'DONE_PACKAGING'.
- Upon receiving a 'RELAY_END' cell, an edge node sends a 'FIN' to
- the corresponding TCP connection (e.g., by calling
- shutdown(SHUT_WR)) and changing its state to 'DONE_DELIVERING'.
- When a stream in already in 'DONE_DELIVERING' receives a 'FIN', it
- also sends a 'RELAY_END' along the circuit, and changes its state
- to 'CLOSED'. When a stream already in 'DONE_PACKAGING' receives a
- 'RELAY_END' cell, it sends a 'FIN' and changes its state to
- 'CLOSED'.
- [Note: Please rename 'RELAY_END2'. :) -NM ]
- If an edge node encounters an error on any stram, it sends a
- 'RELAY_END2' cell along the circuit (if possible) and closes the
- TCP connection immediately. If an edge node receives a
- 'RELAY_END2' cell for any stream, it closes the TCP connection
- completely, and sends nothing along the circuit.
- 6. Flow control
- 6.1. Link throttling
- Each node should do appropriate bandwidth throttling to keep its
- user happy.
- Communicants rely on TCP's default flow control to push back when they
- stop reading.
- 6.2. Link padding
- Currently nodes are not required to do any sort of link padding or
- dummy traffic. Because strong attacks exist even with link padding,
- and because link padding greatly increases the bandwidth requirements
- for running a node, we plan to leave out link padding until this
- tradeoff is better understood.
- 6.3. Circuit-level flow control
- To control a circuit's bandwidth usage, each OR keeps track of
- two 'windows', consisting of how many RELAY_DATA cells it is
- allowed to package for transmission, and how many RELAY_DATA cells
- it is willing to deliver to streams outside the network.
- Each 'window' value is initially set to 1000 data cells
- in each direction (cells that are not data cells do not affect
- the window). When an OR is willing to deliver more cells, it sends a
- RELAY_SENDME cell towards the OP, with Stream ID zero. When an OR
- receives a RELAY_SENDME cell with stream ID zero, it increments its
- packaging window.
- Each of these cells increments the corresponding window by 100.
- The OP behaves identically, except that it must track a packaging
- window and a delivery window for every OR in the circuit.
-
- An OR or OP sends cells to increment its delivery window when the
- corresponding window value falls under some threshold (900).
- If a packaging window reaches 0, the OR or OP stops reading from
- TCP connections for all streams on the corresponding circuit, and
- sends no more RELAY_DATA cells until receiving a RELAY_SENDME cell.
- [this stuff is badly worded; copy in the tor-design section -RD]
- 6.4. Stream-level flow control
- Edge nodes use RELAY_SENDME cells to implement end-to-end flow
- control for individual connections across circuits. Similarly to
- circuit-level flow control, edge nodes begin with a window of cells
- (500) per stream, and increment the window by a fixed value (50)
- upon receiving a RELAY_SENDME cell. Edge nodes initiate RELAY_SENDME
- cells when both a) the window is <= 450, and b) there are less than
- ten cell payloads remaining to be flushed at that edge.
- 7. Directories and routers
- 7.1. Router descriptor format.
- (Unless otherwise noted, tokens on the same line are space-separated.)
- Router ::= Router-Line Date-Line Onion-Key Link-Key Signing-Key Exit-Policy Router-Signature NL
- Router-Line ::= "router" nickname address ORPort SocksPort DirPort bandwidth NL
- Date-Line ::= "published" YYYY-MM-DD HH:MM:SS NL
- Onion-key ::= "onion-key" NL a public key in PEM format NL
- Link-key ::= "link-key" NL a public key in PEM format NL
- Signing-Key ::= "signing-key" NL a public key in PEM format NL
- Exit-Policy ::= Exit-Line*
- Exit-Line ::= ("accept"|"reject") string NL
- Router-Signature ::= "router-signature" NL Signature
- Signature ::= "-----BEGIN SIGNATURE-----" NL
- Base-64-encoded-signature NL "-----END SIGNATURE-----" NL
- ORport ::= port where the router listens for routers/proxies (speaking cells)
- SocksPort ::= where the router listens for applications (speaking socks)
- DirPort ::= where the router listens for directory download requests
- bandwidth ::= maximum bandwidth, in bytes/s
- nickname ::= between 1 and 32 alphanumeric characters. case-insensitive.
- Example:
- router moria1 moria.mit.edu 9001 9021 9031 100000
- published 2003-09-24 19:36:05
- -----BEGIN RSA PUBLIC KEY-----
- MIGJAoGBAMBBuk1sYxEg5jLAJy86U3GGJ7EGMSV7yoA6mmcsEVU3pwTUrpbpCmwS
- 7BvovoY3z4zk63NZVBErgKQUDkn3pp8n83xZgEf4GI27gdWIIwaBjEimuJlEY+7K
- nZ7kVMRoiXCbjL6VAtNa4Zy1Af/GOm0iCIDpholeujQ95xew7rQnAgMA//8=
- -----END RSA PUBLIC KEY-----
- signing-key
- -----BEGIN RSA PUBLIC KEY-----
- 7BvovoY3z4zk63NZVBErgKQUDkn3pp8n83xZgEf4GI27gdWIIwaBjEimuJlEY+7K
- MIGJAoGBAMBBuk1sYxEg5jLAJy86U3GGJ7EGMSV7yoA6mmcsEVU3pwTUrpbpCmwS
- f/GOm0iCIDpholeujQ95xew7rnZ7kVMRoiXCbjL6VAtNa4Zy1AQnAgMA//8=
- -----END RSA PUBLIC KEY-----
- reject 18.0.0.0/24
- Note: The extra newline at the end of the router block is intentional.
- 7.2. Directory format
- Directory ::= Directory-Header Directory-Router Router* Signature
- Directory-Header ::= "signed-directory" NL Software-Line NL
- Software-Line: "recommended-software" comma-separated-version-list
- Directory-Router ::= Router
- Directory-Signature ::= "directory-signature" NL Signature
- Signature ::= "-----BEGIN SIGNATURE-----" NL
- Base-64-encoded-signature NL "-----END SIGNATURE-----" NL
- Note: The router block for the directory server must appear first.
- The signature is computed by computing the SHA-1 hash of the
- directory, from the characters "signed-directory", through the newline
- after "directory-signature". This digest is then padded with PKCS.1,
- and signed with the directory server's signing key.
- 7.3. Behavior of a directory server
- lists nodes that are connected currently
- speaks http on a socket, spits out directory on request
- -----------
- (for emacs)
- Local Variables:
- mode:text
- indent-tabs-mode:nil
- fill-column:77
- End:
|