123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314 |
- Tor Protocol Specification
- Roger Dingledine
- Nick Mathewson
- 0. Preliminaries
- THIS SPECIFICATION IS OBSOLETE.
- This document specifies the Tor directory protocol as used in version
- 0.1.0.x and earlier. See dir-spec.txt for a current version.
- 1. Basic operation
- There is a small number of directory authorities, and a larger number of
- caches. Client and servers know public keys for the directory authorities.
- Tor servers periodically upload self-signed "router descriptors" to the
- directory authorities. Each authority publishes a self-signed "directory"
- (containing all the router descriptors it knows, and a statement on which
- are running) and a self-signed "running routers" document containing only
- the statement on which routers are running.
- All Tors periodically download these documents, downloading the directory
- less frequently than they do the "running routers" document. Clients
- preferentially download from caches rather than authorities.
- 1.1. Document format
- Router descriptors, directories, and running-routers documents all obey the
- following lightweight extensible information format.
- The highest level object is a Document, which consists of one or more
- Items. Every Item begins with a KeywordLine, followed by one or more
- Objects. A KeywordLine begins with a Keyword, optionally followed by
- whitespace and more non-newline characters, and ends with a newline. A
- Keyword is a sequence of one or more characters in the set [A-Za-z0-9-].
- An Object is a block of encoded data in pseudo-Open-PGP-style
- armor. (cf. RFC 2440)
- More formally:
- Document ::= (Item | NL)+
- Item ::= KeywordLine Object*
- KeywordLine ::= Keyword NL | Keyword WS ArgumentsChar+ NL
- Keyword = KeywordChar+
- KeywordChar ::= 'A' ... 'Z' | 'a' ... 'z' | '0' ... '9' | '-'
- ArgumentChar ::= any printing ASCII character except NL.
- WS = (SP | TAB)+
- Object ::= BeginLine Base-64-encoded-data EndLine
- BeginLine ::= "-----BEGIN " Keyword "-----" NL
- EndLine ::= "-----END " Keyword "-----" NL
- The BeginLine and EndLine of an Object must use the same keyword.
- When interpreting a Document, software MUST reject any document containing a
- KeywordLine that starts with a keyword it doesn't recognize.
- The "opt" keyword is reserved for non-critical future extensions. All
- implementations MUST ignore any item of the form "opt keyword ....." when
- they would not recognize "keyword ....."; and MUST treat "opt keyword ....."
- as synonymous with "keyword ......" when keyword is recognized.
- 2. Router descriptor format.
- Every router descriptor MUST start with a "router" Item; MUST end with a
- "router-signature" Item and an extra NL; and MUST contain exactly one
- instance of each of the following Items: "published" "onion-key" "link-key"
- "signing-key" "bandwidth". Additionally, a router descriptor MAY contain
- any number of "accept", "reject", "fingerprint", "uptime", and "opt" Items.
- Other than "router" and "router-signature", the items may appear in any
- order.
- The items' formats are as follows:
- "router" nickname address ORPort SocksPort DirPort
- Indicates the beginning of a router descriptor. "address"
- must be an IPv4 address in dotted-quad format. The last
- three numbers indicate the TCP ports at which this OR exposes
- functionality. ORPort is a port at which this OR accepts TLS
- connections for the main OR protocol; SocksPort is deprecated and
- should always be 0; and DirPort is the port at which this OR accepts
- directory-related HTTP connections. If any port is not supported,
- the value 0 is given instead of a port number.
- "bandwidth" bandwidth-avg bandwidth-burst bandwidth-observed
- Estimated bandwidth for this router, in bytes per second. The
- "average" bandwidth is the volume per second that the OR is willing
- to sustain over long periods; the "burst" bandwidth is the volume
- that the OR is willing to sustain in very short intervals. The
- "observed" value is an estimate of the capacity this server can
- handle. The server remembers the max bandwidth sustained output
- over any ten second period in the past day, and another sustained
- input. The "observed" value is the lesser of these two numbers.
- "platform" string
- A human-readable string describing the system on which this OR is
- running. This MAY include the operating system, and SHOULD include
- the name and version of the software implementing the Tor protocol.
- "published" YYYY-MM-DD HH:MM:SS
- The time, in GMT, when this descriptor was generated.
- "fingerprint"
- A fingerprint (a HASH_LEN-byte of asn1 encoded public key, encoded
- in hex, with a single space after every 4 characters) for this router's
- identity key. A descriptor is considered invalid (and MUST be
- rejected) if the fingerprint line does not match the public key.
- [We didn't start parsing this line until Tor 0.1.0.6-rc; it should
- be marked with "opt" until earlier versions of Tor are obsolete.]
- "hibernating" 0|1
- If the value is 1, then the Tor server was hibernating when the
- descriptor was published, and shouldn't be used to build circuits.
- [We didn't start parsing this line until Tor 0.1.0.6-rc; it should
- be marked with "opt" until earlier versions of Tor are obsolete.]
- "uptime"
- The number of seconds that this OR process has been running.
- "onion-key" NL a public key in PEM format
- This key is used to encrypt EXTEND cells for this OR. The key MUST
- be accepted for at least XXXX hours after any new key is published in
- a subsequent descriptor.
- "signing-key" NL a public key in PEM format
- The OR's long-term identity key.
- "accept" exitpattern
- "reject" exitpattern
- These lines, in order, describe the rules that an OR follows when
- deciding whether to allow a new stream to a given address. The
- 'exitpattern' syntax is described below.
- "router-signature" NL Signature NL
- The "SIGNATURE" object contains a signature of the PKCS1-padded
- hash of the entire router descriptor, taken from the beginning of the
- "router" line, through the newline after the "router-signature" line.
- The router descriptor is invalid unless the signature is performed
- with the router's identity key.
- "contact" info NL
- Describes a way to contact the server's administrator, preferably
- including an email address and a PGP key fingerprint.
- "family" names NL
- 'Names' is a whitespace-separated list of server nicknames. If two ORs
- list one another in their "family" entries, then OPs should treat them
- as a single OR for the purpose of path selection.
- For example, if node A's descriptor contains "family B", and node B's
- descriptor contains "family A", then node A and node B should never
- be used on the same circuit.
- "read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL
- "write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL
- Declare how much bandwidth the OR has used recently. Usage is divided
- into intervals of NSEC seconds. The YYYY-MM-DD HH:MM:SS field defines
- the end of the most recent interval. The numbers are the number of
- bytes used in the most recent intervals, ordered from oldest to newest.
- [We didn't start parsing these lines until Tor 0.1.0.6-rc; they should
- be marked with "opt" until earlier versions of Tor are obsolete.]
- 2.1. Nonterminals in routerdescriptors
- nickname ::= between 1 and 19 alphanumeric characters, case-insensitive.
- exitpattern ::= addrspec ":" portspec
- portspec ::= "*" | port | port "-" port
- port ::= an integer between 1 and 65535, inclusive.
- addrspec ::= "*" | ip4spec | ip6spec
- ipv4spec ::= ip4 | ip4 "/" num_ip4_bits | ip4 "/" ip4mask
- ip4 ::= an IPv4 address in dotted-quad format
- ip4mask ::= an IPv4 mask in dotted-quad format
- num_ip4_bits ::= an integer between 0 and 32
- ip6spec ::= ip6 | ip6 "/" num_ip6_bits
- ip6 ::= an IPv6 address, surrounded by square brackets.
- num_ip6_bits ::= an integer between 0 and 128
- Ports are required; if they are not included in the router
- line, they must appear in the "ports" lines.
- 3. Directory format
- A Directory begins with a "signed-directory" item, followed by one each of
- the following, in any order: "recommended-software", "published",
- "router-status", "dir-signing-key". It may include any number of "opt"
- items. After these items, a directory includes any number of router
- descriptors, and a single "directory-signature" item.
- "signed-directory"
- Indicates the start of a directory.
- "published" YYYY-MM-DD HH:MM:SS
- The time at which this directory was generated and signed, in GMT.
- "dir-signing-key"
- The key used to sign this directory; see "signing-key" for format.
- "recommended-software" comma-separated-version-list
- A list of which versions of which implementations are currently
- believed to be secure and compatible with the network.
- "running-routers" whitespace-separated-list
- A description of which routers are currently believed to be up or
- down. Every entry consists of an optional "!", followed by either an
- OR's nickname, or "$" followed by a hexadecimal encoding of the hash
- of an OR's identity key. If the "!" is included, the router is
- believed not to be running; otherwise, it is believed to be running.
- If a router's nickname is given, exactly one router of that nickname
- will appear in the directory, and that router is "approved" by the
- directory server. If a hashed identity key is given, that OR is not
- "approved". [XXXX The 'running-routers' line is only provided for
- backward compatibility. New code should parse 'router-status'
- instead.]
- "router-status" whitespace-separated-list
- A description of which routers are currently believed to be up or
- down, and which are verified or unverified. Contains one entry for
- every router that the directory server knows. Each entry is of the
- format:
- !name=$digest [Verified router, currently not live.]
- name=$digest [Verified router, currently live.]
- !$digest [Unverified router, currently not live.]
- or $digest [Unverified router, currently live.]
- (where 'name' is the router's nickname and 'digest' is a hexadecimal
- encoding of the hash of the routers' identity key).
- When parsing this line, clients should only mark a router as
- 'verified' if its nickname AND digest match the one provided.
- "directory-signature" nickname-of-dirserver NL Signature
- The signature is computed by computing the digest of the
- directory, from the characters "signed-directory", through the newline
- after "directory-signature". This digest is then padded with PKCS.1,
- and signed with the directory server's signing key.
- If software encounters an unrecognized keyword in a single router descriptor,
- it MUST reject only that router descriptor, and continue using the
- others. Because this mechanism is used to add 'critical' extensions to
- future versions of the router descriptor format, implementation should treat
- it as a normal occurrence and not, for example, report it to the user as an
- error. [Versions of Tor prior to 0.1.1 did this.]
- If software encounters an unrecognized keyword in the directory header,
- it SHOULD reject the entire directory.
- 4. Network-status descriptor
- A "network-status" (a.k.a "running-routers") document is a truncated
- directory that contains only the current status of a list of nodes, not
- their actual descriptors. It contains exactly one of each of the following
- entries.
- "network-status"
- Must appear first.
- "published" YYYY-MM-DD HH:MM:SS
- (see section 3 above)
- "router-status" list
- (see section 3 above)
- "directory-signature" NL signature
- (see section 3 above)
- 5. Behavior of a directory server
- lists nodes that are connected currently
- speaks HTTP on a socket, spits out directory on request
- Directory servers listen on a certain port (the DirPort), and speak a
- limited version of HTTP 1.0. Clients send either GET or POST commands.
- The basic interactions are:
- "%s %s HTTP/1.0\r\nContent-Length: %lu\r\nHost: %s\r\n\r\n",
- command, url, content-length, host.
- Get "/tor/" to fetch a full directory.
- Get "/tor/dir.z" to fetch a compressed full directory.
- Get "/tor/running-routers" to fetch a network-status descriptor.
- Post "/tor/" to post a server descriptor, with the body of the
- request containing the descriptor.
- "host" is used to specify the address:port of the dirserver, so
- the request can survive going through HTTP proxies.
|