123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395 |
- $Id$
- Tor directory protocol for 0.1.1.x series
- 0. Scope and preliminaries
- This document should eventually be merged into tor-spec.txt and replace
- the existing notes on directories.
- This is not a finalized version; what we actually wind up implementing
- may be very different from the system described here.
- 0.1. Goals
- There are several problems with the way Tor handles directories right
- now:
- 1. Directories are very large and use a lot of bandwidth.
- 2. Every directory server is a single point of failure.
- 3. Requiring every client to know every server won't scale.
- 4. Requiring every directory cache to know every server won't scale.
- 5. Our current "verified server" system is kind of nonsensical.
- 6. Getting more directory servers adds more points of failure and
- worsens possible partitioning attacks.
- This design tries to solve every problem except problems 3 and 4, and to
- be compatible with likely eventual solutions to problems 3 and 4.
- 1. Outline
- There is no longer any such thing as a "signed directory". Instead,
- directory servers sign a very compressed 'network status' object that
- lists the current descriptors and their status, and router descriptors
- continue to be self-signed by servers. Clients download network status
- listings periodically, and download router descriptors as needed. ORs
- upload descriptors relatively infrequently.
- There are multiple directory servers. Rather than doing anything
- complicated to coordinate themselves, clients simply rotate through them
- in order, and only use servers that most of the last several directory
- servers like.
- 2. Router descriptors
- The router descriptor format is unchanged from tor-spec.txt.
- ORs SHOULD generate a new router descriptor whenever any of the
- following events have occurred:
- - A period of time (18 hrs by default) has passed since the last
- time a descriptor was generated.
- - A descriptor field other than bandwidth or uptime has changed.
- - Bandwidth has changed by more than +/- 50% from the last time a
- descriptor was generated, and at least a given interval of time
- (20 mins by default) has passed since then.
- - Uptime has been reset.
- After generating a descriptor, ORs upload it to every directory
- server they know.
- 3. Network status
- Directory servers generate, sign, and compress a network-status document
- as needed. As an optimization, they may rate-limit the number of such
- documents generated to once every few seconds. Directory servers should
- rate-limit at least to the point where these documents are generated no
- faster than once per second.
- The network status document contains a preamble, a set of router status
- entries, and a signature, in that order.
- We use the same meta-format as used for directories and router descriptors
- in "tor-spec.txt".
- The preamble contains:
- "network-status-version" -- A document format version. For this
- specification, the version is "2".
- "dir-source" -- The hostname, current IP address, and directory
- port of the directory server, separated by spaces.
- "fingerprint" -- A base16-encoded hash of the signing key's
- fingerprint, with no additional spaces added.
- "contact" -- An arbitrary string describing how to contact the
- directory server's administrator. Administrators should include at
- least an email address and a PGP fingerprint.
- "dir-signing-key" -- The directory server's public signing key.
- "client-versions" -- A comma-separated list of recommended client versions.
- "server-versions" -- A comma-separated list of recommended server versions.
- "published" -- The publication time for this network-status object.
- "dir-options" -- A set of flags separated by spaces:
- "Names" if this directory server performs name bindings.
- "Versions" if this directory server recommends software versions.
- The dir-options entry is optional. The "-versions" entries are required if
- the "Versions" flag is present. The other entries are required and must
- appear exactly once. The "network-status-version" entry must appear first;
- the others may appear in any order.
- For each router, the router entry contains: (This format is designed for
- conciseness.)
- "r" -- followed by the following elements, separated by spaces:
- - The OR's nickname,
- - A hash of its identity key, encoded in base64, with trailing =
- signs removed.
- - A hash of its most recent descriptor, encoded in base64, with
- trailing = signs removed. (The hash is calculated as for
- computing the signature of a descriptor.)
- - The publication time of its most recent descriptor.
- - An IP
- - An OR port
- - A directory port (or "0" for none")
- "s" -- A series of space-separated status flags:
- "Exit" if the router is useful for building general-purpose exit
- circuits.
- "Stable" if the router tends to stay up for a long time.
- "Fast" if the router has high bandwidth.
- "Running" if the router is currently usable.
- "Named" if the router's identity-nickname mapping is canonical.
- "Valid" if the router has been 'validated'.
- "Authority" if the router is a directory authority.
- The "r" entry for each router must appear first and is required. The
- 's" entry is optional. Unrecognized flags, or extra elements on the
- "r" line must be ignored.
- The signature section contains:
- "directory-signature". A signature of the rest of the document using
- the directory server's signing key.
- We compress the network status list with zlib before transmitting it.
- 4. Directory server operation
- By default, directory servers remember all non-expired, non-superseded OR
- descriptors that they have seen.
- For each OR, a directory server remembers whether the OR was running and
- functional the last time they tried to connect to it, and possibly other
- liveness information.
- Directory server administrators may label some servers or IPs as
- blacklisted, and elect not to include them in their network-status lists.
- Thus, the network-status list includes all non-blacklisted,
- non-expired, non-superseded descriptors for ORs that the directory has
- observed at least once to be running.
- Directory server administrators may decide to support name binding. If
- they do, then they must maintain a file of nickname-to-identity-key
- mappings, and try to keep this file consistent with other directory
- servers. If they don't, they act as clients, and report bindings made by
- other directory servers (name X is bound to identity Y if at least one
- binding directory lists it, and no directory binds X to some other Y'.)
- The authoritative network-status published by a host should be available at:
- http://<hostname>/tor/status/authority.z
- An authoritative network-status published by another host with fingerprint
- <F> should be available at:
- http://<hostname>/tor/status/fp/<F>.z
- An authoritative network-status published by other hosts with fingerprints
- <F1>,<F2>,<F3> should be available at:
- http://<hostname>/tor/status/fp/<F1>+<F2>+<F3>.z
- The most recent network-status documents from all known authoritative
- directories, concatenated, should be available at:
- http://<hostname>/tor/status/all.z
- The most recent descriptor for a server whose identity key has a
- fingerprint of <F> should be available at:
- http://<hostname>/tor/server/fp/<F>.z
- The most recent descriptors for servers with fingerprints <F1>,<F2>,<F3>
- should be available at:
- http://<hostname>/tor/server/fp/<F1>+<F2>+<F3>.z
- The descriptor for a server whose digest (in hex) is <D> should be
- available at:
- http://<hostname>/tor/server/d/<D>.z
- The most recent descriptors with digests <D1>,<D2>,<D3> should be
- available at:
- http://<hostname>/tor/server/d/<D1>+<D2>+<D3>.z
- The most recent descriptor for this server should be at:
- http://<hostname>/tor/server/authority.z
- A concatenated set of the most recent descriptors for all known servers
- should be available at:
- http://<hostname>/tor/server/all.z
- For debugging, directories MAY expose non-compressed objects at URLs like
- the above, but without the final ".z".
- Clients MUST handle compressed concatenated information in two forms:
- - A concatenated list of zlib-compressed objects.
- - A zlib-compressed concatenated list of objects.
- Directory servers MAY generate either format: the former requires less
- CPU, but the latter requires less bandwidth.
- 4.1. Caching
- Directory caches (most ORs) regularly download network status documents,
- and republish them at a URL based on the directory server's identity key:
- http://<hostname>/tor/status/<identity fingerprint>.z
- A concatenated list of all network-status documents should be available at:
- http://<hostname>/tor/status/all.z
- 4.2. Compression
- 5. Client operation
- Every OP or OR, including directory servers, acts as a client to the
- directory protocol.
- Each client maintains a list of trusted directory servers. Periodically
- (currently every 20 minutes), the client downloads a new network status. It
- chooses the directory server from which its current information is most
- out-of-date, and retries on failure until it finds a running server.
- When choosing ORs to build circuits, clients proceed as follows:
- - A server is "listed" if it is listed by more than half of the "live"
- network status documents the clients have downloaded. (A network
- status is "live" if it is the most recently downloaded network status
- document for a given directory server, and the server is a directory
- server trusted by the client, and the network-status document is no
- more than D (say, 10) days old.)
- - A server is "valid" is it is listed as valid by more than half of the
- "live" downloaded" network-status document.
- - A server is "running" if it is listed as running by more than
- half of the "recent" downloaded network-status documents.
- (A network status is "recent" if it was published in the last
- 60 minutes. If there are fewer than 3 such documents, the most
- recently published 3 are "recent." If there are fewer than 3 in all,
- all are "recent.")
- Clients store network status documents so long as they are live.
- 5.1. Scheduling network status downloads
- This download scheduling algorithm implements the approach described above
- in a relatively low-state fashion. It reflects the current Tor
- implementation.
- Clients maintain a list of authorities; each client tries to keep the same
- list, in the same order.
- Periodically, on startup, and on HUP, clients check whether they need to
- download fresh network status documents. The approach is as follows:
- - If we have under X network status documents newer than OLD, we choose a
- member of the list at random and try download XX documents starting
- with that member's.
- - Otherwise, if we have no network status documents newer than NEW, we
- check to see which authority's document we retrieved most recently,
- and try to retrieve the next authority's document. If we can't, we
- try the next authority in sequence, and so on.
- 5.2. Managing naming
- In order to provide human-memorable names for individual server
- identities, some directory servers bind names to IDs. Clients handle
- names in two ways:
- If a client is encountering a name it has not mapped before:
- If all the "binding" networks-status documents the client has so far
- received same claim that the name binds to some identity X, and the
- client has received at least three network-status documents, the client
- maps the name to X.
- If a client is encountering a name it has mapped before:
- It uses the last-mapped identity value, unless all of the "binding"
- network status documents bind the name to some other identity.
- 5.3. Notes on what we do now.
- THIS SECTION SHOULD BE FOLDED INTO THE EARLIER SECTIONS; THEY ARE WRONG;
- THIS IS RIGHT.
- All downloaded networkstatuses are discarded once they are 10 days old (by
- published date).
- Authdirs download each others' networkstatus every
- AUTHORITY_NS_CACHE_INTERVAL minutes (currently 10).
- Directory caches download authorities' networkstatus every
- NONAUTHORITY_NS_CACHE_INTERVAL minutes (currently 10).
- Clients always try to replace any networkstatus received over
- NETWORKSTATUS_MAX_VALIDITY ago (currently 2 days). Also, when the most
- recently received networkstatus is more than
- NETWORKSTATUS_CLIENT_DL_INTERVAL (30 minutes) old, and we do not have any
- open directory connections fetching a networkstatus, clients try to
- download the networkstatus on their list after the most recently received
- networkstatus, skipping failed networkstatuses. A networkstatus is
- "failed" if NETWORKSTATUS_N_ALLOWABLE_FAILURES (3) attempts in a row have
- all failed.
- We do not update router statuses if we have less than half of the
- networkstatuses.
- A networkstatus is "live" if it is the most recent we have received signed
- by a given trusted authority.
- A networkstatus is "recent" if it is "live" and:
- - it was received in the last DEFAULT_RUNNING_INTERVAL (currently 60
- minutes)
- OR - it was one of the MIN_TO_INFLUENCE_RUNNING (3) most recently received
- networkstatuses.
- Authorities always believe their own opinion as to a router's status. For
- other tors:
- - a router is valid if more than half of the live networkstatuses think
- it's valid.
- - a router is named if more than half of the live networkstatuses from
- naming authorities think it's named, and they all think it has the
- same name.
- - a router is running if more than half of the recent networkstatuses
- think it's running.
- Everyone downloads router descriptors as follows:
- - If any networkstatus lists a more recently published routerdesc with a
- different descriptor digest, and no more than
- MAX_ROUTERDESC_DOWNLOAD_FAILURES attempts to retrieve that routerdesc
- have failed, then that routerdesc is "downloadable".
- - Every DirFetchInterval, or whenever a request for routerdescs returns
- no routerdescs, we launch a set of requests for all downloadable
- routerdescs. We divide the downloadable routerdescs into groups of no
- more than DL_PER_REQUEST, and send a request for each group to
- directory servers chosen independently.
- - We also launch a request as above when a request for routerdescs
- fails and we have no directory connections fetching routerdescs.
- TODO Specify here:
- - When to 0-out failure count for networkstatus?
- - Drop fallback to download-all. Also, always split download.
- - For versions: if you're listed by more than half of live versioning
- networkstatuses, good. if less than half of networkstatuses are live,
- don't do anything. If half are live, and half of less of the
- versioning ones list you, warn. Only warn once every 24 hours.
- - For names: warn if an unnamed router is specified by nickname.
- Rate-limit these warnings.
- - Also, don't believe N->K if another naming authdir says N->K'.
- - Revise naming rule: N->K is true if any naming directory says N->K,
- and no other naming directory says N->K' or N'->K.
- - Minimum info to build circuits.
- - Revise: always split requests when we have too little info to build
- circuits.
- - Describe when router is "out of date". (Any dirserver says so.)
- - Change rule from "do not launch new connections when one exists" to
- "do not request any fingerprint that we're currently requesting."
- - Launch new connections every minute, plus whenever a download fails.
- - Reset routerdesc failure count after 60 minutes, or when
- when network comes back on after absence.
- - Make "I didn't get the one I thought was most recent" a failure.
- - Retry these every 5 minutes if you're a client.
- - Mirrors should retry these harder and more often.
- - If we have a routerdesc for Bob, and he says, "I'm 0.1.0.x", don't
- fetch a new one if it was published in the last 2 hours. (??)
- - Describe what we do with old server versions.
- - If we have less than 16 to download, do not download unless 10 minutes
- have passed since last download.
- - Which descriptors do directory servers remember?
- 6. Remaining issues
- Client-knowledge partitioning is worrisome. Most versions of this don't
- seem to be worse than the Danezis-Murdoch tracing attack, since an
- attacker can't do more than deduce probable exits from entries (or vice
- versa). But what about when the client connects to A and B but in a
- different order? How bad can it be partitioned based on its knowledge?
|