123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198 |
- Filename: 158-microdescriptors.txt
- Title: Clients download consensus + microdescriptors
- Author: Roger Dingledine
- Created: 17-Jan-2009
- Status: Open
- 0. History
- 15 May 2009: Substantially revised based on discussions on or-dev
- from late January. Removed the notion of voting on how to choose
- microdescriptors; made it just a function of the consensus method.
- (This lets us avoid the possibility of "desynchronization.")
- Added suggestion to use a new consensus flavor. Specified use of
- SHA256 for new hashes. -nickm
- 15 June 2009: Cleaned up based on comments from Roger. -nickm
- 1. Overview
- This proposal replaces section 3.2 of proposal 141, which was
- called "Fetching descriptors on demand". Rather than modifying the
- circuit-building protocol to fetch a server descriptor inline at each
- circuit extend, we instead put all of the information that clients need
- either into the consensus itself, or into a new set of data about each
- relay called a microdescriptor.
- Descriptor elements that are small and frequently changing should go
- in the consensus itself, and descriptor elements that are small and
- relatively static should go in the microdescriptor. If we ever end up
- with descriptor elements that aren't small yet clients need to know
- them, we'll need to resume considering some design like the one in
- proposal 141.
- Note also that any descriptor element which clients need to use to
- decide which servers to fetch info about, or which servers to fetch
- info from, needs to stay in the consensus.
- 2. Motivation
- See
- http://archives.seul.org/or/dev/Nov-2008/msg00000.html and
- http://archives.seul.org/or/dev/Nov-2008/msg00001.html and especially
- http://archives.seul.org/or/dev/Nov-2008/msg00007.html
- for a discussion of the options and why this is currently the best
- approach.
- 3. Design
- There are three pieces to the proposal. First, authorities will list in
- their votes (and thus in the consensus) the expected hash of
- microdescriptor for each relay. Second, authorities will serve
- microdescriptors, directory mirrors will cache and serve
- them. Third, clients will ask for them and cache them.
- 3.1. Consensus changes
- If the authorities choose a consensus method of a given version or
- later, a microdescriptor format is implicit in that version.
- A microdescriptor should in every case be a pure function of the
- router descriptor and the consensus method.
- In votes, we need to include the hash of each expected microdescriptor
- in the routerstatus section. I suggest a new "m" line for each stanza,
- with the base64 of the SHA256 hash of the router's microdescriptor.
- For every consensus method that an authority supports, it includes a
- separate "m" line in each router section of its vote, containing:
- "m" SP methods 1*(SP AlgorithmName "=" digest) NL
- where methods is a comma-separated list of the consensus methods
- that the authority believes will produce "digest".
- (As with base64 encoding of SHA1 hashes in consensuses, let's
- omit the trailing =s)
- The consensus microdescriptor-elements and "m" lines are then computed
- as described in Section 3.1.2 below.
- (This means we need a new consensus-method that knows
- how to compute the microdescriptor-elements and add "m" lines.)
- The microdescriptor consensus uses the directory-signature format from
- proposal 162, with the "sha256" algorithm.
- 3.1.1. Descriptor elements to include for now
- In the first version, the microdescriptor should contain the
- onion-key element, and the family element from the router descriptor,
- and the exit policy summary as currently specified in dir-spec.txt.
- 3.1.2. Computing consensus for microdescriptor-elements and "m" lines
- When we are generating a consensus, we use whichever m line
- unambiguously corresponds to the descriptor digest that will be
- included in the consensus.
- (If different votes have different microdescriptor digests for a
- single <descriptor-digest, consensus-method> pair, then at least one
- of the authorities is broken. If this happens, the consensus should
- contain whichever microdescriptor digest is most common. If there is
- no winner, we break ties in the favor of the lexically earliest.
- Either way, we should log a warning: there is definitely a bug.)
- The "m" lines in a consensus contain only the digest, not a list of
- consensus methods.
- 3.1.3. A new flavor of consensus
- Rather than inserting "m" lines in the current consensus format,
- they should be included in a new consensus flavor (see proposal
- 162).
- This flavor can safely omit descriptor digests.
- When we implement this voting method, we can remove the exit policy
- summary from the current "ns" flavor of consensus, since no current
- clients use them, and they take up about 5% of the compressed
- consensus.
- This new consensus flavor should be signed with the sha256 signature
- format as documented in proposal 162.
- 3.2. Directory mirrors fetch, cache, and serve microdescriptors
- Directory mirrors should fetch, catch, and serve each microdescriptor
- from the authorities. (They need to continue to serve normal relay
- descriptors too, to handle old clients.)
- The microdescriptors with base64 hashes <D1>,<D2>,<D3> should be
- available at:
- http://<hostname>/tor/micro/d/<D1>-<D2>-<D3>.z
- (We use base64 for size and for consistency with the consensus
- format. We use -s instead of +s to separate these items, since
- the + character is used in base64 encoding.)
- All the microdescriptors from the current consensus should also be
- available at:
- http://<hostname>/tor/micro/all.z
- so a client that's bootstrapping doesn't need to send a 70KB URL just
- to name every microdescriptor it's looking for.
- Microdescriptors have no header or footer.
- The hash of the microdescriptor is simply the hash of the concatenated
- elements.
- Directory mirrors should check to make sure that the microdescriptors
- they're about to serve match the right hashes (either the hashes from
- the fetch URL or the hashes from the consensus, respectively).
- We will probably want to consider some sort of smart data structure to
- be able to quickly convert microdescriptor hashes into the appropriate
- microdescriptor. Clients will want this anyway when they load their
- microdescriptor cache and want to match it up with the consensus to
- see what's missing.
- 3.3. Clients fetch them and cache them
- When a client gets a new consensus, it looks to see if there are any
- microdescriptors it needs to learn. If it needs to learn more than
- some threshold of the microdescriptors (half?), it requests 'all',
- else it requests only the missing ones. Clients MAY try to
- determine whether the upload bandwidth for listing the
- microdescriptors they want is more or less than the download
- bandwidth for the microdescriptors they do not want.
- Clients maintain a cache of microdescriptors along with metadata like
- when it was last referenced by a consensus, and which identity key
- it corresponds to. They keep a microdescriptor
- until it hasn't been mentioned in any consensus for a week. Future
- clients might cache them for longer or shorter times.
- 3.3.1. Information leaks from clients
- If a client asks you for a set of microdescs, then you know she didn't
- have them cached before. How much does that leak? What about when
- we're all using our entry guards as directory guards, and we've seen
- that user make a bunch of circuits already?
- Fetching "all" when you need at least half is a good first order fix,
- but might not be all there is to it.
- Another future option would be to fetch some of the microdescriptors
- anonymously (via a Tor circuit).
- Another crazy option (Roger's phrasing) is to do decoy fetches as
- well.
- 4. Transition and deployment
- Phase one, the directory authorities should start voting on
- microdescriptors, and putting them in the consensus.
- Phase two, directory mirrors should learn how to serve them, and learn
- how to read the consensus to find out what they should be serving.
- Phase three, clients should start fetching and caching them instead
- of normal descriptors.
|