| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198 | 
							- Filename: 158-microdescriptors.txt
 
- Title: Clients download consensus + microdescriptors
 
- Author: Roger Dingledine
 
- Created: 17-Jan-2009
 
- Status: Open
 
- 0. History
 
-   15 May 2009: Substantially revised based on discussions on or-dev
 
-   from late January.  Removed the notion of voting on how to choose
 
-   microdescriptors; made it just a function of the consensus method.
 
-   (This lets us avoid the possibility of "desynchronization.")
 
-   Added suggestion to use a new consensus flavor.  Specified use of
 
-   SHA256 for new hashes. -nickm
 
-   15 June 2009: Cleaned up based on comments from Roger. -nickm
 
- 1. Overview
 
-   This proposal replaces section 3.2 of proposal 141, which was
 
-   called "Fetching descriptors on demand". Rather than modifying the
 
-   circuit-building protocol to fetch a server descriptor inline at each
 
-   circuit extend, we instead put all of the information that clients need
 
-   either into the consensus itself, or into a new set of data about each
 
-   relay called a microdescriptor.
 
-   Descriptor elements that are small and frequently changing should go
 
-   in the consensus itself, and descriptor elements that are small and
 
-   relatively static should go in the microdescriptor. If we ever end up
 
-   with descriptor elements that aren't small yet clients need to know
 
-   them, we'll need to resume considering some design like the one in
 
-   proposal 141.
 
-   Note also that any descriptor element which clients need to use to
 
-   decide which servers to fetch info about, or which servers to fetch
 
-   info from, needs to stay in the consensus.
 
- 2. Motivation
 
-   See
 
-   http://archives.seul.org/or/dev/Nov-2008/msg00000.html and
 
-   http://archives.seul.org/or/dev/Nov-2008/msg00001.html and especially
 
-   http://archives.seul.org/or/dev/Nov-2008/msg00007.html
 
-   for a discussion of the options and why this is currently the best
 
-   approach.
 
- 3. Design
 
-   There are three pieces to the proposal. First, authorities will list in
 
-   their votes (and thus in the consensus) the expected hash of
 
-   microdescriptor for each relay. Second, authorities will serve
 
-   microdescriptors, directory mirrors will cache and serve
 
-   them. Third, clients will ask for them and cache them.
 
- 3.1. Consensus changes
 
-   If the authorities choose a consensus method of a given version or
 
-   later, a microdescriptor format is implicit in that version.
 
-   A microdescriptor should in every case be a pure function of the
 
-   router descriptor and the consensus method.
 
-   In votes, we need to include the hash of each expected microdescriptor
 
-   in the routerstatus section. I suggest a new "m" line for each stanza,
 
-   with the base64 of the SHA256 hash of the router's microdescriptor.
 
-   For every consensus method that an authority supports, it includes a
 
-   separate "m" line in each router section of its vote, containing:
 
-     "m" SP methods 1*(SP AlgorithmName "=" digest) NL
 
-   where methods is a comma-separated list of the consensus methods
 
-   that the authority believes will produce "digest".
 
-   (As with base64 encoding of SHA1 hashes in consensuses, let's
 
-   omit the trailing =s)
 
-   The consensus microdescriptor-elements and "m" lines are then computed
 
-   as described in Section 3.1.2 below.
 
-   (This means we need a new consensus-method that knows
 
-   how to compute the microdescriptor-elements and add "m" lines.)
 
-   The microdescriptor consensus uses the directory-signature format from
 
-   proposal 162, with the "sha256" algorithm.
 
- 3.1.1. Descriptor elements to include for now
 
-   In the first version, the microdescriptor should contain the
 
-   onion-key element, and the family element from the router descriptor,
 
-   and the exit policy summary as currently specified in dir-spec.txt.
 
- 3.1.2. Computing consensus for microdescriptor-elements and "m" lines
 
-   When we are generating a consensus, we use whichever m line
 
-   unambiguously corresponds to the descriptor digest that will be
 
-   included in the consensus.
 
-   (If different votes have different microdescriptor digests for a
 
-   single <descriptor-digest, consensus-method> pair, then at least one
 
-   of the authorities is broken.  If this happens, the consensus should
 
-   contain whichever microdescriptor digest is most common.  If there is
 
-   no winner, we break ties in the favor of the lexically earliest.
 
-   Either way, we should log a warning: there is definitely a bug.)
 
-   The "m" lines in a consensus contain only the digest, not a list of
 
-   consensus methods.
 
- 3.1.3. A new flavor of consensus
 
-   Rather than inserting "m" lines in the current consensus format,
 
-   they should be included in a new consensus flavor (see proposal
 
-   162).
 
-   This flavor can safely omit descriptor digests.
 
-   When we implement this voting method, we can remove the exit policy
 
-   summary from the current "ns" flavor of consensus, since no current
 
-   clients use them, and they take up about 5% of the compressed
 
-   consensus.
 
-   This new consensus flavor should be signed with the sha256 signature
 
-   format as documented in proposal 162.
 
- 3.2. Directory mirrors fetch, cache, and serve microdescriptors
 
-   Directory mirrors should fetch, catch, and serve each microdescriptor
 
-   from the authorities.  (They need to continue to serve normal relay
 
-   descriptors too, to handle old clients.)
 
-   The microdescriptors with base64 hashes <D1>,<D2>,<D3> should be
 
-   available at:
 
-     http://<hostname>/tor/micro/d/<D1>-<D2>-<D3>.z
 
-   (We use base64 for size and for consistency with the consensus
 
-   format. We use -s instead of +s to separate these items, since
 
-   the + character is used in base64 encoding.)
 
-   All the microdescriptors from the current consensus should also be
 
-   available at:
 
-     http://<hostname>/tor/micro/all.z
 
-   so a client that's bootstrapping doesn't need to send a 70KB URL just
 
-   to name every microdescriptor it's looking for.
 
-   Microdescriptors have no header or footer.
 
-   The hash of the microdescriptor is simply the hash of the concatenated
 
-   elements.
 
-   Directory mirrors should check to make sure that the microdescriptors
 
-   they're about to serve match the right hashes (either the hashes from
 
-   the fetch URL or the hashes from the consensus, respectively).
 
-   We will probably want to consider some sort of smart data structure to
 
-   be able to quickly convert microdescriptor hashes into the appropriate
 
-   microdescriptor. Clients will want this anyway when they load their
 
-   microdescriptor cache and want to match it up with the consensus to
 
-   see what's missing.
 
- 3.3. Clients fetch them and cache them
 
-   When a client gets a new consensus, it looks to see if there are any
 
-   microdescriptors it needs to learn. If it needs to learn more than
 
-   some threshold of the microdescriptors (half?), it requests 'all',
 
-   else it requests only the missing ones.  Clients MAY try to
 
-   determine whether the upload bandwidth for listing the
 
-   microdescriptors they want is more or less than the download
 
-   bandwidth for the microdescriptors they do not want.
 
-   Clients maintain a cache of microdescriptors along with metadata like
 
-   when it was last referenced by a consensus, and which identity key
 
-   it corresponds to.  They keep a microdescriptor
 
-   until it hasn't been mentioned in any consensus for a week. Future
 
-   clients might cache them for longer or shorter times.
 
- 3.3.1. Information leaks from clients
 
-   If a client asks you for a set of microdescs, then you know she didn't
 
-   have them cached before. How much does that leak? What about when
 
-   we're all using our entry guards as directory guards, and we've seen
 
-   that user make a bunch of circuits already?
 
-   Fetching "all" when you need at least half is a good first order fix,
 
-   but might not be all there is to it.
 
-   Another future option would be to fetch some of the microdescriptors
 
-   anonymously (via a Tor circuit).
 
-   Another crazy option (Roger's phrasing) is to do decoy fetches as
 
-   well.
 
- 4. Transition and deployment
 
-   Phase one, the directory authorities should start voting on
 
-   microdescriptors, and putting them in the consensus.
 
-   Phase two, directory mirrors should learn how to serve them, and learn
 
-   how to read the consensus to find out what they should be serving.
 
-   Phase three, clients should start fetching and caching them instead
 
-   of normal descriptors.
 
 
  |