|
@@ -1,11 +1,20 @@
|
|
|
Filename: 158-microdescriptors.txt
|
|
|
Title: Clients download consensus + microdescriptors
|
|
|
-Version: $Revision$
|
|
|
-Last-Modified: $Date$
|
|
|
Author: Roger Dingledine
|
|
|
Created: 17-Jan-2009
|
|
|
Status: Open
|
|
|
|
|
|
+0. History
|
|
|
+
|
|
|
+ 15 May 2009: Substantially revised based on discussions on or-dev
|
|
|
+ from late January. Removed the notion of voting on how to choose
|
|
|
+ microdescriptors; made it just a function of the consensus method.
|
|
|
+ (This lets us avoid the possibility of "desynchronization.")
|
|
|
+ Added suggestion to use a new consensus flavor. Specified use of
|
|
|
+ SHA256 for new hashes. -nickm
|
|
|
+
|
|
|
+ 15 June 2009: Cleaned up based on comments from Roger. -nickm
|
|
|
+
|
|
|
1. Overview
|
|
|
|
|
|
This proposal replaces section 3.2 of proposal 141, which was
|
|
@@ -13,9 +22,7 @@ Status: Open
|
|
|
circuit-building protocol to fetch a server descriptor inline at each
|
|
|
circuit extend, we instead put all of the information that clients need
|
|
|
either into the consensus itself, or into a new set of data about each
|
|
|
- relay called a microdescriptor. The microdescriptor is a direct
|
|
|
- transform from the relay descriptor, so relays don't even need to know
|
|
|
- this is happening.
|
|
|
+ relay called a microdescriptor.
|
|
|
|
|
|
Descriptor elements that are small and frequently changing should go
|
|
|
in the consensus itself, and descriptor elements that are small and
|
|
@@ -24,6 +31,10 @@ Status: Open
|
|
|
them, we'll need to resume considering some design like the one in
|
|
|
proposal 141.
|
|
|
|
|
|
+ Note also that any descriptor element which clients need to use to
|
|
|
+ decide which servers to fetch info about, or which servers to fetch
|
|
|
+ info from, needs to stay in the consensus.
|
|
|
+
|
|
|
2. Motivation
|
|
|
|
|
|
See
|
|
@@ -36,99 +47,91 @@ Status: Open
|
|
|
3. Design
|
|
|
|
|
|
There are three pieces to the proposal. First, authorities will list in
|
|
|
- their votes (and thus in the consensus) what relay descriptor elements
|
|
|
- are included in the microdescriptor, and also list the expected hash
|
|
|
- of microdescriptor for each relay. Second, directory mirrors will serve
|
|
|
- microdescriptors. Third, clients will ask for them and cache them.
|
|
|
+ their votes (and thus in the consensus) the expected hash of
|
|
|
+ microdescriptor for each relay. Second, authorities will serve
|
|
|
+ microdescriptors, directory mirrors will cache and serve
|
|
|
+ them. Third, clients will ask for them and cache them.
|
|
|
|
|
|
3.1. Consensus changes
|
|
|
|
|
|
- V3 votes should include a new line:
|
|
|
- microdescriptor-elements bar baz foo
|
|
|
- listing each descriptor element (sorted alphabetically) that authority
|
|
|
- included when it calculated its expected microdescriptor hashes.
|
|
|
+ If the authorities choose a consensus method of a given version or
|
|
|
+ later, a microdescriptor format is implicit in that version.
|
|
|
+ A microdescriptor should in every case be a pure function of the
|
|
|
+ router descriptor and the consensus method.
|
|
|
+
|
|
|
+ In votes, we need to include the hash of each expected microdescriptor
|
|
|
+ in the routerstatus section. I suggest a new "m" line for each stanza,
|
|
|
+ with the base64 of the SHA256 hash of the router's microdescriptor.
|
|
|
+
|
|
|
+ For every consensus method that an authority supports, it includes a
|
|
|
+ separate "m" line in each router section of its vote, containing:
|
|
|
+ "m" SP methods 1*(SP AlgorithmName "=" digest) NL
|
|
|
+ where methods is a comma-separated list of the consensus methods
|
|
|
+ that the authority believes will produce "digest".
|
|
|
|
|
|
- We also need to include the hash of each expected microdescriptor in
|
|
|
- the routerstatus section. I suggest a new "m" line for each stanza,
|
|
|
- with the base64 of the hash of the elements that the authority voted
|
|
|
- for above.
|
|
|
+ (As with base64 encoding of SHA1 hashes in consensuses, let's
|
|
|
+ omit the trailing =s)
|
|
|
|
|
|
The consensus microdescriptor-elements and "m" lines are then computed
|
|
|
as described in Section 3.1.2 below.
|
|
|
|
|
|
- I believe that means we need a new consensus-method "6" that knows
|
|
|
- how to compute the microdescriptor-elements and add "m" lines.
|
|
|
+ (This means we need a new consensus-method that knows
|
|
|
+ how to compute the microdescriptor-elements and add "m" lines.)
|
|
|
|
|
|
-3.1.1. Descriptor elements to include for now
|
|
|
+ The microdescriptor consensus uses the directory-signature format from
|
|
|
+ proposal 162, with the "sha256" algorithm.
|
|
|
|
|
|
- To start, the element list that authorities suggest should be
|
|
|
- family onion-key
|
|
|
|
|
|
- (Note that the or-dev posts above only mention onion-key, but if
|
|
|
- we don't also include family then clients will never learn it. It
|
|
|
- seemed like it should be relatively static, so putting it in the
|
|
|
- microdescriptor is smarter than trying to fit it into the consensus.)
|
|
|
+3.1.1. Descriptor elements to include for now
|
|
|
|
|
|
- We could imagine a config option "family,onion-key" so authorities
|
|
|
- could change their voted preferences without needing to upgrade.
|
|
|
+ In the first version, the microdescriptor should contain the
|
|
|
+ onion-key element, and the family element from the router descriptor,
|
|
|
+ and the exit policy summary as currently specified in dir-spec.txt.
|
|
|
|
|
|
3.1.2. Computing consensus for microdescriptor-elements and "m" lines
|
|
|
|
|
|
- One approach is for the consensus microdescriptor-elements line to
|
|
|
- include every element listed by a majority of authorities, sorted. The
|
|
|
- problem here is that it will no longer be deterministic what the correct
|
|
|
- hash for the "m" line should be. We could imagine telling the authority
|
|
|
- to go look in its descriptor and produce the right hash itself, but
|
|
|
- we don't want consensus calculation to be based on external data like
|
|
|
- that. (Plus, the authority may not have the descriptor that everybody
|
|
|
- else voted to use.)
|
|
|
-
|
|
|
- The better approach is to take the exact set that has the most votes
|
|
|
- (breaking ties by the set that has the most elements, and breaking
|
|
|
- ties after that by whichever is alphabetically first). That will
|
|
|
- increase the odds that we actually get a microdescriptor hash that
|
|
|
- is both a) for the descriptor we're putting in the consensus, and b)
|
|
|
- over the elements that we're declaring it should be for.
|
|
|
-
|
|
|
- Then the "m" line for a given relay is the one that gets the most votes
|
|
|
- from authorities that both a) voted for the microdescriptor-elements
|
|
|
- line we're using, and b) voted for the descriptor we're using.
|
|
|
-
|
|
|
- (If there's a tie, use the smaller hash. But really, if there are
|
|
|
- multiple such votes and they differ about a microdescriptor, we caught
|
|
|
- one of them lying or being buggy. We should log it to track down why.)
|
|
|
-
|
|
|
- If there are no such votes, then we leave out the "m" line for that
|
|
|
- relay. That means clients should avoid it for this time period. (As
|
|
|
- an extension it could instead mean that clients should fetch the
|
|
|
- descriptor and figure out its microdescriptor themselves. But let's
|
|
|
- not get ahead of ourselves.)
|
|
|
-
|
|
|
- It would be nice to have a more foolproof way to agree on what
|
|
|
- microdescriptor hash each authority should vote for, so we can avoid
|
|
|
- missing "m" lines. Just switching to a new consensus-method each time
|
|
|
- we change the set of microdescriptor-elements won't help though, since
|
|
|
- each authority will still have to decide what hash to vote for before
|
|
|
- knowing what consensus-method will be used.
|
|
|
-
|
|
|
- Here's one way we could do it. Each vote / consensus includes
|
|
|
- the microdescriptor-elements that were used to compute the hashes,
|
|
|
- and also a preferred-microdescriptor-elements set. If an authority
|
|
|
- has a consensus from the previous period, then it should use the
|
|
|
- consensus preferred-microdescriptor-elements when computing its votes
|
|
|
- for microdescriptor-elements and the appropriate hashes in the upcoming
|
|
|
- period. (If it has no previous consensus, then it just writes its
|
|
|
- own preferences in both lines.)
|
|
|
-
|
|
|
-3.2. Directory mirrors serve microdescriptors
|
|
|
-
|
|
|
- Directory mirrors should then read the microdescriptor-elements line
|
|
|
- from the consensus, and learn how to answer requests. (Directory mirrors
|
|
|
- continue to serve normal relay descriptors too, a) to serve old clients
|
|
|
- and b) to be able to construct microdescriptors on the fly.)
|
|
|
-
|
|
|
- The microdescriptors with hashes <D1>,<D2>,<D3> should be available at:
|
|
|
- http://<hostname>/tor/micro/d/<D1>+<D2>+<D3>.z
|
|
|
+ When we are generating a consensus, we use whichever m line
|
|
|
+ unambiguously corresponds to the descriptor digest that will be
|
|
|
+ included in the consensus.
|
|
|
+
|
|
|
+ (If different votes have different microdescriptor digests for a
|
|
|
+ single <descriptor-digest, consensus-method> pair, then at least one
|
|
|
+ of the authorities is broken. If this happens, the consensus should
|
|
|
+ contain whichever microdescriptor digest is most common. If there is
|
|
|
+ no winner, we break ties in the favor of the lexically earliest.
|
|
|
+ Either way, we should log a warning: there is definitely a bug.)
|
|
|
+
|
|
|
+ The "m" lines in a consensus contain only the digest, not a list of
|
|
|
+ consensus methods.
|
|
|
+
|
|
|
+3.1.3. A new flavor of consensus
|
|
|
+
|
|
|
+ Rather than inserting "m" lines in the current consensus format,
|
|
|
+ they should be included in a new consensus flavor (see proposal
|
|
|
+ 162).
|
|
|
+
|
|
|
+ This flavor can safely omit descriptor digests.
|
|
|
+
|
|
|
+ When we implement this voting method, we can remove the exit policy
|
|
|
+ summary from the current "ns" flavor of consensus, since no current
|
|
|
+ clients use them, and they take up about 5% of the compressed
|
|
|
+ consensus.
|
|
|
+
|
|
|
+ This new consensus flavor should be signed with the sha256 signature
|
|
|
+ format as documented in proposal 162.
|
|
|
+
|
|
|
+3.2. Directory mirrors fetch, cache, and serve microdescriptors
|
|
|
+
|
|
|
+ Directory mirrors should fetch, catch, and serve each microdescriptor
|
|
|
+ from the authorities. (They need to continue to serve normal relay
|
|
|
+ descriptors too, to handle old clients.)
|
|
|
+
|
|
|
+ The microdescriptors with base64 hashes <D1>,<D2>,<D3> should be
|
|
|
+ available at:
|
|
|
+ http://<hostname>/tor/micro/d/<D1>-<D2>-<D3>.z
|
|
|
+ (We use base64 for size and for consistency with the consensus
|
|
|
+ format. We use -s instead of +s to separate these items, since
|
|
|
+ the + character is used in base64 encoding.)
|
|
|
|
|
|
All the microdescriptors from the current consensus should also be
|
|
|
available at:
|
|
@@ -136,24 +139,9 @@ Status: Open
|
|
|
so a client that's bootstrapping doesn't need to send a 70KB URL just
|
|
|
to name every microdescriptor it's looking for.
|
|
|
|
|
|
- The format of a microdescriptor is the header line
|
|
|
- "microdescriptor-header"
|
|
|
- followed by each element (keyword and body), alphabetically. There's
|
|
|
- no need to mention what hash it's for, since it's self-identifying:
|
|
|
- you can hash the elements to learn this.
|
|
|
-
|
|
|
- (Do we need a footer line to show that it's over, or is the next
|
|
|
- microdescriptor line or EOF enough of a hint? A footer line wouldn't
|
|
|
- hurt much. Also, no fair voting for the microdescriptor-element
|
|
|
- "microdescriptor-header".)
|
|
|
-
|
|
|
+ Microdescriptors have no header or footer.
|
|
|
The hash of the microdescriptor is simply the hash of the concatenated
|
|
|
- elements -- not counting the header line or hypothetical footer line.
|
|
|
- Unless you prefer that?
|
|
|
-
|
|
|
- Is there a reasonable way to version these things? We could say that
|
|
|
- the microdescriptor-header line can contain arguments which clients
|
|
|
- must ignore if they don't understand them. Any better ways?
|
|
|
+ elements.
|
|
|
|
|
|
Directory mirrors should check to make sure that the microdescriptors
|
|
|
they're about to serve match the right hashes (either the hashes from
|
|
@@ -170,10 +158,14 @@ Status: Open
|
|
|
When a client gets a new consensus, it looks to see if there are any
|
|
|
microdescriptors it needs to learn. If it needs to learn more than
|
|
|
some threshold of the microdescriptors (half?), it requests 'all',
|
|
|
- else it requests only the missing ones.
|
|
|
+ else it requests only the missing ones. Clients MAY try to
|
|
|
+ determine whether the upload bandwidth for listing the
|
|
|
+ microdescriptors they want is more or less than the download
|
|
|
+ bandwidth for the microdescriptors they do not want.
|
|
|
|
|
|
Clients maintain a cache of microdescriptors along with metadata like
|
|
|
- when it was last referenced by a consensus. They keep a microdescriptor
|
|
|
+ when it was last referenced by a consensus, and which identity key
|
|
|
+ it corresponds to. They keep a microdescriptor
|
|
|
until it hasn't been mentioned in any consensus for a week. Future
|
|
|
clients might cache them for longer or shorter times.
|
|
|
|
|
@@ -190,18 +182,17 @@ Status: Open
|
|
|
Another future option would be to fetch some of the microdescriptors
|
|
|
anonymously (via a Tor circuit).
|
|
|
|
|
|
+ Another crazy option (Roger's phrasing) is to do decoy fetches as
|
|
|
+ well.
|
|
|
+
|
|
|
4. Transition and deployment
|
|
|
|
|
|
Phase one, the directory authorities should start voting on
|
|
|
- microdescriptors and microdescriptor elements, and putting them in the
|
|
|
- consensus. This should happen during the 0.2.1.x series, and should
|
|
|
- be relatively easy to do.
|
|
|
+ microdescriptors, and putting them in the consensus.
|
|
|
|
|
|
Phase two, directory mirrors should learn how to serve them, and learn
|
|
|
- how to read the consensus to find out what they should be serving. This
|
|
|
- phase could be done either in 0.2.1.x or early in 0.2.2.x, depending
|
|
|
- on how messy it turns out to be and how quickly we get around to it.
|
|
|
+ how to read the consensus to find out what they should be serving.
|
|
|
|
|
|
Phase three, clients should start fetching and caching them instead
|
|
|
- of normal descriptors. This should happen post 0.2.1.x.
|
|
|
+ of normal descriptors.
|
|
|
|