Browse Source

Add proposal 140-consensus-diffs.txt

svn:r15293
Nick Mathewson 17 years ago
parent
commit
9cfe70cb69
2 changed files with 150 additions and 0 deletions
  1. 2 0
      doc/spec/proposals/000-index.txt
  2. 148 0
      doc/spec/proposals/140-consensus-diffs.txt

+ 2 - 0
doc/spec/proposals/000-index.txt

@@ -62,6 +62,7 @@ Proposals by number:
 137  Keep controllers informed as Tor bootstraps [OPEN]
 137  Keep controllers informed as Tor bootstraps [OPEN]
 138  Remove routers that are not Running from consensus documents [CLOSED]
 138  Remove routers that are not Running from consensus documents [CLOSED]
 139  Download consensus documents only when it will be trusted [CLOSED]
 139  Download consensus documents only when it will be trusted [CLOSED]
+140  Provide diffs between consensuses [OPEN]
 
 
 
 
 Proposals by status:
 Proposals by status:
@@ -77,6 +78,7 @@ Proposals by status:
    120  Shutdown descriptors when Tor servers stop
    120  Shutdown descriptors when Tor servers stop
    121  Hidden Service Authentication
    121  Hidden Service Authentication
    137  Keep controllers informed as Tor bootstraps
    137  Keep controllers informed as Tor bootstraps
+   140  Provide diffs between consensuses
  NEEDS-REVISION:
  NEEDS-REVISION:
    110  Avoiding infinite length circuits
    110  Avoiding infinite length circuits
    117  IPv6 exits
    117  IPv6 exits

+ 148 - 0
doc/spec/proposals/140-consensus-diffs.txt

@@ -0,0 +1,148 @@
+Filename: 140-consensus-diffs.txt
+Title: Provide diffs between consensuses
+Version: $Revision$
+Last-Modified: $Date$
+Author: Peter Palfrader
+Created: 13-Jun-2008
+Status: Open
+
+1. Overview.
+
+  Tor clients and servers need a list of which relays are on the
+  network.  This list, the consensus, is created by authorities
+  hourly and clients fetch a copy of it, with some delay, hourly.
+
+  This proposal suggests that clients download diffs of consensuses
+  once they have a consensus instead of hourly downloading a full
+  consensus.
+
+2. Numbers
+
+  After implementing proposal 138 which removes nodes that are not
+  running from the list a consensus document is about 92 kilobytes
+  in size after compression.
+
+  The diff between two consecutive consensus, in ed format, is on
+  average 13 kilobytes compressed.
+
+3. Proposal
+
+3.1 Clients
+
+  If a client has a consensus that is recent enough it SHOULD
+  try to download a diff to get the latest consensus rather than
+  fetching a full one.
+
+  [XXX: what is recent enough?
+	time delta in hours / size of compressed diff
+	 0	20
+	 1	9650
+	 2	17011
+	 3	23150
+	 4	29813
+	 5	36079
+	 6	39455
+	 7	43903
+	 8	48907
+	 9	54549
+	10	60057
+	11	67810
+	12	71171
+	13	73863
+	14	76048
+	15	80031
+	16	84686
+	17	89862
+	18	94760
+	19	94868
+	20	94223
+	21	93921
+	22	92144
+	23	90228
+	[ size of gzip compressed "diff -e" between the consensus on
+	  2008-06-01-00:00:00 and the following consensuses that day.
+	  Consensuses have been modified to exclude down routers per
+	  proposal 138. ]
+
+   Data suggests that for the first few hours diffs are very useful,
+   saving about 60% for the first three hours, 30% for the first 10,
+   and almost nothing once we are past 16 hours.
+  ]
+
+3.2 Servers
+
+  Directory authorities and servers need to keep up to X [XXX: depends
+  on how long clients try to download diffs per above] old consensus
+  documents so they can build diffs.  They should offer a diff to the
+  most recent consensus at the URL
+
+  http://tor.noreply.org/tor/status-vote/current/consensus/diff/<HASH>/<FPRLIST>
+
+  where hash is the full digest of the consensus the client currently
+  has, and FPRLIST is a list of (abbreviated) fingerprints of
+  authorities the client trusts.
+
+  Servers will only return a consensus if more than half of the requested
+  authorities have signed the document, otherwise a 404 error will be sent
+  back.  The fingerprints can be shortened to a length of any multiple of
+  two, using only the leftmost part of the encoded fingerprint.  Tor uses
+  3 bytes (6 hex characters) of the fingerprint.  (This is just like the
+  conditional consensus downloads that Tor supports starting with
+  0.1.2.1-alpha.)
+
+  If a server cannot offer a diff from the consensus identified by the
+  hash but has a current consensus it MUST return the full consensus.
+
+  [XXX: what should we do when the client already has the latest
+  consensus?  I can think of the following options:
+    - send back 3xx not modified
+    - send back 200 ok and an empty diff
+    - send back 404 nothing newer here.
+
+    I currently lean towards the empty diff.]
+
+4. Diff Format
+
+  Diffs start with the token "network-status-diff-version" followed by a
+  space and the version number, currently "1".
+
+  If a document does not start with network-status-diff it is assumed
+  to be a full consensus download and would therefore currently start
+  with "network-status-version 3".
+
+  Following the network-status-diff header line is a diff, or patch, in
+  limited ed format.  We choose this format because it is easy to create
+  and process with standard tools (patch, diff -e, ed).  This will help
+  us in developing and testing this proposal and it should make future
+  debugging easier.
+
+  [ If at one point in the future we decide that the space benefits from
+    a custom diff format outweighs these benefits we can always
+    introduce a new diff format and offer it at for instance
+    ../diff2/... ]
+
+  We support the following ed commands, each on a line by itself:
+   - "<n1>d"          Delete line n1
+   - "<n1>,<n2>d"     Delete lines n1 through n2, including
+   - "<n1>c"          Replace line n1 with the following block
+   - "<n1>,<n2>c"     Replace lines n1 through n2, including, with the
+                      following block.
+   - "<n1>a"          Append the following block after line n1.
+   - "a"              Append the following block after the current line.
+   - "s/.//"          Remove the first character in the current line.
+
+  Note that line numbers always apply to the file after all previous
+  commands have already been applied.
+
+  The "current line" is either the first line of the file, if this is
+  the first command, the last line of a block we added in an append or
+  change command, or the line immediate following a set of lines we just
+  deleted (or the last line of the file if there are no lines after
+  that).
+
+  The replace and append command take blocks.  These blocks are simply
+  appended to the diff after the line with the command.  A line with
+  just a period (".") ends the block (and is not part of the lines
+  to add).  Note that it is impossible to insert a line with just
+  a single dot.  Recommended procedure is to insert a line with
+  two dots, then remove the first character of that line using s/.//.