| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149 | 
							- Filename: 140-consensus-diffs.txt
 
- Title: Provide diffs between consensuses
 
- Version: $Revision$
 
- Last-Modified: $Date$
 
- Author: Peter Palfrader
 
- Created: 13-Jun-2008
 
- Status: Accepted
 
- Target: 0.2.2.x
 
- 1. Overview.
 
-   Tor clients and servers need a list of which relays are on the
 
-   network.  This list, the consensus, is created by authorities
 
-   hourly and clients fetch a copy of it, with some delay, hourly.
 
-   This proposal suggests that clients download diffs of consensuses
 
-   once they have a consensus instead of hourly downloading a full
 
-   consensus.
 
- 2. Numbers
 
-   After implementing proposal 138 which removes nodes that are not
 
-   running from the list a consensus document is about 92 kilobytes
 
-   in size after compression.
 
-   The diff between two consecutive consensus, in ed format, is on
 
-   average 13 kilobytes compressed.
 
- 3. Proposal
 
- 3.1 Clients
 
-   If a client has a consensus that is recent enough it SHOULD
 
-   try to download a diff to get the latest consensus rather than
 
-   fetching a full one.
 
-   [XXX: what is recent enough?
 
- 	time delta in hours / size of compressed diff
 
- 	 0	20
 
- 	 1	9650
 
- 	 2	17011
 
- 	 3	23150
 
- 	 4	29813
 
- 	 5	36079
 
- 	 6	39455
 
- 	 7	43903
 
- 	 8	48907
 
- 	 9	54549
 
- 	10	60057
 
- 	11	67810
 
- 	12	71171
 
- 	13	73863
 
- 	14	76048
 
- 	15	80031
 
- 	16	84686
 
- 	17	89862
 
- 	18	94760
 
- 	19	94868
 
- 	20	94223
 
- 	21	93921
 
- 	22	92144
 
- 	23	90228
 
- 	[ size of gzip compressed "diff -e" between the consensus on
 
- 	  2008-06-01-00:00:00 and the following consensuses that day.
 
- 	  Consensuses have been modified to exclude down routers per
 
- 	  proposal 138. ]
 
-    Data suggests that for the first few hours diffs are very useful,
 
-    saving about 60% for the first three hours, 30% for the first 10,
 
-    and almost nothing once we are past 16 hours.
 
-   ]
 
- 3.2 Servers
 
-   Directory authorities and servers need to keep up to X [XXX: depends
 
-   on how long clients try to download diffs per above] old consensus
 
-   documents so they can build diffs.  They should offer a diff to the
 
-   most recent consensus at the URL
 
-   http://tor.noreply.org/tor/status-vote/current/consensus/diff/<HASH>/<FPRLIST>
 
-   where hash is the full digest of the consensus the client currently
 
-   has, and FPRLIST is a list of (abbreviated) fingerprints of
 
-   authorities the client trusts.
 
-   Servers will only return a consensus if more than half of the requested
 
-   authorities have signed the document, otherwise a 404 error will be sent
 
-   back.  The fingerprints can be shortened to a length of any multiple of
 
-   two, using only the leftmost part of the encoded fingerprint.  Tor uses
 
-   3 bytes (6 hex characters) of the fingerprint.  (This is just like the
 
-   conditional consensus downloads that Tor supports starting with
 
-   0.1.2.1-alpha.)
 
-   If a server cannot offer a diff from the consensus identified by the
 
-   hash but has a current consensus it MUST return the full consensus.
 
-   [XXX: what should we do when the client already has the latest
 
-   consensus?  I can think of the following options:
 
-     - send back 3xx not modified
 
-     - send back 200 ok and an empty diff
 
-     - send back 404 nothing newer here.
 
-     I currently lean towards the empty diff.]
 
- 4. Diff Format
 
-   Diffs start with the token "network-status-diff-version" followed by a
 
-   space and the version number, currently "1".
 
-   If a document does not start with network-status-diff it is assumed
 
-   to be a full consensus download and would therefore currently start
 
-   with "network-status-version 3".
 
-   Following the network-status-diff header line is a diff, or patch, in
 
-   limited ed format.  We choose this format because it is easy to create
 
-   and process with standard tools (patch, diff -e, ed).  This will help
 
-   us in developing and testing this proposal and it should make future
 
-   debugging easier.
 
-   [ If at one point in the future we decide that the space benefits from
 
-     a custom diff format outweighs these benefits we can always
 
-     introduce a new diff format and offer it at for instance
 
-     ../diff2/... ]
 
-   We support the following ed commands, each on a line by itself:
 
-    - "<n1>d"          Delete line n1
 
-    - "<n1>,<n2>d"     Delete lines n1 through n2, including
 
-    - "<n1>c"          Replace line n1 with the following block
 
-    - "<n1>,<n2>c"     Replace lines n1 through n2, including, with the
 
-                       following block.
 
-    - "<n1>a"          Append the following block after line n1.
 
-    - "a"              Append the following block after the current line.
 
-    - "s/.//"          Remove the first character in the current line.
 
-   Note that line numbers always apply to the file after all previous
 
-   commands have already been applied.
 
-   The "current line" is either the first line of the file, if this is
 
-   the first command, the last line of a block we added in an append or
 
-   change command, or the line immediate following a set of lines we just
 
-   deleted (or the last line of the file if there are no lines after
 
-   that).
 
-   The replace and append command take blocks.  These blocks are simply
 
-   appended to the diff after the line with the command.  A line with
 
-   just a period (".") ends the block (and is not part of the lines
 
-   to add).  Note that it is impossible to insert a line with just
 
-   a single dot.  Recommended procedure is to insert a line with
 
-   two dots, then remove the first character of that line using s/.//.
 
 
  |