dir-voting.txt 17 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388
  1. $Id: /tor/branches/eventdns/doc/dir-spec.txt 9469 2006-11-01T23:56:30.179423Z nickm $
  2. Voting on the Tor Directory System
  3. 0. Scope and preliminaries
  4. This document describes a consensus voting scheme for Tor directories.
  5. Once it's accepted, it should be merged with dir-spec.txt. Some
  6. preliminaries for authority and caching support should be done during
  7. the 0.1.2.x series; the main deployment should come during the 0.1.3.x
  8. series.
  9. 0.1. Goals and motivation: voting.
  10. The current directory system relies on clients downloading separate
  11. network status statements from the caches signed by each directory.
  12. Clients download a new statement every 30 minutes or so, choosing to
  13. replace the oldest statement they currently have.
  14. This creates a partitioning problem: different clients have different
  15. "most recent" networkstatus sources, and different versions of each
  16. (since authorities change their statements often).
  17. It also creates a scaling problem: most of the downloaded networkstatus
  18. are probably quite similar, and the redundancy grows as we add more
  19. authorities.
  20. So if we have clients only download a single multiply signed consensus
  21. network status statement, we can:
  22. - Save bandwidth.
  23. - Reduce client partitioning
  24. - Reduce client-side and cache-side storage
  25. - Simplify client-side voting code (by moving voting away from the
  26. client)
  27. We should try to do this without:
  28. - Assuming that client-side or cache-side clocks are more correct
  29. than we assume now.
  30. - Assuming that authority clocks are perfectly correct.
  31. - Degrading badly if a few authorities die or are offline for a bit.
  32. We do not have to perform well if:
  33. - No clique of more than half the authorities can agree about who
  34. the authorities are.
  35. 1. The idea.
  36. Instead of publishing a network status whenever something changes,
  37. each authority instead publishes a fresh network status only once per
  38. "period" (say, 60 minutes). Authorities either upload this network
  39. status (or "vote") to every other authority, or download every other
  40. authority's "vote" (see 3.1 below for discussion on push vs pull).
  41. After an authority has (or has become convinced that it won't be able to
  42. get) every other authority's vote, it deterministically computes a
  43. consensus networkstatus, and signs it. Authorities download (or are
  44. uploaded; see 3.1) one another's signatures, and form a multiply signed
  45. consensus. This multiply-signed consensus is what caches cache and what
  46. clients download.
  47. If an authority is down, authorities vote based on what they *can*
  48. download/get uploaded.
  49. If an authority is "a little" down and only some authorities can reach
  50. it, authorities try to get its info from other authorities.
  51. If an authority computes the vote wrong, its signature isn't included on
  52. the consensus.
  53. Clients use a consensus if it is "trusted": signed by more than half the
  54. authorities they recognize. If clients can't find any such consensus,
  55. they use the most recent trusted consensus they have. If they don't
  56. have any trusted consensus, they warn the user and refuse to operate
  57. (and if DirServers is not the default, beg the user to adapt the list
  58. of authorities).
  59. 2. Details.
  60. 2.1. Vote specifications
  61. Votes in v2.1 are similar to v2 network status documents. We add these
  62. fields to the preamble:
  63. "vote-status" -- the word "vote".
  64. "valid-until" -- the time when this authority expects to publish its
  65. next vote.
  66. "known-flags" -- a space-separated list of flags that will sometimes
  67. be included on "s" lines later in the vote.
  68. "dir-source" -- as before, except the "hostname" part MUST be the
  69. authority's nickname, which MUST be unique among authorities, and
  70. MUST match the nickname in the "directory-signature" entry.
  71. Authorities SHOULD cache their most recently generated votes so they
  72. can persist them across restarts. Authorities SHOULD NOT generate
  73. another document until valid-until has passed.
  74. Router entries in the vote MUST be sorted in ascending order by router
  75. identity digest. The flags in "s" lines MUST appear in alphabetical
  76. order.
  77. Votes SHOULD be synchronized to half-hour publication intervals (one
  78. hour? XXX say more; be more precise.)
  79. XXXX some way to request older networkstatus docs?
  80. 2.2. Consensus directory specifications
  81. Consensuses are like v2.1 votes, except for the following fields:
  82. "vote-status" -- the word "consensus".
  83. "published" is the latest of all the published times on the votes.
  84. "valid-until" is the earliest of all the valid-until times on the
  85. votes.
  86. "dir-source" and "fingerprint" and "dir-signing-key" and "contact"
  87. are included for each authority that contributed to the vote.
  88. "vote-digest" for each authority that contributed to the vote,
  89. calculated as for the digest in the signature on the vote. [XXX
  90. re-English this sentence]
  91. "client-versions" and "server-versions" are sorted in ascending
  92. order based on version-spec.txt.
  93. "dir-options" and "known-flags" are not included.
  94. [XXX really? why not list the ones that are used in the consensus?
  95. For example, right now BadExit is in use, but no servers would be
  96. labelled BadExit, and it's still worth knowing that it was considered
  97. by the authorities. -RD]
  98. The fields MUST occur in the following order:
  99. "network-status-version"
  100. "vote-status"
  101. "published"
  102. "valid-until"
  103. For each authority, sorted in ascending order of nickname, case-
  104. insensitively:
  105. "dir-source", "fingerprint", "contact", "dir-signing-key",
  106. "vote-digest".
  107. "client-versions"
  108. "server-versions"
  109. The signatures at the end of the document appear as multiple instances
  110. of directory-signature, sorted in ascending order by nickname,
  111. case-insensitively.
  112. A router entry should be included in the result if it is included by more
  113. than half of the authorities (total authorities, not just those whose votes
  114. we have). A router entry has a flag set if it is included by more than
  115. half of the authorities who care about that flag. [XXXX this creates an
  116. incentive for attackers to DOS authorities whose votes they don't like.
  117. Can we remember what flags people set the last time we saw them? -NM]
  118. [Which 'we' are we talking here? The end-users never learn which
  119. authority sets which flags. So you're thinking the authorities
  120. should record the last vote they saw from each authority and if it's
  121. within a week or so, count all the flags that it advertised as 'no'
  122. votes? Plausible. -RD]
  123. The signature hash covers from the "network-status-version" line through
  124. the characters "directory-signature" in the first "directory-signature"
  125. line.
  126. Consensus directories SHOULD be rejected if they are not signed by more
  127. than half of the known authorities.
  128. 2.2.1. Detached signatures
  129. Assuming full connectivity, every authority should compute and sign the
  130. same consensus directory in each period. Therefore, it isn't necessary to
  131. download the consensus computed by each authority; instead, the authorities
  132. only push/fetch each others' signatures. A "detached signature" document
  133. contains a single "consensus-digest" entry and one or more
  134. directory-signature entries. [XXXX specify more.]
  135. 2.3. URLs and timelines
  136. 2.3.1. URLs and timeline used for agreement
  137. An authority SHOULD publish its vote immediately at the start of each voting
  138. period. It does this by making it available at
  139. http://<hostname>/tor/status-vote/current/authority.z
  140. and sending it in an HTTP POST request to each other authority at the URL
  141. http://<hostname>/tor/post/vote
  142. If, N minutes after the voting period has begun, an authority does not have
  143. a current statement from another authority, the first authority retrieves
  144. the other's statement.
  145. Once an authority has a vote from another authority, it makes it available
  146. at
  147. http://<hostname>/tor/status-vote/current/<fp>.z
  148. where <fp> is the fingerprint of the other authority's identity key.
  149. The consensus network status, along with as many signatures as the server
  150. currently knows, should be available at
  151. http://<hostname>/tor/status-vote/current/consensus.z
  152. All of the detached signatures it knows for consensus status should be
  153. available at:
  154. http://<hostname>/tor/status-vote/current/consensus-signatures.z
  155. Once an authority has computed and signed a consensus network status, it
  156. should send its detached signature to each other authority in an HTTP POST
  157. request to the URL:
  158. http://<hostname>/tor/post/consensus-signature
  159. [XXXX Store votes to disk.]
  160. 2.3.2. Serving a consensus directory
  161. Once the authority is done getting signatures on the consensus directory,
  162. it should serve it from:
  163. http://<hostname>/tor/status/consensus.z
  164. Caches SHOULD download consensus directories from an authority and serve
  165. them from the same URL.
  166. 2.3.3. Timeline and synchronization
  167. [XXXX]
  168. 2.4. Distributing routerdescs between authorities
  169. Consensus will be more meaningful if authorities take steps to make sure
  170. that they all have the same set of descriptors _before_ the voting
  171. starts. This is safe, since all descriptors are self-certified and
  172. timestamped: it's always okay to replace a signed descriptor with a more
  173. recent one signed by the same identity.
  174. In the long run, we might want some kind of sophisticated process here.
  175. For now, since authorities already download one another's networkstatus
  176. documents and use them to determine what descriptors to download from one
  177. another, we can rely on this existing mechanism to keep authorities up to
  178. date.
  179. [We should do a thorough read-through of dir-spec again to make sure
  180. that the authorities converge on which descriptor to "prefer" for
  181. each router. Right now the decision happens at the client, which is
  182. no longer the right place for it. -RD]
  183. 3. Questions and concerns
  184. 3.1. Push or pull?
  185. The URLs above define a push mechanism for publishing votes and consensus
  186. signatures via HTTP POST requests, and a pull mechanism for downloading
  187. these documents via HTTP GET requests. As specified, every authority will
  188. post to every other. The "download if no copy has been received" mechanism
  189. exists only as a fallback.
  190. 3.2. Dropping "opt".
  191. The "opt" keyword in Tor's directory formats was originally intended to
  192. mean, "it is okay to ignore this entry if you don't understand it"; the
  193. default behavior has been "discard a routerdesc if it contains entries you
  194. don't recognize."
  195. But so far, every new flag we have added has been marked 'opt'. It would
  196. probably make sense to change the default behavior to "ignore unrecognized
  197. fields", and add the statement that clients SHOULD ignore fields they don't
  198. recognize. As a meta-principle, we should say that clients and servers
  199. MUST NOT have to understand new fields in order to use directory documents
  200. correctly.
  201. Of course, this will make it impossible to say, "The format has changed a
  202. lot; discard this quietly if you don't understand it." We could do that by
  203. adding a version field.
  204. 3.3. Multilevel keys.
  205. Replacing a directory authority's identity key in the event of a compromise
  206. would be tremendously annoying. We'd need to tell every client to switch
  207. their configuration, or update to a new version with an uploaded list. So
  208. long as some weren't upgraded, they'd be at risk from whoever had
  209. compromised the key.
  210. With this in mind, it's a shame that our current protocol forces us to
  211. store identity keys unencrypted in RAM. We need some kind of signing key
  212. stored unencrypted, since we need to generate new descriptors/directories
  213. and rotate link and onion keys regularly. (And since, of course, we can't
  214. ask server operators to be on-hand to enter a passphrase every time we
  215. want to rotate keys or sign a descriptor.)
  216. The obvious solution seems to be to have a signing-only key that lives
  217. indefinitely (months or longer) and signs descriptors and link keys, and a
  218. separate identity key that's used to sign the signing key. Tor servers
  219. could run in one of several modes:
  220. 1. Identity key stored encrypted. You need to pick a passphrase when
  221. you enable this mode, and re-enter this passphrase every time you
  222. rotate the signing key.
  223. 1'. Identity key stored separate. You save your identity key to a
  224. floppy, and use the floppy when you need to rotate the signing key.
  225. 2. All keys stored unencrypted. In this case, we might not want to even
  226. *have* a separate signing key. (We'll need to support no-separate-
  227. signing-key mode anyway to keep old servers working.)
  228. 3. All keys stored encrypted. You need to enter a passphrase to start
  229. Tor.
  230. (Of course, we might not want to implement all of these.)
  231. Case 1 is probably most usable and secure, if we assume that people don't
  232. forget their passphrases or lose their floppies. We could mitigate this a
  233. bit by encouraging people to PGP-encrypt their passphrases to themselves,
  234. or keep a cleartext copy of their secret key secret-split into a few
  235. pieces, or something like that.
  236. Migration presents another difficulty, especially with the authorities. If
  237. we use the current set of identity keys as the new identity keys, we're in
  238. the position of having sensitive keys that have been stored on
  239. media-of-dubious-encryption up to now. Also, we need to keep old clients
  240. (who will expect descriptors to be signed by the identity keys they know
  241. and love, and who will not understand signing keys) happy.
  242. I'd enumerate designs here, but I'm hoping that somebody will come up with
  243. a better one, so I'll try not to prejudice them with more ideas yet.
  244. Oh, and of course, we'll want to make sure that the keys are
  245. cross-certified. :)
  246. Ideas? -NM
  247. 3.4. Long and short descriptors
  248. Some of the costliest fields in the current directory protocol are ones
  249. that no client actually uses. In particular, the "read-history" and
  250. "write-history" fields are used only by the authorities for monitoring the
  251. status of the network. If we took them out, the size of a compressed list
  252. of all the routers would fall by about 60%. (No other disposable field
  253. would save more than 2%.)
  254. One possible solution here is that routers should generate and upload a
  255. short-form and long-form descriptor. Only the short-form descriptor should
  256. ever be used by anybody for routing. The long-form descriptor should be
  257. used only for analytics and other tools. (If we allowed people to route with
  258. long descriptors, we'd have to ensure that they stayed in sync with the
  259. short ones somehow.) We can ensure that the short descriptors are used by
  260. only recommending those in the network statuses.
  261. Another possible solution would be to drop these fields from descriptors,
  262. and have them uploaded as a part of a separate "bandwidth report" to the
  263. authorities. This could help prevent the mistake of using long descriptors
  264. in the place of short ones.
  265. Thoughts? -NM
  266. 3.5. Compression
  267. Gzip would be easier to work with than zlib; bzip2 would result in smaller
  268. data lengths. [Concretely, we're looking at about 10-15% space savings at
  269. the expense of 3-5x longer compression time for using bzip2.] Doing
  270. on-the-fly gzip requires zlib 1.2 or later; doing bzip2 requires bzlib.
  271. Pre-compressing status documents in multiple formats would force us to use
  272. more memory to hold them.
  273. 4. Migration
  274. For directory voting:
  275. * It would be cool if caches could get ready to download consensus
  276. status docs, verify enough signatures, and serve them now. That way
  277. once stuff works all we need to do is upgrade the authorities. Caches
  278. don't need to verify the correctness of the format so long as it's
  279. signed (or maybe multisigned?). We need to make sure that caches back
  280. off very quickly from downloading consensus docs until they're
  281. actually implemented.
  282. For dropping the "opt" requirement:
  283. * stopped requiring it as of 0.1.2.5-alpha. Stop generating it once
  284. earlier formats are obsolete.
  285. For multilevel keys:
  286. * no idea
  287. For long/short descriptors:
  288. * In 0.1.2.x:
  289. * Authorities should accept both, now, and silently drop short
  290. descriptors.
  291. * Routers should upload both once authorities accept them.
  292. * There should be a "long descriptor" url and the current "normal" URL.
  293. Authorities should serve long descriptors from both URLs.
  294. * Once tools that want long descriptors support fetching them from the
  295. "long descriptor" URL:
  296. * Have authorities remember short descriptors, and serve them from the
  297. 'normal' URL.