dir-spec.txt 8.5 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175
  1. $Id$
  2. Tor network discovery protocol
  3. 0. Scope
  4. This document proposes a way of doing more distributed network discovery
  5. while maintaining some amount of admission control. We don't recommend
  6. you implement this as-is; it needs more discussion.
  7. Terminology:
  8. - Client: The Tor component that chooses paths.
  9. - Server: A relay node that passes traffic along.
  10. 1. Goals.
  11. We want more decentralized discovery for network topology and status.
  12. In particular:
  13. 1a. We want to let clients learn about new servers from anywhere
  14. and build circuits through them if they wish. This means that
  15. Tor nodes need to be able to Extend to nodes they don't already
  16. know about. This is already implemented, but see the 'Extend policy'
  17. issue below.
  18. 1b. We want to provide a robust (available) and not-too-centralized
  19. mechanism for tracking network status (which nodes are up and working)
  20. and admission (which nodes are "recommended" for certain uses).
  21. 1c. [optional] We want to permit servers that can't route to all other
  22. servers, e.g. because they're behind NAT or otherwise firewalled.*
  23. 2. Assumptions.
  24. People get the code from us, and they trust us (or our gpg keys, or
  25. something down the trust chain that's equivalent).
  26. Even if the software allows humans to change the client configuration,
  27. most of them will use the default that's provided, so we should provide
  28. one that is the right balance of robust and safe.
  29. Assume that Sybil attackers can produce only a limited number of
  30. independent-looking nodes.
  31. Roger has only a limited amount of time for approving nodes, and doesn't
  32. want to be the time bottleneck anyway.
  33. We can trust servers to accurately report their characteristics (uptime,
  34. capacity, exit policies, etc), as long as we have some mechanism for
  35. notifying clients when we notice that they're lying.
  36. There exists a "main" core Internet in which most locations can access
  37. most locations. We'll focus on it first.
  38. 3. Some notes on how to achieve.
  39. We ship with S (e.g. 3) seed keys.
  40. We ship with N (e.g. 20) introducer locations and fingerprints.
  41. We ship with some set of signed timestamped certs for those introducers.
  42. Introducers serve signed network-status pages, listing their opinions
  43. of network status and which routers are good.
  44. They also serve descriptors in some way. These don't need to be signed by
  45. the introducers, since they're self-signed and timestamped by each server.
  46. A DHT is not so appropriate for distributing server descriptors as long
  47. as we expect each client to plan to collect all of them periodically. It
  48. would seem that each introducer might as well just keep its own
  49. big pile of descriptors, and they synchronize (pull) from each other
  50. periodically. Clients then get network-status pages from a threshold of
  51. introducers, fetch enough of the server descriptors to make them happy,
  52. and proceed as now. Anything wrong with this?
  53. Notice that this doesn't preclude other approaches to discovering
  54. different concurrent Tor networks. For example, a Tor network inside
  55. China could ship Tor with a different torrc and poof, they're using
  56. a different set of seed keys and a different set of introducers. Some
  57. smarter clients could be made to learn about both networks, and be told
  58. which nodes bridge the networks.
  59. 4. Unresolved:
  60. - What new features need to be added to server descriptors so they
  61. remain compact yet support new functionality?
  62. - How do we compactly describe seeds, introducers, and certs? Does
  63. Tor have built-in defaults still, that can be overridden?
  64. - How much cert functionality do we want in our PKI? Can we revoke
  65. introducers, or is that done by releasing a new version of the code?
  66. - By what mechanism will new servers contact the humans who run
  67. introducers, so they can be approved?
  68. - Is our network growing because of peoples' trust in Roger? Will it
  69. grow the same way, or as robustly, or more robustly, with no
  70. figurehead?
  71. - 'Extend policies' -- middleman doesn't really mean middleman, alas.
  72. ----------
  73. (*) Regarding "Blossom: an unstructured overlay network for end-to-end
  74. connectivity."
  75. In this section we address possible solutions to the problem of how to allow
  76. Tor routers in different transport domains to communicate.
  77. First, we presume that for every interface between transport domains A and B,
  78. one Tor router T_A exists in transport domain A, one Tor router T_B exists in
  79. transport domain B, and (without loss of generality) T_A can open a persistent
  80. connection to T_B. Any Tor traffic between the two routers will occur over
  81. this connection, which effectively renders the routers equal partners in
  82. bridging between the two transport domains. We refer to the established link
  83. between two transport domains as a "bridge" (we use this term because there is
  84. no serious possibility of confusion with the notion of a layer 2 bridge).
  85. Next, suppose that the universe consists of transport domains connected by
  86. persistent connections in this manner. An individual router can open multiple
  87. connections to routers within the same foreign transport domain, and it can
  88. establish separate connections to routers within multiple foreign transport
  89. domains.
  90. As in regular Tor, each Blossom router pushes its descriptor to directory
  91. servers. These directory servers can be within the same transport domain, but
  92. they need not be. The trick is that if a directory server is in another
  93. transport domain, then that directory server must know through which Tor
  94. routers to send messages destined for the Tor router in question. Descriptors
  95. for Blossom routers held by the directory server must contain a special field
  96. for specifying a path through the overlay (i.e. an ordered list of router
  97. names/IDs) to a router in a foreign transport domain. (This field may be a set
  98. of paths rather than a single path.) A new router publishing to a directory
  99. server in a foreign transport should include a list of routers. This list
  100. should be either:
  101. a. ...a list of routers to which the router has persistent connections, or, if
  102. the new router does not have any persistent connections,
  103. b. ...a (not necessarily exhaustive) list of fellow routers that are in the
  104. same transport domain.
  105. The directory server will be able to use this information to derive a path to
  106. the new router, as follows. If the new router used approach (a), then the
  107. directory server will define the same path(s) in the descriptors for the
  108. router(s) specified in the list, with the corresponding specified router
  109. appended to each path. If the new router used approach (b), then the directory
  110. server will define the same path(s) in the descriptors for the routers
  111. specified in the list. The directory server will then insert the newly defined
  112. path into the descriptor from the router.
  113. If all directory servers are within the same transport domain, then the problem
  114. is solved: routers can exist within multiple transport domains, and as long as
  115. the network of transport domains is fully connected by bridges, any router will
  116. be able to access any other router in a foreign transport domain simply by
  117. extending along the path specified by the directory server. However, we want
  118. the system to be truly decentralized, which means not electing any particular
  119. transport domain to be the master domain in which entries are published.
  120. Generally speaking, directory servers share information with each other about
  121. routers. In order for a directory server to share information with a directory
  122. server in a foreign transport domain to which it cannot speak directly, it must
  123. use Tor, which means referring to the other directory server by using a router
  124. in the foreign transport domain. However, in order to use Tor, it must be able
  125. to reach that router, which means that a descriptor for that router must exist
  126. in its table, along with a means of reaching it. Therefore, in order for a
  127. mutual exchange of information between routers in transport domain A and those
  128. in transport domain B to be possible, when routers in transport domain A cannot
  129. establish direct connections with routers in transport domain B, then some
  130. router in transport domain B must have pushed its descriptor to a directory
  131. server in transport domain A, so that the directory server in transport domain
  132. A can use that router to reach the directory server in transport domain B.
  133. When confronted with the choice of multiple different paths to reach the same
  134. router, the Blossom nodes may use a route selection protocol similar in design
  135. to that used by BGP (may be a simple distance-vector route selection procedure
  136. that only takes into account path length, or may be more complex to avoid
  137. loops, cache results, etc.) in order to choose the best one.