123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175 |
- $Id$
- Tor network discovery protocol
- 0. Scope
- This document proposes a way of doing more distributed network discovery
- while maintaining some amount of admission control. We don't recommend
- you implement this as-is; it needs more discussion.
- Terminology:
- - Client: The Tor component that chooses paths.
- - Server: A relay node that passes traffic along.
- 1. Goals.
- We want more decentralized discovery for network topology and status.
- In particular:
- 1a. We want to let clients learn about new servers from anywhere
- and build circuits through them if they wish. This means that
- Tor nodes need to be able to Extend to nodes they don't already
- know about. This is already implemented, but see the 'Extend policy'
- issue below.
- 1b. We want to provide a robust (available) and not-too-centralized
- mechanism for tracking network status (which nodes are up and working)
- and admission (which nodes are "recommended" for certain uses).
- 1c. [optional] We want to permit servers that can't route to all other
- servers, e.g. because they're behind NAT or otherwise firewalled.*
- 2. Assumptions.
- People get the code from us, and they trust us (or our gpg keys, or
- something down the trust chain that's equivalent).
- Even if the software allows humans to change the client configuration,
- most of them will use the default that's provided, so we should provide
- one that is the right balance of robust and safe.
- Assume that Sybil attackers can produce only a limited number of
- independent-looking nodes.
- Roger has only a limited amount of time for approving nodes, and doesn't
- want to be the time bottleneck anyway.
- We can trust servers to accurately report their characteristics (uptime,
- capacity, exit policies, etc), as long as we have some mechanism for
- notifying clients when we notice that they're lying.
- There exists a "main" core Internet in which most locations can access
- most locations. We'll focus on it first.
- 3. Some notes on how to achieve.
- We ship with S (e.g. 3) seed keys.
- We ship with N (e.g. 20) introducer locations and fingerprints.
- We ship with some set of signed timestamped certs for those introducers.
- Introducers serve signed network-status pages, listing their opinions
- of network status and which routers are good.
- They also serve descriptors in some way. These don't need to be signed by
- the introducers, since they're self-signed and timestamped by each server.
- A DHT is not so appropriate for distributing server descriptors as long
- as we expect each client to plan to collect all of them periodically. It
- would seem that each introducer might as well just keep its own
- big pile of descriptors, and they synchronize (pull) from each other
- periodically. Clients then get network-status pages from a threshold of
- introducers, fetch enough of the server descriptors to make them happy,
- and proceed as now. Anything wrong with this?
- Notice that this doesn't preclude other approaches to discovering
- different concurrent Tor networks. For example, a Tor network inside
- China could ship Tor with a different torrc and poof, they're using
- a different set of seed keys and a different set of introducers. Some
- smarter clients could be made to learn about both networks, and be told
- which nodes bridge the networks.
- 4. Unresolved:
- - What new features need to be added to server descriptors so they
- remain compact yet support new functionality?
- - How do we compactly describe seeds, introducers, and certs? Does
- Tor have built-in defaults still, that can be overridden?
- - How much cert functionality do we want in our PKI? Can we revoke
- introducers, or is that done by releasing a new version of the code?
- - By what mechanism will new servers contact the humans who run
- introducers, so they can be approved?
- - Is our network growing because of peoples' trust in Roger? Will it
- grow the same way, or as robustly, or more robustly, with no
- figurehead?
- - 'Extend policies' -- middleman doesn't really mean middleman, alas.
- ----------
- (*) Regarding "Blossom: an unstructured overlay network for end-to-end
- connectivity."
- In this section we address possible solutions to the problem of how to allow
- Tor routers in different transport domains to communicate.
- First, we presume that for every interface between transport domains A and B,
- one Tor router T_A exists in transport domain A, one Tor router T_B exists in
- transport domain B, and (without loss of generality) T_A can open a persistent
- connection to T_B. Any Tor traffic between the two routers will occur over
- this connection, which effectively renders the routers equal partners in
- bridging between the two transport domains. We refer to the established link
- between two transport domains as a "bridge" (we use this term because there is
- no serious possibility of confusion with the notion of a layer 2 bridge).
- Next, suppose that the universe consists of transport domains connected by
- persistent connections in this manner. An individual router can open multiple
- connections to routers within the same foreign transport domain, and it can
- establish separate connections to routers within multiple foreign transport
- domains.
- As in regular Tor, each Blossom router pushes its descriptor to directory
- servers. These directory servers can be within the same transport domain, but
- they need not be. The trick is that if a directory server is in another
- transport domain, then that directory server must know through which Tor
- routers to send messages destined for the Tor router in question. Descriptors
- for Blossom routers held by the directory server must contain a special field
- for specifying a path through the overlay (i.e. an ordered list of router
- names/IDs) to a router in a foreign transport domain. (This field may be a set
- of paths rather than a single path.) A new router publishing to a directory
- server in a foreign transport should include a list of routers. This list
- should be either:
- a. ...a list of routers to which the router has persistent connections, or, if
- the new router does not have any persistent connections,
- b. ...a (not necessarily exhaustive) list of fellow routers that are in the
- same transport domain.
- The directory server will be able to use this information to derive a path to
- the new router, as follows. If the new router used approach (a), then the
- directory server will define the same path(s) in the descriptors for the
- router(s) specified in the list, with the corresponding specified router
- appended to each path. If the new router used approach (b), then the directory
- server will define the same path(s) in the descriptors for the routers
- specified in the list. The directory server will then insert the newly defined
- path into the descriptor from the router.
- If all directory servers are within the same transport domain, then the problem
- is solved: routers can exist within multiple transport domains, and as long as
- the network of transport domains is fully connected by bridges, any router will
- be able to access any other router in a foreign transport domain simply by
- extending along the path specified by the directory server. However, we want
- the system to be truly decentralized, which means not electing any particular
- transport domain to be the master domain in which entries are published.
- Generally speaking, directory servers share information with each other about
- routers. In order for a directory server to share information with a directory
- server in a foreign transport domain to which it cannot speak directly, it must
- use Tor, which means referring to the other directory server by using a router
- in the foreign transport domain. However, in order to use Tor, it must be able
- to reach that router, which means that a descriptor for that router must exist
- in its table, along with a means of reaching it. Therefore, in order for a
- mutual exchange of information between routers in transport domain A and those
- in transport domain B to be possible, when routers in transport domain A cannot
- establish direct connections with routers in transport domain B, then some
- router in transport domain B must have pushed its descriptor to a directory
- server in transport domain A, so that the directory server in transport domain
- A can use that router to reach the directory server in transport domain B.
- When confronted with the choice of multiple different paths to reach the same
- router, the Blossom nodes may use a route selection protocol similar in design
- to that used by BGP (may be a simple distance-vector route selection procedure
- that only takes into account path length, or may be more complex to avoid
- loops, cache results, etc.) in order to choose the best one.
|