123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441 |
- Filename: 114-distributed-storage.txt
- Title: Distributed Storage for Tor Hidden Service Descriptors
- Version: $Revision$
- Last-Modified: $Date$
- Author: Karsten Loesing
- Created: 13-May-2007
- Status: Closed
- Implemented-In: 0.2.0.x
- Change history:
- 13-May-2007 Initial proposal
- 14-May-2007 Added changes suggested by Lasse Øverlier
- 30-May-2007 Changed descriptor format, key length discussion, typos
- 09-Jul-2007 Incorporated suggestions by Roger, added status of specification
- and implementation for upcoming GSoC mid-term evaluation
- 11-Aug-2007 Updated implementation statuses, included non-consecutive
- replication to descriptor format
- 20-Aug-2007 Renamed config option HSDir as HidServDirectoryV2
- 02-Dec-2007 Closed proposal
- Overview:
- The basic idea of this proposal is to distribute the tasks of storing and
- serving hidden service descriptors from currently three authoritative
- directory nodes among a large subset of all onion routers. The three
- reasons to do this are better robustness (availability), better
- scalability, and improved security properties. Further,
- this proposal suggests changes to the hidden service descriptor format to
- prevent new security threats coming from decentralization and to gain even
- better security properties.
- Status:
- As of December 2007, the new hidden service descriptor format is implemented
- and usable. However, servers and clients do not yet make use of descriptor
- cookies, because there are open usability issues of this feature that might
- be resolved in proposal 121. Further, hidden service directories do not
- perform replication by themselves, because (unauthorized) replica fetch
- requests would allow any attacker to fetch all hidden service descriptors in
- the system. As neither issue is critical to the functioning of v2
- descriptors and their distribution, this proposal is considered as Closed.
-
- Motivation:
- The current design of hidden services exhibits the following performance and
- security problems:
- First, the three hidden service authoritative directories constitute a
- performance bottleneck in the system. The directory nodes are responsible for
- storing and serving all hidden service descriptors. As of May 2007 there are
- about 1000 descriptors at a time, but this number is assumed to increase in
- the future. Further, there is no replication protocol for descriptors between
- the three directory nodes, so that hidden services must ensure the
- availability of their descriptors by manually publishing them on all
- directory nodes. Whenever a fourth or fifth hidden service authoritative
- directory is added, hidden services will need to maintain an equally
- increasing number of replicas. These scalability issues have an impact on the
- current usage of hidden services and put an even higher burden on the
- development of new kinds of applications for hidden services that might
- require storing even more descriptors.
- Second, besides posing a limitation to scalability, storing all hidden
- service descriptors on three directory nodes also constitutes a security
- risk. The directory node operators could easily analyze the publish and fetch
- requests to derive information on service activity and usage and read the
- descriptor contents to determine which onion routers work as introduction
- points for a given hidden service and need to be attacked or threatened to
- shut it down. Furthermore, the contents of a hidden service descriptor offer
- only minimal security properties to the hidden service. Whoever gets aware of
- the service ID can easily find out whether the service is active at the
- moment and which introduction points it has. This applies to (former)
- clients, (former) introduction points, and of course to the directory nodes.
- It requires only to request the descriptor for the given service ID, which
- can be performed by anyone anonymously.
- This proposal suggests two major changes to approach the described
- performance and security problems:
- The first change affects the storage location for hidden service descriptors.
- Descriptors are distributed among a large subset of all onion routers instead
- of three fixed directory nodes. Each storing node is responsible for a subset
- of descriptors for a limited time only. It is not able to choose which
- descriptors it stores at a certain time, because this is determined by its
- onion ID which is hard to change frequently and in time (only routers which
- are stable for a given time are accepted as storing nodes). In order to
- resist single node failures and untrustworthy nodes, descriptors are
- replicated among a certain number of storing nodes. A first replication
- protocol makes sure that descriptors don't get lost when the node population
- changes; therefore, a storing node periodically requests the descriptors from
- its siblings. A second replication protocol distributes descriptors among
- non-consecutive nodes of the ID ring to prevent a group of adversaries from
- generating new onion keys until they have consecutive IDs to create a 'black
- hole' in the ring and make random services unavailable. Connections to
- storing nodes are established by extending existing circuits by one hop to
- the storing node. This also ensures that contents are encrypted. The effect
- of this first change is that the probability that a single node operator
- learns about a certain hidden service is very small and that it is very hard
- to track a service over time, even when it collaborates with other node
- operators.
-
- The second change concerns the content of hidden service descriptors.
- Obviously, security problems cannot be solved only by decentralizing storage;
- in fact, they could also get worse if done without caution. At first, a
- descriptor ID needs to change periodically in order to be stored on changing
- nodes over time. Next, the descriptor ID needs to be computable only for the
- service's clients, but should be unpredictable for all other nodes. Further,
- the storing node needs to be able to verify that the hidden service is the
- true originator of the descriptor with the given ID even though it is not a
- client. Finally, a storing node should learn as little information as
- necessary by storing a descriptor, because it might not be as trustworthy as
- a directory node; for example it does not need to know the list of
- introduction points. Therefore, a second key is applied that is only known to
- the hidden service provider and its clients and that is not included in the
- descriptor. It is used to calculate descriptor IDs and to encrypt the
- introduction points. This second key can either be given to all clients
- together with the hidden service ID, or to a group or a single client as
- an authentication token. In the future this second key could be the result of
- some key agreement protocol between the hidden service and one or more
- clients. A new text-based format is proposed for descriptors instead of an
- extension of the existing binary format for reasons of future extensibility.
- Design:
- The proposed design is described by the required changes to the current
- design. These requirements are grouped by content, rather than by affected
- specification documents or code files, and numbered for reference below.
- Hidden service clients, servers, and directories:
- /1/ Create routing list
- All participants can filter the consensus status document received from the
- directory authorities to one routing list containing only those servers
- that store and serve hidden service descriptors and which are running for
- at least 24 hours. A participant only trusts its own routing list and never
- learns about routing information from other parties.
- /2/ Determine responsible hidden service directory
- All participants can determine the hidden service directory that is
- responsible for storing and serving a given ID, as well as the hidden
- service directories that replicate its content. Every hidden service
- directory is responsible for the descriptor IDs in the interval from
- its predecessor, exclusive, to its own ID, inclusive. Further, a hidden
- service directory holds replicas for its n predecessors, where n denotes
- the number of consecutive replicas. (requires /1/)
- [/3/ and /4/ were requirements to use BEGIN_DIR cells for directory
- requests which have not been fulfilled in the course of the implementation
- of this proposal, but elsewhere.]
- Hidden service directory nodes:
-
- /5/ Advertise hidden service directory functionality
- Every onion router that has its directory port open can decide whether it
- wants to store and serve hidden service descriptors by setting a new config
- option "HidServDirectoryV2" 0|1 to 1. An onion router with this config
- option being set includes the flag "hidden-service-dir" in its router
- descriptors that it sends to directory authorities.
- /6/ Accept v2 publish requests, parse and store v2 descriptors
- Hidden service directory nodes accept publish requests for hidden service
- descriptors and store them to their local memory. (It is not necessary to
- make descriptors persistent, because after disconnecting, the onion router
- would not be accepted as storing node anyway, because it has not been
- running for at least 24 hours.) All requests and replies are formatted as
- HTTP messages. Requests are directed to the router's directory port and are
- contained within BEGIN_DIR cells. A hidden service directory node stores a
- descriptor only when it thinks that it is responsible for storing that
- descriptor based on its own routing table. Every hidden service directory
- node is responsible for the descriptor IDs in the interval of its n-th
- predecessor in the ID circle up to its own ID (n denotes the number of
- consecutive replicas). (requires /1/)
- /7/ Accept v2 fetch requests
- Same as /6/, but with fetch requests for hidden service descriptors.
- (requires /2/)
- /8/ Replicate descriptors with neighbors
- A hidden service directory node replicates descriptors from its two
- predecessors by downloading them once an hour. Further, it checks its
- routing table periodically for changes. Whenever it realizes that a
- predecessor has left the network, it establishes a connection to the new
- n-th predecessor and requests its stored descriptors in the interval of its
- (n+1)-th predecessor and the requested n-th predecessor. Whenever it
- realizes that a new onion router has joined with an ID higher than its
- former n-th predecessor, it adds it to its predecessors and discards all
- descriptors in the interval of its (n+1)-th and its n-th predecessor.
- (requires /1/)
- [Dec 02: This function has not been implemented, because arbitrary nodes
- what have been able to download the entire set of v2 descriptors. An
- authorized replication request would be necessary. For the moment, the
- system runs without any directory-side replication. -KL]
- Authoritative directory nodes:
- /9/ Confirm a router's hidden service directory functionality
- Directory nodes include a new flag "HSDir" for routers that decided to
- provide storage for hidden service descriptors and that are running for at
- least 24 hours. The last requirement prevents a node from frequently
- changing its onion key to become responsible for an identifier it wants to
- target.
- Hidden service provider:
- /10/ Configure v2 hidden service
- Each hidden service provider that has set the config option
- "PublishV2HidServDescriptors" 0|1 to 1 is configured to publish v2
- descriptors and conform to the v2 connection establishment protocol. When
- configuring a hidden service, a hidden service provider checks if it has
- already created a random secret_cookie and a hostname2 file; if not, it
- creates both of them. (requires /2/)
- /11/ Establish introduction points with fresh key
- If configured to publish only v2 descriptors and no v0/v1 descriptors any
- more, a hidden service provider that is setting up the hidden service at
- introduction points does not pass its own public key, but the public key
- of a freshly generated key pair. It also includes these fresh public keys
- in the hidden service descriptor together with the other introduction point
- information. The reason is that the introduction point does not need to and
- therefore should not know for which hidden service it works, so as to
- prevent it from tracking the hidden service's activity. (If a hidden
- service provider supports both, v0/v1 and v2 descriptors, v0/v1 clients
- rely on the fact that all introduction points accept the same public key,
- so that this new feature cannot be used.)
- /12/ Encode v2 descriptors and send v2 publish requests
- If configured to publish v2 descriptors, a hidden service provider
- publishes a new descriptor whenever its content changes or a new
- publication period starts for this descriptor. If the current publication
- period would only last for less than 60 minutes (= 2 x 30 minutes to allow
- the server to be 30 minutes behind and the client 30 minutes ahead), the
- hidden service provider publishes both a current descriptor and one for
- the next period. Publication is performed by sending the descriptor to all
- hidden service directories that are responsible for keeping replicas for
- the descriptor ID. This includes two non-consecutive replicas that are
- stored at 3 consecutive nodes each. (requires /1/ and /2/)
- Hidden service client:
- /13/ Send v2 fetch requests
- A hidden service client that has set the config option
- "FetchV2HidServDescriptors" 0|1 to 1 handles SOCKS requests for v2 onion
- addresses by requesting a v2 descriptor from a randomly chosen hidden
- service directory that is responsible for keeping replica for the
- descriptor ID. In total there are six replicas of which the first and the
- last three are stored on consecutive nodes. The probability of picking one
- of the three consecutive replicas is 1/6, 2/6, and 3/6 to incorporate the
- fact that the availability will be the highest on the node with next higher
- ID. A hidden service client relies on the hidden service provider to store
- two sets of descriptors to compensate clock skew between service and
- client. (requires /1/ and /2/)
- /14/ Process v2 fetch reply and parse v2 descriptors
- A hidden service client that has sent a request for a v2 descriptor can
- parse it and store it to the local cache of rendezvous service descriptors.
- /15/ Establish connection to v2 hidden service
- A hidden service client can establish a connection to a hidden service
- using a v2 descriptor. This includes using the secret cookie for decrypting
- the introduction points contained in the descriptor. When contacting an
- introduction point, the client does not use the public key of the hidden
- service provider, but the freshly-generated public key that is included in
- the hidden service descriptor. Whether or not a fresh key is used instead
- of the key of the hidden service depends on the available protocol versions
- that are included in the descriptor; by this, connection establishment is
- to a certain extend decoupled from fetching the descriptor.
- Hidden service descriptor:
- (Requirements concerning the descriptor format are contained in /6/ and /7/.)
-
- The new v2 hidden service descriptor format looks like this:
- onion-address = h(public-key) + cookie
- descriptor-id = h(h(public-key) + h(time-period + cookie + relica))
- descriptor-content = {
- descriptor-id,
- version,
- public-key,
- h(time-period + cookie + replica),
- timestamp,
- protocol-versions,
- { introduction-points } encrypted with cookie
- } signed with private-key
- The "descriptor-id" needs to change periodically in order for the
- descriptor to be stored on changing nodes over time. It may only be
- computable by a hidden service provider and all of his clients to prevent
- unauthorized nodes from tracking the service activity by periodically
- checking whether there is a descriptor for this service. Finally, the
- hidden service directory needs to be able to verify that the hidden service
- provider is the true originator of the descriptor with the given ID.
-
- Therefore, "descriptor-id" is derived from the "public-key" of the hidden
- service provider, the current "time-period" which changes every 24 hours,
- a secret "cookie" shared between hidden service provider and clients, and
- a "replica" denoting the number of this non-consecutive replica. (The
- "time-period" is constructed in a way that time periods do not change at
- the same moment for all descriptors by deriving a value between 0:00 and
- 23:59 hours from h(public-key) and making the descriptors of this hidden
- service provider expire at that time of the day.) The "descriptor-id" is
- defined to be 160 bits long. [extending the "descriptor-id" length
- suggested by LØ]
-
- Only the hidden service provider and the clients are able to generate
- future "descriptor-ID"s. Hence, the "onion-address" is extended from now
- the hash value of "public-key" by the secret "cookie". The "public-key" is
- determined to be 80 bits long, whereas the "cookie" is dimensioned to be
- 120 bits long. This makes a total of 200 bits or 40 base32 chars, which is
- quite a lot to handle for a human, but necessary to provide sufficient
- protection against an adversary from generating a key pair with same
- "public-key" hash or guessing the "cookie".
-
- A hidden service directory can verify that a descriptor was created by the
- hidden service provider by checking if the "descriptor-id" corresponds to
- the "public-key" and if the signature can be verified with the
- "public-key".
- The "introduction-points" that are included in the descriptor are encrypted
- using the same "cookie" that is shared between hidden service provider and
- clients. [correction to use another key than h(time-period + cookie) as
- encryption key for introduction points made by LØ]
- A new text-based format is proposed for descriptors instead of an extension
- of the existing binary format for reasons of future extensibility.
- Security implications:
- The security implications of the proposed changes are grouped by the roles of
- nodes that could perform attacks or on which attacks could be performed.
- Attacks by authoritative directory nodes
- Authoritative directory nodes are no longer the single places in the
- network that know about a hidden service's activity and introduction
- points. Thus, they cannot perform attacks using this information, e.g.
- track a hidden service's activity or usage pattern or attack its
- introduction points. Formerly, it would only require a single corrupted
- authoritative directory operator to perform such an attack.
- Attacks by hidden service directory nodes
- A hidden service directory node could misuse a stored descriptor to track a
- hidden service's activity and usage pattern by clients. Though there is no
- countermeasure against this kind of attack, it is very expensive to track a
- certain hidden service over time. An attacker would need to run a large
- number of stable onion routers that work as hidden service directory nodes
- to have a good probability to become responsible for its changing
- descriptor IDs. For each period, the probability is:
- 1-(N-c choose r)/(N choose r) for N-c>=r and 1 otherwise, with N
- as total
- number of hidden service directories, c as compromised nodes, and r as
- number of replicas
- The hidden service directory nodes could try to make a certain hidden
- service unavailable to its clients. Therefore, they could discard all
- stored descriptors for that hidden service and reply to clients that there
- is no descriptor for the given ID or return an old or false descriptor
- content. The client would detect a false descriptor, because it could not
- contain a correct signature. But an old content or an empty reply could
- confuse the client. Therefore, the countermeasure is to replicate
- descriptors among a small number of hidden service directories, e.g. 5.
- The probability of a group of collaborating nodes to make a hidden service
- completely unavailable is in each period:
- (c choose r)/(N choose r) for c>=r and N>=r, and 0 otherwise,
- with N as total
- number of hidden service directories, c as compromised nodes, and r as
- number of replicas
- A hidden service directory could try to find out which introduction points
- are working on behalf of a hidden service. In contrast to the previous
- design, this is not possible anymore, because this information is encrypted
- to the clients of a hidden service.
- Attacks on hidden service directory nodes
- An anonymous attacker could try to swamp a hidden service directory with
- false descriptors for a given descriptor ID. This is prevented by requiring
- that descriptors are signed.
- Anonymous attackers could swamp a hidden service directory with correct
- descriptors for non-existing hidden services. There is no countermeasure
- against this attack. However, the creation of valid descriptors is more
- expensive than verification and storage in local memory. This should make
- this kind of attack unattractive.
- Attacks by introduction points
- Current or former introduction points could try to gain information on the
- hidden service they serve. But due to the fresh key pair that is used by
- the hidden service, this attack is not possible anymore.
- Attacks by clients
- Current or former clients could track a hidden service's activity, attack
- its introduction points, or determine the responsible hidden service
- directory nodes and attack them. There is nothing that could prevent them
- from doing so, because honest clients need the full descriptor content to
- establish a connection to the hidden service. At the moment, the only
- countermeasure against dishonest clients is to change the secret cookie and
- pass it only to the honest clients.
- Compatibility:
- The proposed design is meant to replace the current design for hidden service
- descriptors and their storage in the long run.
- There should be a first transition phase in which both, the current design
- and the proposed design are served in parallel. Onion routers should start
- serving as hidden service directories, and hidden service providers and
- clients should make use of the new design if both sides support it. Hidden
- service providers should be allowed to publish descriptors of the current
- format in parallel, and authoritative directories should continue storing and
- serving these descriptors.
- After the first transition phase, hidden service providers should stop
- publishing descriptors on authoritative directories, and hidden service
- clients should not try to fetch descriptors from the authoritative
- directories. However, the authoritative directories should continue serving
- hidden service descriptors for a second transition phase. As of this point,
- all v2 config options should be set to a default value of 1.
- After the second transition phase, the authoritative directories should stop
- serving hidden service descriptors.
|