| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441 | 
							- Filename: 114-distributed-storage.txt
 
- Title: Distributed Storage for Tor Hidden Service Descriptors
 
- Version: $Revision$
 
- Last-Modified: $Date$
 
- Author: Karsten Loesing
 
- Created: 13-May-2007
 
- Status: Closed
 
- Implemented-In: 0.2.0.x
 
- Change history:
 
-   13-May-2007  Initial proposal
 
-   14-May-2007  Added changes suggested by Lasse Øverlier
 
-   30-May-2007  Changed descriptor format, key length discussion, typos
 
-   09-Jul-2007  Incorporated suggestions by Roger, added status of specification
 
-                and implementation for upcoming GSoC mid-term evaluation
 
-   11-Aug-2007  Updated implementation statuses, included non-consecutive
 
-                replication to descriptor format
 
-   20-Aug-2007  Renamed config option HSDir as HidServDirectoryV2
 
-   02-Dec-2007  Closed proposal
 
- Overview:
 
-   The basic idea of this proposal is to distribute the tasks of storing and
 
-   serving hidden service descriptors from currently three authoritative
 
-   directory nodes among a large subset of all onion routers. The three
 
-   reasons to do this are better robustness (availability), better
 
-   scalability, and improved security properties. Further,
 
-   this proposal suggests changes to the hidden service descriptor format to
 
-   prevent new security threats coming from decentralization and to gain even
 
-   better security properties.
 
- Status:
 
-   As of December 2007, the new hidden service descriptor format is implemented
 
-   and usable. However, servers and clients do not yet make use of descriptor
 
-   cookies, because there are open usability issues of this feature that might
 
-   be resolved in proposal 121. Further, hidden service directories do not
 
-   perform replication by themselves, because (unauthorized) replica fetch
 
-   requests would allow any attacker to fetch all hidden service descriptors in
 
-   the system. As neither issue is critical to the functioning of v2
 
-   descriptors and their distribution, this proposal is considered as Closed.
 
-   
 
- Motivation:
 
-   The current design of hidden services exhibits the following performance and
 
-   security problems:
 
-   First, the three hidden service authoritative directories constitute a
 
-   performance bottleneck in the system. The directory nodes are responsible for
 
-   storing and serving all hidden service descriptors. As of May 2007 there are
 
-   about 1000 descriptors at a time, but this number is assumed to increase in
 
-   the future. Further, there is no replication protocol for descriptors between
 
-   the three directory nodes, so that hidden services must ensure the
 
-   availability of their descriptors by manually publishing them on all
 
-   directory nodes. Whenever a fourth or fifth hidden service authoritative
 
-   directory is added, hidden services will need to maintain an equally
 
-   increasing number of replicas. These scalability issues have an impact on the
 
-   current usage of hidden services and put an even higher burden on the
 
-   development of new kinds of applications for hidden services that might
 
-   require storing even more descriptors.
 
-   Second, besides posing a limitation to scalability, storing all hidden
 
-   service descriptors on three directory nodes also constitutes a security
 
-   risk. The directory node operators could easily analyze the publish and fetch
 
-   requests to derive information on service activity and usage and read the
 
-   descriptor contents to determine which onion routers work as introduction
 
-   points for a given hidden service and need to be attacked or threatened to
 
-   shut it down. Furthermore, the contents of a hidden service descriptor offer
 
-   only minimal security properties to the hidden service. Whoever gets aware of
 
-   the service ID can easily find out whether the service is active at the
 
-   moment and which introduction points it has. This applies to (former)
 
-   clients, (former) introduction points, and of course to the directory nodes.
 
-   It requires only to request the descriptor for the given service ID, which
 
-   can be performed by anyone anonymously.
 
-   This proposal suggests two major changes to approach the described
 
-   performance and security problems:
 
-   The first change affects the storage location for hidden service descriptors.
 
-   Descriptors are distributed among a large subset of all onion routers instead
 
-   of three fixed directory nodes. Each storing node is responsible for a subset
 
-   of descriptors for a limited time only. It is not able to choose which
 
-   descriptors it stores at a certain time, because this is determined by its
 
-   onion ID which is hard to change frequently and in time (only routers which
 
-   are stable for a given time are accepted as storing nodes). In order to
 
-   resist single node failures and untrustworthy nodes, descriptors are
 
-   replicated among a certain number of storing nodes. A first replication
 
-   protocol makes sure that descriptors don't get lost when the node population
 
-   changes; therefore, a storing node periodically requests the descriptors from
 
-   its siblings. A second replication protocol distributes descriptors among
 
-   non-consecutive nodes of the ID ring to prevent a group of adversaries from
 
-   generating new onion keys until they have consecutive IDs to create a 'black
 
-   hole' in the ring and make random services unavailable. Connections to
 
-   storing nodes are established by extending existing circuits by one hop to
 
-   the storing node. This also ensures that contents are encrypted. The effect
 
-   of this first change is that the probability that a single node operator
 
-   learns about a certain hidden service is very small and that it is very hard
 
-   to track a service over time, even when it collaborates with other node
 
-   operators.
 
-   
 
-   The second change concerns the content of hidden service descriptors.
 
-   Obviously, security problems cannot be solved only by decentralizing storage;
 
-   in fact, they could also get worse if done without caution. At first, a
 
-   descriptor ID needs to change periodically in order to be stored on changing
 
-   nodes over time. Next, the descriptor ID needs to be computable only for the
 
-   service's clients, but should be unpredictable for all other nodes. Further,
 
-   the storing node needs to be able to verify that the hidden service is the
 
-   true originator of the descriptor with the given ID even though it is not a
 
-   client. Finally, a storing node should learn as little information as
 
-   necessary by storing a descriptor, because it might not be as trustworthy as
 
-   a directory node; for example it does not need to know the list of
 
-   introduction points. Therefore, a second key is applied that is only known to
 
-   the hidden service provider and its clients and that is not included in the
 
-   descriptor. It is used to calculate descriptor IDs and to encrypt the
 
-   introduction points. This second key can either be given to all clients
 
-   together with the hidden service ID, or to a group or a single client as
 
-   an authentication token. In the future this second key could be the result of
 
-   some key agreement protocol between the hidden service and one or more
 
-   clients. A new text-based format is proposed for descriptors instead of an
 
-   extension of the existing binary format for reasons of future extensibility.
 
- Design:
 
-   The proposed design is described by the required changes to the current
 
-   design. These requirements are grouped by content, rather than by affected
 
-   specification documents or code files, and numbered for reference below.
 
-   Hidden service clients, servers, and directories:
 
-   /1/ Create routing list
 
-     All participants can filter the consensus status document received from the
 
-     directory authorities to one routing list containing only those servers
 
-     that store and serve hidden service descriptors and which are running for
 
-     at least 24 hours. A participant only trusts its own routing list and never
 
-     learns about routing information from other parties.
 
-   /2/ Determine responsible hidden service directory
 
-     All participants can determine the hidden service directory that is
 
-     responsible for storing and serving a given ID, as well as the hidden
 
-     service directories that replicate its content. Every hidden service
 
-     directory is responsible for the descriptor IDs in the interval from
 
-     its predecessor, exclusive, to its own ID, inclusive. Further, a hidden
 
-     service directory holds replicas for its n predecessors, where n denotes
 
-     the number of consecutive replicas. (requires /1/)
 
-   [/3/ and /4/ were requirements to use BEGIN_DIR cells for directory
 
-    requests which have not been fulfilled in the course of the implementation
 
-    of this proposal, but elsewhere.]
 
-   Hidden service directory nodes:
 
-     
 
-   /5/ Advertise hidden service directory functionality
 
-     Every onion router that has its directory port open can decide whether it
 
-     wants to store and serve hidden service descriptors by setting a new config
 
-     option "HidServDirectoryV2" 0|1 to 1. An onion router with this config
 
-     option being set includes the flag "hidden-service-dir" in its router
 
-     descriptors that it sends to directory authorities.
 
-   /6/ Accept v2 publish requests, parse and store v2 descriptors
 
-     Hidden service directory nodes accept publish requests for hidden service
 
-     descriptors and store them to their local memory. (It is not necessary to
 
-     make descriptors persistent, because after disconnecting, the onion router
 
-     would not be accepted as storing node anyway, because it has not been
 
-     running for at least 24 hours.) All requests and replies are formatted as
 
-     HTTP messages. Requests are directed to the router's directory port and are
 
-     contained within BEGIN_DIR cells. A hidden service directory node stores a
 
-     descriptor only when it thinks that it is responsible for storing that
 
-     descriptor based on its own routing table. Every hidden service directory
 
-     node is responsible for the descriptor IDs in the interval of its n-th
 
-     predecessor in the ID circle up to its own ID (n denotes the number of
 
-     consecutive replicas). (requires /1/)
 
-   /7/ Accept v2 fetch requests
 
-     Same as /6/, but with fetch requests for hidden service descriptors.
 
-     (requires /2/)
 
-   /8/ Replicate descriptors with neighbors
 
-     A hidden service directory node replicates descriptors from its two
 
-     predecessors by downloading them once an hour. Further, it checks its
 
-     routing table periodically for changes. Whenever it realizes that a
 
-     predecessor has left the network, it establishes a connection to the new
 
-     n-th predecessor and requests its stored descriptors in the interval of its
 
-     (n+1)-th predecessor and the requested n-th predecessor. Whenever it
 
-     realizes that a new onion router has joined with an ID higher than its
 
-     former n-th predecessor, it adds it to its predecessors and discards all
 
-     descriptors in the interval of its (n+1)-th and its n-th predecessor.
 
-     (requires /1/)
 
-     [Dec 02: This function has not been implemented, because arbitrary nodes
 
-      what have been able to download the entire set of v2 descriptors. An
 
-      authorized replication request would be necessary. For the moment, the
 
-      system runs without any directory-side replication. -KL]
 
-   Authoritative directory nodes:
 
-   /9/ Confirm a router's hidden service directory functionality
 
-     Directory nodes include a new flag "HSDir" for routers that decided to
 
-     provide storage for hidden service descriptors and that are running for at
 
-     least 24 hours. The last requirement prevents a node from frequently
 
-     changing its onion key to become responsible for an identifier it wants to
 
-     target.
 
-   Hidden service provider:
 
-   /10/ Configure v2 hidden service
 
-     Each hidden service provider that has set the config option
 
-     "PublishV2HidServDescriptors" 0|1 to 1 is configured to publish v2
 
-     descriptors and conform to the v2 connection establishment protocol. When
 
-     configuring a hidden service, a hidden service provider checks if it has
 
-     already created a random secret_cookie and a hostname2 file; if not, it
 
-     creates both of them. (requires /2/)
 
-   /11/ Establish introduction points with fresh key
 
-     If configured to publish only v2 descriptors and no v0/v1 descriptors any
 
-     more, a hidden service provider that is setting up the hidden service at
 
-     introduction points does not pass its own public key, but the public key
 
-     of a freshly generated key pair. It also includes these fresh public keys
 
-     in the hidden service descriptor together with the other introduction point
 
-     information. The reason is that the introduction point does not need to and
 
-     therefore should not know for which hidden service it works, so as to
 
-     prevent it from tracking the hidden service's activity. (If a hidden
 
-     service provider supports both, v0/v1 and v2 descriptors, v0/v1 clients
 
-     rely on the fact that all introduction points accept the same public key,
 
-     so that this new feature cannot be used.)
 
-   /12/ Encode v2 descriptors and send v2 publish requests
 
-     If configured to publish v2 descriptors, a hidden service provider
 
-     publishes a new descriptor whenever its content changes or a new
 
-     publication period starts for this descriptor. If the current publication
 
-     period would only last for less than 60 minutes (= 2 x 30 minutes to allow
 
-     the server to be 30 minutes behind and the client 30 minutes ahead), the
 
-     hidden service provider publishes both a current descriptor and one for
 
-     the next period. Publication is performed by sending the descriptor to all
 
-     hidden service directories that are responsible for keeping replicas for
 
-     the descriptor ID. This includes two non-consecutive replicas that are
 
-     stored at 3 consecutive nodes each. (requires /1/ and /2/)
 
-   Hidden service client:
 
-   /13/ Send v2 fetch requests
 
-     A hidden service client that has set the config option
 
-     "FetchV2HidServDescriptors" 0|1 to 1 handles SOCKS requests for v2 onion
 
-     addresses by requesting a v2 descriptor from a randomly chosen hidden
 
-     service directory that is responsible for keeping replica for the
 
-     descriptor ID. In total there are six replicas of which the first and the
 
-     last three are stored on consecutive nodes. The probability of picking one
 
-     of the three consecutive replicas is 1/6, 2/6, and 3/6 to incorporate the
 
-     fact that the availability will be the highest on the node with next higher
 
-     ID. A hidden service client relies on the hidden service provider to store
 
-     two sets of descriptors to compensate clock skew between service and
 
-     client. (requires /1/ and /2/)
 
-   /14/ Process v2 fetch reply and parse v2 descriptors
 
-     A hidden service client that has sent a request for a v2 descriptor can
 
-     parse it and store it to the local cache of rendezvous service descriptors.
 
-   /15/ Establish connection to v2 hidden service
 
-     A hidden service client can establish a connection to a hidden service
 
-     using a v2 descriptor. This includes using the secret cookie for decrypting
 
-     the introduction points contained in the descriptor. When contacting an
 
-     introduction point, the client does not use the public key of the hidden
 
-     service provider, but the freshly-generated public key that is included in
 
-     the hidden service descriptor. Whether or not a fresh key is used instead
 
-     of the key of the hidden service depends on the available protocol versions
 
-     that are included in the descriptor; by this, connection establishment is
 
-     to a certain extend decoupled from fetching the descriptor.
 
-   Hidden service descriptor:
 
-   (Requirements concerning the descriptor format are contained in /6/ and /7/.)
 
-   
 
-     The new v2 hidden service descriptor format looks like this:
 
-       onion-address = h(public-key) + cookie
 
-       descriptor-id = h(h(public-key) + h(time-period + cookie + relica))
 
-       descriptor-content = {
 
-         descriptor-id,
 
-         version,
 
-         public-key,
 
-         h(time-period + cookie + replica),
 
-         timestamp,
 
-         protocol-versions,
 
-         { introduction-points } encrypted with cookie
 
-       } signed with private-key
 
-     The "descriptor-id" needs to change periodically in order for the
 
-     descriptor to be stored on changing nodes over time. It may only be
 
-     computable by a hidden service provider and all of his clients to prevent
 
-     unauthorized nodes from tracking the service activity by periodically
 
-     checking whether there is a descriptor for this service. Finally, the
 
-     hidden service directory needs to be able to verify that the hidden service
 
-     provider is the true originator of the descriptor with the given ID.
 
-     
 
-     Therefore, "descriptor-id" is derived from the "public-key" of the hidden
 
-     service provider, the current "time-period" which changes every 24 hours,
 
-     a secret "cookie" shared between hidden service provider and clients, and
 
-     a "replica" denoting the number of this non-consecutive replica. (The
 
-     "time-period" is constructed in a way that time periods do not change at
 
-     the same moment for all descriptors by deriving a value between 0:00 and
 
-     23:59 hours from h(public-key) and making the descriptors of this hidden
 
-     service provider expire at that time of the day.) The "descriptor-id" is
 
-     defined to be 160 bits long. [extending the "descriptor-id" length
 
-     suggested by LØ]
 
-     
 
-     Only the hidden service provider and the clients are able to generate
 
-     future "descriptor-ID"s. Hence, the "onion-address" is extended from now 
 
-     the hash value of "public-key" by the secret "cookie". The "public-key" is
 
-     determined to be 80 bits long, whereas the "cookie" is dimensioned to be
 
-     120 bits long. This makes a total of 200 bits or 40 base32 chars, which is
 
-     quite a lot to handle for a human, but necessary to provide sufficient
 
-     protection against an adversary from generating a key pair with same
 
-     "public-key" hash or guessing the "cookie".
 
-     
 
-     A hidden service directory can verify that a descriptor was created by the
 
-     hidden service provider by checking if the "descriptor-id" corresponds to
 
-     the "public-key" and if the signature can be verified with the
 
-     "public-key".
 
-     The "introduction-points" that are included in the descriptor are encrypted
 
-     using the same "cookie" that is shared between hidden service provider and
 
-     clients. [correction to use another key than h(time-period + cookie) as
 
-     encryption key for introduction points made by LØ]
 
-     A new text-based format is proposed for descriptors instead of an extension
 
-     of the existing binary format for reasons of future extensibility.
 
- Security implications:
 
-   The security implications of the proposed changes are grouped by the roles of
 
-   nodes that could perform attacks or on which attacks could be performed.
 
-   Attacks by authoritative directory nodes
 
-     Authoritative directory nodes are no longer the single places in the
 
-     network that know about a hidden service's activity and introduction
 
-     points. Thus, they cannot perform attacks using this information, e.g.
 
-     track a hidden service's activity or usage pattern or attack its
 
-     introduction points. Formerly, it would only require a single corrupted
 
-     authoritative directory operator to perform such an attack.
 
-   Attacks by hidden service directory nodes
 
-     A hidden service directory node could misuse a stored descriptor to track a
 
-     hidden service's activity and usage pattern by clients. Though there is no
 
-     countermeasure against this kind of attack, it is very expensive to track a
 
-     certain hidden service over time. An attacker would need to run a large
 
-     number of stable onion routers that work as hidden service directory nodes
 
-     to have a good probability to become responsible for its changing
 
-     descriptor IDs. For each period, the probability is:
 
-       1-(N-c choose r)/(N choose r) for N-c>=r and 1 otherwise, with N
 
-       as total
 
-       number of hidden service directories, c as compromised nodes, and r as
 
-       number of replicas
 
-     The hidden service directory nodes could try to make a certain hidden
 
-     service unavailable to its clients. Therefore, they could discard all
 
-     stored descriptors for that hidden service and reply to clients that there
 
-     is no descriptor for the given ID or return an old or false descriptor
 
-     content. The client would detect a false descriptor, because it could not
 
-     contain a correct signature. But an old content or an empty reply could
 
-     confuse the client. Therefore, the countermeasure is to replicate
 
-     descriptors among a small number of hidden service directories, e.g. 5.
 
-     The probability of a group of collaborating nodes to make a hidden service
 
-     completely unavailable is in each period:
 
-       (c choose r)/(N choose r) for c>=r and N>=r, and 0 otherwise,
 
-       with N as total
 
-       number of hidden service directories, c as compromised nodes, and r as
 
-       number of replicas
 
-     A hidden service directory could try to find out which introduction points
 
-     are working on behalf of a hidden service. In contrast to the previous
 
-     design, this is not possible anymore, because this information is encrypted
 
-     to the clients of a hidden service.
 
-   Attacks on hidden service directory nodes
 
-     An anonymous attacker could try to swamp a hidden service directory with
 
-     false descriptors for a given descriptor ID. This is prevented by requiring
 
-     that descriptors are signed.
 
-     Anonymous attackers could swamp a hidden service directory with correct
 
-     descriptors for non-existing hidden services. There is no countermeasure
 
-     against this attack. However, the creation of valid descriptors is more
 
-     expensive than verification and storage in local memory. This should make
 
-     this kind of attack unattractive.
 
-   Attacks by introduction points
 
-     Current or former introduction points could try to gain information on the
 
-     hidden service they serve. But due to the fresh key pair that is used by
 
-     the hidden service, this attack is not possible anymore.
 
-   Attacks by clients
 
-     Current or former clients could track a hidden service's activity, attack
 
-     its introduction points, or determine the responsible hidden service
 
-     directory nodes and attack them. There is nothing that could prevent them
 
-     from doing so, because honest clients need the full descriptor content to
 
-     establish a connection to the hidden service. At the moment, the only
 
-     countermeasure against dishonest clients is to change the secret cookie and
 
-     pass it only to the honest clients.
 
- Compatibility:
 
-   The proposed design is meant to replace the current design for hidden service
 
-   descriptors and their storage in the long run.
 
-   There should be a first transition phase in which both, the current design
 
-   and the proposed design are served in parallel. Onion routers should start
 
-   serving as hidden service directories, and hidden service providers and
 
-   clients should make use of the new design if both sides support it. Hidden
 
-   service providers should be allowed to publish descriptors of the current
 
-   format in parallel, and authoritative directories should continue storing and
 
-   serving these descriptors.
 
-   After the first transition phase, hidden service providers should stop
 
-   publishing descriptors on authoritative directories, and hidden service
 
-   clients should not try to fetch descriptors from the authoritative
 
-   directories. However, the authoritative directories should continue serving
 
-   hidden service descriptors for a second transition phase. As of this point,
 
-   all v2 config options should be set to a default value of 1.
 
-   After the second transition phase, the authoritative directories should stop
 
-   serving hidden service descriptors.
 
 
  |