| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174 | 
							
- How to hand out bridges.
 
- Divide bridges into 'strategies' as they come in.  Do this uniformly
 
- at random for now.
 
- For each strategy, we'll hand out bridges in a different way to
 
- clients.  This document describes two strategies: email-based and
 
- IP-based.
 
- 0. Notation:
 
-    HMAC(k,v) : an HMAC of v using the key k.
 
-    A|B: The string A concatenated with the string B.
 
- 1. Email-based.
 
-   Goal: bootstrap based on one or more popular email service's sybil
 
-      prevention algorithms.
 
-   Parameters:
 
-      HMAC -- an HMAC function
 
-      P -- a time period
 
-      K -- the number of bridges to send in a period.
 
-   Setup: Generate two nonces, N and M.
 
-   As bridges arrive, put them into a ring according to HMAC(N,ID)
 
-   where ID is the bridges's identity digest.
 
-   Divide time into divisions of length P.
 
-   When we get an email:
 
-      If it's not from a supported email service, reject it.
 
-      If we already sent a response to that email address (normalized)
 
-      in this period, send _exactly_ the same response.
 
-      If it is from a supported service, generate X = HMAC(M,PS|E) where E
 
-      is the lowercased normalized email address for the user, and
 
-      where PS is the start of the currrent period.  Send
 
-      the first K bridges in the ring after point X.
 
-      [If we want to make sure that repeat queries are given exactly the
 
-       same results, then we can't let the ring change during the
 
-       time period. For a long time period like a month, that's quite a
 
-       hassle. How about instead just keeping a replay cache of addresses
 
-       that have been answered, and sending them a "sorry, you already got
 
-       your addresses for the time period; perhaps you should try these
 
-       other fine distribution strategies while you wait?" response? This
 
-       approach would also resolve the "Make sure you can't construct a
 
-       distinct address to match an existing one" note below. -RD]
 
-         [I think, if we get a replay, we need to sen back the same
 
-         answer as we did the first time, not say "try again."
 
-         Otherwise we need to worry that an attacker can keep people
 
-         from getting bridges by preemtively asking for them,
 
-         or that an attacker may force them to prove they haven't
 
-         gotten any bridges by asking. -NM]
 
-      [While we're at it, if we do the replay cache thing and don't need
 
-       repeatable answers, we could just pick K random answers from the
 
-       pool. Is it beneficial that a bridge user who knows about a clump of
 
-       nodes will be sharing them with other users who know about a similar
 
-       (overlapping) clump? One good aspect is against an adversary who
 
-       learns about a clump this way and watches those bridges to learn
 
-       other users and discover *their* bridges: he doesn't learn about
 
-       as many new bridges as he might if they were randomly distributed.
 
-       A drawback is against an adversary who happens to pick two email
 
-       addresses in P that include overlapping answers: he can measure
 
-       the difference in clumps and estimate how quickly the bridge pool
 
-       is growing. -RD]
 
-         [Random is one more darn thing to implement; rings are already
 
-          there. -NM]
 
-      [If we make the period P be mailbox-specific, and make it a random
 
-       value around some mean, then we make it harder for an attacker to
 
-       know when to try using his small army of gmail addresses to gather
 
-       another harvest. But we also make it harder for users to know when
 
-       they can try again. -RD]
 
-         [Letting the users know about when they can try again seems
 
-          worthwhile.  Otherwise users and attackers will all probe and
 
-          probe and probe until they get an answer.  No additional
 
-          security will be achieved, but bandwidth will be lost. -NM]
 
-   To normalize an email address:
 
-      Start with the RFC822 address.  Consider only the mailbox {???}
 
-      portion of the address (username@domain).  Put this into lowercase
 
-      ascii.
 
-   Questions:
 
-      What to do with weird character encodings?  Look up the RFC.
 
-   Notes:
 
-      Make sure that you can't force a single email address to appear
 
-      in lots of different ways.  IOW, if nickm@freehaven.net and
 
-      NICKM@freehaven.net aren't treated the same, then I can get lots
 
-      more bridges than I should.
 
-      Make sure you can't construct a distinct address to match an
 
-      existing one.  IOW, if we treat nickm@X and nickm@Y as the same
 
-      user, then anybody can register nickm@Z and use it to tell which
 
-      bridges nickm@X got (or would get).
 
-      Make sure that we actually check headers so we can't be trivially
 
-      used to spam people.
 
- 2. IP-based.
 
-   Goal: avoid handing out all the bridges to users in a similar IP
 
-   space and time.
 
-   Parameters:
 
-      T_Flush -- how long it should take a user on a single network to
 
-         see a whole cluster of bridges.
 
-      N_C
 
-      K -- the number of bridges we hand out in response to a single
 
-      request.
 
-   Setup: using an AS map or a geoip map or some other flawed input
 
-   source, divide IP space into "areas" such that surveying a large
 
-   collection of "areas" is hard.  For v0, use /24 address blocks.
 
-   Group areas into N_C clusters.
 
-   Generate secrets L, M, N.
 
-   Set the period P such that P*(bridges-per-cluster/K) = T_flush.
 
-   Don't set P to greater than a week, or less than three hours.
 
-   When we get a bridge:
 
-      Based on HMAC(L,ID), assign the bridge to a cluster.  Within each
 
-      cluster, keep the bridges in a ring based on HMAC(M,ID).
 
-      [Should we re-sort the rings for each new time period, so the ring
 
-       for a given cluster is based on HMAC(M,PS|ID)? -RD]
 
-   When we get a connection:
 
-      If it's http, redirect it to https.
 
-      Let area be the incoming IP network.  Let PS be the current
 
-      period.  Compute X = HMAC(N, PS|area).  Return the next K bridges
 
-      in the ring after X.
 
-      [Don't we want to compute C = HMAC(key, area) to learn what cluster
 
-       to answer from, and then X = HMAC(key, PS|area) to pick a point in
 
-       that ring? -RD]
 
-   Need to clarify that some HMACs are for rings, and some are for
 
-   partitions. How rings scale is clear. How do we grow the number of
 
-   partitions? Looking at successive bits from the HMAC output is one way.
 
- 3. Open issues
 
-    Denial of service attacks
 
-    A good view of network topology
 
- at some point we should learn some reliability stats on our bridges. when
 
- we say above 'give out k bridges', we might give out 2 reliable ones and
 
- k-2 others. we count around the ring the same way we do now, to find them.
 
 
  |