Browse Source

r8826@totoro: nickm | 2006-10-01 17:58:45 -0400
Disprefer exit nodes for entry, middle positions (fixes bug 200). Also, switch to using a uint64_t to hold "total bandwidth for all nodes" under consideration; crypt_rand_int would have died at 2GB/s network capacity.


svn:r8571

Nick Mathewson 19 years ago
parent
commit
57ffca883d
7 changed files with 89 additions and 47 deletions
  1. 5 0
      ChangeLog
  2. 4 6
      doc/TODO
  3. 18 9
      doc/path-spec.txt
  4. 6 6
      src/or/circuitbuild.c
  5. 5 2
      src/or/or.h
  6. 1 1
      src/or/rendservice.c
  7. 50 23
      src/or/routerlist.c

+ 5 - 0
ChangeLog

@@ -47,6 +47,9 @@ Changes in version 0.1.2.2-alpha - 2006-10-??
       and including the function name only seems to confuse users.
       and including the function name only seems to confuse users.
     - Fix CIRC controller events so that controllers can learn the identity
     - Fix CIRC controller events so that controllers can learn the identity
       digests of non-Named servers used in circuit paths. (Fixes bug 336.)
       digests of non-Named servers used in circuit paths. (Fixes bug 336.)
+    - Avoid choosing Exit nodes for entry or middle hops when the bandwidth
+      available in non-Exit nodes is much higher then the bandwidth available
+      in Exit nodes. (Fixes bug 200.)
 
 
   o Security Fixes, minor:
   o Security Fixes, minor:
     - If a client asked for a server by name, and we didn't have a
     - If a client asked for a server by name, and we didn't have a
@@ -93,6 +96,8 @@ Changes in version 0.1.2.2-alpha - 2006-10-??
       With the old code, if a guard was unreachable by us but listed as
       With the old code, if a guard was unreachable by us but listed as
       running, it would clog our guard list forever.
       running, it would clog our guard list forever.
     - Make eventdns give strings for DNS errors, not just error numbers.
     - Make eventdns give strings for DNS errors, not just error numbers.
+    - Be prepared in case we ever have a network with  more than 2GB per
+      second total advertised capacity.
 
 
   o Documentation
   o Documentation
     - Documented (and renamed) ServerDNSSearchDomains and
     - Documented (and renamed) ServerDNSSearchDomains and

+ 4 - 6
doc/TODO

@@ -35,12 +35,10 @@ x - If the client's clock is too far in the past, it will drop (or
       D The right thing here is to revamp our node selection implementation.
       D The right thing here is to revamp our node selection implementation.
         (Deferred until oprofile says this matters.)
         (Deferred until oprofile says this matters.)
     o make it configurable, so people can turn it on or off.
     o make it configurable, so people can turn it on or off.
-N - Add script to grep for identical log msgs.
+  o Add script to grep for identical log msgs.
-N - Bug 200: disprefer exit nodes for entry, middle.
+  o Bug 200: disprefer exit nodes for entry, middle.
-    - If less than 1/3 support port X, only use as exit.
+    o Specify
-    - If 2/3 support port X, weight exits 1/2; weight non-exits 1.
+    o Implement
-    - (Exit fraction - 1/3):Non-exit fraction
-    - (e - 1/3)/(1-e)
   o Bug 303: block exit from circuits created with create-fast
   o Bug 303: block exit from circuits created with create-fast
     o Specify and document
     o Specify and document
     o Implement
     o Implement

+ 18 - 9
doc/path-spec.txt

@@ -157,15 +157,24 @@ of their choices.
        below)
        below)
      - XXXX Choosing the length
      - XXXX Choosing the length
 
 
-   For circuits that are not "fast", when choosing among multiple
+   For circuits that do not need to be not "fast", when choosing among
-   candidates for a path element, we choose randomly. For "fast" circuits,
+   multiple candidates for a path element, we choose randomly.
-   we choose
+
-   a given router with probability proportional to its advertised bandwidth
+   For "fast" circuits, we a given router as an exit with probability
-   [the smaller of the 'rate' and 'observed' arguments to the "bandwidth"
+   proportional to its advertised bandwidth [the smaller of the 'rate' and
-   element in its descriptor].  If a router's advertised bandwidth is greater
+   'observed' arguments to the "bandwidth" element in its descriptor].  If a
-   than MAX_BELIEVEABLE_BANDWIDTH (1.5 MB/sec), we clip to that value.
+   router's advertised bandwidth is greater than MAX_BELIEVEABLE_BANDWIDTH
-
+   (1.5 MB/sec), we clip to that value.
-   (XXXX We should do something to shift traffic away from exit nodes.)
+
+   For non-exit positions on "fast" circuits, we pick routers as above, but
+   we weight the clipped advertised bandwidth of Exit-flagged nodes depending
+   on the fraction of bandwidth available from non-Exit nodes.  Call the
+   total clipped advertised bandwidth for Exit nodes under consideration E,
+   and the total clipped advertised bandwidth for non-Exit nodes under
+   consideration N.  If E<N/2, we do not consider Exit-flagged nodes.
+   Otherwise, we weight their bandwidth with the factor (E-N/2)/(N+E-N/2) ==
+   (2E - N)/(2E + N).  This ensures that bandwidth is evenly distributed over
+   nodes in 3-hop paths.
 
 
    Additionally, we may be building circuits with one or more requests in
    Additionally, we may be building circuits with one or more requests in
    mind.  Each kind of request puts certain constraints on paths:
    mind.  Each kind of request puts certain constraints on paths:

+ 6 - 6
src/or/circuitbuild.c

@@ -1199,7 +1199,7 @@ choose_good_exit_server_general(routerlist_t *dir, int need_uptime,
     smartlist_subtract(sl,excludedexits);
     smartlist_subtract(sl,excludedexits);
     if (options->StrictExitNodes || smartlist_overlap(sl,preferredexits))
     if (options->StrictExitNodes || smartlist_overlap(sl,preferredexits))
       smartlist_intersect(sl,preferredexits);
       smartlist_intersect(sl,preferredexits);
-    router = routerlist_sl_choose_by_bandwidth(sl);
+    router = routerlist_sl_choose_by_bandwidth(sl, 1);
   } else {
   } else {
     /* Either there are no pending connections, or no routers even seem to
     /* Either there are no pending connections, or no routers even seem to
      * possibly support any of them.  Choose a router at random that satisfies
      * possibly support any of them.  Choose a router at random that satisfies
@@ -1238,7 +1238,7 @@ choose_good_exit_server_general(routerlist_t *dir, int need_uptime,
         smartlist_intersect(sl,preferredexits);
         smartlist_intersect(sl,preferredexits);
         /* XXX sometimes the above results in null, when the requested
         /* XXX sometimes the above results in null, when the requested
          * exit node is down. we should pick it anyway. */
          * exit node is down. we should pick it anyway. */
-      router = routerlist_sl_choose_by_bandwidth(sl);
+      router = routerlist_sl_choose_by_bandwidth(sl, 1);
       if (router)
       if (router)
         break;
         break;
     }
     }
@@ -1282,14 +1282,14 @@ choose_good_exit_server(uint8_t purpose, routerlist_t *dir,
       if (is_internal) /* pick it like a middle hop */
       if (is_internal) /* pick it like a middle hop */
         return router_choose_random_node(NULL, get_options()->ExcludeNodes,
         return router_choose_random_node(NULL, get_options()->ExcludeNodes,
                NULL, need_uptime, need_capacity, 0,
                NULL, need_uptime, need_capacity, 0,
-               get_options()->_AllowInvalid & ALLOW_INVALID_MIDDLE, 0);
+               get_options()->_AllowInvalid & ALLOW_INVALID_MIDDLE, 0, 0);
       else
       else
         return choose_good_exit_server_general(dir,need_uptime,need_capacity);
         return choose_good_exit_server_general(dir,need_uptime,need_capacity);
     case CIRCUIT_PURPOSE_C_ESTABLISH_REND:
     case CIRCUIT_PURPOSE_C_ESTABLISH_REND:
       return router_choose_random_node(
       return router_choose_random_node(
                options->RendNodes, options->RendExcludeNodes,
                options->RendNodes, options->RendExcludeNodes,
                NULL, need_uptime, need_capacity, 0,
                NULL, need_uptime, need_capacity, 0,
-               options->_AllowInvalid & ALLOW_INVALID_RENDEZVOUS, 0);
+               options->_AllowInvalid & ALLOW_INVALID_RENDEZVOUS, 0, 0);
   }
   }
   log_warn(LD_BUG,"Bug: unhandled purpose %d", purpose);
   log_warn(LD_BUG,"Bug: unhandled purpose %d", purpose);
   tor_fragile_assert();
   tor_fragile_assert();
@@ -1508,7 +1508,7 @@ choose_good_middle_server(uint8_t purpose,
   choice = router_choose_random_node(preferred,
   choice = router_choose_random_node(preferred,
            options->ExcludeNodes, excluded,
            options->ExcludeNodes, excluded,
            state->need_uptime, state->need_capacity, 0,
            state->need_uptime, state->need_capacity, 0,
-           options->_AllowInvalid & ALLOW_INVALID_MIDDLE, 0);
+           options->_AllowInvalid & ALLOW_INVALID_MIDDLE, 0, 0);
   if (preferred)
   if (preferred)
     tor_free(preferred);
     tor_free(preferred);
   smartlist_free(excluded);
   smartlist_free(excluded);
@@ -1571,7 +1571,7 @@ choose_good_entry_server(uint8_t purpose, cpath_build_state_t *state)
            excluded, state ? state->need_uptime : 0,
            excluded, state ? state->need_uptime : 0,
            state ? state->need_capacity : 0,
            state ? state->need_capacity : 0,
            state ? 0 : 1,
            state ? 0 : 1,
-           options->_AllowInvalid & ALLOW_INVALID_ENTRY, 0);
+           options->_AllowInvalid & ALLOW_INVALID_ENTRY, 0, 0);
   smartlist_free(excluded);
   smartlist_free(excluded);
   return choice;
   return choice;
 }
 }

+ 5 - 2
src/or/or.h

@@ -918,6 +918,7 @@ typedef struct {
   unsigned int is_fast:1; /** Do we think this is a fast OR? */
   unsigned int is_fast:1; /** Do we think this is a fast OR? */
   unsigned int is_stable:1; /** Do we think this is a stable OR? */
   unsigned int is_stable:1; /** Do we think this is a stable OR? */
   unsigned int is_possible_guard:1; /**< Do we think this is an OK guard? */
   unsigned int is_possible_guard:1; /**< Do we think this is an OK guard? */
+  unsigned int is_exit:1; /**< Do we think this is an OK exit? */
 
 
 /** Tor can use this desc for circuit-building. */
 /** Tor can use this desc for circuit-building. */
 #define ROUTER_PURPOSE_GENERAL 0
 #define ROUTER_PURPOSE_GENERAL 0
@@ -2562,13 +2563,15 @@ routerinfo_t *router_find_exact_exit_enclave(const char *address,
 int router_is_unreliable(routerinfo_t *router, int need_uptime,
 int router_is_unreliable(routerinfo_t *router, int need_uptime,
                          int need_capacity, int need_guard);
                          int need_capacity, int need_guard);
 uint32_t router_get_advertised_bandwidth(routerinfo_t *router);
 uint32_t router_get_advertised_bandwidth(routerinfo_t *router);
-routerinfo_t *routerlist_sl_choose_by_bandwidth(smartlist_t *sl);
+routerinfo_t *routerlist_sl_choose_by_bandwidth(smartlist_t *sl, int for_exit);
+
 routerinfo_t *router_choose_random_node(const char *preferred,
 routerinfo_t *router_choose_random_node(const char *preferred,
                                         const char *excluded,
                                         const char *excluded,
                                         smartlist_t *excludedsmartlist,
                                         smartlist_t *excludedsmartlist,
                                         int need_uptime, int need_bandwidth,
                                         int need_uptime, int need_bandwidth,
                                         int need_guard,
                                         int need_guard,
-                                        int allow_invalid, int strict);
+                                        int allow_invalid, int strict,
+                                        int weight_for_exit);
 routerinfo_t *router_get_by_nickname(const char *nickname,
 routerinfo_t *router_get_by_nickname(const char *nickname,
                                      int warn_if_unnamed);
                                      int warn_if_unnamed);
 routerinfo_t *router_get_by_hexdigest(const char *hexdigest);
 routerinfo_t *router_get_by_hexdigest(const char *hexdigest);

+ 1 - 1
src/or/rendservice.c

@@ -1005,7 +1005,7 @@ rend_services_introduce(void)
       router = router_choose_random_node(service->intro_prefer_nodes,
       router = router_choose_random_node(service->intro_prefer_nodes,
                service->intro_exclude_nodes, exclude_routers, 1, 0, 0,
                service->intro_exclude_nodes, exclude_routers, 1, 0, 0,
                get_options()->_AllowInvalid & ALLOW_INVALID_INTRODUCTION,
                get_options()->_AllowInvalid & ALLOW_INVALID_INTRODUCTION,
-               0);
+               0, 0);
       if (!router) {
       if (!router) {
         log_warn(LD_REND,
         log_warn(LD_REND,
                  "Could only establish %d introduction points for %s.",
                  "Could only establish %d introduction points for %s.",

+ 50 - 23
src/or/routerlist.c

@@ -884,46 +884,70 @@ router_get_advertised_bandwidth(routerinfo_t *router)
  * the advertised bandwidth of each router.
  * the advertised bandwidth of each router.
  */
  */
 routerinfo_t *
 routerinfo_t *
-routerlist_sl_choose_by_bandwidth(smartlist_t *sl)
+routerlist_sl_choose_by_bandwidth(smartlist_t *sl, int for_exit)
 {
 {
   int i;
   int i;
   routerinfo_t *router;
   routerinfo_t *router;
-  smartlist_t *bandwidths;
+  uint32_t *bandwidths;
-  uint32_t this_bw, tmp, total_bw=0, rand_bw;
+  uint32_t this_bw;
-  uint32_t *p;
+  uint64_t total_nonexit_bw = 0, total_exit_bw = 0, total_bw = 0;
+  uint64_t rand_bw, tmp;
+  double exit_weight;
 
 
   /* First count the total bandwidth weight, and make a smartlist
   /* First count the total bandwidth weight, and make a smartlist
    * of each value. */
    * of each value. */
-  bandwidths = smartlist_create();
+  bandwidths = tor_malloc(sizeof(uint32_t)*smartlist_len(sl));
   for (i = 0; i < smartlist_len(sl); ++i) {
   for (i = 0; i < smartlist_len(sl); ++i) {
     router = smartlist_get(sl, i);
     router = smartlist_get(sl, i);
     this_bw = router_get_advertised_bandwidth(router);
     this_bw = router_get_advertised_bandwidth(router);
     /* if they claim something huge, don't believe it */
     /* if they claim something huge, don't believe it */
     if (this_bw > MAX_BELIEVABLE_BANDWIDTH)
     if (this_bw > MAX_BELIEVABLE_BANDWIDTH)
       this_bw = MAX_BELIEVABLE_BANDWIDTH;
       this_bw = MAX_BELIEVABLE_BANDWIDTH;
-    p = tor_malloc(sizeof(uint32_t));
+    bandwidths[i] = this_bw;
-    *p = this_bw;
+    if (router->is_exit)
-    smartlist_add(bandwidths, p);
+      total_exit_bw += this_bw;
-    total_bw += this_bw;
+    else
+      total_nonexit_bw += this_bw;
   }
   }
-  if (!total_bw) {
+  if (!(total_exit_bw+total_nonexit_bw)) {
-    SMARTLIST_FOREACH(bandwidths, uint32_t*, p, tor_free(p));
+    tor_free(bandwidths);
-    smartlist_free(bandwidths);
     return smartlist_choose(sl);
     return smartlist_choose(sl);
   }
   }
+  if (for_exit) {
+    exit_weight = 1.0;
+    total_bw = total_exit_bw + total_nonexit_bw;
+  } else if (total_exit_bw < total_nonexit_bw / 2) {
+    exit_weight = 0.0;
+    total_bw = total_nonexit_bw;
+  } else {
+    uint64_t leftover = (total_exit_bw - total_nonexit_bw / 2);
+    exit_weight = U64_TO_DBL(leftover) /
+      U64_TO_DBL(leftover + total_nonexit_bw);
+    total_bw =  total_nonexit_bw +
+      DBL_TO_U64(exit_weight * U64_TO_DBL(total_exit_bw));
+  }
+  /*
+  log_debug(LD_CIRC, "Total bw = "U64_FORMAT", total exit bw = "U64_FORMAT
+            ", total nonexit bw = "U64_FORMAT", exit weight = %lf "
+            "(for exit == %d)",
+            U64_PRINTF_ARG(total_bw), U64_PRINTF_ARG(total_exit_bw),
+            U64_PRINTF_ARG(total_nonexit_bw), exit_weight, for_exit);
+  */
+
   /* Second, choose a random value from the bandwidth weights. */
   /* Second, choose a random value from the bandwidth weights. */
-  rand_bw = crypto_rand_int(total_bw);
+  rand_bw = crypto_rand_uint64(total_bw);
   /* Last, count through sl until we get to the element we picked */
   /* Last, count through sl until we get to the element we picked */
   tmp = 0;
   tmp = 0;
-  for (i=0; ; i++) {
+  for (i=0; i < smartlist_len(sl); i++) {
-    tor_assert(i < smartlist_len(sl));
+    router = smartlist_get(sl, i);
-    p = smartlist_get(bandwidths, i);
+    if (router->is_exit)
-    tmp += *p;
+      tmp += ((uint64_t)(bandwidths[i] * exit_weight));
+    else
+      tmp += bandwidths[i];
     if (tmp >= rand_bw)
     if (tmp >= rand_bw)
       break;
       break;
   }
   }
-  SMARTLIST_FOREACH(bandwidths, uint32_t*, p, tor_free(p));
+  tor_free(bandwidths);
-  smartlist_free(bandwidths);
   return (routerinfo_t *)smartlist_get(sl, i);
   return (routerinfo_t *)smartlist_get(sl, i);
 }
 }
 
 
@@ -944,7 +968,8 @@ router_choose_random_node(const char *preferred,
                           smartlist_t *excludedsmartlist,
                           smartlist_t *excludedsmartlist,
                           int need_uptime, int need_capacity,
                           int need_uptime, int need_capacity,
                           int need_guard,
                           int need_guard,
-                          int allow_invalid, int strict)
+                          int allow_invalid, int strict,
+                          int weight_for_exit)
 {
 {
   smartlist_t *sl, *excludednodes;
   smartlist_t *sl, *excludednodes;
   routerinfo_t *choice = NULL;
   routerinfo_t *choice = NULL;
@@ -975,8 +1000,8 @@ router_choose_random_node(const char *preferred,
       smartlist_subtract(sl,excludedsmartlist);
       smartlist_subtract(sl,excludedsmartlist);
     routerlist_sl_remove_unreliable_routers(sl, need_uptime,
     routerlist_sl_remove_unreliable_routers(sl, need_uptime,
                                             need_capacity, need_guard);
                                             need_capacity, need_guard);
-    if (need_capacity)
+    if (need_capacity) /* XXXX Is this documented in path spec. -NM */
-      choice = routerlist_sl_choose_by_bandwidth(sl);
+      choice = routerlist_sl_choose_by_bandwidth(sl, weight_for_exit);
     else
     else
       choice = smartlist_choose(sl);
       choice = smartlist_choose(sl);
     smartlist_free(sl);
     smartlist_free(sl);
@@ -989,7 +1014,8 @@ router_choose_random_node(const char *preferred,
                need_uptime?", stable":"",
                need_uptime?", stable":"",
                need_guard?", guard":"");
                need_guard?", guard":"");
       choice = router_choose_random_node(
       choice = router_choose_random_node(
-        NULL, excluded, excludedsmartlist, 0, 0, 0, allow_invalid, 0);
+        NULL, excluded, excludedsmartlist,
+        0, 0, 0, allow_invalid, 0, weight_for_exit);
     }
     }
   }
   }
   smartlist_free(excludednodes);
   smartlist_free(excludednodes);
@@ -3422,6 +3448,7 @@ routers_update_status_from_networkstatus(smartlist_t *routers,
       router->is_fast = rs->status.is_fast;
       router->is_fast = rs->status.is_fast;
       router->is_stable = rs->status.is_stable;
       router->is_stable = rs->status.is_stable;
       router->is_possible_guard = rs->status.is_possible_guard;
       router->is_possible_guard = rs->status.is_possible_guard;
+      router->is_exit =  rs->status.is_exit;
     }
     }
     if (router->is_running && ds) {
     if (router->is_running && ds) {
       ds->n_networkstatus_failures = 0;
       ds->n_networkstatus_failures = 0;