Browse Source

Updated HACKING and README docs

HACKING now explains bandwidth throttling, congestion control,
and router twins. Read it and see if it makes sense.


svn:r68
Roger Dingledine 22 years ago
parent
commit
86eb8db0f0
3 changed files with 72 additions and 18 deletions
  1. 63 11
      HACKING
  2. 2 1
      Makefile.am
  3. 7 6
      README

+ 63 - 11
HACKING

@@ -8,30 +8,41 @@ Read the README file first, so you can get familiar with the basics.
 
 1. The pieces.
 
-1.1 Connections. A connection is a long-standing tcp socket between
+1.1. Routers. Onion routers, as far as the 'or' program is concerned,
+are a bunch of data items that are loaded into the router_array when
+the program starts. After it's loaded, the router information is never
+changed. When a new OR connection is started (see below), the relevant
+information is copied from the router struct to the connection struct.
+
+1.2. Connections. A connection is a long-standing tcp socket between
 nodes. A connection is named based on what it's connected to -- an "OR
 connection" has an onion router on the other end, an "OP connection" has
 an onion proxy on the other end, an "exit connection" has a website or
 other server on the other end, and an "AP connection" has an application
 proxy (and thus a user) on the other end.
 
-1.2. Circuits. A circuit is a single conversation between two
+1.3. Circuits. A circuit is a single conversation between two
 participants over the onion routing network. One end of the circuit has
 an AP connection, and the other end has an exit connection. AP and exit
-connections have only one circuit associated with them, whereas OP and
-OR connections multiplex many circuits at once.
+connections have only one circuit associated with them (and thus these
+connection types are closed when the circuit is closed), whereas OP and
+OR connections multiplex many circuits at once, and stay standing even
+when there are no circuits running over them.
 
-1.3. Cells. Some connections, specifically OR and OP connections, speak
+1.4. Cells. Some connections, specifically OR and OP connections, speak
 "cells". This means that data over that connection is bundled into 128
 byte packets (8 bytes of header and 120 bytes of payload). Each cell has
 a type, or "command", which indicates what it's for.
 
 
+2. Important parameters in the code.
+
+2.1. Role.
 
 
-2. Other features.
+3. Robustness features.
 
-2.1. Bandwidth throttling. Each cell-speaking connection has a maximum
+3.1. Bandwidth throttling. Each cell-speaking connection has a maximum
 bandwidth it can use, as specified in the routers.or file. Bandwidth
 throttling occurs on both the sender side and the receiving side. The
 sending side sends cells at regularly spaced intervals (e.g., a connection
@@ -53,8 +64,49 @@ The bandwidth throttling uses TCP to push back when we stop reading.
 We extend it with token buckets to allow more flexibility for traffic
 bursts.
 
-2.2. Data congestion control. 
-
-
-
+3.2. Data congestion control. Even with the above bandwidth throttling,
+we still need to worry about congestion, either accidental or intentional.
+If a lot of people make circuits into same node, and they all come out
+through the same connection, then that connection may become saturated
+(be unable to send out data cells as quickly as it wants to). An adversary
+can make a 'put' request through the onion routing network to a webserver
+he owns, and then refuse to read any of the bytes at the webserver end
+of the circuit. These bottlenecks can propagate back through the entire
+network, mucking up everything.
+
+To handle this congestion, each circuit starts out with a receive
+window at each node of 100 cells -- it is willing to receive at most 100
+cells on that circuit. (It handles each direction separately; so that's
+really 100 cells forward and 100 cells back.) The edge of the circuit
+is willing to create at most 100 cells from data coming from outside the
+onion routing network. Nodes in the middle of the circuit will tear down
+the circuit if a data cell arrives when the receive window is 0. When
+data has traversed the network, the edge node buffers it on its outbuf,
+and evaluates whether to respond with a 'sendme' acknowledgement: if its
+outbuf is not too full, and its receive window is less than 90, then it
+queues a 'sendme' cell backwards in the circuit. Each node that receives
+the sendme increments its window by 10 and passes the cell onward.
+
+In practice, all the nodes in the circuit maintain a receive window
+close to 100 except the exit node, which stays around 0, periodically
+receiving a sendme and reading 10 more data cells from the webserver.
+In this way we can use pretty much all of the available bandwidth for
+data, but gracefully back off when faced with multiple circuits (a new
+sendme arrives only after some cells have traversed the entire network),
+stalled network connections, or attacks.
+
+We don't need to reimplement full tcp windows, with sequence numbers,
+the ability to drop cells when we're full etc, because the tcp streams
+already guarantee in-order delivery of each cell. Rather than trying
+to build some sort of tcp-on-tcp scheme, we implement this minimal data
+congestion control; so far it's enough.
+
+3.3. Router twins. In many cases when we ask for a router with a given
+address and port, we really mean a router who knows a given key. Router
+twins are two or more routers that all share the same private key. We thus
+give routers extra flexibility in choosing the next hop in the circuit: if
+some of the twins are down or slow, it can choose the more available ones.
+
+Currently the code tries for the primary router first, and if it's down,
+chooses the first available twin.
 

+ 2 - 1
Makefile.am

@@ -3,4 +3,5 @@ SUBDIRS = src
 
 DIST_SUBDIRS = src
 
-EXTRA_DIST = TODO
+EXTRA_DIST = TODO HACKING FAQ
+

+ 7 - 6
README

@@ -13,7 +13,8 @@ If you got the source from cvs:
 
 If you got the source from a tarball:
 
-  Run ./configure, make, make install as usual.
+  Run ./configure and make as usual. There isn't much point in 
+  'make install' yet.
 
 If this doesn't work for you:
 
@@ -23,7 +24,6 @@ If this doesn't work for you:
   we'll see what we can do.
 
 Once you've got it compiled:
-  (these notes assume you started with source from cvs)
 
   It's a bit hard to figure out what to do with the binaries. If you
   want to set up your own test network, go into src/config/ and look
@@ -54,8 +54,9 @@ Once you've got it compiled:
   then ^z the wget a little bit in. The onion routers will continue
   talking for a while, queueing around 500k in the kernel-level buffers.
   When the kernel buffers are full, and the outbuf for the AP connection
-  also fills, the internal congestion control will kick in and the
-  exit connection will stop reading from the webserver. The circuit
-  will wait until you fg the wget -- and other circuits will work just
-  fine throughout.
+  also fills, the internal congestion control will kick in and the exit
+  connection will stop reading from the webserver. The circuit will
+  wait until you fg the wget -- and other circuits will work just fine
+  throughout. Then try ^z'ing the onion routers, and watch how well it
+  recovers. Then try ^z'ing several of them at once. :)