|  | @@ -8,16 +8,20 @@ Read the README file first, so you can get familiar with the basics.
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  |  1. The programs.
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  | -1.1. "or". This is the main program here. It functions as both a server
 | 
	
		
			
				|  |  | -and a client, depending on which config file you give it. ...
 | 
	
		
			
				|  |  | +1.1. "or". This is the main program here. It functions as either a server
 | 
	
		
			
				|  |  | +or a client, depending on which config file you give it.
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +1.2. "orkeygen". Use "orkeygen file-for-privkey file-for-pubkey" to
 | 
	
		
			
				|  |  | +generate key files for an onion router.
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  |  2. The pieces.
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  |  2.1. Routers. Onion routers, as far as the 'or' program is concerned,
 | 
	
		
			
				|  |  |  are a bunch of data items that are loaded into the router_array when
 | 
	
		
			
				|  |  | -the program starts. After it's loaded, the router information is never
 | 
	
		
			
				|  |  | -changed. When a new OR connection is started (see below), the relevant
 | 
	
		
			
				|  |  | -information is copied from the router struct to the connection struct.
 | 
	
		
			
				|  |  | +the program starts. Periodically it downloads a new set of routers
 | 
	
		
			
				|  |  | +from a directory server, and updates the router_array. When a new OR
 | 
	
		
			
				|  |  | +connection is started (see below), the relevant information is copied
 | 
	
		
			
				|  |  | +from the router struct to the connection struct.
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  |  2.2. Connections. A connection is a long-standing tcp socket between
 | 
	
		
			
				|  |  |  nodes. A connection is named based on what it's connected to -- an "OR
 | 
	
	
		
			
				|  | @@ -26,34 +30,36 @@ an onion proxy on the other end, an "exit connection" has a website or
 | 
	
		
			
				|  |  |  other server on the other end, and an "AP connection" has an application
 | 
	
		
			
				|  |  |  proxy (and thus a user) on the other end.
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  | -2.3. Circuits. A circuit is a single conversation between two
 | 
	
		
			
				|  |  | -participants over the onion routing network. One end of the circuit has
 | 
	
		
			
				|  |  | -an AP connection, and the other end has an exit connection. AP and exit
 | 
	
		
			
				|  |  | +2.3. Circuits. A circuit is a path over the onion routing
 | 
	
		
			
				|  |  | +network. Applications can connect to one end of the circuit, and can
 | 
	
		
			
				|  |  | +create exit connections at the other end of the circuit. AP and exit
 | 
	
		
			
				|  |  |  connections have only one circuit associated with them (and thus these
 | 
	
		
			
				|  |  |  connection types are closed when the circuit is closed), whereas OP and
 | 
	
		
			
				|  |  |  OR connections multiplex many circuits at once, and stay standing even
 | 
	
		
			
				|  |  |  when there are no circuits running over them.
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  | +2.4. Topics. Topics are specific conversations between an AP and an exit.
 | 
	
		
			
				|  |  | +Topics are multiplexed over circuits.
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  |  2.4. Cells. Some connections, specifically OR and OP connections, speak
 | 
	
		
			
				|  |  | -"cells". This means that data over that connection is bundled into 128
 | 
	
		
			
				|  |  | -byte packets (8 bytes of header and 120 bytes of payload). Each cell has
 | 
	
		
			
				|  |  | +"cells". This means that data over that connection is bundled into 256
 | 
	
		
			
				|  |  | +byte packets (8 bytes of header and 248 bytes of payload). Each cell has
 | 
	
		
			
				|  |  |  a type, or "command", which indicates what it's for.
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  |  3. Important parameters in the code.
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  | -3.1. Role.
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  |  4. Robustness features.
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  |  4.1. Bandwidth throttling. Each cell-speaking connection has a maximum
 | 
	
		
			
				|  |  |  bandwidth it can use, as specified in the routers.or file. Bandwidth
 | 
	
		
			
				|  |  | -throttling occurs on both the sender side and the receiving side. The
 | 
	
		
			
				|  |  | -sending side sends cells at regularly spaced intervals (e.g., a connection
 | 
	
		
			
				|  |  | -with a bandwidth of 12800B/s would queue a cell every 10ms). The receiving
 | 
	
		
			
				|  |  | -side protects against misbehaving servers that send cells more frequently,
 | 
	
		
			
				|  |  | -by using a simple token bucket:
 | 
	
		
			
				|  |  | +throttling can occur on both the sender side and the receiving side. If
 | 
	
		
			
				|  |  | +the LinkPadding option is on, the sending side sends cells at regularly
 | 
	
		
			
				|  |  | +spaced intervals (e.g., a connection with a bandwidth of 25600B/s would
 | 
	
		
			
				|  |  | +queue a cell every 10ms). The receiving side protects against misbehaving
 | 
	
		
			
				|  |  | +servers that send cells more frequently, by using a simple token bucket:
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  |  Each connection has a token bucket with a specified capacity. Tokens are
 | 
	
		
			
				|  |  |  added to the bucket each second (when the bucket is full, new tokens
 | 
	
	
		
			
				|  | @@ -79,22 +85,12 @@ he owns, and then refuse to read any of the bytes at the webserver end
 | 
	
		
			
				|  |  |  of the circuit. These bottlenecks can propagate back through the entire
 | 
	
		
			
				|  |  |  network, mucking up everything.
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  | -To handle this congestion, each circuit starts out with a receive
 | 
	
		
			
				|  |  | -window at each node of 100 cells -- it is willing to receive at most 100
 | 
	
		
			
				|  |  | -cells on that circuit. (It handles each direction separately; so that's
 | 
	
		
			
				|  |  | -really 100 cells forward and 100 cells back.) The edge of the circuit
 | 
	
		
			
				|  |  | -is willing to create at most 100 cells from data coming from outside the
 | 
	
		
			
				|  |  | -onion routing network. Nodes in the middle of the circuit will tear down
 | 
	
		
			
				|  |  | -the circuit if a data cell arrives when the receive window is 0. When
 | 
	
		
			
				|  |  | -data has traversed the network, the edge node buffers it on its outbuf,
 | 
	
		
			
				|  |  | -and evaluates whether to respond with a 'sendme' acknowledgement: if its
 | 
	
		
			
				|  |  | -outbuf is not too full, and its receive window is less than 90, then it
 | 
	
		
			
				|  |  | -queues a 'sendme' cell backwards in the circuit. Each node that receives
 | 
	
		
			
				|  |  | -the sendme increments its window by 10 and passes the cell onward.
 | 
	
		
			
				|  |  | +(See the tor-spec.txt document for details of how congestion control
 | 
	
		
			
				|  |  | +works.)
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  |  In practice, all the nodes in the circuit maintain a receive window
 | 
	
		
			
				|  |  | -close to 100 except the exit node, which stays around 0, periodically
 | 
	
		
			
				|  |  | -receiving a sendme and reading 10 more data cells from the webserver.
 | 
	
		
			
				|  |  | +close to maximum except the exit node, which stays around 0, periodically
 | 
	
		
			
				|  |  | +receiving a sendme and reading more data cells from the webserver.
 | 
	
		
			
				|  |  |  In this way we can use pretty much all of the available bandwidth for
 | 
	
		
			
				|  |  |  data, but gracefully back off when faced with multiple circuits (a new
 | 
	
		
			
				|  |  |  sendme arrives only after some cells have traversed the entire network),
 | 
	
	
		
			
				|  | @@ -108,7 +104,7 @@ congestion control; so far it's enough.
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  |  4.3. Router twins. In many cases when we ask for a router with a given
 | 
	
		
			
				|  |  |  address and port, we really mean a router who knows a given key. Router
 | 
	
		
			
				|  |  | -twins are two or more routers that all share the same private key. We thus
 | 
	
		
			
				|  |  | +twins are two or more routers that share the same private key. We thus
 | 
	
		
			
				|  |  |  give routers extra flexibility in choosing the next hop in the circuit: if
 | 
	
		
			
				|  |  |  some of the twins are down or slow, it can choose the more available ones.
 | 
	
		
			
				|  |  |  
 |