|  | @@ -6,108 +6,113 @@ the code, add features, fix bugs, etc.
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  |  Read the README file first, so you can get familiar with the basics.
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  | -1. The programs.
 | 
	
		
			
				|  |  | -
 | 
	
		
			
				|  |  | -1.1. "or". This is the main program here. It functions as either a server
 | 
	
		
			
				|  |  | -or a client, depending on which config file you give it.
 | 
	
		
			
				|  |  | -
 | 
	
		
			
				|  |  | -1.2. "orkeygen". Use "orkeygen file-for-privkey file-for-pubkey" to
 | 
	
		
			
				|  |  | -generate key files for an onion router.
 | 
	
		
			
				|  |  | -
 | 
	
		
			
				|  |  | -2. The pieces.
 | 
	
		
			
				|  |  | -
 | 
	
		
			
				|  |  | -2.1. Routers. Onion routers, as far as the 'or' program is concerned,
 | 
	
		
			
				|  |  | -are a bunch of data items that are loaded into the router_array when
 | 
	
		
			
				|  |  | -the program starts. Periodically it downloads a new set of routers
 | 
	
		
			
				|  |  | -from a directory server, and updates the router_array. When a new OR
 | 
	
		
			
				|  |  | -connection is started (see below), the relevant information is copied
 | 
	
		
			
				|  |  | -from the router struct to the connection struct.
 | 
	
		
			
				|  |  | -
 | 
	
		
			
				|  |  | -2.2. Connections. A connection is a long-standing tcp socket between
 | 
	
		
			
				|  |  | -nodes. A connection is named based on what it's connected to -- an "OR
 | 
	
		
			
				|  |  | -connection" has an onion router on the other end, an "OP connection" has
 | 
	
		
			
				|  |  | -an onion proxy on the other end, an "exit connection" has a website or
 | 
	
		
			
				|  |  | -other server on the other end, and an "AP connection" has an application
 | 
	
		
			
				|  |  | -proxy (and thus a user) on the other end.
 | 
	
		
			
				|  |  | -
 | 
	
		
			
				|  |  | -2.3. Circuits. A circuit is a path over the onion routing
 | 
	
		
			
				|  |  | -network. Applications can connect to one end of the circuit, and can
 | 
	
		
			
				|  |  | -create exit connections at the other end of the circuit. AP and exit
 | 
	
		
			
				|  |  | -connections have only one circuit associated with them (and thus these
 | 
	
		
			
				|  |  | -connection types are closed when the circuit is closed), whereas OP and
 | 
	
		
			
				|  |  | -OR connections multiplex many circuits at once, and stay standing even
 | 
	
		
			
				|  |  | -when there are no circuits running over them.
 | 
	
		
			
				|  |  | -
 | 
	
		
			
				|  |  | -2.4. Topics. Topics are specific conversations between an AP and an exit.
 | 
	
		
			
				|  |  | -Topics are multiplexed over circuits.
 | 
	
		
			
				|  |  | -
 | 
	
		
			
				|  |  | -2.4. Cells. Some connections, specifically OR and OP connections, speak
 | 
	
		
			
				|  |  | -"cells". This means that data over that connection is bundled into 256
 | 
	
		
			
				|  |  | -byte packets (8 bytes of header and 248 bytes of payload). Each cell has
 | 
	
		
			
				|  |  | -a type, or "command", which indicates what it's for.
 | 
	
		
			
				|  |  | -
 | 
	
		
			
				|  |  | -
 | 
	
		
			
				|  |  | -3. Important parameters in the code.
 | 
	
		
			
				|  |  | -
 | 
	
		
			
				|  |  | -
 | 
	
		
			
				|  |  | -
 | 
	
		
			
				|  |  | -4. Robustness features.
 | 
	
		
			
				|  |  | -
 | 
	
		
			
				|  |  | -4.1. Bandwidth throttling. Each cell-speaking connection has a maximum
 | 
	
		
			
				|  |  | -bandwidth it can use, as specified in the routers.or file. Bandwidth
 | 
	
		
			
				|  |  | -throttling can occur on both the sender side and the receiving side. If
 | 
	
		
			
				|  |  | -the LinkPadding option is on, the sending side sends cells at regularly
 | 
	
		
			
				|  |  | -spaced intervals (e.g., a connection with a bandwidth of 25600B/s would
 | 
	
		
			
				|  |  | -queue a cell every 10ms). The receiving side protects against misbehaving
 | 
	
		
			
				|  |  | -servers that send cells more frequently, by using a simple token bucket:
 | 
	
		
			
				|  |  | -
 | 
	
		
			
				|  |  | -Each connection has a token bucket with a specified capacity. Tokens are
 | 
	
		
			
				|  |  | -added to the bucket each second (when the bucket is full, new tokens
 | 
	
		
			
				|  |  | -are discarded.) Each token represents permission to receive one byte
 | 
	
		
			
				|  |  | -from the network --- to receive a byte, the connection must remove a
 | 
	
		
			
				|  |  | -token from the bucket. Thus if the bucket is empty, that connection must
 | 
	
		
			
				|  |  | -wait until more tokens arrive. The number of tokens we add enforces a
 | 
	
		
			
				|  |  | -longterm average rate of incoming bytes, yet we still permit short-term
 | 
	
		
			
				|  |  | -bursts above the allowed bandwidth. Currently bucket sizes are set to
 | 
	
		
			
				|  |  | -ten seconds worth of traffic.
 | 
	
		
			
				|  |  | -
 | 
	
		
			
				|  |  | -The bandwidth throttling uses TCP to push back when we stop reading.
 | 
	
		
			
				|  |  | -We extend it with token buckets to allow more flexibility for traffic
 | 
	
		
			
				|  |  | -bursts.
 | 
	
		
			
				|  |  | -
 | 
	
		
			
				|  |  | -4.2. Data congestion control. Even with the above bandwidth throttling,
 | 
	
		
			
				|  |  | -we still need to worry about congestion, either accidental or intentional.
 | 
	
		
			
				|  |  | -If a lot of people make circuits into same node, and they all come out
 | 
	
		
			
				|  |  | -through the same connection, then that connection may become saturated
 | 
	
		
			
				|  |  | -(be unable to send out data cells as quickly as it wants to). An adversary
 | 
	
		
			
				|  |  | -can make a 'put' request through the onion routing network to a webserver
 | 
	
		
			
				|  |  | -he owns, and then refuse to read any of the bytes at the webserver end
 | 
	
		
			
				|  |  | -of the circuit. These bottlenecks can propagate back through the entire
 | 
	
		
			
				|  |  | -network, mucking up everything.
 | 
	
		
			
				|  |  | -
 | 
	
		
			
				|  |  | -(See the tor-spec.txt document for details of how congestion control
 | 
	
		
			
				|  |  | -works.)
 | 
	
		
			
				|  |  | -
 | 
	
		
			
				|  |  | -In practice, all the nodes in the circuit maintain a receive window
 | 
	
		
			
				|  |  | -close to maximum except the exit node, which stays around 0, periodically
 | 
	
		
			
				|  |  | -receiving a sendme and reading more data cells from the webserver.
 | 
	
		
			
				|  |  | -In this way we can use pretty much all of the available bandwidth for
 | 
	
		
			
				|  |  | -data, but gracefully back off when faced with multiple circuits (a new
 | 
	
		
			
				|  |  | -sendme arrives only after some cells have traversed the entire network),
 | 
	
		
			
				|  |  | -stalled network connections, or attacks.
 | 
	
		
			
				|  |  | -
 | 
	
		
			
				|  |  | -We don't need to reimplement full tcp windows, with sequence numbers,
 | 
	
		
			
				|  |  | -the ability to drop cells when we're full etc, because the tcp streams
 | 
	
		
			
				|  |  | -already guarantee in-order delivery of each cell. Rather than trying
 | 
	
		
			
				|  |  | -to build some sort of tcp-on-tcp scheme, we implement this minimal data
 | 
	
		
			
				|  |  | -congestion control; so far it's enough.
 | 
	
		
			
				|  |  | -
 | 
	
		
			
				|  |  | -4.3. Router twins. In many cases when we ask for a router with a given
 | 
	
		
			
				|  |  | -address and port, we really mean a router who knows a given key. Router
 | 
	
		
			
				|  |  | -twins are two or more routers that share the same private key. We thus
 | 
	
		
			
				|  |  | -give routers extra flexibility in choosing the next hop in the circuit: if
 | 
	
		
			
				|  |  | -some of the twins are down or slow, it can choose the more available ones.
 | 
	
		
			
				|  |  | -
 | 
	
		
			
				|  |  | -Currently the code tries for the primary router first, and if it's down,
 | 
	
		
			
				|  |  | -chooses the first available twin.
 | 
	
		
			
				|  |  | +The pieces.
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +  Routers. Onion routers, as far as the 'tor' program is concerned,
 | 
	
		
			
				|  |  | +  are a bunch of data items that are loaded into the router_array when
 | 
	
		
			
				|  |  | +  the program starts. Periodically it downloads a new set of routers
 | 
	
		
			
				|  |  | +  from a directory server, and updates the router_array. When a new OR
 | 
	
		
			
				|  |  | +  connection is started (see below), the relevant information is copied
 | 
	
		
			
				|  |  | +  from the router struct to the connection struct.
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +  Connections. A connection is a long-standing tcp socket between
 | 
	
		
			
				|  |  | +  nodes. A connection is named based on what it's connected to -- an "OR
 | 
	
		
			
				|  |  | +  connection" has an onion router on the other end, an "OP connection" has
 | 
	
		
			
				|  |  | +  an onion proxy on the other end, an "exit connection" has a website or
 | 
	
		
			
				|  |  | +  other server on the other end, and an "AP connection" has an application
 | 
	
		
			
				|  |  | +  proxy (and thus a user) on the other end.
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +  Circuits. A circuit is a path over the onion routing
 | 
	
		
			
				|  |  | +  network. Applications can connect to one end of the circuit, and can
 | 
	
		
			
				|  |  | +  create exit connections at the other end of the circuit. AP and exit
 | 
	
		
			
				|  |  | +  connections have only one circuit associated with them (and thus these
 | 
	
		
			
				|  |  | +  connection types are closed when the circuit is closed), whereas OP and
 | 
	
		
			
				|  |  | +  OR connections multiplex many circuits at once, and stay standing even
 | 
	
		
			
				|  |  | +  when there are no circuits running over them.
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +  Streams. Streams are specific conversations between an AP and an exit.
 | 
	
		
			
				|  |  | +  Streams are multiplexed over circuits.
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +  Cells. Some connections, specifically OR and OP connections, speak
 | 
	
		
			
				|  |  | +  "cells". This means that data over that connection is bundled into 256
 | 
	
		
			
				|  |  | +  byte packets (8 bytes of header and 248 bytes of payload). Each cell has
 | 
	
		
			
				|  |  | +  a type, or "command", which indicates what it's for.
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +Robustness features.
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +[XXX no longer up to date]
 | 
	
		
			
				|  |  | + Bandwidth throttling. Each cell-speaking connection has a maximum
 | 
	
		
			
				|  |  | +  bandwidth it can use, as specified in the routers.or file. Bandwidth
 | 
	
		
			
				|  |  | +  throttling can occur on both the sender side and the receiving side. If
 | 
	
		
			
				|  |  | +  the LinkPadding option is on, the sending side sends cells at regularly
 | 
	
		
			
				|  |  | +  spaced intervals (e.g., a connection with a bandwidth of 25600B/s would
 | 
	
		
			
				|  |  | +  queue a cell every 10ms). The receiving side protects against misbehaving
 | 
	
		
			
				|  |  | +  servers that send cells more frequently, by using a simple token bucket:
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +  Each connection has a token bucket with a specified capacity. Tokens are
 | 
	
		
			
				|  |  | +  added to the bucket each second (when the bucket is full, new tokens
 | 
	
		
			
				|  |  | +  are discarded.) Each token represents permission to receive one byte
 | 
	
		
			
				|  |  | +  from the network --- to receive a byte, the connection must remove a
 | 
	
		
			
				|  |  | +  token from the bucket. Thus if the bucket is empty, that connection must
 | 
	
		
			
				|  |  | +  wait until more tokens arrive. The number of tokens we add enforces a
 | 
	
		
			
				|  |  | +  longterm average rate of incoming bytes, yet we still permit short-term
 | 
	
		
			
				|  |  | +  bursts above the allowed bandwidth. Currently bucket sizes are set to
 | 
	
		
			
				|  |  | +  ten seconds worth of traffic.
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +  The bandwidth throttling uses TCP to push back when we stop reading.
 | 
	
		
			
				|  |  | +  We extend it with token buckets to allow more flexibility for traffic
 | 
	
		
			
				|  |  | +  bursts.
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | + Data congestion control. Even with the above bandwidth throttling,
 | 
	
		
			
				|  |  | +  we still need to worry about congestion, either accidental or intentional.
 | 
	
		
			
				|  |  | +  If a lot of people make circuits into same node, and they all come out
 | 
	
		
			
				|  |  | +  through the same connection, then that connection may become saturated
 | 
	
		
			
				|  |  | +  (be unable to send out data cells as quickly as it wants to). An adversary
 | 
	
		
			
				|  |  | +  can make a 'put' request through the onion routing network to a webserver
 | 
	
		
			
				|  |  | +  he owns, and then refuse to read any of the bytes at the webserver end
 | 
	
		
			
				|  |  | +  of the circuit. These bottlenecks can propagate back through the entire
 | 
	
		
			
				|  |  | +  network, mucking up everything.
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +  (See the tor-spec.txt document for details of how congestion control
 | 
	
		
			
				|  |  | +  works.)
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +  In practice, all the nodes in the circuit maintain a receive window
 | 
	
		
			
				|  |  | +  close to maximum except the exit node, which stays around 0, periodically
 | 
	
		
			
				|  |  | +  receiving a sendme and reading more data cells from the webserver.
 | 
	
		
			
				|  |  | +  In this way we can use pretty much all of the available bandwidth for
 | 
	
		
			
				|  |  | +  data, but gracefully back off when faced with multiple circuits (a new
 | 
	
		
			
				|  |  | +  sendme arrives only after some cells have traversed the entire network),
 | 
	
		
			
				|  |  | +  stalled network connections, or attacks.
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +  We don't need to reimplement full tcp windows, with sequence numbers,
 | 
	
		
			
				|  |  | +  the ability to drop cells when we're full etc, because the tcp streams
 | 
	
		
			
				|  |  | +  already guarantee in-order delivery of each cell. Rather than trying
 | 
	
		
			
				|  |  | +  to build some sort of tcp-on-tcp scheme, we implement this minimal data
 | 
	
		
			
				|  |  | +  congestion control; so far it's enough.
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | + Router twins. In many cases when we ask for a router with a given
 | 
	
		
			
				|  |  | +  address and port, we really mean a router who knows a given key. Router
 | 
	
		
			
				|  |  | +  twins are two or more routers that share the same private key. We thus
 | 
	
		
			
				|  |  | +  give routers extra flexibility in choosing the next hop in the circuit: if
 | 
	
		
			
				|  |  | +  some of the twins are down or slow, it can choose the more available ones.
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +  Currently the code tries for the primary router first, and if it's down,
 | 
	
		
			
				|  |  | +  chooses the first available twin.
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +Coding conventions:
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | + Log convention: use only these four log severities.
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +  ERR is if something fatal just happened.
 | 
	
		
			
				|  |  | +  WARNING is something bad happened, but we're still running. The
 | 
	
		
			
				|  |  | +    bad thing is either a bug in the code, an attack or buggy
 | 
	
		
			
				|  |  | +    protocol/implementation of the remote peer, etc. The operator should
 | 
	
		
			
				|  |  | +    examine the bad thing and try to correct it.
 | 
	
		
			
				|  |  | +  (No error or warning messages should be expected. I expect most people
 | 
	
		
			
				|  |  | +    to run on -l warning eventually. If a library function is currently
 | 
	
		
			
				|  |  | +    called such that failure always means ERR, then the library function
 | 
	
		
			
				|  |  | +    should log WARNING and let the caller log ERR.)
 | 
	
		
			
				|  |  | +  INFO means something happened (maybe bad, maybe ok), but there's nothing
 | 
	
		
			
				|  |  | +    you need to (or can) do about it.
 | 
	
		
			
				|  |  | +  DEBUG is for everything louder than INFO.
 | 
	
		
			
				|  |  |  
 |