|  | @@ -56,7 +56,7 @@ the distant future, stuff may have changed.)
 | 
	
		
			
				|  |  |    
 | 
	
		
			
				|  |  |     [General-purpose modules]
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  | -     or.h -- Common header file: includes everything, define everything.
 | 
	
		
			
				|  |  | +     or.h -- Common header file: include everything, define everything.
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  |       buffers.c -- Implements a generic buffer interface.  Buffers are 
 | 
	
		
			
				|  |  |          fairly opaque string holders that can read to or flush from:
 | 
	
	
		
			
				|  | @@ -65,7 +65,7 @@ the distant future, stuff may have changed.)
 | 
	
		
			
				|  |  |          Also implements parsing functions to read HTTP and SOCKS commands
 | 
	
		
			
				|  |  |          from buffers.
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  | -     tree.h -- A splay tree implementatio by Niels Provos.  Used only by
 | 
	
		
			
				|  |  | +     tree.h -- A splay tree implementation by Niels Provos.  Used only by
 | 
	
		
			
				|  |  |          dns.c.
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  |       config.c -- Code to parse and validate the configuration file.
 | 
	
	
		
			
				|  | @@ -88,7 +88,7 @@ the distant future, stuff may have changed.)
 | 
	
		
			
				|  |  |          results; clients use routers.c to parse them.
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  |       dirserv.c -- Code to manage directory contents and generate
 | 
	
		
			
				|  |  | -        directories. [Directory only] 
 | 
	
		
			
				|  |  | +        directories. [Directory server only] 
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  |       routers.c -- Code to parse directories and router descriptors; and to
 | 
	
		
			
				|  |  |          generate a router descriptor corresponding to this OR's
 | 
	
	
		
			
				|  | @@ -109,7 +109,7 @@ the distant future, stuff may have changed.)
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  |       connection_edge.c -- Code used only by edge connections.
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  | -     command.c -- Code to handle specific cell types. [OR only]
 | 
	
		
			
				|  |  | +     command.c -- Code to handle specific cell types.
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  |       connection_or.c -- Code to implement cell-speaking connections.
 | 
	
		
			
				|  |  |  
 | 
	
	
		
			
				|  | @@ -151,29 +151,29 @@ the distant future, stuff may have changed.)
 | 
	
		
			
				|  |  |       [Edge connections]
 | 
	
		
			
				|  |  |         CONN_TYPE_EXIT -- A TCP connection from an onion router to a
 | 
	
		
			
				|  |  |            Stream's destination. [OR only]
 | 
	
		
			
				|  |  | -       CONN_TYPE_AP -- A SOCKS proxy connection from the end user to the
 | 
	
		
			
				|  |  | -          onion proxy.  [OP only]
 | 
	
		
			
				|  |  | +       CONN_TYPE_AP -- A SOCKS proxy connection from the end user
 | 
	
		
			
				|  |  | +          application to the onion proxy.  [OP only]
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  |       [Listeners]
 | 
	
		
			
				|  |  |         CONN_TYPE_OR_LISTENER [OR only]
 | 
	
		
			
				|  |  |         CONN_TYPE_AP_LISTENER [OP only]
 | 
	
		
			
				|  |  | -       CONN_TYPE_DIR_LISTENER [Directory only]
 | 
	
		
			
				|  |  | +       CONN_TYPE_DIR_LISTENER [Directory server only]
 | 
	
		
			
				|  |  |            -- Bound network sockets, waiting for incoming connections.
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  |       [Internal]
 | 
	
		
			
				|  |  |         CONN_TYPE_DNSWORKER -- Connection from the main process to a DNS
 | 
	
		
			
				|  |  | -          worker. [OR only]
 | 
	
		
			
				|  |  | +          worker process. [OR only]
 | 
	
		
			
				|  |  |         
 | 
	
		
			
				|  |  |         CONN_TYPE_CPUWORKER -- Connection from the main process to a CPU
 | 
	
		
			
				|  |  | -          worker. [OR only]
 | 
	
		
			
				|  |  | +          worker process. [OR only]
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  |     Connection states are documented in or.h.
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  |     Every connection has two associated input and output buffers.
 | 
	
		
			
				|  |  | -   Listeners don't use them.  With other connections, incoming data is
 | 
	
		
			
				|  |  | -   appended to conn->inbuf, and outgoing data is taken from the front of
 | 
	
		
			
				|  |  | -   conn->outbuf.  Connections differ primarily in the functions called
 | 
	
		
			
				|  |  | -   to fill and drain these buffers.
 | 
	
		
			
				|  |  | +   Listeners don't use them.  For non-listener connections, incoming
 | 
	
		
			
				|  |  | +   data is appended to conn->inbuf, and outgoing data is taken from the
 | 
	
		
			
				|  |  | +   front of conn->outbuf.  Connections differ primarily in the functions
 | 
	
		
			
				|  |  | +   called to fill and drain these buffers.
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  |  1.3. All about circuits.
 | 
	
		
			
				|  |  |  
 | 
	
	
		
			
				|  | @@ -192,9 +192,10 @@ the distant future, stuff may have changed.)
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  |  1.4. Asynchronous IO and the main loop.
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  | -   Tor uses the poll(2) system call [or a substitute based on select(2)]
 | 
	
		
			
				|  |  | -   to handle nonblocking (asynchonous) IO.  If you're not familiar with
 | 
	
		
			
				|  |  | -   nonblocking IO, check out the links at the end of this document.
 | 
	
		
			
				|  |  | +   Tor uses the poll(2) system call (or it wraps select(2) to act like
 | 
	
		
			
				|  |  | +   poll, if poll is not available) to handle nonblocking (asynchronous)
 | 
	
		
			
				|  |  | +   IO.  If you're not familiar with nonblocking IO, check out the links
 | 
	
		
			
				|  |  | +   at the end of this document.
 | 
	
		
			
				|  |  |          
 | 
	
		
			
				|  |  |     All asynchronous logic is handled in main.c.  The functions
 | 
	
		
			
				|  |  |     'connection_add', 'connection_set_poll_socket', and 'connection_remove'
 | 
	
	
		
			
				|  | @@ -205,18 +206,23 @@ the distant future, stuff may have changed.)
 | 
	
		
			
				|  |  |     individual connections.)
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  |     To trap read and write events, connections call the functions
 | 
	
		
			
				|  |  | -   'connection_{is|stop|start}_{reading|writing}'.
 | 
	
		
			
				|  |  | -
 | 
	
		
			
				|  |  | -   When connections get events, main.c calls conn_read and conn_write.
 | 
	
		
			
				|  |  | -   These functions dispatch events to connection_handle_read and
 | 
	
		
			
				|  |  | -   connection_handle_write as appropriate.
 | 
	
		
			
				|  |  | -
 | 
	
		
			
				|  |  | -   When connection need to be closed, they can respond in two ways.  Most
 | 
	
		
			
				|  |  | -   simply, they can make connection_handle_* to return an error (-1),
 | 
	
		
			
				|  |  | -   which will make conn_{read|write} close them.  But if the connection
 | 
	
		
			
				|  |  | -   needs to stay around [XXXX explain why] until the end of the current
 | 
	
		
			
				|  |  | -   iteration of the main loop, it marks itself for closing by setting
 | 
	
		
			
				|  |  | -   conn->connection_marked_for_close.
 | 
	
		
			
				|  |  | +   'connection_{is|stop|start}_{reading|writing}'. If you want
 | 
	
		
			
				|  |  | +   to completely reset the events you're watching for, use
 | 
	
		
			
				|  |  | +   'connection_watch_events'.
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +   Every time poll() finishes, main.c calls conn_read and conn_write on
 | 
	
		
			
				|  |  | +   every connection. These functions dispatch events that have something
 | 
	
		
			
				|  |  | +   to read to connection_handle_read, and events that have something to
 | 
	
		
			
				|  |  | +   write to connection_handle_write, respectively.
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  | +   When connections need to be closed, they can respond in two ways.  Most
 | 
	
		
			
				|  |  | +   simply, they can make connection_handle_* return an error (-1),
 | 
	
		
			
				|  |  | +   which will make conn_{read|write} close them.  But if it's not
 | 
	
		
			
				|  |  | +   convenient to return -1 (for example, processing one connection causes
 | 
	
		
			
				|  |  | +   you to realize that a second one should close), then you can also
 | 
	
		
			
				|  |  | +   mark a connection to close by setting conn->marked_for_close. Marked
 | 
	
		
			
				|  |  | +   connections will be closed at the end of the current iteration of
 | 
	
		
			
				|  |  | +   the main loop.
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  |     The main loop handles several other operations: First, it checks
 | 
	
		
			
				|  |  |     whether any signals have been received that require a response (HUP,
 | 
	
	
		
			
				|  | @@ -227,23 +233,26 @@ the distant future, stuff may have changed.)
 | 
	
		
			
				|  |  |     that were blocking for more bandwidth, and maintaining statistics.
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  |     A word about TLS: Using TLS on OR connections complicates matters in
 | 
	
		
			
				|  |  | -   two ways.  First, a TLS stream has its own read buffer independent of
 | 
	
		
			
				|  |  | -   the connection's read buffer.  (TLS needs to read an entire frame from
 | 
	
		
			
				|  |  | +   two ways.
 | 
	
		
			
				|  |  | +   First, a TLS stream has its own read buffer independent of the
 | 
	
		
			
				|  |  | +   connection's read buffer.  (TLS needs to read an entire frame from
 | 
	
		
			
				|  |  |     the network before it can decrypt any data.  Thus, trying to read 1
 | 
	
		
			
				|  |  | -   byte from TLS can require that several KB be read from the network and
 | 
	
		
			
				|  |  | -   decrypted.  The extra data is stored in TLS's decrypt buffer.)  Second,
 | 
	
		
			
				|  |  | -   the TLS stream's events do not correspond directly to network events:
 | 
	
		
			
				|  |  | -   sometimes, before a TLS stream can read, the network must be ready to
 | 
	
		
			
				|  |  | -   write -- or vice versa.
 | 
	
		
			
				|  |  | -
 | 
	
		
			
				|  |  | -   [XXXX describe the consequences of this for OR connections.]
 | 
	
		
			
				|  |  | +   byte from TLS can require that several KB be read from the network
 | 
	
		
			
				|  |  | +   and decrypted.  The extra data is stored in TLS's decrypt buffer.)
 | 
	
		
			
				|  |  | +   Because the data hasn't been read by tor (it's still inside the TLS),
 | 
	
		
			
				|  |  | +   this means that sometimes a connection "has stuff to read" even when
 | 
	
		
			
				|  |  | +   poll() didn't return POLLIN. The tor_tls_get_pending_bytes function is
 | 
	
		
			
				|  |  | +   used in main.c to detect TLS objects with non-empty internal buffers.
 | 
	
		
			
				|  |  | +   Second, the TLS stream's events do not correspond directly to network
 | 
	
		
			
				|  |  | +   events: sometimes, before a TLS stream can read, the network must be
 | 
	
		
			
				|  |  | +   ready to write -- or vice versa.
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  |  1.5. How data flows (An illustration.)
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  | -   Suppose an OR receives 50 bytes along an OR connection.  These 50 bytes
 | 
	
		
			
				|  |  | -   complete a data relay cell, which gets decrypted and delivered to an
 | 
	
		
			
				|  |  | -   edge connection.  Here we give a possible call sequence for the
 | 
	
		
			
				|  |  | -   delivery of this data.
 | 
	
		
			
				|  |  | +   Suppose an OR receives 256 bytes along an OR connection.  These 256
 | 
	
		
			
				|  |  | +   bytes turn out to be a data relay cell, which gets decrypted and
 | 
	
		
			
				|  |  | +   delivered to an edge connection.  Here we give a possible call sequence
 | 
	
		
			
				|  |  | +   for the delivery of this data.
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  |     (This may be outdated quickly.)
 | 
	
		
			
				|  |  |  
 | 
	
	
		
			
				|  | @@ -264,22 +273,29 @@ the distant future, stuff may have changed.)
 | 
	
		
			
				|  |  |                   makes sure the circuit is live, then passes the cell to:
 | 
	
		
			
				|  |  |             circuit_deliver_relay_cell -- Passes the cell to each of: 
 | 
	
		
			
				|  |  |              relay_crypt -- Strips a layer of encryption from the cell and
 | 
	
		
			
				|  |  | -                 notice that the cell is for local delivery.
 | 
	
		
			
				|  |  | +                 notices that the cell is for local delivery.
 | 
	
		
			
				|  |  |              connection_edge_process_relay_cell -- extracts the cell's
 | 
	
		
			
				|  |  |                   relay command, and makes sure the edge connection is
 | 
	
		
			
				|  |  |                   open.  Since it has a DATA cell and an open connection,
 | 
	
		
			
				|  |  |                   calls:
 | 
	
		
			
				|  |  | -             circuit_consider_sending_sendme -- [XXX]
 | 
	
		
			
				|  |  | +             circuit_consider_sending_sendme -- check if the total number
 | 
	
		
			
				|  |  | +                 of cells received by all streams on this circuit is
 | 
	
		
			
				|  |  | +                 enough that we should send back an acknowledgement
 | 
	
		
			
				|  |  | +                 (requesting that more cells be sent to any stream).
 | 
	
		
			
				|  |  |               connection_write_to_buf -- To place the data on the outgoing
 | 
	
		
			
				|  |  |                   buffer of the correct edge connection, by calling:
 | 
	
		
			
				|  |  |                connection_start_writing -- To tell the main poll loop about
 | 
	
		
			
				|  |  |                   the pending data.
 | 
	
		
			
				|  |  |                write_to_buf -- To actually place the outgoing data on the
 | 
	
		
			
				|  |  |                   edge connection.
 | 
	
		
			
				|  |  | -             connection_consider_sending_sendme -- [XXX]
 | 
	
		
			
				|  |  | +             connection_consider_sending_sendme -- if the outbuf waiting
 | 
	
		
			
				|  |  | +                 to flush to the exit connection is not too full, check
 | 
	
		
			
				|  |  | +                 if the total number of cells received on this stream
 | 
	
		
			
				|  |  | +                 is enough that we should send back an acknowledgement
 | 
	
		
			
				|  |  | +                 (requesting that more cells be sent to this stream).
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  | -   [In a subsequent iteration, main notices that the edge connection is
 | 
	
		
			
				|  |  | -    ready for writing.]
 | 
	
		
			
				|  |  | +   In a subsequent iteration, main notices that the edge connection is
 | 
	
		
			
				|  |  | +   ready for writing:
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  |     do_main_loop -- Calls poll(2), receives a POLLOUT event on a struct
 | 
	
		
			
				|  |  |                   pollfd, then calls:
 | 
	
	
		
			
				|  | @@ -294,7 +310,12 @@ the distant future, stuff may have changed.)
 | 
	
		
			
				|  |  |                   calls:
 | 
	
		
			
				|  |  |          connection_stop_writing -- Tells the main poll loop that this
 | 
	
		
			
				|  |  |                   connection has no more data to write.
 | 
	
		
			
				|  |  | -        connection_consider_sending_sendme -- [XXX]
 | 
	
		
			
				|  |  | +        connection_consider_sending_sendme -- now that the outbuf
 | 
	
		
			
				|  |  | +                 is empty, check again if the total number of cells
 | 
	
		
			
				|  |  | +                 received on this stream is enough that we should send
 | 
	
		
			
				|  |  | +                 back an acknowledgement (requesting that more cells be
 | 
	
		
			
				|  |  | +                 sent to this stream).
 | 
	
		
			
				|  |  | +
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  |  1.6. Routers, descriptors, and directories
 | 
	
		
			
				|  |  |  
 | 
	
	
		
			
				|  | @@ -302,7 +323,7 @@ the distant future, stuff may have changed.)
 | 
	
		
			
				|  |  |     several reasons:
 | 
	
		
			
				|  |  |         - OPs need to establish connections and circuits to ORs.
 | 
	
		
			
				|  |  |         - ORs need to establish connections to other ORs.
 | 
	
		
			
				|  |  | -       - OPs and ORs need to fetch directories from a directory servers.
 | 
	
		
			
				|  |  | +       - OPs and ORs need to fetch directories from a directory server.
 | 
	
		
			
				|  |  |         - ORs need to upload their descriptors to directory servers.
 | 
	
		
			
				|  |  |         - Directory servers need to know which ORs are allowed onto the
 | 
	
		
			
				|  |  |           network, what the descriptors are for those ORs, and which of
 | 
	
	
		
			
				|  | @@ -321,8 +342,8 @@ the distant future, stuff may have changed.)
 | 
	
		
			
				|  |  |     'desc_routerinfo' and 'descriptor' static variables in routers.c.
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  |     Additionally, a directory server keeps track of a list of the
 | 
	
		
			
				|  |  | -   router descriptors it knows in a separte list in dirserv.c.  It
 | 
	
		
			
				|  |  | -   uses this list, plus the open connections in main.c, to build
 | 
	
		
			
				|  |  | +   router descriptors it knows in a separate list in dirserv.c.  It
 | 
	
		
			
				|  |  | +   uses this list, checking which OR connections are open, to build
 | 
	
		
			
				|  |  |     directories.
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  |  1.7. Data model
 | 
	
	
		
			
				|  | @@ -372,14 +393,14 @@ the distant future, stuff may have changed.)
 | 
	
		
			
				|  |  |    Log convention: use only these four log severities.
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  |      ERR is if something fatal just happened.
 | 
	
		
			
				|  |  | -    WARNING is something bad happened, but we're still running. The
 | 
	
		
			
				|  |  | +    WARN if something bad happened, but we're still running. The
 | 
	
		
			
				|  |  |        bad thing is either a bug in the code, an attack or buggy
 | 
	
		
			
				|  |  |        protocol/implementation of the remote peer, etc. The operator should
 | 
	
		
			
				|  |  |        examine the bad thing and try to correct it.
 | 
	
		
			
				|  |  |      (No error or warning messages should be expected during normal OR or OP
 | 
	
		
			
				|  |  | -      operation.. I expect most people to run on -l warning eventually. If a
 | 
	
		
			
				|  |  | +      operation. I expect most people to run on -l warn eventually. If a
 | 
	
		
			
				|  |  |        library function is currently called such that failure always means
 | 
	
		
			
				|  |  | -      ERR, then the library function should log WARNING and let the caller
 | 
	
		
			
				|  |  | +      ERR, then the library function should log WARN and let the caller
 | 
	
		
			
				|  |  |        log ERR.)
 | 
	
		
			
				|  |  |      INFO means something happened (maybe bad, maybe ok), but there's nothing
 | 
	
		
			
				|  |  |        you need to (or can) do about it.
 | 
	
	
		
			
				|  | @@ -397,7 +418,7 @@ the distant future, stuff may have changed.)
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  |       See http://freehaven.net/tor/
 | 
	
		
			
				|  |  |           http://freehaven.net/tor/cvs/doc/tor-spec.txt
 | 
	
		
			
				|  |  | -         http://freehaven.net/tor/cvs/doc/tor-dessign.tex
 | 
	
		
			
				|  |  | +         http://freehaven.net/tor/cvs/doc/tor-design.tex
 | 
	
		
			
				|  |  |           http://freehaven.net/tor/cvs/doc/FAQ
 | 
	
		
			
				|  |  |  
 | 
	
		
			
				|  |  |    About anonymity
 |