| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312 | 
							
- 0. Useful tools.
 
- 0.0 The buildbot.
 
-   http://tor-buildbot.freehaven.net:8010/
 
-   - Down because nickm isn't running services at home any more. ioerror says
 
-     he will resurrect it.
 
- 0.1. Useful command-lines that are non-trivial to reproduce but can
 
- help with tracking bugs or leaks.
 
- 0.1.1. Dmalloc
 
- dmalloc -l ~/dmalloc.log
 
- (run the commands it tells you)
 
- ./configure --with-dmalloc
 
- 0.2.2. Valgrind
 
- valgrind --leak-check=yes --error-limit=no --show-reachable=yes src/or/tor
 
- (Note that if you get a zillion openssl warnings, you will also need to
 
-  pass --undef-value-errors=no to valgrind, or rebuild your openssl
 
-  with -DPURIFY.)
 
- 0.2. Running gcov for unit test coverage
 
-   make clean
 
-   make CFLAGS='-g -fprofile-arcs -ftest-coverage'
 
-   ./src/or/test
 
-   cd src/common; gcov *.[ch]
 
-   cd ../or; gcov *.[ch]
 
-   Then, look at the .gcov files.  '-' before a line means that the
 
-   compiler generated  no code for that line.  '######' means that the
 
-   line was never reached.  Lines with numbers were called that number
 
-   of times.
 
- 1. Coding conventions
 
- 1.0. Whitespace and C conformance
 
-   Invoke "make check-spaces" from time to time, so it can tell you about
 
-   deviations from our C whitespace style.  Generally, we use:
 
-     - Unix-style line endings
 
-     - K&R-style indentation
 
-     - No space before newlines
 
-     - A blank line at the end of each file
 
-     - Never more than one blank line in a row
 
-     - Always spaces, never tabs
 
-     - No more than 79-columns per line.
 
-     - Two spaces per indent.
 
-     - A space between control keywords and their corresponding paren
 
-       "if (x)", "while (x)", and "switch (x)", never "if(x)", "while(x)", or
 
-       "switch(x)".
 
-     - A space between anything and an open brace.
 
-     - No space between a function name and an opening paren. "puts(x)", not
 
-       "puts (x)".
 
-     - Function declarations at the start of the line.
 
-   We try hard to build without warnings everywhere.  In particular, if you're
 
-   using gcc, you should invoke the configure script with the option
 
-   "--enable-gcc-warnings".  This will give a bunch of extra warning flags to
 
-   the compiler, and help us find divergences from our preferred C style.
 
- 1.0.1. Getting emacs to edit Tor source properly.
 
-   Hi, folks!  Nick here.  I like to put the following snippet in my .emacs
 
-   file:
 
-     (add-hook 'c-mode-hook
 
-           (lambda ()
 
-             (font-lock-mode 1)
 
-             (set-variable 'show-trailing-whitespace t)
 
-             (let ((fname (expand-file-name (buffer-file-name))))
 
-               (cond
 
-                ((string-match "^/home/nickm/src/libevent" fname)
 
-                 (set-variable 'indent-tabs-mode t)
 
-                 (set-variable 'c-basic-offset 4)
 
-                 (set-variable 'tab-width 4))
 
-                ((string-match "^/home/nickm/src/tor" fname)
 
-                 (set-variable 'indent-tabs-mode nil)
 
-                 (set-variable 'c-basic-offset 2))
 
-                ((string-match "^/home/nickm/src/openssl" fname)
 
-                 (set-variable 'indent-tabs-mode t)
 
-                 (set-variable 'c-basic-offset 8)
 
-                 (set-variable 'tab-width 8))
 
-             ))))
 
-   You'll note that it defaults to showing all trailing whitespace.  The
 
-   "cond" test detects whether the file is one of a few C free software
 
-   projects that I often edit, and sets up the indentation level and tab
 
-   preferences to match what they want.
 
-   If you want to try this out, you'll need to change the filename regex
 
-   patterns to match where you keep your Tor files.
 
-   If you *only* use emacs to edit Tor, you could always just say:
 
-     (add-hook 'c-mode-hook
 
-           (lambda ()
 
-             (font-lock-mode 1)
 
-             (set-variable 'show-trailing-whitespace t)
 
-             (set-variable 'indent-tabs-mode nil)
 
-             (set-variable 'c-basic-offset 2)))
 
-   There is probably a better way to do this.  No, we are probably not going
 
-   to clutter the files with emacs stuff.
 
- 1.1. Details
 
-   Use tor_malloc, tor_free, tor_strdup, and tor_gettimeofday instead of their
 
-   generic equivalents.  (They always succeed or exit.)
 
-   You can get a full list of the compatibility functions that Tor provides by
 
-   looking through src/common/util.h and src/common/compat.h.  You can see the
 
-   available containers in src/common/containers.h.  You should probably
 
-   familiarize yourself with these modules before you write too much code,
 
-   or else you'll wind up reinventing the wheel.
 
-   Use 'INLINE' instead of 'inline', so that we work properly on Windows.
 
- 1.2. Calling and naming conventions
 
-   Whenever possible, functions should return -1 on error and 0 on success.
 
-   For multi-word identifiers, use lowercase words combined with
 
-   underscores. (e.g., "multi_word_identifier").  Use ALL_CAPS for macros and
 
-   constants.
 
-   Typenames should end with "_t".
 
-   Function names should be prefixed with a module name or object name.  (In
 
-   general, code to manipulate an object should be a module with the same
 
-   name as the object, so it's hard to tell which convention is used.)
 
-   Functions that do things should have imperative-verb names
 
-   (e.g. buffer_clear, buffer_resize); functions that return booleans should
 
-   have predicate names (e.g. buffer_is_empty, buffer_needs_resizing).
 
-   If you find that you have four or more possible return code values, it's
 
-   probably time to create an enum.  If you find that you are passing three or
 
-   more flags to a function, it's probably time to create a flags argument
 
-   that takes a bitfield.
 
- 1.3. What To Optimize
 
-   Don't optimize anything if it's not in the critical path.  Right now,
 
-   the critical path seems to be AES, logging, and the network itself.
 
-   Feel free to do your own profiling to determine otherwise.
 
- 1.4. Log conventions
 
-   http://wiki.noreply.org/noreply/TheOnionRouter/TorFAQ#LogLevels
 
-   No error or warning messages should be expected during normal OR or OP
 
-   operation.
 
-   If a library function is currently called such that failure always
 
-   means ERR, then the library function should log WARN and let the caller
 
-   log ERR.
 
-   [XXX Proposed convention: every message of severity INFO or higher should
 
-   either (A) be intelligible to end-users who don't know the Tor source; or
 
-   (B) somehow inform the end-users that they aren't expected to understand
 
-   the message (perhaps with a string like "internal error").  Option (A) is
 
-   to be preferred to option (B). -NM]
 
- 1.5. Doxygen
 
-   We use the 'doxygen' utility to generate documentation from our
 
-   source code. Here's how to use it:
 
-   1. Begin every file that should be documented with
 
-          /**
 
-           * \file filename.c
 
-           * \brief Short description of the file.
 
-           **/
 
-      (Doxygen will recognize any comment beginning with /** as special.)
 
-   2. Before any function, structure, #define, or variable you want to
 
-      document, add a comment of the form:
 
-         /** Describe the function's actions in imperative sentences.
 
-          *
 
-          * Use blank lines for paragraph breaks
 
-          *   - and
 
-          *   - hyphens
 
-          *   - for
 
-          *   - lists.
 
-          *
 
-          * Write <b>argument_names</b> in boldface.
 
-          *
 
-          * \code
 
-          *     place_example_code();
 
-          *     between_code_and_endcode_commands();
 
-          * \endcode
 
-          */
 
-   3. Make sure to escape the characters "<", ">", "\", "%" and "#" as "\<",
 
-      "\>", "\\", "\%", and "\#".
 
-   4. To document structure members, you can use two forms:
 
-        struct foo {
 
-          /** You can put the comment before an element; */
 
-          int a;
 
-          int b; /**< Or use the less-than symbol to put the comment
 
-                  * after the element. */
 
-        };
 
-   5. To generate documentation from the Tor source code, type:
 
-      $ doxygen -g
 
-      To generate a file called 'Doxyfile'.  Edit that file and run
 
-      'doxygen' to generate the API documentation.
 
-   6. See the Doxygen manual for more information; this summary just
 
-      scratches the surface.
 
- 1.5.1. Doxygen comment conventions
 
-   Say what functions do as a series of one or more imperative sentences, as
 
-   though you were telling somebody how to be the function.  In other words,
 
-   DO NOT say:
 
-      /** The strtol function parses a number.
 
-       *
 
-       * nptr -- the string to parse.  It can include whitespace.
 
-       * endptr -- a string pointer to hold the first thing that is not part
 
-       *    of the number, if present.
 
-       * base -- the numeric base.
 
-       * returns: the resulting number.
 
-       */
 
-      long strtol(const char *nptr, char **nptr, int base);
 
-   Instead, please DO say:
 
-      /** Parse a number in radix <b>base</b> from the string <b>nptr</b>,
 
-       * and return the result.  Skip all leading whitespace.  If
 
-       * <b>endptr</b> is not NULL, set *<b>endptr</b> to the first character
 
-       * after the number parsed.
 
-       **/
 
-      long strtol(const char *nptr, char **nptr, int base);
 
-   Doxygen comments are the contract in our abstraction-by-contract world: if
 
-   the functions that call your function rely on it doing something, then your
 
-   function should mention that it does that something in the documentation.
 
-   If you rely on a function doing something beyond what is in its
 
-   documentation, then you should watch out, or it might do something else
 
-   later.
 
- 2. Code notes
 
- 2.1. Dataflows
 
- 2.1.1. How Incoming data is handled
 
- There are two paths for data arriving at Tor over the network: regular
 
- TCP data, and DNS.
 
- 2.1.1.1. TCP.
 
- When Tor takes information over the network, it uses the functions
 
- read_to_buf() and read_to_buf_tls() in buffers.c.  These read from a
 
- socket or an SSL* into a buffer_t, which is an mbuf-style linkedlist
 
- of memory chunks.
 
- read_to_buf() and read_to_buf_tls() are called only from
 
- connection_read_to_buf() in connection.c.  It takes a connection_t
 
- pointer, and reads data into it over the network, up to the
 
- connection's current bandwidth limits.  It places that data into the
 
- "inbuf" field of the connection, and then:
 
-   - Adjusts the connection's want-to-read/want-to-write status as
 
-     appropriate.
 
-   - Increments the read and written counts for the connection as
 
-     appropriate.
 
-   - Adjusts bandwidth buckets as appropriate.
 
- connection_read_to_buf() is called only from connection_handle_read().
 
- The connection_handle_read() function is called whenever libevent
 
- decides (based on select, poll, epoll, kqueue, etc) that there is data
 
- to read from a connection.  If any data is read,
 
- connection_handle_read() calls connection_process_inbuf() to see if
 
- any of the data can be processed.  If the connection was closed,
 
- connection_handle_read() calls connection_reached_eof().
 
- Connection_process_inbuf() and connection_reached_eof() both dispatch
 
- based on the connection type to determine what to do with the data
 
- that's just arrived on the connection's inbuf field.  Each type of
 
- connection has its own version of these functions.  For example,
 
- directory connections process incoming data in
 
- connection_dir_process_inbuf(), while OR connections process incoming
 
- data in connection_or_process_inbuf().  These
 
- connection_*_process_inbuf() functions extract data from the
 
- connection's inbuf field (a buffer_t), using functions from buffers.c.
 
- Some of these accessor functions are straightforward data extractors
 
- (like fetch_from_buf()); others do protocol-specific parsing.
 
- 2.1.1.2. DNS
 
- Tor launches (and optionally accepts) DNS requests using the code in
 
- eventdns.c, which is a copy of libevent's evdns.c.  (We don't use
 
- libevent's version because it is not yet in the versions of libevent
 
- all our users have.)  DNS replies are read in nameserver_read();
 
- DNS queries are read in server_port_read().
 
 
  |