HACKING 24 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580
  1. Guide to Hacking Tor
  2. (As of 8 October 2003, this was all accurate. If you're reading this in
  3. the distant future, stuff may have changed.)
  4. 0. Intro and required reading
  5. Onion Routing is still very much in development stages. This document
  6. aims to get you started in the right direction if you want to understand
  7. the code, add features, fix bugs, etc.
  8. Read the README file first, so you can get familiar with the basics of
  9. installing and running an onion router.
  10. Then, skim some of the introductory materials in tor-design.pdf,
  11. tor-spec.txt, and the Tor FAQ to learn more about how the Tor protocol
  12. is supposed to work. This document will assume you know about Cells,
  13. Circuits, Streams, Connections, Onion Routers, and Onion Proxies.
  14. 1. Code organization
  15. 1.1. The modules
  16. The code is divided into two directories: ./src/common and ./src/or.
  17. The "common" directory contains general purpose utility functions not
  18. specific to onion routing. The "or" directory implements all
  19. onion-routing and onion-proxy specific functionality.
  20. Files in ./src/common:
  21. aes.[ch] -- Implements the AES cipher (with 128-bit keys and blocks),
  22. and a counter-mode stream cipher on top of AES. This code is
  23. taken from the main Rijndael distribution. (We include this
  24. because many people are running older versions of OpenSSL without
  25. AES support.)
  26. compat.[ch] -- Wrappers to make calls more portable. This code defines
  27. functions such as tor_malloc, tor_snprintf, get/set various data types,
  28. renaming, setting socket options, switching user IDs. It is basically
  29. where the non-portable items are conditionally included depending on
  30. the platform.
  31. container.[ch] -- Implements a smart list which is a resizable array along
  32. with helper functions to use on these lists. Also includes a
  33. splay-tree implementation of the string-to-void* map.
  34. crypto.[ch] -- Wrapper functions to present a consistent interface to
  35. public-key and symmetric cryptography operations from OpenSSL.
  36. log.[ch] -- Tor's logging subsystem.
  37. strlcat.c -- Safer, size-bounded string concatenation. Use this instead
  38. of strncat because it has a safer API. Included for platforms that
  39. that don't already ship this code.
  40. strlcpy.c -- Safer, size-bounded string copying. Use this instead of
  41. strncpy because it is a safer API which guarantees to NUL terminate.
  42. Included for platforms that don't already ship this code.
  43. test.h -- Macros used by unit tests.
  44. torgzip.[ch] -- A simple in-memory gzip implementation.
  45. torint.h -- Provides missing [u]int*_t types for environments that
  46. don't have stdint.h.
  47. tortls.[ch] -- Wrapper functions to present a consistent interface to
  48. TLS, SSL, and X.509 functions from OpenSSL.
  49. util.[ch] -- Miscellaneous portability and convenience functions.
  50. Files in ./src/or:
  51. [General-purpose modules]
  52. or.h -- Common header file: include everything, define everything.
  53. buffers.c -- Implements a generic buffer interface. Buffers are
  54. fairly opaque string holders that can read to or flush from:
  55. memory, file descriptors, or TLS connections.
  56. Also implements parsing functions to read HTTP and SOCKS commands
  57. from buffers.
  58. tree.h -- A splay tree implementation by Niels Provos. Used by
  59. dns.c for dns caching at exits, and by connection_edge.c for dns
  60. caching at clients.
  61. config.c -- Code to parse and validate the configuration file.
  62. [Background processing modules]
  63. cpuworker.c -- Implements a farm of 'CPU worker' processes to perform
  64. CPU-intensive tasks in the background, so as not interrupt the
  65. onion router. (OR only)
  66. dns.c -- Implements a farm of 'DNS worker' processes to perform DNS
  67. lookups for onion routers and cache the results. [This needs to
  68. be done in the background because of the lack of a good,
  69. ubiquitous asynchronous DNS implementation.] (OR only)
  70. [Directory-related functionality.]
  71. directory.c -- Code to send and fetch directories and router
  72. descriptors via HTTP. Directories use dirserv.c to generate the
  73. results; clients use routers.c to parse them.
  74. dirserv.c -- Code to manage directory contents and generate
  75. directories. [Directory server only]
  76. router.c -- Code to parse directories and router descriptors; and to
  77. generate a router descriptor corresponding to this OR's
  78. capabilities. Also presents some high-level interfaces for
  79. managing an OR or OP's view of the directory.
  80. [Circuit-related modules.]
  81. circuitbuild.c -- Creates circuits.
  82. circuitlist.c -- Manage the global circuit list.
  83. circuituse.c -- Launch the right type of circuits and attach streams
  84. to them.
  85. onion.c -- Code to generate and respond to "onion skins".
  86. relay.c -- Handle relay cell encryption/decryption along with packaging
  87. and receiving from circuits.
  88. [Core protocol implementation.]
  89. command.c -- Code to handle specific cell types.
  90. connection.c -- Code used in common by all connection types. See
  91. 1.2. below for more general information about connections.
  92. connection_edge.c -- Code used only by edge connections.
  93. connection_or.c -- Code to implement cell-speaking connections.
  94. [Hidden services]
  95. rendclient.c -- Client code to access location-hidden services. This
  96. allows clients and servers to run services and have people connect
  97. without either end knowing who they are connecting to.
  98. rendcommon.c -- Rendevzous implementation: Shared code between
  99. introducers, services, clients, and rendezvous points.
  100. rendmid.c -- Implement introduction and rendezvous points.
  101. rendservice.c -- Hidden-service side of rendezvous functionality.
  102. [Reputation]
  103. rephist.c -- Basic history functionality for reputation module.
  104. [Router lists]
  105. routerlist.c -- Code to maintain and access global list of routerinfos for
  106. known servers.
  107. routerparse.c -- Code to parse and validate router descriptors and
  108. directories.
  109. [Bandwidth and GUI]
  110. control.c -- Implementation of Tor's control socket interface. Useful
  111. for designing GUIs to interact with Tor.
  112. hibernate.c -- Functions to close listeners, stop allowing new circuits,
  113. and so on in preparation of closing down or going dormant. Also used
  114. to track bandwidth and time intervals to know when to hibernate.
  115. [Toplevel modules.]
  116. main.c -- Toplevel module. Initializes keys, handles signals,
  117. multiplexes between connections, implements main loop, and drives
  118. scheduled events.
  119. tor_main.c -- Stub module containing a main() function. Allows unit
  120. test binary to link against main.c
  121. [Unit tests]
  122. test.c -- Contains unit tests for many pieces of the lower level Tor
  123. modules.
  124. 1.2. All about connections
  125. All sockets in Tor are handled as different types of nonblocking
  126. 'connections'. (What the Tor spec calls a "Connection", the code refers
  127. to as a "Cell-speaking" or "OR" connection.)
  128. Connections are implemented by the connection_t struct, defined in or.h.
  129. Not every kind of connection uses all the fields in connection_t; see
  130. the comments in or.h and the assertions in assert_connection_ok() for
  131. more information.
  132. Every connection has a type and a state. Connections never change their
  133. type, but can go through many state changes in their lifetime.
  134. The connection types break down as follows:
  135. [Cell-speaking connections]
  136. CONN_TYPE_OR -- A bidirectional TLS connection transmitting a
  137. sequence of cells. May be from an OR to an OR, or from an OP to
  138. an OR.
  139. [Edge connections]
  140. CONN_TYPE_EXIT -- A TCP connection from an onion router to a
  141. Stream's destination. [OR only]
  142. CONN_TYPE_AP -- A SOCKS proxy connection from the end user
  143. application to the onion proxy. [OP only]
  144. [Listeners]
  145. CONN_TYPE_OR_LISTENER [OR only]
  146. CONN_TYPE_AP_LISTENER [OP only]
  147. CONN_TYPE_DIR_LISTENER [Directory server only]
  148. -- Bound network sockets, waiting for incoming connections.
  149. [Internal]
  150. CONN_TYPE_DNSWORKER -- Connection from the main process to a DNS
  151. worker process. [OR only]
  152. CONN_TYPE_CPUWORKER -- Connection from the main process to a CPU
  153. worker process. [OR only]
  154. Connection states are documented in or.h.
  155. Every connection has two associated input and output buffers.
  156. Listeners don't use them. For non-listener connections, incoming
  157. data is appended to conn->inbuf, and outgoing data is taken from the
  158. front of conn->outbuf. Connections differ primarily in the functions
  159. called to fill and drain these buffers.
  160. 1.3. All about circuits.
  161. A circuit_t structure fills two roles. First, a circuit_t links two
  162. connections together: either an edge connection and an OR connection,
  163. or two OR connections. (When joined to an OR connection, a circuit_t
  164. affects only cells sent to a particular circID on that connection. When
  165. joined to an edge connection, a circuit_t affects all data.)
  166. Second, a circuit_t holds the cipher keys and state for sending data
  167. along a given circuit. At the OP, it has a sequence of ciphers, each
  168. of which is shared with a single OR along the circuit. Separate
  169. ciphers are used for data going "forward" (away from the OP) and
  170. "backward" (towards the OP). At the OR, a circuit has only two stream
  171. ciphers: one for data going forward, and one for data going backward.
  172. 1.4. Asynchronous IO and the main loop.
  173. Tor uses the poll(2) system call (or it wraps select(2) to act like
  174. poll, if poll is not available) to handle nonblocking (asynchronous)
  175. IO. If you're not familiar with nonblocking IO, check out the links
  176. at the end of this document.
  177. All asynchronous logic is handled in main.c. The functions
  178. 'connection_add', 'connection_set_poll_socket', and 'connection_remove'
  179. manage an array of connection_t*, and keep in synch with the array of
  180. struct pollfd required by poll(2). (This array of connection_t* is
  181. accessible via get_connection_array, but users should generally call
  182. one of the 'connection_get_by_*' functions in connection.c to look up
  183. individual connections.)
  184. To trap read and write events, connections call the functions
  185. 'connection_{is|stop|start}_{reading|writing}'. If you want
  186. to completely reset the events you're watching for, use
  187. 'connection_watch_events'.
  188. Every time poll() finishes, main.c calls conn_read and conn_write on
  189. every connection. These functions dispatch events that have something
  190. to read to connection_handle_read, and events that have something to
  191. write to connection_handle_write, respectively.
  192. When connections need to be closed, they can respond in two ways. Most
  193. simply, they can make connection_handle_* return an error (-1),
  194. which will make conn_{read|write} close them. But if it's not
  195. convenient to return -1 (for example, processing one connection causes
  196. you to realize that a second one should close), then you can also
  197. mark a connection to close by setting conn->marked_for_close. Marked
  198. connections will be closed at the end of the current iteration of
  199. the main loop.
  200. The main loop handles several other operations: First, it checks
  201. whether any signals have been received that require a response (HUP,
  202. KILL, USR1, CHLD). Second, it calls prepare_for_poll to handle recurring
  203. tasks and compute the necessary poll timeout. These recurring tasks
  204. include periodically fetching the directory, timing out unused
  205. circuits, incrementing flow control windows and re-enabling connections
  206. that were blocking for more bandwidth, and maintaining statistics.
  207. A word about TLS: Using TLS on OR connections complicates matters in
  208. two ways.
  209. First, a TLS stream has its own read buffer independent of the
  210. connection's read buffer. (TLS needs to read an entire frame from
  211. the network before it can decrypt any data. Thus, trying to read 1
  212. byte from TLS can require that several KB be read from the network
  213. and decrypted. The extra data is stored in TLS's decrypt buffer.)
  214. Because the data hasn't been read by tor (it's still inside the TLS),
  215. this means that sometimes a connection "has stuff to read" even when
  216. poll() didn't return POLLIN. The tor_tls_get_pending_bytes function is
  217. used in main.c to detect TLS objects with non-empty internal buffers.
  218. Second, the TLS stream's events do not correspond directly to network
  219. events: sometimes, before a TLS stream can read, the network must be
  220. ready to write -- or vice versa.
  221. 1.5. How data flows (An illustration.)
  222. Suppose an OR receives 256 bytes along an OR connection. These 256
  223. bytes turn out to be a data relay cell, which gets decrypted and
  224. delivered to an edge connection. Here we give a possible call sequence
  225. for the delivery of this data.
  226. (This may be outdated quickly.)
  227. do_main_loop -- Calls poll(2), receives a POLLIN event on a struct
  228. pollfd, then calls:
  229. conn_read -- Looks up the corresponding connection_t, and calls:
  230. connection_handle_read -- Calls:
  231. connection_read_to_buf -- Notices that it has an OR connection so:
  232. read_to_buf_tls -- Pulls data from the TLS stream onto conn->inbuf.
  233. connection_process_inbuf -- Notices that it has an OR connection so:
  234. connection_or_process_inbuf -- Checks whether conn is open, and calls:
  235. connection_process_cell_from_inbuf -- Notices it has enough data for
  236. a cell, then calls:
  237. connection_fetch_from_buf -- Pulls the cell from the buffer.
  238. cell_unpack -- Decodes the raw cell into a cell_t
  239. command_process_cell -- Notices it is a relay cell, so calls:
  240. command_process_relay_cell -- Looks up the circuit for the cell,
  241. makes sure the circuit is live, then passes the cell to:
  242. circuit_deliver_relay_cell -- Passes the cell to each of:
  243. relay_crypt -- Strips a layer of encryption from the cell and
  244. notices that the cell is for local delivery.
  245. connection_edge_process_relay_cell -- extracts the cell's
  246. relay command, and makes sure the edge connection is
  247. open. Since it has a DATA cell and an open connection,
  248. calls:
  249. circuit_consider_sending_sendme -- check if the total number
  250. of cells received by all streams on this circuit is
  251. enough that we should send back an acknowledgement
  252. (requesting that more cells be sent to any stream).
  253. connection_write_to_buf -- To place the data on the outgoing
  254. buffer of the correct edge connection, by calling:
  255. connection_start_writing -- To tell the main poll loop about
  256. the pending data.
  257. write_to_buf -- To actually place the outgoing data on the
  258. edge connection.
  259. connection_consider_sending_sendme -- if the outbuf waiting
  260. to flush to the exit connection is not too full, check
  261. if the total number of cells received on this stream
  262. is enough that we should send back an acknowledgement
  263. (requesting that more cells be sent to this stream).
  264. In a subsequent iteration, main notices that the edge connection is
  265. ready for writing:
  266. do_main_loop -- Calls poll(2), receives a POLLOUT event on a struct
  267. pollfd, then calls:
  268. conn_write -- Looks up the corresponding connection_t, and calls:
  269. connection_handle_write -- This isn't a TLS connection, so calls:
  270. flush_buf -- Delivers data from the edge connection's outbuf to the
  271. network.
  272. connection_wants_to_flush -- Reports that all data has been flushed.
  273. connection_finished_flushing -- Notices the connection is an exit,
  274. and calls:
  275. connection_edge_finished_flushing -- The connection is open, so it
  276. calls:
  277. connection_stop_writing -- Tells the main poll loop that this
  278. connection has no more data to write.
  279. connection_consider_sending_sendme -- now that the outbuf
  280. is empty, check again if the total number of cells
  281. received on this stream is enough that we should send
  282. back an acknowledgement (requesting that more cells be
  283. sent to this stream).
  284. 1.6. Routers, descriptors, and directories
  285. All Tor processes need to keep track of a list of onion routers, for
  286. several reasons:
  287. - OPs need to establish connections and circuits to ORs.
  288. - ORs need to establish connections to other ORs.
  289. - OPs and ORs need to fetch directories from a directory server.
  290. - ORs need to upload their descriptors to directory servers.
  291. - Directory servers need to know which ORs are allowed onto the
  292. network, what the descriptors are for those ORs, and which of
  293. those ORs are currently live.
  294. Thus, every Tor process keeps track of a list of all the ORs it knows
  295. in a static variable 'directory' in the routers.c module. This
  296. variable contains a routerinfo_t object for each known OR. On startup,
  297. the directory is initialized to a list of known directory servers (via
  298. router_get_list_from_file()). Later, the directory is updated via
  299. router_get_dir_from_string(). (OPs and ORs retrieve fresh directories
  300. from directory servers; directory servers generate their own.)
  301. Every OR must periodically regenerate a router descriptor for itself.
  302. The descriptor and the corresponding routerinfo_t are stored in the
  303. 'desc_routerinfo' and 'descriptor' static variables in routers.c.
  304. Additionally, a directory server keeps track of a list of the
  305. router descriptors it knows in a separate list in dirserv.c. It
  306. uses this list, checking which OR connections are open, to build
  307. directories.
  308. 1.7. Data model
  309. [XXX]
  310. 1.8. Flow control
  311. [XXX]
  312. 2. Coding conventions
  313. 2.1. Details
  314. Use tor_malloc, tor_free, tor_snprintf, tor_strdup, and tor_gettimeofday
  315. instead of their generic equivalents. (They always succeed or exit.)
  316. Use INLINE instead of 'inline', so that we work properly on windows.
  317. 2.2. Calling and naming conventions
  318. Whenever possible, functions should return -1 on error and and 0 on
  319. success.
  320. For multi-word identifiers, use lowercase words combined with
  321. underscores. (e.g., "multi_word_identifier"). Use ALL_CAPS for macros and
  322. constants.
  323. Typenames should end with "_t".
  324. Function names should be prefixed with a module name or object name. (In
  325. general, code to manipulate an object should be a module with the same
  326. name as the object, so it's hard to tell which convention is used.)
  327. Functions that do things should have imperative-verb names
  328. (e.g. buffer_clear, buffer_resize); functions that return booleans should
  329. have predicate names (e.g. buffer_is_empty, buffer_needs_resizing).
  330. 2.3. What To Optimize
  331. Don't optimize anything if it's not in the critical path. Right now,
  332. the critical path seems to be AES, logging, and the network itself.
  333. Feel free to do your own profiling to determine otherwise.
  334. 2.4. Log conventions
  335. Log convention: use only these four log severities.
  336. ERR is if something fatal just happened.
  337. WARN if something bad happened, but we're still running. The
  338. bad thing is either a bug in the code, an attack or buggy
  339. protocol/implementation of the remote peer, etc. The operator should
  340. examine the bad thing and try to correct it.
  341. NOTICE if it's something the operator will want to know about.
  342. (No error or warning messages should be expected during normal OR or OP
  343. operation. I expect most people to run on -l notice eventually. If a
  344. library function is currently called such that failure always means
  345. ERR, then the library function should log WARN and let the caller
  346. log ERR.)
  347. INFO means something happened (maybe bad, maybe ok), but there's nothing
  348. you need to (or can) do about it.
  349. DEBUG is for everything louder than INFO.
  350. [XXX Proposed convention: every messages of severity INFO or higher should
  351. either (A) be intelligible to end-users who don't know the Tor source; or
  352. (B) somehow inform the end-users that they aren't expected to understand
  353. the message (perhaps with a string like "internal error"). Option (A) is
  354. to be preferred to option (B). -NM]
  355. 2.5. Doxygen
  356. We use the 'doxygen' utility to generate documentation from our source code.
  357. Here's how to use it:
  358. 1. Begin every file that should be documented with
  359. /**
  360. * \file filename.c
  361. * \brief Short desccription of the file
  362. */
  363. (Doxygen will recognize any comment beginning with /** as special.)
  364. 2. Before any function, structure, #define, or variable you want to
  365. document, add a comment of the form:
  366. /** Describe the function's actions in imperative sentences.
  367. *
  368. * Use blank lines for paragraph breaks
  369. * - and
  370. * - hyphens
  371. * - for
  372. * - lists.
  373. *
  374. * Write <b>argument_names</b> in boldface.
  375. *
  376. * \code
  377. * place_example_code();
  378. * between_code_and_endcode_commands();
  379. * \endcode
  380. */
  381. 3. Make sure to escape the characters "<", ">", "\", "%" and "#" as "\<",
  382. "\>", "\\", "\%", and "\#".
  383. 4. To document structure members, you can use two forms:
  384. struct foo {
  385. /** You can put the comment before an element; */
  386. int a;
  387. int b; /**< Or use the less-than symbol to put the comment after the element. */
  388. };
  389. 5. To generate documentation from the Tor source code, type:
  390. $ doxygen -g
  391. To generate a file called 'Doxyfile'. Edit that file and run 'doxygen' to
  392. generate the aPI documentation.
  393. 6. See the Doxygen manual for more information; this summary just scratches
  394. the surface.
  395. 3. References
  396. About Tor
  397. See http://tor.eff.org/
  398. http://tor.eff.org/cvs/doc/tor-spec.txt
  399. http://tor.eff.org/cvs/doc/tor-design.tex
  400. http://tor.eff.org/cvs/doc/FAQ
  401. About anonymity
  402. See http://freehaven.net/anonbib/
  403. About nonblocking IO
  404. [XXX insert references]
  405. # ======================================================================
  406. # Old HACKING document; merge into the above, move into tor-design.tex,
  407. # or delete.
  408. # ======================================================================
  409. The pieces.
  410. Routers. Onion routers, as far as the 'tor' program is concerned,
  411. are a bunch of data items that are loaded into the router_array when
  412. the program starts. Periodically it downloads a new set of routers
  413. from a directory server, and updates the router_array. When a new OR
  414. connection is started (see below), the relevant information is copied
  415. from the router struct to the connection struct.
  416. Connections. A connection is a long-standing tcp socket between
  417. nodes. A connection is named based on what it's connected to -- an "OR
  418. connection" has an onion router on the other end, an "OP connection" has
  419. an onion proxy on the other end, an "exit connection" has a website or
  420. other server on the other end, and an "AP connection" has an application
  421. proxy (and thus a user) on the other end.
  422. Circuits. A circuit is a path over the onion routing
  423. network. Applications can connect to one end of the circuit, and can
  424. create exit connections at the other end of the circuit. AP and exit
  425. connections have only one circuit associated with them (and thus these
  426. connection types are closed when the circuit is closed), whereas OP and
  427. OR connections multiplex many circuits at once, and stay standing even
  428. when there are no circuits running over them.
  429. Streams. Streams are specific conversations between an AP and an exit.
  430. Streams are multiplexed over circuits.
  431. Cells. Some connections, specifically OR and OP connections, speak
  432. "cells". This means that data over that connection is bundled into 512
  433. byte packets (14 bytes of header and 498 bytes of payload). Each cell has
  434. a type, or "command", which indicates what it's for.