TODO 11 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261
  1. Legend:
  2. SPEC!! - Not specified
  3. SPEC - Spec not finalized
  4. NICK - nick claims
  5. ARMA - arma claims
  6. - Not done
  7. * Top priority
  8. . Partially done
  9. o Done
  10. D Deferred
  11. X Abandoned
  12. For scalability:
  13. - Slightly smarter bandwidth management: use link capacity
  14. intelligently.
  15. - Handle full buffers without totally borking
  16. For dtor:
  17. NICK pre1:
  18. o make all ORs serve the directory too.
  19. o "AuthoritativeDir 1" for dirservers
  20. o non-authorative servers with dirport publish opt dircacheport
  21. o make clients read that and use it.
  22. o make clients able to read a normal dirport from non-trusted OR too
  23. o make ORs parse-and-keep-and-serve the directory they pull down
  24. o authoritativedirservers should pull down directories from
  25. other authdirservers, to merge descriptors.
  26. D Have clients and dirservers preserve reputation info over
  27. reboots.
  28. [Deferred until we know what reputation info we actually want to
  29. maintain. Our current algorithm Couldn't Possibly Work.]
  30. . allow dirservers to serve running-router list separately.
  31. o "get /running-routers" will fetch just this.
  32. o actually make the clients use this sometimes.
  33. o distinguish directory-is-dirty from runninglist-is-dirty
  34. - ORs keep this too, and serve it
  35. - Design: do we need running and non-running lists?
  36. o tor remembers descriptor-lists across reboots.
  37. . Packages define datadir as /var/lib/tor/. If no datadir is defined,
  38. then choose, make, and secure ~/.tor as datadir.
  39. o Adjust tor
  40. o Change torrc.sample
  41. D Change packages (not till 0.0.8 packages!)
  42. - Look in ~/.torrc if no */etc/torrc is found?
  43. o Contact info, pgp fingerprint, comments in router desc.
  44. o Add a ContactInfo line to torrc, which gets published in
  45. descriptor (as opt)
  46. o write tor version at the top of each log file
  47. pre2:
  48. - refer to things by key:
  49. o extend cells need ip:port:identitykeyhash.
  50. . Lookup routers and connections by key digest; accept hex
  51. key digest in place of nicknames.
  52. . Audit all uses of lookup-by-hostname and lookup-by-addr-port
  53. to search by digest when appropriate.
  54. - Rep-hist functions
  55. - also use this in intro points and rendezvous points, and
  56. hidserv descs. [XXXX This isn't enough.]
  57. - figure out what to do about ip:port:differentkey
  58. ARMA - ORs connect on demand. attach circuits to new connections, keep
  59. create cells around somewhere, send destroy if fail.
  60. - nickname defaults to first piece of hostname
  61. - running-routers list refers to nickname if verified, else
  62. hash-base64'ed.
  63. pre3:
  64. - users can set their bandwidth, or we auto-detect it:
  65. - advertised bandwidth defaults to 10KB
  66. - advertised bandwidth is the min of max seen in each direction
  67. in the past N seconds.
  68. - not counting "local" connections
  69. - round detected bandwidth up to nearest 10KB
  70. - client software not upload descriptor until:
  71. - you've been running for an hour
  72. - it's sufficiently satisfied with its bandwidth
  73. - it decides it is reachable
  74. - start counting again if your IP ever changes.
  75. - never regenerate identity keys, for now.
  76. - you can set a bit for not-being-an-OR.
  77. - clients choose nodes proportional to advertised bandwidth
  78. - authdirserver includes descriptor and lists as running iff:
  79. - he can connect to you
  80. - he has successfully extended to you
  81. - he has sufficient mean-time-between-failures
  82. - add new "Middleman 1" config variable?
  83. - if torrc not found, exitpolicy reject *:*
  84. ongoing:
  85. . rename/rearrange functions for what file they're in
  86. - generalize our transport: add transport.c in preparation for
  87. http, airhook, etc transport.
  88. For September:
  89. NICK . Windows port
  90. o works as client
  91. - deal with pollhup / reached_eof on all platforms
  92. . robust as a client
  93. . works as server
  94. - can be configured
  95. - robust as a server
  96. . Usable as NT service
  97. - docs for building in win
  98. - installer
  99. - Docs
  100. - FAQ
  101. o overview of tor. how does it work, what's it do, pros and
  102. cons of using it, why should I use it, etc.
  103. - a howto tutorial with examples
  104. o tutorial: how to set up your own tor network
  105. - (need to not hardcode dirservers file in config.c)
  106. . correct, update, polish spec
  107. - document the exposed function api?
  108. - document what we mean by socks.
  109. NICK . packages
  110. . rpm
  111. - find a long-term rpm maintainer
  112. - code
  113. - better warn/info messages
  114. o let tor do resolves.
  115. o extend socks4 to do resolves?
  116. o make script to ask tor for resolves
  117. - tsocks
  118. - gather patches, submit to maintainer
  119. - intercept gethostbyname and others, do resolve via tor
  120. - redesign and thorough code revamp, with particular eye toward:
  121. - support half-open tcp connections
  122. - conn key rotation
  123. - other transports -- http, airhook
  124. - modular introduction mechanism
  125. - allow non-clique topology
  126. Other details and small and hard things:
  127. - tor should be able to have a pool of outgoing IP addresses
  128. that it is able to rotate through. (maybe)
  129. - tie into squid
  130. - buffer size pool, to let a few buffers grow huge or many buffers
  131. grow a bit
  132. - hidserv offerers shouldn't need to define a SocksPort
  133. - when the client fails to pick an intro point for a hidserv,
  134. it should refetch the hidserv desc.
  135. . should maybe make clients exit(1) when bad things happen?
  136. e.g. clock skew.
  137. - should retry exitpolicy end streams even if the end cell didn't
  138. resolve the address for you
  139. - Add '[...truncated]' or similar to truncated log entries (like the directory
  140. in connection_dir_process_inbuf()).
  141. . Make logs handle it better when writing to them fails.
  142. o Dirserver shouldn't put you in running-routers list if you haven't
  143. uploaded a descriptor recently
  144. . Refactor: add own routerinfo to routerlist. Right now, only
  145. router_get_by_nickname knows about 'this router', as a hack to
  146. get circuit_launch_new to do the right thing.
  147. . Scrubbing proxies
  148. - Find an smtp proxy?
  149. . Get socks4a support into Mozilla
  150. - Extend by hostname, not by IP.
  151. - Need a relay teardown cell, separate from one-way ends.
  152. - Make it harder to circumvent bandwidth caps: look at number of bytes
  153. sent across sockets, not number sent inside TLS stream.
  154. - fix router_get_by_* functions so they can get ourselves too,
  155. and audit everything to make sure rend and intro points are
  156. just as likely to be us as not.
  157. ***************************Future tasks:****************************
  158. Rendezvous and hidden services:
  159. make it fast:
  160. - preemptively build and start rendezvous circs.
  161. - preemptively build n-1 hops of intro circs?
  162. - cannibalize general circs?
  163. make it reliable:
  164. - standby/hotswap/redundant services.
  165. - store stuff to disk? dirservers forget service descriptors when
  166. they restart; nodes offering hidden services forget their chosen
  167. intro points when they restart.
  168. make it robust:
  169. - auth mechanisms to let midpoint and bob selectively choose
  170. connection requests.
  171. make it scalable:
  172. - right now the hidserv store/lookup system is run by the dirservers;
  173. this won't scale.
  174. Tor scalability:
  175. Relax clique assumptions.
  176. Redesign how directories are handled.
  177. - Separate running-routers lookup from descriptor list lookup.
  178. - Resolve directory agreement somehow.
  179. - Cache directory on all servers.
  180. Find and remove bottlenecks
  181. - Address linear searches on e.g. circuit and connection lists.
  182. Reputation/memory system, so dirservers can measure people,
  183. and so other people can verify their measurements.
  184. - Need to measure via relay, so it's not distinguishable.
  185. Bandwidth-aware path selection. So people with T3's are picked
  186. more often than people with DSL.
  187. Reliability-aware node selection. So people who are stable are
  188. preferred for long-term circuits such as intro and rend circs,
  189. and general circs for irc, aim, ssh, etc.
  190. Let dissidents get to Tor servers via Tor users. ("Backbone model")
  191. Anonymity improvements:
  192. Is abandoning the circuit the only option when an extend fails, or
  193. can we do something without impacting anonymity too much?
  194. Is exiting from the middle of the circuit always a bad idea?
  195. Helper nodes. Decide how to use them to improve safety.
  196. DNS resolution: need to make tor support resolve requests. Need to write
  197. a script and an interface (including an extension to the socks
  198. protocol) so we can ask it to do resolve requests. Need to patch
  199. tsocks to intercept gethostbyname, else we'll continue leaking it.
  200. Improve path selection algorithms based on routing-zones paper. Be sure
  201. to start and end circuits in different ASs. Ideally, consider AS of
  202. source and destination -- maybe even enter and exit via nearby AS.
  203. Intermediate model, with some delays and mixing.
  204. Add defensive dropping regime?
  205. Make it more correct:
  206. Handle half-open connections: right now we don't support all TCP
  207. streams, at least according to the protocol. But we handle all that
  208. we've seen in the wild.
  209. Support IPv6.
  210. Efficiency/speed/robustness:
  211. Congestion control. Is our current design sufficient once we have heavy
  212. use? Need to measure and tweak, or maybe overhaul.
  213. Allow small cells and large cells on the same network?
  214. Cell buffering and resending. This will allow us to handle broken
  215. circuits as long as the endpoints don't break, plus will allow
  216. connection (tls session key) rotation.
  217. Implement Morphmix, so we can compare its behavior, complexity, etc.
  218. Use cpuworker for more heavy lifting.
  219. - Signing (and verifying) hidserv descriptors
  220. - Signing (and verifying) intro/rend requests
  221. - Signing (and verifying) router descriptors
  222. - Signing (and verifying) directories
  223. - Doing TLS handshake (this is very hard to separate out, though)
  224. Buffer size pool: allocate a maximum size for all buffers, not
  225. a maximum size for each buffer. So we don't have to give up as
  226. quickly (and kill the thickpipe!) when there's congestion.
  227. Exit node caching: tie into squid or other caching web proxy.
  228. Other transport. HTTP, udp, rdp, airhook, etc. May have to do our own
  229. link crypto, unless we can bully openssl into it.
  230. P2P Tor:
  231. Do all the scalability stuff above, first.
  232. Incentives to relay. Not so hard.
  233. Incentives to allow exit. Possibly quite hard.
  234. Sybil defenses without having a human bottleneck.
  235. How to gather random sample of nodes.
  236. How to handle nodelist recommendations.
  237. Consider incremental switches: a p2p tor with only 50 users has
  238. different anonymity properties than one with 10k users, and should
  239. be treated differently.