127-dirport-mirrors-downloads.txt 7.0 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157
  1. Filename: 127-dirport-mirrors-downloads.txt
  2. Title: Relaying dirport requests to Tor download site / website
  3. Version: $Revision$
  4. Last-Modified: $Date$
  5. Author: Roger Dingledine
  6. Created: 2007-12-02
  7. Status: Draft
  8. 1. Overview
  9. Some countries and networks block connections to the Tor website. As
  10. time goes by, this will remain a problem and it may even become worse.
  11. We have a big pile of mirrors (google for "Tor mirrors"), but few of
  12. our users think to try a search like that. Also, many of these mirrors
  13. might be automatically blocked since their pages contain words that
  14. might cause them to get banned. And lastly, we can imagine a future
  15. where the blockers are aware of the mirror list too.
  16. Here we describe a new set of URLs for Tor's DirPort that will relay
  17. connections from users to the official Tor download site. Rather than
  18. trying to cache a bunch of new Tor packages (which is a hassle in terms
  19. of keeping them up to date, and a hassle in terms of drive space used),
  20. we instead just proxy the requests directly to Tor's /dist page.
  21. Specifically, we should support
  22. GET /tor/dist/$1
  23. and
  24. GET /tor/website/$1
  25. 2. Direct connections, one-hop circuits, or three-hop circuits?
  26. We could relay the connections directly to the download site -- but
  27. this produces recognizable outgoing traffic on the bridge or cache's
  28. network, which will probably surprise our nice volunteers. (Is this
  29. a good enough reason to discard the direct connection idea?)
  30. Even if we don't do direct connections, should we do a one-hop
  31. begindir-style connection to the mirror site (make a one-hop circuit
  32. to it, then send a 'begindir' cell down the circuit), or should we do
  33. a normal three-hop anonymized connection?
  34. If these mirrors are mainly bridges, doing either a direct or a one-hop
  35. connection creates another way to enumerate bridges. That would argue
  36. for three-hop. On the other hand, downloading a 10+ megabyte installer
  37. through a normal Tor circuit can't be fun. But if you're already getting
  38. throttled a lot because you're in the "relayed traffic" bucket, you're
  39. going to have to accept a slow transfer anyway. So three-hop it is.
  40. Speaking of which, we would want to label this connection
  41. as "relay" traffic for the purposes of rate limiting; see
  42. connection_counts_as_relayed_traffic() and or_conn->client_used. This
  43. will be a bit tricky though, because these connections will use the
  44. bridge's guards.
  45. 3. Scanning resistance
  46. One other goal we'd like to achieve, or at least not hinder, is making
  47. it hard to scan large swaths of the Internet to look for responses
  48. that indicate a bridge.
  49. In general this is a really hard problem, so we shouldn't demand to
  50. solve it here. But we can note that some bridges should open their
  51. DirPort (and offer this functionality), and others shouldn't. Then
  52. some bridges provide a download mirror while others can remain
  53. scanning-resistant.
  54. 4. Integrity checking
  55. If we serve this stuff in plaintext from the bridge, anybody in between
  56. the user and the bridge can intercept and modify it. The bridge can too.
  57. If we do an anonymized three-hop connection, the exit node can also
  58. intercept and modify the exe it sends back.
  59. Are we setting ourselves up for rogue exit relays, or rogue bridges,
  60. that trojan our users?
  61. Answer #1: Users need to do pgp signature checking. Not a very good
  62. answer, a) because it's complex, and b) because they don't know the
  63. right signing keys in the first place.
  64. Answer #2: The mirrors could exit from a specific Tor relay, using the
  65. '.exit' notation. This would make connections a bit more brittle, but
  66. would resolve the rogue exit relay issue. We could even round-robin
  67. among several, and the list could be dynamic -- for example, all the
  68. relays with an Authority flag that allow exits to the Tor website.
  69. Answer #3: The mirrors should connect to the main distribution site
  70. via SSL. That way the exit relay can't influence anything.
  71. Answer #4: We could suggest that users only use trusted bridges for
  72. fetching a copy of Tor. Hopefully they heard about the bridge from a
  73. trusted source rather than from the adversary.
  74. Answer #5: What if the adversary is trawling for Tor downloads by
  75. network signature -- either by looking for known bytes in the binary,
  76. or by looking for "GET /tor/dist/"? It would be nice to encrypt the
  77. connection from the bridge user to the bridge. And we can! The bridge
  78. already supports TLS. Rather than initiating a TLS renegotiation after
  79. connecting to the ORPort, the user should actually request a URL. Then
  80. the ORPort can either pass the connection off as a linked conn to the
  81. dirport, or renegotiate and become a Tor connection, depending on how
  82. the client behaves.
  83. 5. Linked connections: at what level should we proxy?
  84. Check out the connection_ap_make_link() function, as called from
  85. directory.c. Tor clients use this to create a "fake" socks connection
  86. back to themselves, and then they attach a directory request to it,
  87. so they can launch directory fetches via Tor. We can piggyback on
  88. this feature.
  89. We need to decide if we're going to be passing the bytes back and
  90. forth between the web browser and the main distribution site, or if
  91. we're going to be actually acting like a proxy (parsing out the file
  92. they want, fetching that file, and serving it back).
  93. Advantages of proxying without looking inside:
  94. - We don't need to build any sort of http support (including
  95. continues, partial fetches, etc etc).
  96. Disadvantages:
  97. - If the browser thinks it's speaking http, are there easy ways
  98. to pass the bytes to an https server and have everything work
  99. correctly? At the least, it would seem that the browser would
  100. complain about the cert. More generally, ssl wants to be negotiated
  101. before the URL and headers are sent, yet we need to read the URL
  102. and headers to know that this is a mirror request; so we have an
  103. ordering problem here.
  104. - Makes it harder to do caching later on, if we don't look at what
  105. we're relaying. (It might be useful down the road to cache the
  106. answers to popular requests, so we don't have to keep getting
  107. them again.)
  108. 6. Outstanding problems
  109. 1) HTTP proxies already exist. Why waste our time cloning one
  110. badly? When we clone existing stuff, we usually regret it.
  111. 2) It's overbroad. We only seem to need a secure get-a-tor feature,
  112. and instead we're contemplating building a locked-down HTTP proxy.
  113. 3) It's going to add a fair bit of complexity to our code. We do
  114. not currently implement HTTPS. We'd need to refactor lots of the
  115. low-level connection stuff so that "SSL" and "Cell-based" were no
  116. longer synonymous.
  117. 4) It's still unclear how effective this proposal would be in
  118. practice. You need to know that this feature exists, which means
  119. somebody needs to tell you about a bridge (mirror) address and tell
  120. you how to use it. And if they're doing that, they could (e.g.) tell
  121. you about a gmail autoresponder address just as easily, and then you'd
  122. get better authentication of the Tor program to boot.