| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157 | 
							- Filename: 127-dirport-mirrors-downloads.txt
 
- Title: Relaying dirport requests to Tor download site / website
 
- Version: $Revision$
 
- Last-Modified: $Date$
 
- Author: Roger Dingledine
 
- Created: 2007-12-02
 
- Status: Draft
 
- 1. Overview
 
-   Some countries and networks block connections to the Tor website. As
 
-   time goes by, this will remain a problem and it may even become worse.
 
-   We have a big pile of mirrors (google for "Tor mirrors"), but few of
 
-   our users think to try a search like that. Also, many of these mirrors
 
-   might be automatically blocked since their pages contain words that
 
-   might cause them to get banned. And lastly, we can imagine a future
 
-   where the blockers are aware of the mirror list too.
 
-   Here we describe a new set of URLs for Tor's DirPort that will relay
 
-   connections from users to the official Tor download site. Rather than
 
-   trying to cache a bunch of new Tor packages (which is a hassle in terms
 
-   of keeping them up to date, and a hassle in terms of drive space used),
 
-   we instead just proxy the requests directly to Tor's /dist page.
 
-   Specifically, we should support
 
-     GET /tor/dist/$1
 
-   and
 
-     GET /tor/website/$1
 
- 2. Direct connections, one-hop circuits, or three-hop circuits?
 
-   We could relay the connections directly to the download site -- but
 
-   this produces recognizable outgoing traffic on the bridge or cache's
 
-   network, which will probably surprise our nice volunteers. (Is this
 
-   a good enough reason to discard the direct connection idea?)
 
-   Even if we don't do direct connections, should we do a one-hop
 
-   begindir-style connection to the mirror site (make a one-hop circuit
 
-   to it, then send a 'begindir' cell down the circuit), or should we do
 
-   a normal three-hop anonymized connection?
 
-   If these mirrors are mainly bridges, doing either a direct or a one-hop
 
-   connection creates another way to enumerate bridges. That would argue
 
-   for three-hop. On the other hand, downloading a 10+ megabyte installer
 
-   through a normal Tor circuit can't be fun. But if you're already getting
 
-   throttled a lot because you're in the "relayed traffic" bucket, you're
 
-   going to have to accept a slow transfer anyway. So three-hop it is.
 
-   Speaking of which, we would want to label this connection
 
-   as "relay" traffic for the purposes of rate limiting; see
 
-   connection_counts_as_relayed_traffic() and or_conn->client_used. This
 
-   will be a bit tricky though, because these connections will use the
 
-   bridge's guards.
 
- 3. Scanning resistance
 
-   One other goal we'd like to achieve, or at least not hinder, is making
 
-   it hard to scan large swaths of the Internet to look for responses
 
-   that indicate a bridge.
 
-   In general this is a really hard problem, so we shouldn't demand to
 
-   solve it here. But we can note that some bridges should open their
 
-   DirPort (and offer this functionality), and others shouldn't. Then
 
-   some bridges provide a download mirror while others can remain
 
-   scanning-resistant.
 
- 4. Integrity checking
 
-   If we serve this stuff in plaintext from the bridge, anybody in between
 
-   the user and the bridge can intercept and modify it. The bridge can too.
 
-   If we do an anonymized three-hop connection, the exit node can also
 
-   intercept and modify the exe it sends back.
 
-   Are we setting ourselves up for rogue exit relays, or rogue bridges,
 
-   that trojan our users?
 
-   Answer #1: Users need to do pgp signature checking. Not a very good
 
-   answer, a) because it's complex, and b) because they don't know the
 
-   right signing keys in the first place.
 
-   Answer #2: The mirrors could exit from a specific Tor relay, using the
 
-   '.exit' notation. This would make connections a bit more brittle, but
 
-   would resolve the rogue exit relay issue. We could even round-robin
 
-   among several, and the list could be dynamic -- for example, all the
 
-   relays with an Authority flag that allow exits to the Tor website.
 
-   Answer #3: The mirrors should connect to the main distribution site
 
-   via SSL. That way the exit relay can't influence anything.
 
-   Answer #4: We could suggest that users only use trusted bridges for
 
-   fetching a copy of Tor. Hopefully they heard about the bridge from a
 
-   trusted source rather than from the adversary.
 
-   Answer #5: What if the adversary is trawling for Tor downloads by
 
-   network signature -- either by looking for known bytes in the binary,
 
-   or by looking for "GET /tor/dist/"? It would be nice to encrypt the
 
-   connection from the bridge user to the bridge. And we can! The bridge
 
-   already supports TLS. Rather than initiating a TLS renegotiation after
 
-   connecting to the ORPort, the user should actually request a URL. Then
 
-   the ORPort can either pass the connection off as a linked conn to the
 
-   dirport, or renegotiate and become a Tor connection, depending on how
 
-   the client behaves.
 
- 5. Linked connections: at what level should we proxy?
 
-   Check out the connection_ap_make_link() function, as called from
 
-   directory.c. Tor clients use this to create a "fake" socks connection
 
-   back to themselves, and then they attach a directory request to it,
 
-   so they can launch directory fetches via Tor. We can piggyback on
 
-   this feature.
 
-   We need to decide if we're going to be passing the bytes back and
 
-   forth between the web browser and the main distribution site, or if
 
-   we're going to be actually acting like a proxy (parsing out the file
 
-   they want, fetching that file, and serving it back).
 
-   Advantages of proxying without looking inside:
 
-     - We don't need to build any sort of http support (including
 
-       continues, partial fetches, etc etc).
 
-   Disadvantages:
 
-     - If the browser thinks it's speaking http, are there easy ways
 
-       to pass the bytes to an https server and have everything work
 
-       correctly? At the least, it would seem that the browser would
 
-       complain about the cert. More generally, ssl wants to be negotiated
 
-       before the URL and headers are sent, yet we need to read the URL
 
-       and headers to know that this is a mirror request; so we have an
 
-       ordering problem here.
 
-     - Makes it harder to do caching later on, if we don't look at what
 
-       we're relaying. (It might be useful down the road to cache the
 
-       answers to popular requests, so we don't have to keep getting
 
-       them again.)
 
- 6. Outstanding problems
 
-   1) HTTP proxies already exist.  Why waste our time cloning one
 
-   badly? When we clone existing stuff, we usually regret it.
 
-   2) It's overbroad.  We only seem to need a secure get-a-tor feature,
 
-   and instead we're contemplating building a locked-down HTTP proxy.
 
-   3) It's going to add a fair bit of complexity to our code.  We do
 
-   not currently implement HTTPS.  We'd need to refactor lots of the
 
-   low-level connection stuff so that "SSL" and "Cell-based" were no
 
-   longer synonymous.
 
-   4) It's still unclear how effective this proposal would be in
 
-   practice. You need to know that this feature exists, which means
 
-   somebody needs to tell you about a bridge (mirror) address and tell
 
-   you how to use it. And if they're doing that, they could (e.g.) tell
 
-   you about a gmail autoresponder address just as easily, and then you'd
 
-   get better authentication of the Tor program to boot.
 
 
  |