| 
					
				 | 
			
			
				@@ -0,0 +1,134 @@ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+Filename: 168-reduce-circwindow.txt 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+Title: Reduce default circuit window 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+Author: Roger Dingledine 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+Created: 12-Aug-2009 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+Status: Open 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+Target: 0.2.2 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+0. History 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+1. Overview 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  We should reduce the starting circuit "package window" from 1000 to 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  101. The lower package window will mean that clients will only be able 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  to receive 101 cells (~50KB) on a circuit before they need to send a 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  'sendme' acknowledgement cell to request 100 more. 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  Starting with a lower package window on exit relays should save on 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  buffer sizes (and thus memory requirements for the exit relay), and 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  should save on queue sizes (and thus latency for users). 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  Lowering the package window will induce an extra round-trip for every 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  additional 50298 bytes of the circuit. This extra step is clearly a 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  slow-down for large streams, but ultimately we hope that a) clients 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  fetching smaller streams will see better response, and b) slowing 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  down the large streams in this way will produce lower e2e latencies, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  so the round-trips won't be so bad. 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+2. Motivation 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  Karsten's torperf graphs show that the median download time for a 50KB 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  file over Tor in mid 2009 is 7.7 seconds, whereas the median download 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  time for 1MB and 5MB are around 50s and 150s respectively. The 7.7 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  second figure is way too high, whereas the 50s and 150s figures are 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  surprisingly low. 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  The median round-trip latency appears to be around 2s, with 25% of 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  the data points taking more than 5s. That's a lot of variance. 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  We designed Tor originally with the original goal of maximizing 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  throughput. We figured that would also optimize other network properties 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  like round-trip latency. Looks like we were wrong. 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+3. Design 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  Wherever we initialize the circuit package window, initialize it to 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  101 rather than 1000. Reducing it should be safe even when interacting 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  with old Tors: the old Tors will receive the 101 cells and send back 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  a sendme ack cell. They'll still have much higher deliver windows, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  but the rest of their deliver window will go unused. 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  You can find the patch at arma/circwindow. It seems to work. 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+3.1. Why not 100? 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  Tor 0.0.0 through 0.2.1.19 have a bug where they only send the sendme 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  ack cell after 101 cells rather than the intended 100 cells. 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  Once 0.2.1.19 is obsolete we can change it back to 100 if we like. But 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  hopefully we'll have moved to some datagram protocol long before 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  0.2.1.19 becomes obsolete. 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+3.2. What about stream packaging windows? 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  Right now the stream packaging windows start at 500. The goal was to 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  set the stream window to half the circuit window, to provide a crude 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  load balancing between streams on the same circuit. Once we lower 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  the circuit packaging window, the stream packaging window basically 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  becomes redundant. 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  We could leave it in -- it isn't hurting much in either case. Or we 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  could take it out -- people building other Tor clients would thank us 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  for that step. Alas, people building other Tor clients are going to 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  have to be compatible with current Tor clients, so in practice there's 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  no point taking out the stream packaging windows. 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+3.3. What about variable circuit windows? 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  Once upon a time we imagined adapting the circuit package window to 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  the network conditions. That is, we would start the window small, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  and raise it based on the latency and throughput we see. 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  In theory that crude imitation of TCP's windowing system would allow 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  us to adapt to fill the network better. In practice, I think we want 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  to stick with the small window and never raise it. The low cap reduces 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  the total throughput you can get from Tor for a given circuit. But 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  that's a feature, not a bug. 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+4. Evaluation 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  How do we know this change is actually smart? It seems intuitive that 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  it's helpful, and some smart systems people have agreed that it's 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  a good idea (or said another way, they were shocked at how big the 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  default package window was before). 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  To get a more concrete sense of the benefit, though, Karsten has been 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  running torperf side-by-side on exit relays with the old package window 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  vs the new one. The results are mixed currently -- it is slightly faster 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  for fetching 40KB files, and slightly slower for fetching 50KB files. 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  I think it's going to be tough to get a clear conclusion that this is 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  a good design just by comparing one exit relay running the patch. The 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  trouble is that the other hops in the circuits are still getting bogged 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  down by other clients introducing too much traffic into the network. 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  Ultimately, we'll want to put the circwindow parameter into the 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  consensus so we can test a broader range of values once enough relays 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  have upgraded. 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+5. Transition and deployment 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  We should put the circwindow in the consensus (see proposal 167), 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  with an initial value of 101. Then as more exit relays upgrade, 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  clients should seamlessly get the better behavior. 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  Note that upgrading the exit relay will only affect the "download" 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  package window. An old client that's uploading lots of bytes will 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  continue to use the old package window at the client side, and we 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  can't throttle that window at the exit side without breaking protocol. 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  The real question then is what we should backport to 0.2.1. Assuming 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  this could be a big performance win, we can't afford to wait until 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  0.2.2.x comes out before starting to see the changes here. So we have 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  two options as I see them: 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  a) once clients in 0.2.2.x know how to read the value out of the 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  consensus, and it's been tested for a bit, backport that part to 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  0.2.1.x. 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  b) if it's too complex to backport, just pick a number, like 101, and 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  backport that number. 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  Clearly choice (a) is the better one if the consensus parsing part 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  isn't very complex. Let's shoot for that, and fall back to (b) if the 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  patch turns out to be so big that we reconsider. 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+ 
			 |