18 years ago · be069d3cd1
--- a/doc/spec/proposals/108-mtbf-based-stability.txt
+++ b/doc/spec/proposals/108-mtbf-based-stability.txt
@@ -29,22 +29,16 @@ Spec changes:
 
				    known to drop circuits stupidly. (0.1.1.10-alpha through 0.1.1.16-rc
			
 
				    are stupid this way.)
			
 
				 
			
 
				-   Stability shall be defined as the mean length of the runs observed by a
			
 
				-   given directory authority.  A run begins when an authority decides
			
 
				-   that the server is Running, and ends when the authority decides that
			
 
				-   the server is not Running.  In-progress runs are counted when
			
 
				-   measuring Stability.
			
 
				+   Stability shall be defined as the weighted mean length of the runs
			
 
				+   observed by a given directory authority.  A run begins when an authority
			
 
				+   decides that the server is Running, and ends when the authority decides
			
 
				+   that the server is not Running.  In-progress runs are counted when
			
 
				+   measuring Stability.  When calculating the mean, runs are weighted by
			
 
				+   $\alpha ^ t$, where $t$ is time elapsed since the end of the run, and
			
 
				+   $0 < \alpha < 1$.  Time when an authority is down do not count to the
			
 
				+   length of the run.
			
 
				 
			
 
				-Issues:
			
 
				-
			
 
				-   How do you define a clipped MTBF?  If the current month begins with one
			
 
				-   day at the end of a one-year uptime, and then has 29 days of uptime, do we
			
 
				-   average one day and 29 days?  Or do we average one year and 29 days?  Or
			
 
				-   take 29 days on its own and discard the year?
			
 
				-
			
 
				-   Surely somebody has done this kinds of thing before.
			
 
				-
			
 
				-Alternative:
			
 
				+Rejected Alternative:
			
 
				 
			
 
				    "A router's Stability shall be defined as the sum of $\alpha ^ d$ for every
			
 
				    $d$ such that the router was not observed to be unavailable $d$ days ago."
			
@@ -82,3 +76,13 @@ Implementation:
 
				    For now, the easiest way to store this information at authorities
			
 
				    would probably be in some kind of periodically flushed flat file.
			
 
				    Later, we could move to Berkeley db or something if we really had to.
			
 
				+
			
 
				+   For each router, an authority will need to store:
			
 
				+     The router ID.
			
 
				+     Whether the router is up.
			
 
				+     The time when the current run started, if the router is up.
			
 
				+     The weighted sum length of all previous runs.
			
 
				+     The time at which the weighted sum length was last weighted down.
			
 
				+
			
 
				+   Servers should probe at random intervals to test whether servers are
			
 
				+   running.