|
|
@@ -122,16 +122,16 @@ Here is an example of what the end of the script output should look like:
|
|
|
|
|
|
The artifact produces Tables 2 and 3 (from Section 6) and Table 5 (from Appendix C), showing benchmarks for the current development branch of Lox (Table 2), our fork of this development branch that includes reporting protocols (Table 3), and the original Lox implementation (Table 5 from Appendix C).
|
|
|
|
|
|
-Tables 2 and 3 relate to Main Results 1 and 2 (listed below).
|
|
|
+Tables 2 and 3 relate to Main Results [1](#main-result-1-new-protocols-are-comparable-to-existing-protocols) and [2](#main-result-2-our-modifications-have-a-small-impact-on-existing-protocols) (listed below).
|
|
|
|
|
|
-The artifact also reproduces the results in Appendix A of our paper, which are described below as Main Results 3 and 4.
|
|
|
+The artifact also reproduces the results in Appendix A of our paper, which are described below as Main Results [3](#main-result-3-algorithms-13-are-poor-classifiers-of-censorship-in-the-belarus-case-study) and [4](#main-result-4-email-distributed-obfs4-bridges-were-not-popular-in-belarus).
|
|
|
|
|
|
#### Main Result 1: New Protocols are Comparable to Existing Protocols
|
|
|
|
|
|
Our paper claims that our new protocols for reporting are comparable to existing Lox protocols in terms of communication and computation.
|
|
|
This is found by looking at Table 3 (benchmarks for our modified code, which adds reporting protocols).
|
|
|
Our new protocols (Report Submit, Report Status, and Report Resolve) have times and sizes comparable to those of the other protocols listed in the table.
|
|
|
-This claim is reproduced by Experiment 1, which produces Tables 2 and 3.
|
|
|
+This claim is reproduced by [Experiment 1](#experiment-1-lox-benchmarking), which produces Tables 2 and 3.
|
|
|
|
|
|
#### Main Result 2: Our Modifications Have a Small Impact on Existing Protocols
|
|
|
|
|
|
@@ -142,7 +142,7 @@ The request and response sizes produced by running this artifact should be ident
|
|
|
Request sizes in Table 3 should be at most 192 bytes greater than the sizes in Table 2 for the same protocols.
|
|
|
(The specific differences are 0, 96, 128, and 192.)
|
|
|
Response sizes in Tables 2 and 3 should be identical.
|
|
|
-This claim is reproduced by Experiment 1, which produces Tables 2 and 3.
|
|
|
+This claim is reproduced by [Experiment 1](#experiment-1-lox-benchmarking), which produces Tables 2 and 3.
|
|
|
|
|
|
#### Main Result 3: Algorithms 1–3 are Poor Classifiers of Censorship in the Belarus Case Study
|
|
|
|
|
|
@@ -151,7 +151,7 @@ These algorithms all failed spectacularly when trying to detect censorship of em
|
|
|
This is shown in Table 4, in which all but one configuration had 0 true positives.
|
|
|
The one instance with any true positives was Algorithm 3, with d=8.
|
|
|
In this case, we observed 5 true positives and 995 false positives.
|
|
|
-This claim is reproduced by Experiment 2, which produces Table 4 from Appendix A.
|
|
|
+This claim is reproduced by [Experiment 2](#experiment-2-belarus-case-study), which produces Table 4 from Appendix A.
|
|
|
|
|
|
#### Main Result 4: Email-Distributed obfs4 Bridges Were Not Popular in Belarus
|
|
|
|
|
|
@@ -167,6 +167,8 @@ Of the 93 email-distributed obfs4 bridges...
|
|
|
Only 5 of the 93 email-distributed obfs4 bridges had a mean connection count more than 1 standard deviation away from 0.
|
|
|
Of these 5 bridges, the greatest distance between 0 and the mean was only about 1.6 standard deviations.
|
|
|
|
|
|
+These details are reproduced by [Experiment 2](#experiment-2-belarus-case-study).
|
|
|
+
|
|
|
### Experiments
|
|
|
|
|
|
Our entire artifact can be run with the `./run.sh` script, which accepts the following (optional) arguments:
|
|
|
@@ -176,9 +178,9 @@ Our entire artifact can be run with the `./run.sh` script, which accepts the fol
|
|
|
-s (Experiment 2) Run ./scripts/belarus.sh sequentially, instead of in parallel.
|
|
|
--fast (Experiment 2) Start with pre-processed data.
|
|
|
|
|
|
-See Experiment 1 for examples of using the `-n` and `-N` options.
|
|
|
+See [Experiment 1](#experiment-1-lox-benchmarking) for examples of using the `-n` and `-N` options.
|
|
|
This process should take around 1–2 hours and requires 20 GB of free disk space.
|
|
|
-(`-s` or `--fast` will change the time and disk space requirements. See Experiment 2 for details.)
|
|
|
+(`-s` or `--fast` will change the time and disk space requirements. See [Experiment 2](#experiment-2-belarus-case-study) for details.)
|
|
|
|
|
|
#### Experiment 1: Lox Benchmarking
|
|
|
|
|
|
@@ -206,7 +208,7 @@ These scripts create .tex files for these tables and compile them to PDFs.
|
|
|
These .pdf files (as well as the corresponding .tex files) are copied to the root directory of the artifact.
|
|
|
The results can be found as table-2-results.pdf (Table 2), table-3-results.pdf (Table 3), and appendix-c-results.pdf (Table 5).
|
|
|
|
|
|
-table-2-results.pdf and table-3-results.pdf can be used to verify Main Results 1 and 2.
|
|
|
+table-2-results.pdf and table-3-results.pdf can be used to verify Main Results [1](#main-result-1-new-protocols-are-comparable-to-existing-protocols) and [2](#main-result-2-our-modifications-have-a-small-impact-on-existing-protocols).
|
|
|
|
|
|
#### Experiment 2: Belarus Case Study
|
|
|
|
|
|
@@ -238,10 +240,10 @@ The errors appear because there are some records that are malformed (missing bri
|
|
|
|
|
|
This experiment outputs appendix-a-results.pdf (and a corresponding .tex file), containing Table 4 (from Appendix A of our paper).
|
|
|
As this process begins with downloading existing data, and the algorithms used are deterministic, the contents of the output Table 4 should be identical to those in our paper.
|
|
|
-This table can be used to verify Main Result 3.
|
|
|
+This table can be used to verify [Main Result 3](#main-result-3-algorithms-13-are-poor-classifiers-of-censorship-in-the-belarus-case-study).
|
|
|
|
|
|
This experiment also outputs a text file called "appendix-a-results", which contains various statistics about connection counts to the bridges we evaluate, including those described in Appendix A.2.
|
|
|
-These can be used to verify Main Result 4.
|
|
|
+These can be used to verify [Main Result 4](#main-result-4-email-distributed-obfs4-bridges-were-not-popular-in-belarus).
|
|
|
|
|
|
## Limitations
|
|
|
|