|
|
@@ -6,45 +6,85 @@ This repo contains scripts for reproducing the results in our paper.
|
|
|
|
|
|
Dependencies:
|
|
|
- bash
|
|
|
+- curl
|
|
|
- docker
|
|
|
- git
|
|
|
|
|
|
+### Quick Start
|
|
|
+
|
|
|
To reproduce our results:
|
|
|
|
|
|
-1. `git clone -b artifact https://git-crysp.uwaterloo.ca/vvecna/lox-troll-patrol-extension`
|
|
|
-2. `cd lox-troll-patrol-extension`
|
|
|
-3. `./run.sh [-n NUM_PERFORMANCE_CORES] [-N PERFORMANCE_CORE_RANGE]` or `./run-fast.sh [-n NUM_PERFORMANCE_CORES] [-N PERFORMANCE_CORE_RANGE]`
|
|
|
-
|
|
|
-**The `./run.sh` script takes a long time and requires a few GB of free
|
|
|
-space.** The reason is that it downloads and processes all extra-info
|
|
|
-records for all bridges from 2020-07 to 2021-04 from the Tor Project's
|
|
|
-CollecTor service. This is about 4 million files, and it takes a long
|
|
|
-time to process them all. (These files are downloaded in compressed
|
|
|
-archives; the total download size for them is under 750 MB.)
|
|
|
-
|
|
|
-The `./run-fast.sh` script instead starts with a 6.7 MB archive (87 MB
|
|
|
-uncompressed) containing only the information we need. This is *much*
|
|
|
-faster and requires much less disk space. It yields identical results;
|
|
|
-however, it requires trusting that the pre-processed data was extracted
|
|
|
-correctly.
|
|
|
-
|
|
|
-See the "Time to Run" section below to estimate how long either option
|
|
|
-will take. The choice between the two scripts only determines whether
|
|
|
-`./scripts/belarus.sh` or `./scripts/belarus.sh --fast` will be run. The
|
|
|
-other steps are pretty quick. If you specify `-n NUM_PERFORMANCE_CORES`,
|
|
|
-then the Lox benchmarking step (`./scripts/generate-lox-results.sh`)
|
|
|
-will only use the first `NUM_PERFORMANCE_CORES` threads. This may take a
|
|
|
-bit longer but should ensure that the results are computed fairly. You
|
|
|
-can also specify `-N PERFORMANCE_CORE_RANGE` to use only specific
|
|
|
-threads (e.g., `-N 1-2,4-5`), instead of 0 through
|
|
|
-`NUM_PERFORMANCE_CORES`-1. If you use the `-N` option, please also use
|
|
|
-the `-n` option to specify how many threads are available in the
|
|
|
-specified set.
|
|
|
+1. Install the dependencies listed above.
|
|
|
+2. `git clone -b artifact https://git-crysp.uwaterloo.ca/vvecna/lox-troll-patrol-extension`
|
|
|
+3. `cd lox-troll-patrol-extension`
|
|
|
+4. `./run.sh`
|
|
|
+
|
|
|
+See below for additional options. In particular, this script is expected
|
|
|
+to take about 1-2 hours and requires about 20 GB of free disk space. If
|
|
|
+either of these requirements is too high, see the `-s` and `--fast`
|
|
|
+options below. If your CPU has some cores that are more performant than
|
|
|
+others, see the `-n` and `-N` options below.
|
|
|
+
|
|
|
+### Options
|
|
|
+
|
|
|
+Full usage: `./run.sh [-s|--fast] [-n NUM_PERFORMANCE_CORES] [-N PERFORMANCE_CORE_RANGE]`
|
|
|
+
|
|
|
+`./run.sh` actually runs four different scripts:
|
|
|
+
|
|
|
+1. `./scripts/setup.sh` sets up the docker images.
|
|
|
+2. `./scripts/belarus.sh` produces the results from the Belarus 2020-2021 case study (Section 3 and Appendix A of our paper).
|
|
|
+3. `./scripts/generate-lox-results.sh` runs benchmarks for the original version of Lox, a current development branch of Lox, and our fork of Lox containing additional report-related protocols.
|
|
|
+4. `./scripts/process-lox-results.sh` processes the results from the previous step and outputs the tables from Section 6 and Appendix B of our paper.
|
|
|
+
|
|
|
+If the `-s` or `--fast` option (these are mutually exclusive) is specified, it is passed to `./scripts/belarus.sh`.
|
|
|
+
|
|
|
+`./scripts/belarus.sh` downloads and processes all extra-info records
|
|
|
+for all bridges from 2020-07 through 2021-04 from the Tor Project's
|
|
|
+CollecTor service. This is about 4 million files (around 20 GB
|
|
|
+uncompressed). By default, data from all 10 months is processed in
|
|
|
+parallel, which takes under an hour but requires extracting and storing
|
|
|
+all 20 GB of data at once.
|
|
|
+
|
|
|
+If you cannot afford to fill up 20 GB of disk space at once, you can use
|
|
|
+the `-s` option, which performs these 10 steps sequentially. This will
|
|
|
+take about 10 times as long but will only use a few GB of space at a
|
|
|
+time.
|
|
|
+
|
|
|
+If you cannot afford the time and space requirements, you can use the
|
|
|
+`--fast` option, which does not download the extra-info archives from
|
|
|
+the Tor Project. Instead, it starts with a 6.7 MB archive (87 MB
|
|
|
+uncompressed) containing only the information we need. It yields
|
|
|
+identical results and takes only 1-2 minutes; however, it requires
|
|
|
+trusting that the pre-processed data we provide was extracted correctly.
|
|
|
+
|
|
|
+If the `-n` and/or `-N` options are specified, they are passed to `./scripts/generate-lox-results.sh`.
|
|
|
+
|
|
|
+`./scripts/generate-lox-results.sh` runs benchmarks, with each process
|
|
|
+isolated to a single thread. Ideally, the threads used should all be
|
|
|
+equally performant, to ensure that all results computed can be
|
|
|
+reasonably compared to each other.
|
|
|
+
|
|
|
+By default, the script will use all available threads (up to the number
|
|
|
+of processes to be run, which is 16), but you can use
|
|
|
+`-n NUM_PERFORMANCE_CORES` to restrict it to only use
|
|
|
+`NUM_PERFORMANCE_CORES` threads.
|
|
|
+
|
|
|
+By default, if the `-n` option is used, the script will use the first
|
|
|
+`NUM_PERFORMANCE_CORES` threads. If these are not the ones you want to
|
|
|
+use, you can also specify `-N PERFORMANCE_CORE_RANGE` to indicate the
|
|
|
+specific threads to use. **If you use `-N` to indicate a range, please
|
|
|
+also specify `-n` to indicate the number of values in that range.** The
|
|
|
+script does not compute this for you.
|
|
|
+
|
|
|
+Examples:
|
|
|
+
|
|
|
+- `./run.sh --fast -n 4`: Use pre-processed data for the `./scripts/belarus.sh` step, and use only the first 4 threads (i.e., 0-3) for `./scripts/generate-lox-results.sh`.
|
|
|
+- `./run.sh -n 5 -N 1-2,4-6`: Use only threads 1, 2, 4, 5, and 6 for `./scripts/generate-lox-results.sh`.
|
|
|
|
|
|
## Results
|
|
|
|
|
|
The results are all output to the top-level directory of the project.
|
|
|
-After running either `./run.sh` or `./run-fast.sh`:
|
|
|
+After running `./run.sh`:
|
|
|
- The results from Section 3 can be found in the **section-3-results** file.
|
|
|
- Table 4 (found in Appendix A of the paper) can be found in **appendix-a-results.pdf**
|
|
|
- Tables 2 and 3 (from Section 6) can be found in **table-2-results.pdf** and **table-3-results.pdf**.
|
|
|
@@ -60,25 +100,22 @@ re-run steps that have files. If a step worked, just leave its files
|
|
|
there, and you shouldn't have to redo that step.
|
|
|
|
|
|
If you want to run the commands for individual steps, you can do that.
|
|
|
-From the top-level directory, run:
|
|
|
-- `./scripts/setup.sh` to clone dependencies and build the docker images
|
|
|
-- `./scripts/belarus.sh` to produce the results from Section 3 and Appendix A starting with all 20 GB of data from the Tor Project (or `./scripts/belarus.sh --fast` to start with some pre-processed data)
|
|
|
-- `./scripts/generate-lox-results.sh [-n NUM_EFFICIENT_CPUS]` to run the Lox benchmarking code to get the results from Section 6 and Appendix B (see the next step for producing the actual tables); specify `NUM_PERFORMANCE_CORES` to only use the first `NUM_PERFORMANCE_CORES` threads
|
|
|
-- `./scripts/process-lox-results.sh` to process the Lox benchmarking results previously produced by `./scripts/generate-lox-results.sh` and generate the tables in Section 6 and Appendix B
|
|
|
+See the Options section above.
|
|
|
|
|
|
## Time to Run
|
|
|
|
|
|
On my laptop ([13th Gen Intel Core i7-1360P](https://www.intel.com/content/www/us/en/products/sku/232155/intel-core-i71360p-processor-18m-cache-up-to-5-00-ghz/specifications.html); 16 threads; 4 performance cores, up to 5 GHz):
|
|
|
- `./scripts/setup.sh`: 10m 35s
|
|
|
-- `./scripts/belarus.sh`: 9h 15m 16s
|
|
|
+- `./scripts/belarus.sh`: 51m 2s
|
|
|
+- `./scripts/belarus.sh -s`: 9h 15m 16s
|
|
|
- `./scripts/belarus.sh --fast`: 1m 4s
|
|
|
- `./scripts/generate-lox-results.sh -n 4`: 18m 39s
|
|
|
- `./scripts/process-lox-results.sh`: 18s
|
|
|
|
|
|
On the device used for the paper ([Intel Xeon Platinum 8380](https://www.intel.com/content/www/us/en/products/sku/212287/intel-xeon-platinum-8380-processor-60m-cache-2-30-ghz/specifications.html); 80 threads; 40 cores @ 2.30 GHz, up to 3.40 GHz):
|
|
|
- `./scripts/setup.sh`: 6m 47s
|
|
|
-- `./scripts/belarus.sh`: 7h 57m 10s
|
|
|
+- `./scripts/belarus.sh`: 42m 32s
|
|
|
+- `./scripts/belarus.sh -s`: 7h 57m 10s
|
|
|
- `./scripts/belarus.sh --fast`: 1m 0s
|
|
|
- `./scripts/generate-lox-results.sh`: 5m 56s
|
|
|
- `./scripts/process-lox-results.sh`: 17s
|
|
|
-
|