# Troll Patrol Artifact

This repo contains scripts for reproducing the results in our paper.

## Reproducing our results

Dependencies:
- bash
- curl
- docker
- git

### Quick Start

To reproduce our results:

1. Install the dependencies listed above.
2. `git clone -b artifact https://git-crysp.uwaterloo.ca/vvecna/lox-troll-patrol-extension`
3. `cd lox-troll-patrol-extension`
4. `./run.sh`

See below for additional options. In particular, this script is expected
to take about 1-2 hours and requires about 20 GB of free disk space. If
either of these requirements is too high, see the `-s` and `--fast`
options below. If your CPU has some cores that are more performant than
others, see the `-n` and `-N` options below.

### Options

Full usage: `./run.sh [-s|--fast] [-n NUM_PERFORMANCE_CORES] [-N PERFORMANCE_CORE_RANGE]`

`./run.sh` actually runs four different scripts:

1. `./scripts/setup.sh` sets up the docker images.
2. `./scripts/belarus.sh` produces the results from the Belarus 2020-2021 case study (Section 3 and Appendix A of our paper).
3. `./scripts/generate-lox-results.sh` runs benchmarks for the original version of Lox, a current development branch of Lox, and our fork of Lox containing additional report-related protocols.
4. `./scripts/process-lox-results.sh` processes the results from the previous step and outputs the tables from Section 6 and Appendix B of our paper.

If the `-s` or `--fast` option (these are mutually exclusive) is specified, it is passed to `./scripts/belarus.sh`.

`./scripts/belarus.sh` downloads and processes all extra-info records
for all bridges from 2020-07 through 2021-04 from the Tor Project's
CollecTor service. This is about 4 million files (around 20 GB
uncompressed). By default, data from all 10 months is processed in
parallel, which takes under an hour but requires extracting and storing
all 20 GB of data at once.

If you cannot afford to fill up 20 GB of disk space at once, you can use
the `-s` option, which performs these 10 steps sequentially. This will
take about 10 times as long but will only use a few GB of space at a
time.

If you cannot afford the time and space requirements, you can use the
`--fast` option, which does not download the extra-info archives from
the Tor Project. Instead, it starts with a 6.7 MB archive (87 MB
uncompressed) containing only the information we need. It yields
identical results and takes only 1-2 minutes; however, it requires
trusting that the pre-processed data we provide was extracted correctly.

If the `-n` and/or `-N` options are specified, they are passed to `./scripts/generate-lox-results.sh`.

`./scripts/generate-lox-results.sh` runs benchmarks, with each process
isolated to a single thread. Ideally, the threads used should all be
equally performant, to ensure that all results computed can be
reasonably compared to each other.

By default, the script will use all available threads (up to the number
of processes to be run, which is 16), but you can use
`-n NUM_PERFORMANCE_CORES` to restrict it to only use
`NUM_PERFORMANCE_CORES` threads.

By default, if the `-n` option is used, the script will use the first
`NUM_PERFORMANCE_CORES` threads. If these are not the ones you want to
use, you can also specify `-N PERFORMANCE_CORE_RANGE` to indicate the
specific threads to use. **If you use `-N` to indicate a range, please
also specify `-n` to indicate the number of values in that range.** The
script does not compute this for you.

Examples:

- `./run.sh --fast -n 4`: Use pre-processed data for the `./scripts/belarus.sh` step, and use only the first 4 threads (i.e., 0-3) for `./scripts/generate-lox-results.sh`.
- `./run.sh -n 5 -N 1-2,4-6`: Use only threads 1, 2, 4, 5, and 6 for `./scripts/generate-lox-results.sh`.

## Results

The results are all output to the top-level directory of the project.
After running `./run.sh`:
- The results from Section 3 can be found in the **section-3-results** file.
- Table 4 (found in Appendix A of the paper) can be found in **appendix-a-results.pdf**
- Tables 2 and 3 (from Section 6) can be found in **table-2-results.pdf** and **table-3-results.pdf**.
- Table 5 (in Appendix B) can be found in **appendix-b-results.pdf**.

## Recovering from Errors

Hopefully the script will just run everything without issue. If there is
an issue at a step, try deleting the files/directories related to that
step, stopping and removing any docker containers related to that step,
and running the above script again. The script will not attempt to
re-run steps that have files. If a step worked, just leave its files
there, and you shouldn't have to redo that step.

If you want to run the commands for individual steps, you can do that.
See the Options section above.

## Time to Run

On my laptop ([13th Gen Intel Core i7-1360P](https://www.intel.com/content/www/us/en/products/sku/232155/intel-core-i71360p-processor-18m-cache-up-to-5-00-ghz/specifications.html); 16 threads; 4 performance cores, up to 5 GHz):
- `./scripts/setup.sh`: 10m 35s
- `./scripts/belarus.sh`: 51m 2s
- `./scripts/belarus.sh -s`: 9h 15m 16s
- `./scripts/belarus.sh --fast`: 1m 4s
- `./scripts/generate-lox-results.sh -n 4`: 18m 39s
- `./scripts/process-lox-results.sh`: 18s

On the device used for the paper ([Intel Xeon Platinum 8380](https://www.intel.com/content/www/us/en/products/sku/212287/intel-xeon-platinum-8380-processor-60m-cache-2-30-ghz/specifications.html); 80 threads; 40 cores @ 2.30 GHz, up to 3.40 GHz):
- `./scripts/setup.sh`: 6m 47s
- `./scripts/belarus.sh`: 42m 32s
- `./scripts/belarus.sh -s`: 7h 57m 10s
- `./scripts/belarus.sh --fast`: 1m 0s
- `./scripts/generate-lox-results.sh`: 5m 56s
- `./scripts/process-lox-results.sh`: 17s