Troll Patrol extension to Lox, forked from https://gitlab.torproject.org/onyinyang/lox/-/tree/lox-extension

Vecna c822babd5d Update readme to reflect Belarus parallel/sequential modes 4 weeks ago
conf b384d8d283 Fix non-interactive docker setup 1 month ago
parsing-results f3d93a012f Support more sigmas 1 month ago
scripts e93fd15e59 Simplify 4 weeks ago
Dockerfile 3a66d748c5 Add missing fonts 1 month ago
LICENSE 60f23e2549 Add license 1 month ago
README.md c822babd5d Update readme to reflect Belarus parallel/sequential modes 4 weeks ago
run-fast.sh 29427f8d6b Fix fast script too 1 month ago
run.sh 86a7a56a41 Support sequential option for Belarus case study 4 weeks ago

README.md

Troll Patrol Artifact

This repo contains scripts for reproducing the results in our paper.

Reproducing our results

Dependencies:

  • bash
  • curl
  • docker
  • git

Quick Start

To reproduce our results:

  1. Install the dependencies listed above.
  2. git clone -b artifact https://git-crysp.uwaterloo.ca/vvecna/lox-troll-patrol-extension
  3. cd lox-troll-patrol-extension
  4. ./run.sh

See below for additional options. In particular, this script is expected to take about 1-2 hours and requires about 20 GB of free disk space. If either of these requirements is too high, see the -s and --fast options below. If your CPU has some cores that are more performant than others, see the -n and -N options below.

Options

Full usage: ./run.sh [-s|--fast] [-n NUM_PERFORMANCE_CORES] [-N PERFORMANCE_CORE_RANGE]

./run.sh actually runs four different scripts:

  1. ./scripts/setup.sh sets up the docker images.
  2. ./scripts/belarus.sh produces the results from the Belarus 2020-2021 case study (Section 3 and Appendix A of our paper).
  3. ./scripts/generate-lox-results.sh runs benchmarks for the original version of Lox, a current development branch of Lox, and our fork of Lox containing additional report-related protocols.
  4. ./scripts/process-lox-results.sh processes the results from the previous step and outputs the tables from Section 6 and Appendix B of our paper.

If the -s or --fast option (these are mutually exclusive) is specified, it is passed to ./scripts/belarus.sh.

./scripts/belarus.sh downloads and processes all extra-info records for all bridges from 2020-07 through 2021-04 from the Tor Project's CollecTor service. This is about 4 million files (around 20 GB uncompressed). By default, data from all 10 months is processed in parallel, which takes under an hour but requires extracting and storing all 20 GB of data at once.

If you cannot afford to fill up 20 GB of disk space at once, you can use the -s option, which performs these 10 steps sequentially. This will take about 10 times as long but will only use a few GB of space at a time.

If you cannot afford the time and space requirements, you can use the --fast option, which does not download the extra-info archives from the Tor Project. Instead, it starts with a 6.7 MB archive (87 MB uncompressed) containing only the information we need. It yields identical results and takes only 1-2 minutes; however, it requires trusting that the pre-processed data we provide was extracted correctly.

If the -n and/or -N options are specified, they are passed to ./scripts/generate-lox-results.sh.

./scripts/generate-lox-results.sh runs benchmarks, with each process isolated to a single thread. Ideally, the threads used should all be equally performant, to ensure that all results computed can be reasonably compared to each other.

By default, the script will use all available threads (up to the number of processes to be run, which is 16), but you can use -n NUM_PERFORMANCE_CORES to restrict it to only use NUM_PERFORMANCE_CORES threads.

By default, if the -n option is used, the script will use the first NUM_PERFORMANCE_CORES threads. If these are not the ones you want to use, you can also specify -N PERFORMANCE_CORE_RANGE to indicate the specific threads to use. If you use -N to indicate a range, please also specify -n to indicate the number of values in that range. The script does not compute this for you.

Examples:

  • ./run.sh --fast -n 4: Use pre-processed data for the ./scripts/belarus.sh step, and use only the first 4 threads (i.e., 0-3) for ./scripts/generate-lox-results.sh.
  • ./run.sh -n 5 -N 1-2,4-6: Use only threads 1, 2, 4, 5, and 6 for ./scripts/generate-lox-results.sh.

Results

The results are all output to the top-level directory of the project. After running ./run.sh:

  • The results from Section 3 can be found in the section-3-results file.
  • Table 4 (found in Appendix A of the paper) can be found in appendix-a-results.pdf
  • Tables 2 and 3 (from Section 6) can be found in table-2-results.pdf and table-3-results.pdf.
  • Table 5 (in Appendix B) can be found in appendix-b-results.pdf.

Recovering from Errors

Hopefully the script will just run everything without issue. If there is an issue at a step, try deleting the files/directories related to that step, stopping and removing any docker containers related to that step, and running the above script again. The script will not attempt to re-run steps that have files. If a step worked, just leave its files there, and you shouldn't have to redo that step.

If you want to run the commands for individual steps, you can do that. See the Options section above.

Time to Run

On my laptop (13th Gen Intel Core i7-1360P; 16 threads; 4 performance cores, up to 5 GHz):

  • ./scripts/setup.sh: 10m 35s
  • ./scripts/belarus.sh: 51m 2s
  • ./scripts/belarus.sh -s: 9h 15m 16s
  • ./scripts/belarus.sh --fast: 1m 4s
  • ./scripts/generate-lox-results.sh -n 4: 18m 39s
  • ./scripts/process-lox-results.sh: 18s

On the device used for the paper (Intel Xeon Platinum 8380; 80 threads; 40 cores @ 2.30 GHz, up to 3.40 GHz):

  • ./scripts/setup.sh: 6m 47s
  • ./scripts/belarus.sh: 42m 32s
  • ./scripts/belarus.sh -s: 7h 57m 10s
  • ./scripts/belarus.sh --fast: 1m 0s
  • ./scripts/generate-lox-results.sh: 5m 56s
  • ./scripts/process-lox-results.sh: 17s