Simulation for Lox/Troll Patrol

Vecna d4f6380d77 Add tests for direct_scan_server		hai 1 ano
configs	0c74c62e3e Verify Troll Patrol's inferences but still count wrong inferences	hai 1 ano
scripts	15ad9c02ed Fixes for previous commit	hai 1 ano
setup_files	2a78b4e8b2 Add Dockerfile	%!s(int64=2) %!d(string=hai) anos
src	d4f6380d77 Add tests for direct_scan_server	hai 1 ano
Cargo.toml	642827e228 Reduce number of unnecessary censor users for efficiency	hai 1 ano
Dockerfile	76cd49208b Update OS version	hai 1 ano
LICENSE	19fc667b36 Move simulation code from Troll Patrol to its own repo	%!s(int64=2) %!d(string=hai) anos
README.md	219aa35010 Add break	hai 1 ano
run-experiments.sh	c945f72cac Add script and configurations for reproducing Troll Patrol results	hai 1 ano

Lox Simulation

This is a simulation for evaluating Lox and Troll Patrol.

Troll Patrol Experiments

Vecna, vvecna@uwaterloo.ca
Ian Goldberg, iang@uwaterloo.ca

This repo contains scripts to run the code and reproduce the graphs and tables in our paper:

[TODO: citation]

Requirements

To reproduce our results, you will need a basic Linux installation, with git, docker, python3, and matplotlib. We have tested our code on Debian 12 and Ubuntu 22.04. On Debian 12, you can install these dependencies with apt install git docker.io python3 python3-matplotlib.

Quick Start

Clone this repo and check out the commit [TODO: hash/tag/branch]

git clone https://git-crysp.uwaterloo.ca/vvecna/lox-simulation.git
cd lox-simulation
git checkout [TODO: hash/tag/branch]

Decide values for the following parameters. Depending on your CPU model, each simulation run may take around 1.5 to 3.5 real days to complete. While multiple simulation runs can be performed in parallel, reproducing all of our results would require 205 total simulation runs (41 configurations * 5 runs in each configuration). If the full results cannot be performed in a reasonable amount of time, then you can reproduce a subset of our results by selecting these parameters appropriately.

PARALLEL_RUNS, the number of simulation runs to perform in parallel. We recommend one run per available CPU core. A single simulation run could take up to 688 MB of RAM, so you should have at least 688 * PARALLEL_RUNS MB of RAM.
NUM_RUNS, the number of simulation runs to perform in each configuration. We used 5 for our paper.
NUM_CONFIGS_1, the number of configurations to run for the first experiment. To fully reproduce our first experiment, set this value to 33. The first two configurations are shared between the two experiments.
NUM_CONFIGS_2, the number of configurations to run for the second experiment. To fully reproduce our second experiment, set this value to 8 and set NUM_CONFIGS_1 to at least 2. The results of the shared configurations will be copied from the first experiment.

Run the experiments. ./run-experiments.sh -p PARALLEL_RUNS -n NUM_RUNS -1 NUM_CONFIGS_1 -2 NUM_CONFIGS_2 -y

After this script completes, the results can be found in the results directory. See the "Output" section below for more details.

Advanced Use

Run ./run-experiments.sh with appropriate options. If required options are omitted, the script will ask you for them.

Full options for ./run-experiments.sh:

-p PARALLEL_RUNS -- Run PARALLEL_RUNS simulation runs in parallel.
-n NUM_RUNS -- Run NUM_RUNS simulation runs in each configuration. We used 5 for our paper.
-1 NUM_CONFIGS -- Run simulations for NUM_CONFIGS (max 33) configurations for the first experiment. The results from the first two configurations will be copied and also used as part of the second experiment.
-1 FIRST_CONFIG-LAST_CONFIG -- Run simulations for the first experiment for configurations numbered from FIRST_CONFIG to LAST_CONFIG, inclusive. Begin indexing at 1. To run all configurations, this would be -1 1-33.
-2 NUM_CONFIGS -- Run simulations for NUM_CONFIGS (max 8) configurations for the second experiment. There are actually 10 total configurations for the second experiment, but the first two are evaluated as part of the first experiment and copied.
-2 FIRST_CONFIG-LAST_CONFIG -- Run simulations for the first experiment for configurations numbered from FIRST_CONFIG to LAST_CONFIG, inclusive. Begin indexing at 1. To run all configurations, this would be -2 1-8.
-e -- Only run experiments but do not process results.
-r -- Only process results but do not run any simulations.
-y -- Run non-interactively. If you do not use this flag, this script will give you an estimate of how long it will take to run and ask for confirmation before beginning the experiments.

To break up the workload between multiple devices, use the -e option and specify mutually exclusive ranges of configurations. For example, if you have a NUMA cluster with 8 nodes, each with an 18-core CPU, you might run these four commands:

numactl -N 0 -m 0 ./run-experiments.sh -p 18 -n 5 -1 1-5 -2 0 -e -y
numactl -N 1 -m 1 ./run-experiments.sh -p 18 -n 5 -1 6-10 -2 0 -e -y
numactl -N 2 -m 2 ./run-experiments.sh -p 18 -n 5 -1 11-15 -2 0 -e -y
numactl -N 3 -m 3 ./run-experiments.sh -p 18 -n 5 -1 16-20 -2 0 -e -y
numactl -N 4 -m 4 ./run-experiments.sh -p 18 -n 5 -1 21-25 -2 0 -e -y
numactl -N 5 -m 5 ./run-experiments.sh -p 18 -n 5 -1 26-30 -2 0 -e -y
numactl -N 6 -m 6 ./run-experiments.sh -p 18 -n 5 -1 31-33 -2 1-2 -e -y
numactl -N 7 -m 7 ./run-experiments.sh -p 18 -n 5 -1 0 -2 3-8 -e -y

After all of these commands conclude, you can process the results using the following command:

./run-experiments.sh -n 5 -r

Output

./run-experiments.sh reproduces Figures 2 and 3 from Section 5, as well as Tables 2--5 from Appendix B. These files can be found in the results directory.

The figures are saved as PNG files, and the tables are saved as LaTeX files using the standalone package. These LaTeX files can be compiled on their own or included in another LaTeX file with \input. If the number of configurations for an experiment is reduced, the number of data points represented in each plot/table will be reduced. We have ordered the configurations such that using a reduced number of configurations in order should still yield a plot that demonstrates the underlying trend of the full results.

This script does not label the harshness values in Figure 3a.

Credit

Format and some text adapted from the README for the PRAC PoPETs 2024 artifact.

README.md