Ian Goldberg 5cdd6a0e4b README touchups		5 months ago
..
Dockerfile	642dbc48b6 Remove a stray line from the Dockerfile	6 months ago
README.md	5cdd6a0e4b README touchups	5 months ago
build-docker	2dcc1d246a Dockerization, with run-experiments script	11 months ago
parse-clientscale-logs	9cb62258f9 Script and parser for experiments scaling the number of cores for Sparta	11 months ago
parse-corescale-logs	9cb62258f9 Script and parser for experiments scaling the number of cores for Sparta	11 months ago
run-clientscale-experiments	c83606d135 Make run-clientscale-experiments and run-corescale-experiments a little more robust to the number of cores available	6 months ago
run-corescale-experiments	c83606d135 Make run-clientscale-experiments and run-corescale-experiments a little more robust to the number of cores available	6 months ago
start-docker	2dcc1d246a Dockerization, with run-experiments script	11 months ago
stop-docker	2dcc1d246a Dockerization, with run-experiments script	11 months ago

sparta-experiments dockerization

This repository is a dockerization of the ucsc-anonymity/sparta-experiments repository. This dockerization can be used to reproduce the Sparta-SB datapoints in Figures 7 and 8 of the TEEMS paper:

Sajin Sasy, Aaron Johnson, and Ian Goldberg. TEEMS: A Trusted Execution Environment based Metadata-protected Messaging System. Proceedings on Privacy Enhancing Technologies, Vol. 2025, No. 4, July 2025.

This dockerization by Ian Goldberg, iang@uwaterloo.ca.

Hardware requirements

You will need a server with an Intel Xeon CPU that supports SGX2, and SGX2 must be enabled in the BIOS. To fully reproduce the graphs in the paper, you will need 72 CPU cores (not hyperthreads), but if you have fewer, your experiments will just run more slowly (Figure 7) or only partially (Figure 8). We used a machine with two 40-core Intel Xeon 8380 CPUs running at 2.3 GHz to generate the figures, if you aim to compare your results to ours.

Software requirements

The server should run Linux with kernel at least 5.11. We used Ubuntu 22.04.

SGX2 must be enabled on your machine, and so you should see device files /dev/sgx/enclave and /dev/sgx/provision.

You will need docker. On Ubuntu, for example: apt install docker.io and be sure to run all the experiments as a user with docker permissions (in the docker group).

You will need the aesmd service. If you have the file /var/run/aesmd/aesm.socket, then all is well. You already have the aesmd service running on your machine.

If not, run:

sudo mkdir -p -m 0755 /var/run/aesmd
docker run -d --rm --device /dev/sgx/enclave --device /dev/sgx/provision \
    -v /var/run/aesmd:/var/run/aesmd --name aesmd fortanix/aesmd

That will start the aesmd service in a docker, and you should then see the /var/run/aesmd/aesm.socket file existing.

If you started aesmd with this docker method, then when you're done with the experiments:

docker stop aesmd

You will need python3 with numpy (on the host machine) to run the log parser scripts.

Quickstart

Once you have SGX2 and aesmd set up, the following will build sparta and its dependencies, and run the experiments:

git clone https://git-crysp.uwaterloo.ca/iang/sparta-experiments-docker
cd sparta-experiments-docker/docker
./build-docker
./run-clientscale-experiments | tee /dev/stderr | ./parse-clientscale-logs
./run-corescale-experiments | tee /dev/stderr | ./parse-corescale-logs

Running the clientscale experiments (Figure 7) should take about 4 minutes; runinng the corescale experiments (Figure 8) should take about 13 minutes. (These runtimes are on a 2.3 GHz CPU, so your runtimes may vary.)

See below for optional arguments you can pass to the experiment scripts.

Details

Changes to the upstream repository

Other than the dockerization, this repository makes the following changes to the original ucsc-anonymity/sparta-experiments repository:

Added functions for sparta to return the current size of its message store (commit b8d48d7). As we note in Section A.3 of the TEEMS paper, Sparta's message store grows whenever any user sends a message, but does not shrink when messages are retrieved. These functions enable demonstration of this behaviour.
Increased the maximum number of threads and heap size for the SGX enclave (commit 0b32698 and commit afc33a7). When we experimented with multiple rounds of message sending, the message store grew larger than the configured maximum enclave heap size. Our machine also had more cores than the maximum number of configured threads.
The original code did a single batch send, followed by a configurable number of batch fetches, and timed only the fetches. We slightly rearrange the code to do a configurable number of (batch send + batch fetch) rounds (commit d11e4fc). For each round, we report:
- The number of messages sent in the batch (the number of users)
- How many batches have been sent so far (the round number)
- The size of the message store after this iteration
- The time to send the batch
- The time to fetch the batch
- The total time to send and fetch the batch (the sum of the previous two)
All times are in seconds.

The clientscale experiments

The ./run-clientscale-experiments script will generate the data for Figure 7 in our paper, which holds the number of cores fixed at 16, and varies the number of messages per batch from 32768 to 1048576. (If you have fewer than 16 cores, the script will just use however many you have, but will run more slowly, of course.)

This script can take two optional parameters:

niters (defaults to 1): the number of times to run the experiment. The variance is quite small, so this doesn't need to be large. Even 1 is probably fine if you're just checking that your results are similar to ours. (Your CPU speed will of course be a factor if you compare your raw timings to ours.)
nrounds (defaults to 1): the number of (send + fetch) rounds to do per batch size per experiment. To give maximal benefit to the Sparta-SB data points in our Figure 7, we only used 1 round (just like the original code). However, by setting this higher, you can see the effect described above, where larger numbers of sending rounds cause the message store to get larger, and the send and fetch operations to get slower.

The output of the ./run-clientscale-experiments script can be parsed by the ./parse-clientscale-logs script, which will output a CSV with columns:

Number of users
Number of batches sent (sending round)
Time to send that batch (mean and stddev over the niters experiment runs)
Time to fetch that batch (mean and stddev over the niters experiment runs)
Total time to send and fetch that batch (mean and stddev over the niters experiment runs)

These are the values (with sending round equal to 1) plotted in Figure 7 in the TEEMS paper. Our exact results (and what is plotted in the figure) were:

users,batches,send_mean,send_stddev,fetch_mean,fetch_stddev,tot_mean,tot_stddev
32768,1,0.676,0.020,1.182,0.016,1.858,0.028
65536,1,1.301,0.014,2.190,0.013,3.491,0.017
131072,1,2.488,0.033,4.049,0.052,6.537,0.055
262144,1,4.959,0.042,7.856,0.071,12.816,0.085
524288,1,9.742,0.102,15.284,0.144,25.027,0.167
1048576,1,18.448,0.340,29.074,0.367,47.522,0.595

The corescale experiments

The ./run-corescale-experiments script will generate the data for Figure 8 in our paper, which holds the batch size fixed at 1048576, while varying the number of cores from 4 to 72. If you have fewer than 72 cores, the experiment will only gather data for core counts you have available.