|
vor 2 Monaten | |
---|---|---|
.. | ||
Dockerfile | vor 4 Monaten | |
README.md | vor 2 Monaten | |
build-docker | vor 8 Monaten | |
parse-clientscale-logs | vor 8 Monaten | |
parse-corescale-logs | vor 8 Monaten | |
run-clientscale-experiments | vor 3 Monaten | |
run-corescale-experiments | vor 3 Monaten | |
start-docker | vor 8 Monaten | |
stop-docker | vor 8 Monaten |
This repository is a dockerization of the ucsc-anonymity/sparta-experiments repository. This dockerization can be used to reproduce the Sparta-SB datapoints in Figures 7 and 8 of the TEEMS paper:
Sajin Sasy, Aaron Johnson, and Ian Goldberg. TEEMS: A Trusted Execution Environment based Metadata-protected Messaging System. Proceedings on Privacy Enhancing Technologies, Vol. 2025, No. 4, July 2025.
This dockerization by Ian Goldberg, iang@uwaterloo.ca.
You will need a server with an Intel Xeon CPU that supports SGX2, and SGX2 must be enabled in the BIOS. To fully reproduce the graphs in the paper, you will need 72 CPU cores (not hyperthreads), but if you have fewer, your experiments will just run more slowly (Figure 7) or only partially (Figure 8). We used a machine with two 40-core Intel Xeon 8380 CPUs running at 2.3 GHz to generate the figures, if you aim to compare your results to ours.
The server should run Linux with kernel at least 5.11. We used Ubuntu 22.04.
SGX2 must be enabled on your machine, and so you should see device files
/dev/sgx/enclave
and /dev/sgx/provision
.
You will need docker
. On Ubuntu, for example: apt install docker.io
and be sure to run all the experiments as a user with docker permissions
(in the docker
group).
You will need the aesmd
service. If you have the file
/var/run/aesmd/aesm.socket
, then all is well. You already have the
aesmd
service running on your machine.
If not, run:
sudo mkdir -p -m 0755 /var/run/aesmd
docker run -d --rm --device /dev/sgx/enclave --device /dev/sgx/provision \
-v /var/run/aesmd:/var/run/aesmd --name aesmd fortanix/aesmd
That will start the aesmd
service in a docker, and you should then see
the /var/run/aesmd/aesm.socket
file existing.
If you started aesmd
with this docker method, then when you're done
with the experiments:
docker stop aesmd
You will need python3
with numpy
(on the host machine) to run the
log parser scripts.
Once you have SGX2 and aesmd
set up, the following will build sparta
and its dependencies, and run the experiments:
git clone https://git-crysp.uwaterloo.ca/iang/sparta-experiments-docker
cd sparta-experiments-docker/docker
./build-docker
./run-clientscale-experiments | tee /dev/stderr | ./parse-clientscale-logs
./run-corescale-experiments | tee /dev/stderr | ./parse-corescale-logs
Running the clientscale experiments (Figure 7) should take about 4 minutes; runinng the corescale experiments (Figure 8) should take about 13 minutes. (These runtimes are on a 2.3 GHz CPU, so your runtimes may vary.)
See below for optional arguments you can pass to the experiment scripts.
Other than the dockerization, this repository makes the following changes to the original ucsc-anonymity/sparta-experiments repository:
Added functions for sparta to return the current size of its message store (commit b8d48d7). As we note in Section A.3 of the TEEMS paper, Sparta's message store grows whenever any user sends a message, but does not shrink when messages are retrieved. These functions enable demonstration of this behaviour.
Increased the maximum number of threads and heap size for the SGX enclave (commit 0b32698 and commit afc33a7). When we experimented with multiple rounds of message sending, the message store grew larger than the configured maximum enclave heap size. Our machine also had more cores than the maximum number of configured threads.
The original code did a single batch send, followed by a configurable number of batch fetches, and timed only the fetches. We slightly rearrange the code to do a configurable number of (batch send + batch fetch) rounds (commit d11e4fc). For each round, we report:
All times are in seconds.
The ./run-clientscale-experiments
script will generate the data for
Figure 7 in our paper, which holds the number of cores fixed at 16, and
varies the number of messages per batch from 32768 to 1048576. (If you
have fewer than 16 cores, the script will just use however many you
have, but will run more slowly, of course.)
This script can take two optional parameters:
niters
(defaults to 1): the number of times to run the experiment.
The variance is quite small, so this doesn't need to be large. Even
1 is probably fine if you're just checking that your results are
similar to ours. (Your CPU speed will of course be a factor if you
compare your raw timings to ours.)
nrounds
(defaults to 1): the number of (send + fetch) rounds to do
per batch size per experiment. To give maximal benefit to the
Sparta-SB data points in our Figure 7, we only used 1 round (just
like the original code). However, by setting this higher, you can
see the effect described above, where larger numbers of sending
rounds cause the message store to get larger, and the send and
fetch operations to get slower.
The output of the ./run-clientscale-experiments
script can be parsed
by the ./parse-clientscale-logs
script, which will output a CSV with
columns:
niters
experiment runs)niters
experiment runs)niters
experiment runs)These are the values (with sending round equal to 1) plotted in Figure 7 in the TEEMS paper. Our exact results (and what is plotted in the figure) were:
users,batches,send_mean,send_stddev,fetch_mean,fetch_stddev,tot_mean,tot_stddev
32768,1,0.676,0.020,1.182,0.016,1.858,0.028
65536,1,1.301,0.014,2.190,0.013,3.491,0.017
131072,1,2.488,0.033,4.049,0.052,6.537,0.055
262144,1,4.959,0.042,7.856,0.071,12.816,0.085
524288,1,9.742,0.102,15.284,0.144,25.027,0.167
1048576,1,18.448,0.340,29.074,0.367,47.522,0.595
The ./run-corescale-experiments
script will generate the data for
Figure 8 in our paper, which holds the batch size fixed at 1048576,
while varying the number of cores from 4 to 72. If you have fewer than
72 cores, the experiment will only gather data for core counts you have
available.
This script can take two optional parameters:
niters
(defaults to 1): the number of times to run the experiment.
The variance is quite small, so this doesn't need to be large. Even
1 is probably fine if you're just checking that your results are
similar to ours. (Your CPU speed will of course be a factor if you
compare your raw timings to ours.)
sends
(defaults to 1048576): the number of messages in each batch
The output of the ./run-corescale-experiments
script can be parsed
by the ./parse-corescale-logs
script, which will output a CSV with
the same columns as ./parse-clientscale-logs
above, except an
additional column just before the timings:
These are the values plotted in Figure 8 in the TEEMS paper. Our exact results (and what is plotted in the figure) were:
users,batches,ncores,send_mean,send_stddev,fetch_mean,fetch_stddev,tot_mean,tot_stddev
1048576,1,4,25.666,0.367,51.238,0.808,76.904,0.772
1048576,1,6,25.803,0.396,51.315,0.767,77.117,0.927
1048576,1,8,26.084,0.383,52.118,0.415,78.202,0.364
1048576,1,16,18.273,0.190,28.953,0.219,47.226,0.235
1048576,1,24,13.194,0.221,18.997,0.377,32.191,0.384
1048576,1,32,13.434,0.404,18.312,0.189,31.747,0.518
1048576,1,36,13.375,0.259,18.619,0.410,31.995,0.473
1048576,1,40,13.435,0.275,18.482,0.414,31.917,0.540
1048576,1,44,12.984,0.290,18.399,0.374,31.383,0.569
1048576,1,48,10.382,0.401,14.264,0.322,24.646,0.471
1048576,1,64,10.432,0.382,14.579,0.263,25.011,0.590
1048576,1,72,10.820,0.286,15.066,0.296,25.886,0.466