|
@@ -9,16 +9,85 @@ These scripts are in support of our paper:
|
|
|
|
|
|
Adithya Vadapalli, Ryan Henry, Ian Goldberg. Duoram: A Bandwidth-Efficient Distributed ORAM for 2- and 3-Party Computation. USENIX Security Symposium 2023. [https://eprint.iacr.org/2022/1747](https://eprint.iacr.org/2022/1747)
|
|
|
|
|
|
-It is based on [Doerner and shelat's published code](https://gitlab.com/neucrypt/floram/-/archive/floram-release/floram-floram-release.zip), with two small changes:
|
|
|
+It is a dockerization of [Doerner and shelat's published code](https://gitlab.com/neucrypt/floram/-/archive/floram-release/floram-floram-release.zip), with two small changes:
|
|
|
|
|
|
- Their benchmarking code (`bench_oram_read` and `bench_oram_write`) sets up the ORAM, and then does a number of read or a number of write operations. The _time_ to set up the ORAM is included in the reported time, but the _bandwidth_ to set up the ORAM is not included in the reported bandwith. We have [a patch](bench_oram.patch) to also measure the bandwidth of the setup, and report it separately from the bandwidth of the operations.
|
|
|
- - We also add [a read/write benchmark](bench_oram_readwrite.oc) that does alternating reads and writes. If you ask for 128 iterations, for example, it will do 128 reads and 128 writes, interleaved.
|
|
|
-
|
|
|
-## Instructions:
|
|
|
+ - We also add [a read/write benchmark](bench_oram_readwrite.oc) that does alternating reads and writes. If you ask for 128 operations, for example, it will do 128 reads and 128 writes, interleaved.
|
|
|
+
|
|
|
+## Reproduction instructions
|
|
|
+
|
|
|
+Follow these instructions to reproduce the Floram data points (timings
|
|
|
+and bandwidth usage of Floram operations for various ORAM sizes and
|
|
|
+network settings) for the plots in our paper. See
|
|
|
+[below](#manual-instructions) if you want to run experiments of your
|
|
|
+choosing.
|
|
|
+
|
|
|
+ - Build the docker image with `./build-docker`
|
|
|
+ - Start the dockers with `./start-docker`
|
|
|
+ - This will start two dockers, each running one of the parties.
|
|
|
+ - Run the reproduction script `./repro` with one of the following
|
|
|
+ arguments:
|
|
|
+ - <code>./repro test</code>: Run a short (just a few seconds) "kick-the-tires" test.
|
|
|
+ You should see output like the following:
|
|
|
+
|
|
|
+ <code>Running test experiment...
|
|
|
+ Tue 21 Feb 2023 01:37:45 PM EST: Running read 16 1us 100gbit 2 ...
|
|
|
+ Floram read 16 1us 100gbit 2 0.554001 s
|
|
|
+ Floram read 16 1us 100gbit 2 3837.724609375 KiB</code>
|
|
|
+
|
|
|
+ The last two lines are the output data points, telling you that a
|
|
|
+ Floram read test on an ORAM of size 2<sup>16</sup>, with a network
|
|
|
+ configuration of 1us latency and 100gbit bandwidth, performing 2
|
|
|
+ read operations, took 0.554001 s of time and 3837.724609375 KiB of
|
|
|
+ bandwidth. If you've run the test before, you will see a
|
|
|
+ concatenation of all of the output data points. When you run it,
|
|
|
+ the time of course will depend on the particulars of your
|
|
|
+ hardware, but the bandwidth used should be exactly the value
|
|
|
+ quoted above.
|
|
|
+
|
|
|
+ - <code>./repro small _numops_</code>: Run the "small" tests. These
|
|
|
+ are the tests up to size 2<sup>26</sup>, and produce all the data
|
|
|
+ points for Figures 7 and 8, and most of Figure 9.
|
|
|
+ <code>_numops_</code> is the number of operations to run for each
|
|
|
+ test; we used the default of 128 for the figures in the paper, but
|
|
|
+ you can use a lower number to make the tests run faster. For the
|
|
|
+ default of 128, these tests should complete in about 4 to 5 hours,
|
|
|
+ and require 16 GB of available RAM.
|
|
|
+
|
|
|
+ - <code>./repro large _numops_</code>: Run the "large" tests. These
|
|
|
+ are the rightmost 3 data points in Figure 9. They are not
|
|
|
+ essential to our major claims, so they are optional to run,
|
|
|
+ and you will definitely require a larger machine to run them.
|
|
|
+ For the default <code>_numops_</code> of 128, these experiments
|
|
|
+ will require 9 to 10 hours to run and 540 GB of available RAM.
|
|
|
+ Reducing <code>_numops_</code> will only slightly reduce the
|
|
|
+ runtime (down to 8 to 9 hours) and will not change the RAM
|
|
|
+ requirements.
|
|
|
+
|
|
|
+ - <code>./repro all _numops_</code>: Run both the "small" and
|
|
|
+ "large" tests.
|
|
|
+
|
|
|
+ - <code>./repro none _numops_</code>: Run no tests. This command is
|
|
|
+ nonetheless useful in order to parse the output logs and display
|
|
|
+ the data points for the graphs (see below).
|
|
|
+
|
|
|
+ - <code>./repro single _mode_ _size_ _latency_ _bandwidth_
|
|
|
+ _numops_</code>: run a single manually selected test with the
|
|
|
+ given parameters.
|
|
|
+
|
|
|
+ - After `small`, `large`, `all`, or `none`, the script will parse
|
|
|
+ all of the outputs that have been collected with the specified
|
|
|
+ <code>_numops_</code> (in this run or previous runs), and output
|
|
|
+ them as they would appear in each of the subfigures of Figures 7,
|
|
|
+ 8, and 9.
|
|
|
+
|
|
|
+ - When you're done, `./stop-docker`
|
|
|
+
|
|
|
+## Manual instructions
|
|
|
|
|
|
- `./build-docker`
|
|
|
- `./start-docker`
|
|
|
- - This will start two dockers, each running one of the parties
|
|
|
+ - This will start two dockers, each running one of the parties.
|
|
|
|
|
|
Then to simulate network latency and capacity (optional):
|
|
|
|
|
@@ -41,11 +110,11 @@ instead of `-N 1` to pin to specific cores, even on a non-NUMA machine.
|
|
|
|
|
|
Run experiments:
|
|
|
|
|
|
- - <code>./run-experiment _mode_ _size_ _iters_ _port_ >> _outfile_</code>
|
|
|
+ - <code>./run-experiment _mode_ _size_ _numops_ _port_ >> _outfile_</code>
|
|
|
- <code>_mode_</code> is one of `read`, `write`, `readwrite`, or `init`
|
|
|
- `init` measures setting up the database with non-zero initial values; the other three modes include setting up the database initialized to 0. Defaults to `read`.
|
|
|
- <code>_size_</code> is the base-2 log of the number of entries in the ORAM (so <code>_size_</code> = 20 is an ORAM with 1048576 entries, for example). Defaults to 20.
|
|
|
- - <code>_iters_</code> is the number of iterations to perform; one setup will be followed by <code>_iters_</code> operations, where each operation is a read, a write, or a read plus a write, depending on the <code>_mode_</code>. Defaults to 128.
|
|
|
+ - <code>_numops_</code> is the number of operations to perform; one setup will be followed by <code>_numops_</code> operations, where each operation is a read, a write, or a read plus a write, depending on the <code>_mode_</code>. Defaults to 128.
|
|
|
- <code>_port_</code> is the port number to use; if you're running multiple experiments at the same time, they must each be on a different port. Defaults to 3000.
|
|
|
|
|
|
- <code>./parse\_sizes _outfile_</code>
|
|
@@ -54,7 +123,7 @@ Run experiments:
|
|
|
- <code>./parse\_times _outfile_</code>
|
|
|
- Parses the file output by one or more executions of `./run-experiment` to extract the runtime of each experiment. The output will be, for each experiment, a line with the two numbers <code>_size_</code> and <code>_sec_</code>, which are the size of the experiment and the time in seconds, including both the ORAM setup and the operations.
|
|
|
|
|
|
-To see an example of how to use `./run-experiment` while varying the experiment size and the network latency and bandwidth, the [`./run-readwrite-experiments`](run-readwrite-experiments) script wraps `./run-experiment`, and is the script we used to generate the interleaved Floram measurements in Figures 7 and 8 of [our paper](https://eprint.iacr.org/2022/1747).
|
|
|
+To see an example of how to use `./run-experiment` while varying the experiment size and the network latency and bandwidth, and using the NUMA functionality, the [`./run-readwrite-experiments`](run-readwrite-experiments) script wraps `./run-experiment`.
|
|
|
|
|
|
When you're all done:
|
|
|
|