Symmetric Private Information Retrieval (SPIR) built on Spiral

Ian Goldberg b49e8544ab Use upstream spiral-rs again instead of our fork		3 years ago
cxx	60314dbab9 Typo in C++ test program output	3 years ago
src	b49e8544ab Use upstream spiral-rs again instead of our fork	3 years ago
.gitignore	62601eb0e3 Initial commit	3 years ago
Cargo.toml	b49e8544ab Use upstream spiral-rs again instead of our fork	3 years ago
README.md	ceaebe7dc1 Minor touchups to README.md	3 years ago

Symmetric Private Information Retrieval (SPIR) built on Spiral

Ian Goldberg (iang@uwaterloo.ca), July 2022

Last update August 2022

This code implements Symmetric Private Information Retrieval, building on the Spiral PIR library (this code is not written by the Spiral authors).

Spiral is a recent single-server (computational) PIR system from 2022:

Samir Jordan Menon, David J. Wu, "Spiral: Fast, High-Rate Single-Server PIR via FHE Composition". IEEE Symposium on Security and Privacy, 2022. eprint: https://eprint.iacr.org/2022/368, code: https://github.com/menonsamir/spiral-rs

In ordinary PIR, the client learns the database record they were looking for, and the server does not learn which record that was. The client is not prevented, however, from learning additional database records. Downloading the whole database, for example, can be seen as a trivial form of PIR (though one that transfers way too much data, except for tiny databases).

In Symmetric PIR (SPIR), the client must learn only one database record, in addition to the server learning no information about which record that was. SPIR is similar to oblivious transfer (OT), except that SPIR aims to have sublinear communication (at no time is something the size of the whole database downloaded), while OT does not have that restriction.

This code builds SPIR from ordinary PIR (using Spiral), using a strategy based on that of Naor and Pinkas:

Moni Naor and Benny Pinkas, "Oblivious Transfer and Polynomial Evaluation". STOC 1999. paper: https://dl.acm.org/doi/10.1145/301250.301312

The general strategy is this (for a database with N=2^r records):

Server: Encrypt each record of the database with a unique key
Client: Use 1-of-N OT to retrieve exactly one key
Client: Privately download the encrypted record and decrypt it with the key retrieved above

In the Naor and Pinkas paper, step 3 was just trivial download of the whole encrypted database, but we use Sprial to privately retrieve the encrypted record of interest. Note, however, that the client will typically learn additional nearby records as well, which is why we need to encrypt each record separately.

The 1-of-N OT is built from r 1-of-2 OTs in the same manner as in the Naor and Pinkas paper: the server picks r pairs of random keys, and the client chooses one key from each pair, according to the bits of the desired query index value. Then each database entry is encrypted with the combination of keys associated with its index.

The Naor and Pinkas paper uses this for its encryption, where F is a pseudorandom function (PRF) family:

Enc[j] = Db[j] XOR F_{K_1,j1}(j) XOR F_{K_2,j2}(j) XOR ... XOR F_{K_r,jr}(j)

where j1,j2,...,jr are the bits of j, as j ranges from 0 to N-1.

We instead use:

Ktot = K_1,j1 XOR K_2,k2 XOR ... XOR K_r,kr

Enc[j] = Db[j] XOR F_{Ktot}(j)

That is, Naor and Pinkas XOR the outputs of the PRF, whereas we XOR the keys. This requires a slightly stronger assumption on the PRF family (for example, that the functions with different keys are independent, even if the keys have some known XOR relation), but we're going to use AES for F, so that should be fine.

For the 1-of-2 OT protocol, we base it on the simple one from the 1989 paper of Bellare and Micali:

Mihir Bellare and Silvio Micali, "Non-Interactive Oblivious Transfer and Applications". CRYPTO 1989. paper: https://cseweb.ucsd.edu/~mihir/papers/niot.pdf

We slightly optimize the protocol in that instead of the client sending both β0 and β1 to the server, and the server checking that their sum (in elliptic curve terms) is C, the client just sends β0 and the server computes β1 = C-β0. In addition, the server sends back the two keys ElGamal-encrypted to the keys β0 and β1, where the client can know the private key for at most one of them. Bellare and Micali's paper does this as (s0*G, H(s0*β0) XOR K0, s1*G, H(s1*β1) XOR K1), whereas we use the slightly more efficient (s*G, H(s*β0) XOR K0, H(s*β1) XOR K1).

Running the code

In the August 2022 version, this code is now built as a Rust library that can be called from Rust or from C++.

To build the Rust library and a Rust test program:

RUSTFLAGS="-C target-cpu=native" cargo build --release

To build the C++ library that wraps the Rust library and a C++ test program:

make -C cxx

To run the Rust test program:

./target/release/spiral-spir 20 4 100 2

To run the C++ test program:

./cxx/spir_test 20 4 100 2

Here:

20 is the value of r (that is, the database will have N=2^20 entries)
4 is the number of threads to use (defaults to 1)
100 is the number of SPIR queries to prepare in the preprocessing step; there is only one round trip from the client to the server regardless of this number, but the message and response are larger (defaults to 1)
2 is the number of SPIR queries to do, one at a time. Must be at most the number of preprocessed queries, and defaults to that number

Each entry is 8 bytes. There are three phases of execution: a one-time Spiral public key generation (this only has to be done once, regardless of how many SPIR queries you do), a preprocessing phase per SPIR query (this can be done before knowing the contents of the database on the server side, or the desired index on the client side, as as above, the preprocessing for all of the SPIR queries are all done in a single round trip), and the runtime phase per SPIR query (once the contents of the database and the desired index are known).

A sample output (for r=20, 4 threads, 100 preprocessed SPIR queries, 2 SPIR queries performed):

===== ONE-TIME SETUP =====

One-time setup: 201893 µs
pub_params len = 10878976

===== PREPROCESSING =====

num_preproc = 100
Preprocessing client: 76869 µs
preproc_msg len = 3342408
Preprocessing server: 53409 µs
preproc_resp len = 128808
Preprocessing client finish: 26688 µs

===== SPIR QUERY 1 =====

Query client: 107 µs
query_msg len = 8
expansion (took 99798 us).
Query server: 411422 µs
query_resp len = 14336
Query client finish: 832 µs
idx = 778275; entry = 7783750778395

===== SPIR QUERY 2 =====

Query client: 73 µs
query_msg len = 8
expansion (took 90281 us).
Query server: 406014 µs
query_resp len = 14336
Query client finish: 810 µs
idx = 675158; entry = 6752580675278

The various lines show the amount of wall-clock time taken for various parts of the computation and the amount of data transferred between the client and the server. The last line of each SPIR query shows the random index that was looked up, and the database value the client retrieved. The value for index i should be (10000001*(i+100)+20).

Using the C++ library yourself

Build the C++ library as above:

make -C cxx

Then cxx/spir.hpp and cxx/libspir_cxx.a are the files you'll need. #include the former in your code, and link your program with the latter as well as -lpthread -ldl.

README.md

Symmetric Private Information Retrieval (SPIR) built on Spiral

Running the code

Using the C++ library yourself