README.md 5.8 KB

Symmetric Private Information Retrieval (SPIR) built on Spiral

Ian Goldberg (iang@uwaterloo.ca), July 2022

This code implements Symmetric Private Information Retrieval, building on the Spiral PIR library (this code is not written by the Spiral authors).

Spiral is a recent single-server (computational) PIR system from 2022:

In ordinary PIR, the client learns the database record they were looking for, and the server does not learn which record that was. The client is not prevented, however, from learning additional database records. Downloading the whole database, for example, can be seen as a trivial form of PIR (though one that transfers way too much data, except for tiny databases).

In Symmetric PIR (SPIR), the client must learn only one database record, in addition to the server learning no information about which record that was. SPIR is similar to oblivious transfer (OT), except that SPIR aims to have sublinear communication (at no time is something the size of the whole database downloaded), while OT does not have that restriction.

This code builds SPIR from ordinary PIR (using Spiral), using a strategy based on that of Naor and Pinkas:

The general strategy is this (for a database with N=2^r records):

  1. Server: Encrypt each record of the database with a unique key
  2. Client: Use 1-of-N OT to retrieve exactly one key
  3. Client: Privately download the encrypted record and decrypt it with the key retrieved above

In the Naor and Pinkas paper, step 3 was just trivial download of the whole encrypted database, but we use Sprial to privately retrieve the encrypted record of interest. Note, however, that the client will typically learn additional nearby records as well, which is why we need to encrypt each record separately.

The 1-of-N OT is built from r 1-of-2 OTs in the same manner as in the Naor and Pinkas paper: the server picks r pairs of random keys, and the client chooses one key from each pair, according to the bits of the desired query index value. Then each database entry is encrypted with the combination of keys associated with its index.

The Naor and Pinkas paper uses this for its encryption, where F is a pseudorandom function (PRF) family:

Enc[j] = Db[j] XOR F_{K_1,j1}(j) XOR F_{K_2,j2}(j) XOR ... XOR F_{K_r,jr}(j)

where j1,j2,...,jr are the bits of j, as j ranges from 0 to N-1.

We instead use:

Ktot = K_1,j1 XOR K_2,k2 XOR ... XOR K_r,kr

Enc[j] = Db[j] XOR F_{Ktot}(j)

That is, Naor and Pinkas XOR the outputs of the PRF, whereas we XOR the keys. This requires a slightly stronger assumption on the PRF family (for example, that the functions with different keys are independent, even if the keys have some known XOR relation), but we're going to use AES for F, so that should be fine.

For the 1-of-2 OT protocol, we base it on the simple one from the 1989 paper of Bellare and Micali:

We slightly optimize the protocol in that instead of the client sending both β0 and β1 to the server, and the server checking that their sum (in elliptic curve terms) is C, the client just sends β0 and the server computes β1 = C-β0. In addition, the server sends back the two keys ElGamal-encrypted to the keys β0 and β1, where the client can know the private key for at most one of them. Bellare and Micali's paper does this as (r0*G, H(r0*β0) XOR K0, r1*G, H(r1*β1) XOR K1), whereas we use the slightly more efficient (r*G, H(r*β0) XOR K0, H(r*β1) XOR K1).

Running the code

To build the code:

RUSTFLAGS="-C target-cpu=native" cargo build --release

To run the code:

./target/release/spiral-spir 20 4

Where 20 is the value of r (that is, the database will have N=2^20 entries), and 4 is the number of threads to use (defaults to 1). Each entry is 8 bytes. There are three phases of execution: a one-time Spiral public key generation (this only has to be done once, regardless of how many SPIR queries you do), a preprocessing phase per SPIR query (this can be done before knowing the contents of the database on the server side, or the desired index on the client side), and the runtime phase per SPIR query (once those two things are known).

A sample output (for r=20, 4 threads):

===== ONE-TIME SETUP =====

Using a 2048 x 4096 byte database (8388608 bytes total)
OT one-time setup: 3637 µs
Spiral client one-time setup: 157144 µs, 10878976 bytes

===== PREPROCESSING =====

rand_idx = 146516 rand_pir_idx = 286
Spiral query: 457 µs, 32768 bytes
key OT query in 324 µs, 640 bytes
key OT serve in 1653 µs, 1280 bytes
key OT receive in 1029 µs

===== RUNTIME =====

Send to server 8 bytes
Server encrypt database 29738 µs
Server load database 248825 µs
expansion (took 101920 us).
Server compute response 181293 µs, 14336 bytes (*including* the above expansion time)
Client decode response 790 µs
index = 919657, Response = 9196570919677

The various lines show the amount of wall-clock time taken for various parts of the computation and the amount of data transferred between the client and the server. The last line shows the random index that was looked up, and the database value the client retrieved. The value for index i should be (10000001*i+20).