(Disclaimer: This explanation is partially outdated. It is intended only as an internal reference for developers of Graphene, not as a general documentation for Graphene users.)
Fork() system call is intercepted in the shim_do_fork()
LibOS function. This function performs
three tasks: (1) discovers the namespace leader, (2) creates a LibOS shim_thread
structure for
a new thread, and (3) calls do_migrate_process()
. The first two tasks are trivial, so we
concentrate on the third one.
do_migrate_process
The function do_migrate_process()
creates a new process in the underlying OS, establishes a
channel between the current process and the newly created child process, collects checkpoint data
from the parent process, sends the checkpoint to the child process, and establishes an IPC port
to listen for events from the child process (such as death of the child).
To implement this logic, do_migrate_process()
calls the following functions:
_DkProcessCreate
-- PAL-specific creation of the process in the underlying OS. For SGX PAL,
this function performs clone() + execve()
system calls in underlying Linux to create a new
process. The new (child) process can communicate with the parent process via Unix pipes. Using
these pipes, the parent and the child establish a shared secret via Diffie-Hellman (DH) key
exchange. A simple Key Derivation Function (KDF) is used to derive a shared 16B AES-CMAC key.
Ideally, messages between the parent and the child are encrypted and signed by this shared key.
Currently, this shared key is used only for mutual authentication of the parent and the child.
As part of mutual authentication, the child sends its REPORT (generated by EREPORT) to the parent
who verifies it, and the parent sends its REPORT to the child, who in turn verifies it.
*migrate()
-- function pointer to collect checkpoint data from the parent process (e.g., shim
structures for process metadata, threads metadata, loaded libraries metadata, mounted points with
FS metadata, VMAs, etc.). The checkpoint is created in trusted enclave memory and is never
persisted (so it does not lead to leaks). For the fork case, the actual function is migrate_fork()
.
This function collects absolutely all shim data on the current state of the parent process.
send_checkpoint_on_stream()
-- sends the whole checkpoint to the child using
DkStreamWrite()
. The only interesting thing about this function is that data is sent in plaintext.
For SGX, the data must be encrypted in this function.
After the child is created, Pal-Linux-SGX enters the enclave using EENTER. The enclave code calls
pal_linux_main()
which in turn calls init_child_process()
. The init_child_process()
function
serves as a child counterpart of the _DkProcessCreate()
logic of the parent. After
init_child_process()
successfully authenticates the parent, pal_linux_main()
ends up calling
pal_main()
which calls start_execution()
which finally invokes the entry point of LibOS --
shim_start()
. shim_start()
simply calls shim_init()
, which reads the checkpoint from the
network, restores it in child enclave's memory, and finally jumps to checkpointed RIP.
In particular, these are the important child functions for fork:
init_child_process()
-- a counterpart function to parent's _DkProcessCreate()
. It inherits
Unix pipes to communicate with the parent. Using these pipes, the child establishes a share
secret via DH key exchange with the parent. The rest is similar to the logic in
DkProcessCreate()
. At the end of this function, the child successfully authenticated the parent
process.
do_migration()
-- reads the checkpoint from the network using DkStreamRead()
. You can think
of do_migration()
as a counterpart to send_checkpoint_on_stream()
. This function is called
from shim_init()
.
restore_checkpoint()
-- restores checkpoint in child enclave's memory. You can think of
restore_checkpoint()
as a counterpart to migrate_fork()
. This function is called from
shim_init()
.
High-level flow between the parent and the child on fork looks as follows:
Parent Child
+---------------------------------------+ +----------------------------------------+
fork() system call
+
+ do_migrate_process()
+
| _DkProcessCreate()
| + clone() & execve()
| | <create new process> +----------------------------> OS starts Pal-Linux-SGX
| | +
| | + Newly created enclave does EENTER
| | +
| | + pal_linux_main()
| | +
| | | init_child_process()
| | | +
| + <mutual authentication flow> <-------------------------> | + <mutual authentication flow>
| |
| cpstore = migrate_fork() + shim_init()
| + +
| | <checkpoint shim state> | do_migration()
| | | +
| + <checkpoint memory contents> | |
| | |
+ send_checkpoint_on_stream(cpstore) | |
+ | |
+ _DkStreamWrite(cpstore) +---------------------------------> | + cpstore = _DkStreamRead()
|
DONE WITH FORK | restore_checkpoint(cpstore)
| +
| + <restore in enclave memory>
|
+ <jump to checkpointed RIP>
CONTINUE EXECUTION AFTER FORK
Mutual authentication flow between the parent and the child (mentioned in the previous diagram) looks as follows:
Parent Child
+---------------------------------------+ +----------------------------------------+
_DkProcessCreate() init_child_process()
+ +
| session_key = _DkStreamKeyExchange() | session_key = _DkStreamKeyExchange()
| + | +
| | lib_DhInit() | | lib_DhInit()
| | | |
| | parent_pub = lib_DhCreatePublic() | | child_pub = lib_DhCreatePublic()
| | DH | |
| | _DkStreamWrite(parent_pub) +---------- ------------> | | _DkStreamWrite(child_pub)
| | \/ | |
| | child_pub = _DkStreamRead() <---------/\------------+ | | parent_pub = _DkStreamRead()
| | | |
| | session_key = lib_DhCalcSecret() | | session_key = lib_DhCalcSecret()
| | | |
| + session_key = KDF(session_key) to 32B | + session_key = KDF(session_key) to 32B
| |
| mac_key = session_key_to_mac_key() | mac_key = session_key_to_mac_key()
| + | +
| + mac_key = KDF(session_key) to 16B | + mac_key = KDF(session_key) to 16B
| |
| parent_eid = MAC(enclave_id) with mac_key | child_eid = MAC(enclave_id) with mac_key
| |
+ _DkStreamAttestationRespond() + _DkStreamAttestationRespond()
+ +
+ <SGX attestation flow> <-------------------------------> + <SGX attestation flow>
AUTHENTICATION COMPLETED AUTHENTICATION COMPLETED
SGX attestation flow between the parent and the child (mentioned in the previous diagram) looks as follows:
Parent Child
+---------------------------------------+ +----------------------------------------+
_DkStreamAttestationRequest() _DkStreamAttestationRespond()
+ +
| parent_targetinfo = |
| {parent_mrenclave, parent_encl_attrs} |
| |
| _DkStreamWrite(parent_targetinfo) +--------------------> | parent_targetinfo = _DkStreamRead()
| |
| | child_report = EREPORT(
| | targetinfo = parent_targetinfo,
| | reportdata = {
| | child_enclave_flags,
| | child_enclave_id,
| | mac = MAC(child_eid)
| | }
| | )
| |
| child_report = _DkStreamRead() <-----------------------+ | _DkStreamWrite(child_report)
| |
| sgx_verify_report(child_report) |
| + |
| + <SGX report verification> |
| |
| check_child_mrenclave(child_report, mac_key) |
| + |
| + <SGX trusted-child check> |
| |
| child_targetinfo = |
| {child_report.mrenclave, |
| child_report.encl_attrs} |
| |
| parent_report = EREPORT( |
| targetinfo = child_targetinfo, |
| reportdata = { |
| parent_enclave_flags, |
| parent_enclave_id, |
| mac = MAC(parent_eid) |
| } |
| ) |
| |
+ _DkStreamWrite(parent_report) +------------------------> | parent_report = _DkStreamRead()
|
ATTESTATION COMPLETED | sgx_verify_report(parent_report)
| +
| + <SGX report verification>
|
| check_parent_mrenclave(parent_report, mac_key)
| +
+ + <SGX trusted-parent check>
ATTESTATION COMPLETED
(The current source code contains att
-- the attestation structure describing the child/parent
process. This att
wrapper around SGX report is redundant. In the diagram above, it is omitted.)
SGX report verification logic (mentioned in the previous diagram) looks as follows:
Parent and child
+---------------------------------------------------------+
sgx_verify_report(report)
+
| report_key = EGETKEY(keyrequest = {
| keyname = REPORT_KEY,
| keyid = report.keyid})
|
| check_mac = MAC(all report except keyid) with report_key
|
| check_mac == report.mac ?
| +-------------------------------------------------+
| | Proof that received report was generated by |
| | the legitimate (and same) SGX machine. |
| | Report fields can be trusted (e.g., MRENCLAVE) |
+ +-------------------------------------------------+
REPORT VERIFIED
SGX trusted-child / trusted-parent check (mentioned in the previous diagram) looks as follows:
Parent Child
+---------------------------------------------------+ +---------------------------------------------------+
check_child_mrenclave(child_report, mac_key) check_parent_mrenclave(parent_report, mac_key)
+ +
| check_child_eid = MAC( | check_parent_eid = MAC(
| child_report.reportdata.child_enclave_id) | parent_report.reportdata.parent_enclave_id)
| with mac_key | with mac_key
| |
| check_child_eid == child_report.reportdata.mac ? | check_parent_eid == parent_report.reportdata.mac ?
| +---------------------------------------------+ | +---------------------------------------------+
| | Proof that the child posseses the same | | | Proof that the parent posseses the same |
| | shared key (derived from DH key exchange) | | | shared key (derived from DH key exchange) |
| +---------------------------------------------+ | +---------------------------------------------+
| |
| child_report.mrenclave == parent_mrenclave ? + <MISSING: need a check that the parent is trusted>
|
| OR TRUSTED-PARENT CHECK COMPLETED
|
| child_report.mrenclave IN trusted_children ?
| +----------------------------------------------+
| | Proof that child is either the same enclave |
| | or one of the trusted children enclaves |
| | specified in the manifest |
+ +----------------------------------------------+
TRUSTED-CHILD CHECK COMPLETED
Notes for the above diagrams:
Diffie-Hellman is implemented using mbedTLS primitives. The configuration parameters are MBEDTLS_DHM_RFC3526_MODP_2048_P, MBEDTLS_DHM_RFC3526_MODP_2048_G, DH_SIZE=256.
The Key Derivation Function (KDF) used here is very simple: it XORs 32B/16B chunks of the input key to produce a 32B/16B output key. This KDF is weak.
The MAC function is actually AES-CMAC of mbedTLS. For the 16B mac-key, it uses the
MBEDTLS_CIPHER_AES_128_ECB cipher. The format in the diagram is MAC(data1 || data2) with mac-key
.
enclave_id
is the Enclave Identifier -- a 64-bit random number generated as part of the app
initialization inside SGX enclave. Thus, enclave_id
is dynamic and unique per enclave instance
(in contrast to MRENCLAVE). enclave_id
is needed to distinguish between two instance of the same
enclave image. Also, enclave_id
is needed to protect against replay attacks.
SGX report produced by EREPORT contains: enclave's MRENCLAVE, MRSIGNER, ISVPRODID, ISVSVN, CPUSVN, and attributes (all copy-pasted from enclave's SECS), as well as a nonce keyid (randomly-generated by EREPORT), and reportdata (copy-pasted from input reportdata). All fields in the report are MACed, except for keyid. The report also contains the MAC itself.
There is no encryption at any moment of the fork protocol. For example, the complete checkpoint
is passed to the child in plaintext. Ideally, the pipe/stream between the parent and the child must
be always encrypted as soon as the shared DH key is established. This would require changes in the
SGX PAL's implementation of DkStreamWrite()
, DkStreamRead()
, etc.
Currently used KDF is weak. It is not clear whether this weakens the generated MACs. Can the
attacker reconstruct mac-key
by observing the passed MACed (child_report.reportdata.mac
and
parent_report.reportdata.mac
)? Also, the double use of KDF is strange: first the 128B DH key
is KDFed to 32B, and then again to 16B.
The missing trusted-parent-enclave check in check_parent_mrenclave()
opens an attack vector.
The attacker can start a malicious enclave which spawns a benign child (by tweaking untrusted PAL's
logic in the child process). This malicious enclave is happily authenticated by the child. Now
there exists a communication channel between the attacker-controlled malicious enclave and the
victim child.
It is not clear if the current check in check_child_mrenclave()
and check_parent_mrenclave()
is sufficient to protect against replay attacks. (Can the attacker actually do anything malicious
with stale messages from the old child or parent?) Ideally, we want both the parent and the child
to challenge each other using a cryptographic nonce (the nonce can simply be the enclave ID?).
Performance optimization: to reduce latency of fork, the authentication process can run in
parallel with migrate_fork()
. Currently, the parent first waits to finish the authentication of
the child and then starts collecting the checkpoint.
Parent:
shim_do_fork()
: LibOS/shim/src/sys/shim_fork.c
do_migrate_process()
: LibOS/shim/src/shim_checkpoint.c
_DkProcessCreate()
: Pal/src/host/Linux-SGX/db_process.c
migrate_fork()
: LibOS/shim/src/sys/shim_fork.c
send_checkpoint_on_stream()
:LibOS/shim/src/shim_checkpoint.c
Child:
init_child_process()
: Pal/src/host/Linux-SGX/db_process.c
do_migration()
: LibOS/shim/src/shim_checkpoint.c
restore_checkpoint()
: LibOS/shim/src/shim_checkpoint.c