(Disclaimer: This explanation is partially outdated. It is intended only as an internal reference for developers of Graphene, not as a general documentation for Graphene users.)
This analysis is written while Graphene's signal handling mechanisms are in flux. In future, all Graphene PALs should implement the same mechanism, and LibOS should adopt a better scheme to support nested signals and alternate signal stacks.
In the interest of space and mental sanity, we do not discuss FreeBSD PAL implementation. Historically, Linux and FreeBSD shared the same mechanism (where signals were immediately delivered to LibOS even if signal arrived during PAL call). This old mechanism was adopted by Linux-SGX PAL, though due to peculiarities of Intel SGX, it has its own sub-flows and is more complicated. Currently, Linux PAL implements a new mechanism where a signal during a PAL call is pended and is delivered to LibOS only after the PAL call is finished.
So, there are two signal-handling mechanisms at the PAL layer:
Linux PAL: (1) If signal arrives during PAL call, pend it and return from signal context, continuing normal context of PAL call. Immediately after a PAL call is finished, deliver all pending signals to LibOS. (2) If signal arrives during LibOS/application code, deliver the signal immediately to LibOS. Note that the signal delivery and handling is done in signal context (in contrast to pending-signal delivery).
Linux-SGX PAL: (1) If signal arrives during enclave-code execution, remember the interrupted
enclave-code context and return from signal context. When jumping back into the enclave (in normal
context), deliver the signal to LibOS. After handling the signal, LibOS/PAL will continue from
interrupted enclave-code context. (2) If signal arrives during non-enclave-code, i.e.
untrusted-PAL, execution, just return from signal context. When jumping back into the enclave
(in normal context), deliver the signal to LibOS. In contrast to first case, after handling the
signal, LibOS/PAL will continue as if outermost PAL function failed with PAL_ERROR_INTERRUPTED
.
The advantage of the first mechanism is that there is never a possibility of nested PAL calls (which is not supported by Graphene). However, this also disallows nested signals already at the PAL layer. The advantage of the second mechanism is that nested signals are possible, at least as far as it concerns the PAL layer.
There is a single unified signal-handling mechanism at the LibOS layer. This mechanism does not support nested signals: if a signal is delivered while another signal is handled (or during a LibOS internal lock), then it is pended. Pended signals are delivered after any system-call completion or after any LibOS internal unlock.
A new signal-handling mechanism at the LibOS layer was proposed by Isaku Yamahata (see https://github.com/oscarlab/graphene/pull/347). This proposal changes the points at which signals are delivered to the user app. The two points are (1) if signal arrives during app execution, the signal is delivered after host OS returns from signal context, and (2) if signal arrives during LibOS/PAL execution, the signal is delivered after system-call completion. This is in contrast to current LibOS approach of (1) delivering the first signal even in the middle of emulated syscall, and (2) pending nested signals until system-call completion.
Normal context
+----------------+
... load_enclave() ...
+
| sgx_signal_setup()
| +
| | set_sighandler(SIGTERM | SIGINT | SIGCONT) +---------------+
| | + | async signals |
| | | action = struct sigaction( +---------------+
| | | sa_handler = _DkTerminateSighandler,
| | | sa_restorer = rt_sigreturn,
| | | sa_flags = {SA_SIGINFO | SA_RESTORER}
| | | sa_mask = {SIGCONT}
| | | )
| | |
| | | rt_sigaction(SIGTERM, action)
| | | rt_sigaction(SIGINT, action)
| | | rt_sigaction(SIGCONT, action)
| | |
| | + rt_sigprocmask(SIGUNBLOCK, SIGTERM | SIGINT | SIGCONT)
| |
| | set_sighandler(SIGSEGV | SIGILL | SIGFPE | SIGBUS) +--------------+
| | + | sync signals |
| | | action = struct sigaction( +--------------+
| | | sa_handler = _DkResumeSighandler,
| | | sa_restorer = rt_sigreturn,
| | | sa_flags = {SA_SIGINFO | SA_RESTORER}
| | | sa_mask = {SIGCONT}
| | | )
| | |
| | | rt_sigaction(SIGSEGV, action)
| | | rt_sigaction(SIGILL, action)
| | | rt_sigaction(SIGFPE, action)
| | | rt_sigaction(SIGBUS, action)
| | |
+ + + rt_sigprocmask(SIGUNBLOCK, SIGSEGV | SIGILL | SIGFPE | SIGBUS)
On the example of SIGINT, until we arrive into _DkGenericSignalHandle()
.
Normal context Signal context
+----------------+ +----------------+
+ ... enclave code .....
|
| +-----+
| | AEX due to SIGINT
| <-----+
| signal handler called
| +----------------------------------------> _DkTerminateSighandler(SIGINT, siginfo, uc)
| +
| | sgx_raise(PAL_EVENT_SUSPEND)
| | +
| | | RDX = after_resume() addr
| | | RBX = current thread's TCS
| | | RCX = async_exit_pointer() addr
| | | RDI = PAL_EVENT_SUSPEND (from first func arg)
| | |
| | | EENTER(RBX, RCX) <---+
| | | + |
| | | | (SGX creates new SSA frame and |
| | | | sets RAX = current-SSA-frame = 1) |e
| | | | |n
| | | | enclave_entry() |c
| | | | + |l
| | | | | jump to handle_resume() |a
| | | | | +>
| | | | | <double-check RDI contains signum > |e
| | | | | |
| | | | | jump to handle_exception() |m
| | | | | |o
| | | | | create new enclave-thread stack frame |d
| | | | | and push GPRSGX registers on this frame |e
| | | | | |
| | | | | update GPRSGX = (RSP = new frame, |
| | RDI = PAL_EVENT_SUSPEND, |
| | | | | RSI = new frame, |
| | | | | RIP = _DkExceptionHandler) |
| | | | | |
| | | + + EEXIT(RDX = after_resume) <---+
| signal handler done | |
| <----------------------------------------+ + + after_resume(): return
|
| async_exit_pointer()
| +
| | ERESUME <---+
| | + |
| | | (SGX's current-SSA-frame = 0, thus |e
| | | enclave-thread's GPRSGX is loaded in regs) |n
| | | |c
| | | _DkExceptionHandler(exit_info = RDI = PAL_EVENT_SUSPEND, uc = RSI = new frame) |l
| | | + |a
| | | | PAL_CONTEXT ctx = copy(uc) <ctx contains interrupted-context frame> |v
| | | | |e
| | | | _DkExceptionRealHandler(PAL_EVENT_SUSPEND, ctx) |
| | | | |
+ + + + _DkGenericSignalHandle(PAL_EVENT_SUSPEND, ctx) v
< ... >
Non-enclave code execution can only happen if Graphene process is currently executing untrusted-PAL
code, e.g., is blocked on a futex(wait)
system call.
On the example of SIGINT, until we arrive into _DkGenericSignalHandle()
.
Normal context Signal context
+----------------+ +----------------+
+ ... non-enclave code ...
|
| +-----+
| | SIGINT
| <-----+
| signal handler called
| +----------------------------------------> _DkTerminateSighandler(SIGINT, siginfo, uc)
| +
| | update normal-context registers =
| | (RIP = sgx_entry_return,
| | RDI = -PAL_ERROR_INTERRUPTED,
| signal handler done | RSI = PAL_EVENT_SUSPEND)
| <----------------------------------------+ +
|
| sgx_entry_return()
| +
| | RDX = sgx_entry() addr
| | RBX = current thread's TCS
| | RCX = async_exit_pointer() addr
| |
| | EENTER(RBX, RCX) <--+
| | + |
| | | (SGX's current-SSA-frame = 0, ocall is done) |
| | | |e
| | | enclave_entry() |n
| | | + |c
| | | | jump to return_from_ocall() |l
| | | | |a
| | | | remember RDI = -PAL_ERROR_INTERRUPTED on enclave stack |v
| | | | |e
| | | | _DkHandleExternalEvent(event = RDI = PAL_EVENT_SUSPEND, |
| | | | + uc = RSI = enclave frame) |m
| | | | | |o
| | | | | frame = get_frame(uc) <finds outermost PAL function> |d
| | | | | |e
| | | | | _DkGenericSignalHandle(PAL_EVENT_SUSPEND, frame) |
< ... > v
This case is exactly the same as for async signal. The only difference in the diagram would be that
_DkTerminateSighandler
is replaced by _DkResumeSighandler
. But the logic is exactly the same.
Non-enclave code execution can only happen if Graphene process is currently executing untrusted-PAL
code, e.g., is blocked on a futex(wait)
system call.
If a sync signal arrives in this case, it means that there was a memory fault, illegal instruction,
or arithmetic exception in untrusted-PAL code. This should never happen in a correct implementation
of Graphene. In this case, _DkResumeSighandler
simply kills the faulting thread (not the whole
process!) by issuing exit(1)
syscall.
Normal context (enclave mode)
+----------------------------------+
+ _DkGenericSignalHandle(PAL_EVENT_SUSPEND, frame/ctx)
| +
| | upcall = _DkGetExceptionHandler(PAL_EVENT_SUSPEND)
| | = suspend_upcall
| |
| | _DkGenericEventTrigger(PAL_EVENT_SUSPEND, suspend_upcall, frame/ctx)
| | +
| | | event = struct exception_event(event_num = PAL_EVENT_SUSPEND,
| | | context = ctx, +--------------+
| | | frame = frame) | only one of |
| | | | context/frame|
| | | suspend_upcall(event, ctx) | is not NULL |
| | | + +------------------------------+ +--------------+
| | | | | event is opaque ptr to LibOS |
| | | | +------------------------------+
| | | |
| | | | +------------------- PAL -> LibOS transition --------------------+
| | | |
| | | | <... LibOS signal handling ...>
| | | |
| | | | DkExceptionReturn(event)
| | | | +
| | | | | +----------------- LibOS -> PAL transition --------------------+
| | | | |
| | | | | _DkExceptionReturn(event)
| | | | | +
| | | | | | if event.frame is not NULL:
| | | | | | update regs with event.frame regs
| | | | | | return to LibOS (as if PAL function returned)
| | | | | |
| | | | | | if event.context is not NULL:
| | | | | | update regs with event.context regs (including RSP)
| | | | | | return to interrupted-context frame (somewhere in user app)
+ + + + + +
< context is reset, no unwinding here! >
Very similar to the flow for Linux-SGX. In addition to 7 handled signals, Linux PAL also operates on these signals:
_DkPipeSighandler
handlerDescribing flows for these signals is future work.
Normal context Signal context
+----------------+ +----------------+
+ ... PAL code ...
|
| +-----+
| | SIGINT
| <-----+
| signal handler called
| +----------------------------------------> _DkTerminateSighandler(SIGINT, siginfo, uc)
| +
| | < SIGINT arrived during PAL call >
| |
| | add to thread's pending events:
| | tcb.pending_event = PAL_EVENT_SUSPEND
| |
| signal handler done | append PAL_EVENT_SUSPEND to tcb.pending_queue
| <----------------------------------------+ + if tcb.pending_event is already set
|
| ... PAL call finishes (LEAVE_PAL_CALL) ...
|
| __check_pending_event()
| +
| | _DkGenericSignalHandle(tcb.pending_event,
| | siginfo_t = NULL,
| | ucontext_t = NULL)
| |
| | foreach event in tcb.pending_queue:
+ + _DkGenericSignalHandle(event, NULL, NULL)
Normal context Signal context
+----------------+ +----------------+
+ ... non-PAL code ...
|
| +-----+
| | SIGINT
| <-----+
| signal handler called
| +----------------------------------------> _DkTerminateSighandler(SIGINT, siginfo, uc)
| +
| | < SIGINT arrived during app/LibOS code >
| |
| | _DkGenericSignalHandle(PAL_EVENT_SUSPEND,
| | + siginfo_t = NULL,
| | | ucontext_t = uc)
| | |
| | | < ... >
| signal handler done | |
| <----------------------------------------+ + +
+
Normal context Signal context
+----------------+ +----------------+
+ ... PAL code ...
|
| +-----+
| | SIGILL
| <-----+
| signal handler called
| +----------------------------------------> __DkGenericSighandler(SIGILL, siginfo, uc)
| +
| | < SIGILL arrived during PAL call >
| |
| | print panic message
| |
| | _DkThreadExit() < kill this thread >
... thread is dead ...
Normal context Signal context
+----------------+ +----------------+
+ ... non-PAL code ...
|
| +-----+
| | SIGILL
| <-----+
| signal handler called
| +----------------------------------------> __DkGenericSighandler(SIGILL, siginfo, uc)
| +
| | < SIGILL arrived during app/LibOS code >
| |
| | _DkGenericSignalHandle(PAL_EVENT_ILLEGAL,
| | + siginfo_t = siginfo,
| | | ucontext_t = uc)
| | |
| | | < ... >
| signal handler done | |
| <----------------------------------------+ + +
+
Normal context (enclave mode)
+----------------------------------+
+ _DkGenericSignalHandle(PAL_EVENT_SUSPEND, uc)
| +
| | upcall = _DkGetExceptionHandler(PAL_EVENT_SUSPEND)
| | = suspend_upcall
| |
| | _DkGenericEventTrigger(PAL_EVENT_SUSPEND, suspend_upcall, uc)
| | +
| | | event = struct exception_event(event_num = PAL_EVENT_SUSPEND,
| | | context = copy-of-uc-regs,
| | | uc = uc)
| | |
| | | suspend_upcall(event, ctx)
| | | + +------------------------------+
| | | | | event is opaque ptr to LibOS |
| | | | +------------------------------+
| | | |
| | | | +------------------- PAL -> LibOS transition --------------------+
| | | |
| | | | <... LibOS signal handling ...>
| | | |
| | | | DkExceptionReturn(event)
| | | | +
| | | | | +----------------- LibOS -> PAL transition --------------------+
| | | | |
| | | | | _DkExceptionReturn(event)
| | | | | +
| | | | | + update event.uc.regs with event.context regs
| | | | + +-----------------------------------------------+
| | | + | unlike SGX PAL, don't jump to updated context |
| | + | but unwind call stack as usual |
| + +-----------------------------------------------+
|
| ... host OS switches Graphene to normal context if was in signal context
+ (or simply continue execution if already in normal context) ...
Note that LibOS flows are the same for all PALs.
On the example of suspend_upcall()
.
Normal context (enclave mode, non-nested signal)
+-----------------------------------------------------+
+ suspend_upcall(event, context)
|
| if internal Graphene thread (async or ipc helper):
| DkExceptionReturn(event)
|
| siginfo_t info = (SIGINT, SI_USER, .si_pid = 0)
|
| deliver_signal(info, context = NULL)
| +
| | tcb.context.preempt = 1 < __disable_preemt() >
| |
| | shim_signal signal = (siginfo_t info = info,
| | context_stored = false/true,
| | context = LibOS-syscall/context,
| | pal_context = context = NULL)
| | +-----------------------------------------------------+
| | | If signal is delivered while in LibOS syscall, |
| | | then signal.context = LibOS-syscall context; |
| | | otherwise context = NULL and context_stored = false |
| | +-----------------------------------------------------+
| |
| | if curr-thread's signal mask includes SIGINT (blocks it):
| | < allocate_signal_log(SIGINT) and append signal to it >
| |
| | else:
| | __handle_signal(SIGINT, signal.context) < deliver pending >
| | +
| | | for each pending SIGINT signal on this thread:
| | + __handle_one_signal(SIGINT, pending-signal)
| |
| | __handle_one_signal(SIGINT, signal) < deliver this signal >
| | +
| | | save LibOS-syscall context and reset it (to indicate that
| | | context is now not LibOS but user signal handler)
| | |
| | | user signal handler(SIGINT, signal.info, signal.context)
| | | < ... >
| | |
| | | copy signal.context.<regs> in signal.pal_context if not NULL
| | + (propagate user-updated regs to event.context in DkExceptionReturn)
| |
| + tcb.context.preemt = 0 < __enable_preempt() >
|
+ DkExceptionReturn(event)
On the example of suspend_upcall()
. Assumes tcb.context.preempt = 1
(in a signal handler).
Normal context (enclave mode, nested signal)
+-----------------------------------------------------+
+ suspend_upcall(event, context)
|
| if internal Graphene thread (async or ipc helper):
| DkExceptionReturn(event)
|
| siginfo_t info = (SIGINT, SI_USER, .si_pid = 0)
|
| deliver_signal(info, context = NULL)
| +
| | tcb.context.preempt = 2 < __disable_preemt() >
| |
| | shim_signal signal = (siginfo_t info = info,
| | context_stored = false/true,
| | context = LibOS+syscall/context,
| | pal_context = context = NULL)
| | +-----------------------------------------------------+
| | | If signal is delivered while in LibOS syscall, |
| | | then signal.context = LibOS+syscall context; |
| | | otherwise context = NULL and context_stored = false |
| | +-----------------------------------------------------+
| |
| | +-----------Now different from non-nested case--------------+
| |
| | < goto delay because tcb.context.preempt > 1 >
| |
| | allocate_signal_log(SIGINT):
| |
| | append signal to tcb.thread.signal_logs[SIGINT]
| |
| | tcb.thread.has_signal = 1 (increment from 0)
| |
| + tcb.context.preemt = 1 < __enable_preempt() >
|
+ DkExceptionReturn(event)
< ...after top-level signal handler is finished... >
< ...after any system call (END_SHIM) or any internal unlock... >
+ handle_signal(false)
| +
| | __handle_signal(signal-num = 0, context = NULL)
| | +
| | | for each pending (any) signal on this thread:
| | | __handle_one_signal(signo, pending-signal)
| | | +
+ + + + < handles pended SIGINT from tcb.thread.signal_logs[SIGINT]
(Notation: -> PAL signal -> LibOS signal handler (purpose))
Sync signals:
Async signals:
We already described flows of suspend_upcall
. Here is how other signal handlers are different
from suspend_upcall
:
Normal context (enclave mode)
+-----------------------------------------------------+
quit_upcall(event, context)
+
+ < exactly the same as suspend_upcall >
+-----------------------------+
resume_upcall(event, context) | handles all pending signals |
+ +-----------------------------+
| if internal Graphene thread (async or ipc helper):
| DkExceptionReturn(event)
|
| if tcb.context.preempt > 0: (nested signal)
| DkExceptionReturn(event)
|
| tcb.context.preempt = 1 < __disable_preemt() >
|
| __handle_signal(signal-code = 0, context = NULL)
| +
| | for each pending (any) signal on this thread:
| + __handle_one_signal(signo, pending-signal)
|
| tcb.context.preemt = 0 < __enable_preempt() >
|
+ DkExceptionReturn(event)
arithmetic_error_upcall(event, context)
+
| if internal Graphene thread or exception during LibOS/PAL:
| print panic message
| DkExceptionReturn(event)
|
| siginfo_t info = (SIGFPE, FPE_INTDIV,
| si_addr = <faulting addr from PAL>)
|
| deliver_signal(info, context) +--------------------------+
| < ... as in suspend_upcall ... > | note that context is set |
| +--------------------------+
+ DkExceptionReturn(event)
memfault_upcall(event, context)
+
| if exception during test_user_memory/string:
| update RIP to ret_fault
| DkExceptionReturn(event)
|
| if internal Graphene thread or exception during LibOS/PAL:
| print panic message
| DkExceptionReturn(event)
|
| < choose SIGBUS/SIGSEGV and signal code based on VMA info >
|
| siginfo_t info = (SIGBUS/SIGSEGV, signal code,
| si_addr = <faulting addr from PAL>)
|
| deliver_signal(info, context)
+ < ... as in suspend_upcall ... >
illegal_upcall(event, context)
+
| if internal Graphene thread or exception during LibOS/PAL:
| print panic message
| DkExceptionReturn(event)
|
| siginfo_t info = (SIGILL, ILL_ILLOPC,
| si_addr = <faulting addr from PAL>)
|
| deliver_signal(info, context)
| < ... as in suspend_upcall ... >
|
+ DkExceptionReturn(event)
SIGALRM signal is blocked in Graphene. Therefore, on alarm()
syscall, SIGALRM is generated and
raised purely by LibOS.
Application thread AsyncHelperThread
+---------------------+ +---------------------+
shim_do_alarm(seconds) ... no alive host thread ...
+ ... (created on-demand) ...
| install_async_event(seconds,
| + callback = signal_alarm)
| |
| | time = DkSystemTimeQuery()
| |
| | event = struct async_event(
| | callback = signal_alarm,
| | caller = app-thread,
| | install_time = time,
| | expire_time = time+seconds)
| |
| | append event to global async_list
| |
| | create_async_helper() < if not alive >
| | +
| | | thread_create(shim_async_helper)
| | | +
| | + + <creates new thread in host> +------> shim_async_helper()
| | +
| | set_event(async_helper_event) | while (true):
| | + | DkObjectsWaitAny(array =
+ + + DkStreamWrite(async_helper_event) +-+ | { global async_helper_event },
| | timeout = <some-constant>)
... app-thread code continues ... | | ...
| |
+--> | event = async_list.pop()
|
| DkObjectsWaitAny(...,
| timeout = event.expire_time)
|
| ... sleep until timeout ...
|
| timeout fired: call event.callback
|
| signal_alarm(event.caller)
| +
| | append_signal(app-thread, SIGALRM,
| | + wakeup = true)
| | |
| | | shim_signal signal = (siginfo_t info = NULL,
| | | context_stored = false,
| | | context = NULL,
| | | pal_context = NULL)
| | |
| | | < allocate_signal_log(SIGALRM) and append signal >
| | |
| | | DkThreadResume(app-thread)
| | | +
< SIGCONT delivered > <---------------------+ | + + + < send SIGCONT to app-thread via tgkill() >
|
< resume_upcall() with pending SIGALRM, | ...
see other diagrams > +
BUG? Graphene LibOS performs DkThreadYieldExecution()
in __handle_signal()
(i.e., yield
thread execution after handling one pending signal). Looks useless.
TODO: clean-up install_async_event()
, redundant logic in async_list
checking
TODO: suspend_on_signal
is useless
BUG? return_from_ocall
remembers RDI = -PAL_ERROR_INTERRUPTED, but _DkExceptionReturn
never
returns back to after _DkHandleExternalEvent
in return_from_ocall
. Thus, the PAL return code
(interrupted error) is lost! Check it with printfs and simple example.
BUG? SIGNAL_DELAYED
flag is useless? It is set as one of the highest bits in int64
SIGNAL_DELAYED = 0x80000000UL
. resume_upcall
sets SIGNAL_DELAYED flag in current thread's
context.preempt
if the SIGCONT signal arrives during signal handling. handle_signal
does the same.
TODO: Sigsuspend fix ( https://github.com/oscarlab/graphene/issues/453 ). In shim_do_sigsuspend
:
(1) unlock before thread_setwait + thread_sleep
(2) lock and unlock around last set_sig_mask
(3) add code similar to __handle_signal
, but on all possible signal numbers and without
DkThreadYieldExecution
and without unsetting SIGNAL_DELAYED
(?).
Allow all pending signals to be delivered
(see https://stackoverflow.com/questions/40592066/sigsuspend-vs-additional-signals-delivered-during-handler-execution).
If at least one signal was delivered, do NOT go to thread_sleep
but immediately return
(and set the old mask beforehand).