Fix execute_task() so unhandled task exceptions become task results#6871
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a mechanism to return unhandled task exceptions as task results instead of propagating them directly, ensuring the scheduler protect frame remains intact. It executes tasks via mrb_protect_error and captures exceptions in the VM execution loop. However, the reviewer noted that the fallback mechanism for abnormal cases that bypasses exception_as_result and calls mrb_exc_raise could propagate exceptions to the scheduler's context and cause a crash. The reviewer suggested terminating the task cleanly and storing the exception as the task's result instead of re-raising it.
| if (error) { | ||
| mrb_exc_raise(mrb, t->result); | ||
| } |
There was a problem hiding this comment.
If an abnormal case (such as a CINFO_SKIP frame) bypasses exception_as_result and unwinds via MRB_THROW, calling mrb_exc_raise(mrb, t->result) here will propagate the exception into the scheduler's context. In pattern 1 (where the scheduler is called directly from main()), there is no outer jmpbuf to catch this, which will cause the entire scheduler/process to abort.
To make the scheduler more robust and prevent fatal crashes in these abnormal cases, we should terminate the task cleanly and store the exception as the task's result instead of re-raising it to the scheduler.
if (error) {
terminate_task_internal(mrb, t);
}This patch fixes a bug that c09196c introduced. ## Background `mrb_task_run()` has two usage patterns: 1. Called directly from `main()` as the top-level scheduler (PicoRuby and R2P2). There is no surrounding C exception handler, so mrb->jmp is NULL on entry 2. Called from Ruby code via Task.run, bootstrapped on top of mruby's regular call chain. mrb->jmp is non-NULL Historically, an unhandled exception raised inside a task body was turned into the task's result value by `mrb_vm_exec()`: the L_RAISE path walked callinfo down to cibase, ran `fiber_terminate()`, and - because c->vmexec was TRUE and prev_jmp was NULL in pattern 1 - took `return mrb_obj_value(mrb->exc)`. That value landed in t->result and could be read back through `mrb_task_value()` / `join()`. ## What c09196c broke It consider only pattern 2 and wrapped `mrb_task_run()` in a protect frame (MRB_TRY / mrb_protect_error) to guarantee that loop_running is cleared on exception. As a side effect, mrb->jmp is now always non-NULL while a task body is executing, so the L_RAISE path takes `MRB_THROW(prev_jmp)` instead of returning the exception value. In pattern 2 this merely changed the semantics (exceptions started propagating out of `Task.run` instead of being stored as task results). In pattern 1 it was FATAL: the throw unwound to mrb_task_run's catch handler, which called `mrb_exc_raise()` to re-propagate, and with no outer jmpbuf this aborted the process. PicoRuby/R2P2 could no longer retrieve task exceptions via `mrb_task_value()`. ## Fix Restore the "task exception becomes task result" contract uniformly for both patterns, independent of mrb->jmp: * Add `mrb_task_state.exception_as_result`. When set, `mrb_vm_exec()`'s non-root_c L_RAISE branch returns the exception as a value even if prev_jmp is non-NULL, instead of throwing * `execute_task_vm()` raises the flag around `mrb_vm_exec()`, captures the exception into `t->result`, and clears `mrb->exc` * Wrap `execute_task_vm()` in `mrb_protect_error()` as a safety net for rare paths that still unwind via MRB_THROW (e.g. CINFO_SKIP frames). exception_as_result is reset both at the end of the body and immediately after `mrb_protect_error()` returns, so a caught throw does not leave the llag set * Expose `Task#value` to retrieve t->result from Ruby, since Task#join cannot deliver the value through its return path under cooperative scheduling * Add a test asserting that `Task#join` on a task that raised returns the exception object, matching the pre-c09196c observable behavior ## Notes The "task exception becomes task result" semantics match the mruby/c's rrt0.c and the spirit of CRuby's Thread (an unhandled exception in a thread does not kill the scheduler / process; it surfaces when the thread is joined). The visible API shape still differs from CRuby - `Task#join` here returns the exception object rather than re-raising it - but the scheduler is no longer destabilized by task errors in either invocation pattern.
6c2d27b to
f123233
Compare
It is not likely happens but if happened, in pattern 1, the whole process abort when mrb->jmp is NULL. Instead, make the status MRB_TASK_STOPPED and delegate following logic of "Handle task termination"
This patch fixes a bug that c09196c introduced.
Background
mrb_task_run()has two usage patterns:main()as the top-level scheduler (PicoRuby and R2P2). There is no surrounding C exception handler, so mrb->jmp is NULL on entryHistorically, an unhandled exception raised inside a task body was turned into the task's result value by
mrb_vm_exec(): the L_RAISE path walked callinfo down to cibase, ranfiber_terminate(), and - because c->vmexec was TRUE and prev_jmp was NULL in pattern 1 - tookreturn mrb_obj_value(mrb->exc). That value landed in t->result and could be read back throughmrb_task_value()/join().What c09196c broke
It consider only pattern 2 and wrapped
mrb_task_run()in a protect frame (MRB_TRY / mrb_protect_error) to guarantee that loop_running is cleared on exception. As a side effect, mrb->jmp is now always non-NULL while a task body is executing, so the L_RAISE path takesMRB_THROW(prev_jmp)instead of returning the exception value.In pattern 2 this merely changed the semantics (exceptions started propagating out of
Task.runinstead of being stored as task results). In pattern 1 it was FATAL: the throw unwound to mrb_task_run's catch handler, which calledmrb_exc_raise()to re-propagate, and with no outer jmpbuf this aborted the process. PicoRuby/R2P2 could no longer retrieve task exceptions viamrb_task_value().Fix
Restore the "task exception becomes task result" contract uniformly for both patterns, independent of mrb->jmp:
Add
mrb_task_state.exception_as_result. When set,mrb_vm_exec()'s non-root_c L_RAISE branch returns the exception as a value even if prev_jmp is non-NULL, instead of throwingexecute_task_vm()raises the flag aroundmrb_vm_exec(), captures the exception intot->result, and clearsmrb->excWrap
execute_task_vm()inmrb_protect_error()as a safety net for rare paths that still unwind via MRB_THROW (e.g. CINFO_SKIP frames). exception_as_result is reset both at the end of the body and immediately aftermrb_protect_error()returns, so a caught throw does not leave the flag setExpose
Task#valueto retrieve t->result from Ruby, since Task#join cannot deliver the value through its return path under cooperative schedulingAdd a test asserting that
Task#joinon a task that raised returns the exception object, matching the pre-c09196c observable behaviorNotes
The "task exception becomes task result" semantics match the mruby/c's rrt0.c and the spirit of CRuby's Thread (an unhandled exception in a thread does not kill the scheduler / process; it surfaces when the thread is joined). The visible API shape still differs from CRuby -
Task#joinhere returns the exception object rather than re-raising it - but the scheduler is no longer destabilized by task errors in either invocation pattern.