The Reactor
One Reactor instance is a closed world: a thread, a ring, and
every structure that thread touches. This page walks the class itself.
What it owns
- The ring - a thin
Ringwrapper over the mmapped SQ/CQ:GetSqe()claims a slot,SubmitAndWait(n)publishes the SQ tail and enters the kernel,CqReady/CqeAt/CqAdvancedrain completions batched - one acquire-load of the tail and one release-store of the head per batch, not per CQE. - The connection table -
Connection?[]indexed by fd, grown by doubling. Lookup is an array index plus a generation compare, no hashing. - The buffer ring - the kernel-shared
io_uring_buf_ringcontrol area plus one contiguous slab; a buffer id is both the ring slot and the slab offset. Returns stage entries and publish the tail once per drain batch. - Hand-off queues - lock-free MPSC queues for buffer returns and flushes (flush
entries pack
(generation << 32) | fd), plus queues for recycles and off-reactor client ops, all woken by one eventfd registered as a multishot poll. - The connection pool - a plain
Stack<Connection>; accept and teardown both run on this thread, so nothing concurrent is needed. Capped byPoolMax, which bounds the reactor's reserved native memory. - The client-op slot table -
IRingCompletion?[]plus a free-index stack; each client submission parks its completion in a slot whose index rides the SQE. See Ring clients. - Services - a small typed registry (
AddService<T>/GetService<T>) whereOnStartleaves clients for handlers.
Startup
Run() executes on the reactor's own thread, in a deliberate order: record the
thread id (every fast-path check compares against it), create the ring -
DEFER_TASKRUN requires that setup and enter happen on the same thread - open the
SO_REUSEPORT listeners - one per Port/ExtraPorts entry, accepts
route by listener fd - register the buffer ring (or initialize the per-connection ring
machinery in incremental mode), create the eventfd, run OnStart so the
application opens its ring-native clients here, then arm the multishot accept and the wake
poll and fall into the loop. Client opens started in OnStart may be async: their
submissions sit staged in the SQ and complete once the loop begins.
Dispatch, kind by kind
Each CQE's user_data unpacks to a kind, a generation, and an fd (or slot):
- recv - extract the buffer id from
cqe.flags >> 16, resolve the connection (generation-checked). EOF or error → teardown. Stale generation → return the buffer, drop the CQE. Otherwise hand the slice toConnection.Complete; if that reports recv-queue overflow, cancel the multishot and tear down rather than zombify. If the kernel ended the multishot (F_MOREclear), re-arm it with the current generation. - send - generation-checked; stale CQEs are dropped without touching the fd's new
tenant. On success advance
WriteHead; a short send (rare with MSG_WAITALL) is resubmitted from the offset; a full ack runsCompleteFlush, which resumes the awaiting handler inline. On error, cancel the connection's recv and tear down. - accept - pop a pooled connection or construct one, set TCP_NODELAY (it doesn't reliably inherit from the listener), register it in the table, init the refcount to 2, arm its multishot recv stamped with its generation, fire the application handler. In incremental mode this is also where the per-connection buffer ring registers.
- client - index the slot table and free the slot before invoking the completion: the inline continuation may immediately submit again and reuse it.
- wake - drain the eventfd counter; the queues drain at the top of the next loop iteration. Re-arm the poll if the kernel ended it.
- cancel - the result of an ASYNC_CANCEL the reactor issued; nothing to route.
Producing SQEs
SQE production is reactor-thread-only - SINGLE_ISSUER is a promise to the
kernel. GetSqeOrFlush claims a slot and, if the SQ is full mid-batch, flushes it
with a no-wait enter and retries: submission never blocks on completions.
Off-reactor callers never touch this path; their work arrives through the queues and becomes
SQEs here.
Recycling
Recycle always runs on the reactor (it touches the buf_ring and the pool):
wake any awaiters via MarkClosed, submit an ASYNC_CANCEL for the
connection's multishot recv - tagged with the pre-bump generation so it matches the
armed SQE - return leftover buffers (shared mode) or unregister the per-connection ring
(incremental), close the fd, Clear() the connection (which bumps the generation
first, invalidating every stale token and queued reference), then push to the pool or free if
the pool is full.