Architecture

One reactor per thread, typically one per core. Each owns an io_uring, a SO_REUSEPORT listener, a connection table, buffer rings, and a connection pool - and is the sole writer of all of them. Nothing is shared between reactors. This page is the system view; The Reactor and The Connection go inside the classes.

The loop

A reactor's life is a single loop: drain the cross-thread queues, enter the kernel once (io_uring_enter - submitting everything staged and waiting for at least one completion), then dispatch the whole completion batch.

while (true)
{
    // Work handed over by off-reactor handlers. Cheap when empty.
    DrainReturnQ();      // buffer returns
    DrainFlushQ();       // flushes
    DrainRecycleQ();     // connection teardowns
    DrainRemoteOps();    // client ops

    // One syscall per batch: submit everything staged, wait for >= 1 CQE.
    Ring.SubmitAndWait(1);

    // Read the CQ tail once, dispatch the whole batch, publish the head once.
    uint ready = Ring.CqReady();

    for (uint i = 0; i < ready; i++)
        Dispatch(in Ring.CqeAt(i));

    Ring.CqAdvance(ready);
}

Dispatching a completion frequently runs handler code - that's the inline-resume model below - so by the time the loop re-enters the kernel, the responses those completions triggered are already staged in the submission queue. One syscall carries the whole request/response batch.

Ring setup

Accept, recv, send

Two buffer-ring modes

Shared (default): one pool per reactor; every connection draws from it. One recv consumes one whole buffer regardless of size - elastic and simple, but small messages waste space. Incremental (IOU_PBUF_RING_INC, kernel 6.12+): a small ring per connection, and the kernel appends successive recvs into the same buffer until it fills. Dense packing and per-connection isolation, paid for with refcounted recycling - a buffer returns only when the handler has returned every slice and the kernel is done appending (F_BUF_MORE cleared) - plus a ring registration per connection (MaxConnections caps the buffer-group ids). The handler API is identical in both modes; ReturnBuffer(s) routes the right return path.

Completion routing: tags and generations

Every SQE carries its routing in user_data:

[63:56] kind     accept · recv · send · wake · client · cancel
[47:32] gen      the connection's generation at submit time
[31:0]  fd       (or the client-op slot)

Dispatch is an array index (connections[fd] - fds are small dense integers, so an array beats hashing) plus a generation check. The generation is what makes fd reuse safe: when a connection dies, its fd number is immediately reusable, and a straggler CQE from the old life would otherwise reach the new tenant. Stale generation → the CQE is dropped and its buffer returned. The same guard rides the flush queue and incremental buffer returns.

Teardown also submits an ASYNC_CANCEL for the connection's multishot recv (matched by exact user_data), so a dead connection can't keep consuming buffers or race the fd's next tenant. If a connection's recv queue overflows - the handler isn't draining - the reactor cancels and tears it down rather than leaving it zombied.

Client ops (kind = client) skip the connection table entirely: the low 32 bits index a slot table holding the submission's completion object.

Inline resume

Every awaitable - ReadAsync, FlushAsync, every client op - is backed by a reusable IValueTaskSource core with RunContinuationsAsynchronously = false. When the reactor dispatches a CQE and calls SetResult, the awaiting handler continues right there, on the reactor thread, inside the dispatch loop. Zero allocation per await: connections are pooled and their cores are reused, with the connection generation as the token so an awaiter from a previous pool life resolves to a closed result instead of the new tenant's state.

Leaving the reactor (and coming back)

Handlers may wander - await Task.Delay, any BCL async - and resume on the thread pool. Every reactor-touching operation checks the current thread: on the reactor it takes the direct path (write the SQE, touch the buf_ring); off it, the operation is queued - lock-free MPSC queues for buffer returns and flushes, a queue for client ops and recycles - and the reactor is woken through an eventfd registered as a multishot poll. The detour costs a queue hop and a syscall; the hop Playground mode runs every request through it, end to end.

Connection lifetime

A connection has exactly two owners: the reactor (recv side) and the handler. The refcount starts at 2 on accept; each owner releases once - the reactor on EOF/error, the handler via conn.DecRef() on exit (exactly once, in a finally). Whoever reaches zero hands the connection to the reactor for recycling: cancel the multishot recv, return leftover buffers, close the fd, bump the generation (invalidating stale awaiters and queued work), reset state, and push to the pool (capped by PoolMax; beyond it, native memory is freed). Pooled connections keep their slab and buffer-ring allocations across lives.

Wiring and services

Three seams connect an application: Handle (the per-connection loop), OnStart (runs on the reactor thread before serving - open ring-native clients here so they bind to this reactor's ring), and typed services (AddService<T> / GetService<T>) so one reactor can carry any number of clients. The engine never names a client type - see Ring clients.

Configuration

OptionDefaultMeaning
Port8080SO_REUSEPORT listener port (every reactor binds it)
ExtraPorts[]additional listener ports; conn.ListenerPort says which one a connection used
ReactorCount12reactors = threads; run one per core
RingEntries8192io_uring SQ/CQ depth
RecvBufferSize32 KBshared mode: bytes per recv buffer
BufferRingEntries4096shared mode: buffers per reactor (power of two)
WriteSlabSize16 KBper-connection write buffer
PoolMax1024pooled connection objects per reactor (bounds native memory)
RecvQueueEntries64per-connection slice queue depth; overflow closes the connection
Incrementalfalseper-connection buffer rings (kernel 6.12+)
MaxConnections4096incremental: one buffer-group id per live connection
ConnBufRingEntries16incremental: buffers per connection ring
IncRecvBufferSize4 KBincremental: bytes per buffer (kernel appends into it)