The Connection
A Connection is the meeting point of two parties that are usually
one thread: the reactor (producing recv slices, consuming flush requests) and the handler
(consuming slices, producing writes) - which normally runs inline on the reactor, but
may legally be a thread-pool thread. Every shared field is designed for that duality.
Reading
The reactor pushes each recv slice into the connection's SPSC queue (sized by
RecvQueueEntries) and wakes the awaiter. The raw contract:
RecvSnapshot snapshot = await conn.ReadAsync(); // resumes when slices exist
while (conn.TryGetItem(snapshot, out SpscRecvRing.Item item)) // drain up to the snapshot
{
if (item.HasBuffer)
{
// item.AsSpan() - the bytes, zero-copy
conn.ReturnBuffer(in item); // hand the buffer back
}
}
conn.ResetRead(); // re-arm before the next ReadAsync
The snapshot is a tail marker: you drain exactly the slices that existed when the read completed, so a burst arriving mid-drain doesn't extend your loop. For request reconstruction across slices there are three levels:
- Per-item -
item.AsSpan(), when one slice is always one message. - Per-snapshot -
conn.GetSnapshotMemories(snapshot)returns the slices as oneUnmanagedMemoryManager[];ToReadOnlySequence()stitches them into a single zero-copy sequence;conn.ReturnBuffers(rings)hands everything back. Pair with a carry buffer for requests split across reads - the landing page example shows the full pattern. - PipeReader -
ConnectionPipeReaderowns the carry for you: unconsumed bytes are held zero-copy across reads, buffers return automatically once fully consumed, andexaminedis honored - a fully-examined buffer parks the next read until new bytes arrive. Allocation-free at steady state (pooled segments, no async state machine); within ~1% of the raw API on a plaintext benchmark.
Writing
Each connection owns a native write slab (WriteSlabSize).
conn.Write(span) copies into it; conn also implements
IBufferWriter<byte> (GetSpan/GetMemory/
Advance) for formatters that write in place - the Postgres driver and
Utf8JsonWriter use this. await conn.FlushAsync() submits one send
for everything staged and completes when the kernel acks it. One flush in flight per
connection; batch several responses into one flush when you can.
ConnectionPipeWriter adapts the same slab to PipeWriter,
allocation-free.
Under the hood: the read side
Recv slices land in a bounded SPSC ring (RecvQueueEntries slots). Each
Item carries the slab pointer, length, buffer id, a has-buffer flag, and the
connection generation at enqueue time. ReadAsync() follows a lost-wakeup-proof
protocol: if items (or the sticky pending flag) exist, complete synchronously
with a snapshot - the queue tail at that instant; otherwise arm via
Interlocked.Exchange, re-check (a slice may have raced in between the check and
the arm), and only then hand back an IValueTaskSource task. The reactor's
Complete does the mirror exchange: if an awaiter is armed,
SetResult - which runs the handler inline - else set pending so the
next ReadAsync completes immediately.
Under the hood: the write side
One native slab per connection and three counters: WriteTail (bytes staged by
Write/Advance), WriteInFlight (the target of the
current flush), WriteHead (bytes the kernel has acked).
FlushAsync() guards with two interlocked flags (flushInProgress,
flushArmed - one flush at a time, writes blocked while sending), snapshots
WriteTail, hands (fd, generation) to the reactor, and recovers the
close race by self-completing if the connection died between the guard and the hand-off. The
send completion path advances WriteHead, resubmits shortfalls, and
CompleteFlush zeroes the counters and resumes the awaiter.
Lifetime and generations
The refcount starts at 2 on accept - reactor and handler each own one release. The handler
calls conn.DecRef() exactly once, in a finally; the reactor releases
on EOF or error. MarkClosed wakes both a parked reader (with a closed snapshot)
and a parked flusher. Clear bumps the generation before resetting
anything else, so every stale token, queued flush, and straggler CQE from the old life
resolves harmlessly. The IVTS plumbing uses the generation as its public token while
dispatching internally on the core's own version - the cross-life guard and the per-cycle
version are independent mechanisms.
How ConnectionPipeReader integrates
The reader composes on top of the raw contract without touching it.
ReadAsync completes synchronously while unexamined bytes are held (or the
connection closed); otherwise it arms its own reusable
ManualResetValueTaskSourceCore<ReadResult> and chains onto
conn.ReadAsync() by storing the ValueTaskAwaiter and registering one
cached Action - no async state machine, no allocation. When the connection's core
fires (inline, on the reactor), the callback ingests the snapshot - draining
TryGetItem into pooled Slice segments, each owning a reusable
UnmanagedMemoryManager and remembering its original Item - links
them onto a persistent chain, calls conn.ResetRead(), and completes its own core,
which resumes the PipeReader caller inline. The exposed
ReadOnlySequence<byte> is just a struct over
(head, headConsumed, tail).
AdvanceTo converts the positions to offsets, trims fully-consumed slices off
the front - returning each buffer via conn.ReturnBuffer(in item), generation and
mode intact - and records the examined watermark that gates the next read: fully examined
means park until new bytes arrive, never re-serve the same data. Spurious wakes
re-arm in place.
How ConnectionPipeWriter integrates
The writer is a veneer: GetMemory/GetSpan/Advance
delegate straight to the connection's slab (the same memory a raw handler writes), tracking
UnflushedBytes. FlushAsync calls conn.FlushAsync(); if
it completed synchronously the result is wrapped without allocation, otherwise the same
stored-awaiter + cached-callback pattern bridges the connection's flush core to the writer's
own FlushResult core.
Both adapters inherit the connection's single-flight rules (one read, one flush at a time)
and its inline-resume property - continuations still run on the reactor - and both are
allocation-free at steady state, which is why the pipe Playground mode benchmarks
within about one percent of raw.