Work That Fits in a Night: How Trail Stops Caring About Corpus Size

A graph that does not stop growing

If you build a system that compiles knowledge at ingest time — as Trail does, and as Karpathy's LLM Wiki does, and as a brain does during sleep — you accept a particular kind of problem the moment the corpus crosses about eight thousand pages.

Every new source wants to be integrated against everything the system already knows. Every existing page wants to be re-checked against every new claim. Contradictions should not accumulate silently. Stale references should surface. Orphaned Neurons — pages no source supports anymore — should get caught.

All of this is correct, all of this is necessary, and all of this runs as a nightly background pass. The problem is that the nightly pass is not free, and at a certain corpus size it stops fitting inside the night.

This is a post about that wall, how we hit it, and how we walked through it without the usual answers — without giving up on full-corpus consistency, without silently skipping old pages, and without turning the whole engine into a distributed computing project.

A naive sequential scan crosses the 24-hour scheduling window around 8,000 Neurons and the passes start stacking. The sampled pass stays flat — by design.

The math of the wall

Trail's contradiction detector works like this. For every Neuron that has changed — or, in the nightly pass, every Neuron at all — we find its top-K nearest peers via FTS5 over the corpus, then ask a small language model whether each (new, peer) pair contradicts each other. Top-K is 5 by default. Each LLM call takes 1–3 seconds on Haiku. So every Neuron costs about 5 × 1.5 seconds ≈ 7.5 seconds of wall-clock time to scan.

At a thousand Neurons, that is two hours. At four thousand, eight hours. At eight thousand, a sequential full pass runs close to seventeen hours — and the scheduler re-fires every twenty-four. The passes start to overlap. Then they start to stack. Then the scan queue grows without bound, and the whole lint service is one bad night away from becoming useless.

This is the scan wall. It is not a bug in the detector; it is a direct consequence of an architecture where knowledge compiles. RAG doesn't have this problem because RAG never integrates anything. Trail has it because Trail does the integration that gives Neurons their value in the first place.

The wall has to be walked through, not avoided.

Why the obvious solutions are wrong

Parallelise it. The straightforward response is to spawn K worker processes and divide the Neurons among them. This works in the sense that it moves the wall further out, but it does not move it far enough — at 50,000 Neurons a 4-way parallel scan still takes almost an entire night, and the LLM token budget scales linearly with worker count. Worse, it replaces an architectural problem with an operations problem. Trail would become a service that required tuning a thread pool.

Skip pages that have not changed. The reactive path already does this. When a Neuron is approved, the contradiction-lint subscriber fires for that Neuron only. The nightly pass exists precisely because the reactive path misses cases: a page approved before contradiction-lint was enabled, a page affected by a source that was later retracted, a page whose truth-value depends on a peer that was just rewritten. You cannot drop the full pass without reintroducing silent drift.

Sample uniformly. Pick 500 random Neurons every night and scan those. Over sixteen days, every page gets visited eventually. This sounds reasonable until you look at where contradictions actually live. Contradictions are overwhelmingly introduced by recent edits, not discovered by re-checking quiet pages. A uniform random 500-of-8000 would delay the discovery of a fresh contradiction by a week on average, which is roughly forever in curator-time. The curator would notice the problem before the system did.

Each of these three solutions has a defect at the same place: they treat every Neuron as equally worth scanning, when the Neurons that actually need scanning are the ones near a recent edit — and, with lower frequency, a long-idle tail that nobody has looked at in months.

The split that works

The solution Trail ships is boring in the way good solutions usually are.

Cap the per-pass budget at 500 Neurons. Spend sixty percent of that budget — three hundred Neurons — on the most-recently-updated pages in the corpus. Spend the remaining forty percent — two hundred Neurons — on a uniform random sample of everything else. Do this every twenty-four hours.

Each nightly sample is biased toward recent edits (where contradictions actually live) but reserves room for random long-tail picks so old Neurons never stop being checked.

That is the whole thing.

The choices embedded in those two numbers are not arbitrary. Sixty percent is what you get when you follow the biology honestly: slow-wave sleep devotes the bulk of its replay bandwidth to recent experience, but not all of it. The hippocampus replays the day's traces preferentially; it also replays older memories that are being consolidated or updated. If you look at sleep-spindle literature, the ratio is not symmetric and it is not winner-take-all — it is weighted toward new material with a meaningful reserve for old. Sixty-forty falls out of that shape.

Five hundred is the number that makes a full pass fit in roughly an hour of wall-clock time on a single Haiku lane, with comfortable headroom for retries and scheduler slack. It is not a constant of the universe. Tune it by environment variable (TRAIL_CONTRADICTION_SAMPLE_SIZE) based on your hardware budget. On a larger deployment with parallel CLI lanes, push it to two thousand. On a Raspberry Pi running a personal knowledge base, drop it to fifty. The sampling strategy does not change — only the cap.

And cap zero disables the cap entirely. On a corpus of three hundred Neurons, there is nothing to ration; just scan all of them. The code path that handles this is four lines long.

What this looks like in a running system

A scheduled pass that sleeps for most of the day and does one bounded hour of integration work. The analogy to slow-wave consolidation is not metaphorical — it is the same pattern.

The scheduler ticks once a day. It picks the five hundred Neurons described above, hands each one to the contradiction checker in sequence, and emits any findings as curation-queue candidates. A curator sees those findings in the queue the next morning — or later that day, or whenever they open the admin — and resolves them through the standard flow. Retire the newer page, retire the older one, reconcile manually, or dismiss as a false positive.

From the curator's perspective, nothing about the scheduling strategy is visible. They see a queue of findings. The queue is roughly the same size on night one at a thousand Neurons as it is on night one-hundred-and-fifty at twenty thousand. The work per human has been flattened, even though the corpus has grown by twenty times.

From the infrastructure side, the work per machine has also been flattened. LLM token spend per night is bounded by the cap, not by corpus size. Database read pressure is bounded. The scheduler either finishes its pass in an hour or it doesn't fire at all, depending on whether contradictions are enabled for that KB. There is no runaway.

The full-fidelity story is that every Neuron gets re-scanned on average every sixteen to twenty nights. Recent edits get re-scanned within twenty-four hours of the edit. A Neuron that has not been touched in two years sits in the tail and gets picked up a handful of times a year, which is the right frequency for a page that nobody is modifying. The system never stops checking anything; it just checks different things at different rates.

The same pattern, used everywhere

Contradiction-scanning is the most expensive thing the lint scheduler does, but it is not the only thing. The same nightly pass also runs orphan detection (every source to every Neuron, asking which Neurons no source supports anymore) and stale detection (every Neuron's updatedAt against a cutoff, flagging pages that haven't been touched in the configured staleness window).

Orphan and stale detection are cheap. They are pure SQL queries; they cost milliseconds even on large corpora. They run across the full KB every night with no sampling. There is nothing to ration because there is nothing expensive to do.

The principle is that expensive work gets sampled; cheap work gets exhaustively scanned. Trail applies this uniformly across the background services. The queue-backfill that replays source re-ingests after schema migrations uses the same bounded-per-tick model. The reference-extractor runs reactively, never on a full pass. The backlink-extractor rebuilds incrementally on candidate_approved events.

The pattern falls out naturally once you commit to the compile-time architecture. Work has to be integrated. Integration has cost. Bounded integration has bounded cost. Unbounded integration, given a growing corpus, eventually exceeds any schedule you give it. You either sample thoughtfully or you accept that your system will eventually stop working on large corpora — and "eventually" arrives around the eight-thousand-Neuron mark, not at some abstract future scale.

What the user sees

Nothing. That is the point.

A curator with three hundred Neurons and a curator with thirty thousand Neurons have the same experience of the queue. The queue fills at roughly the same rate. The findings that land there are drawn from the same nightly-pass machinery. The delay between editing a page and the system re-checking it against its peers is under twenty-four hours in both cases.

The difference is hidden inside the scheduler. On the small KB, the cap is irrelevant — there are not enough Neurons for the cap to matter. On the large KB, the cap is the only reason the scheduler has not fallen apart. And because the sampling is biased toward recent edits, the curator's most common action — "I just merged a source, are any of these claims contradicting something I have already written?" — gets answered on the correct timescale regardless of how big the KB is.

Trail does not brag about this. It does not need to. The measure of a good scaling plan is that it does not show up in the product. Nobody notices the wall when you have already walked through it.

The broader commitment

Compile-time architecture makes a specific bet: that it is better to pay the integration cost at ingest than to avoid integration altogether. That bet is correct for a wide class of knowledge work — the class where knowledge should compound over months or years — and it is wrong for the class where corpus size is small and query volume is low.

But the bet only pays off if the integration cost stays bounded as the corpus grows. If every nightly pass scales with N and every reactive emission scales with K, the system eventually drowns in its own usefulness. The integration that made the Neurons valuable is also the integration that, unbounded, makes the system slow and eventually unusable.

Sampling with bias is how you escape that trap. Not uniform sampling — too much blindness. Not full scans — too much cost. A weighted cap that spends its budget where contradictions actually live, keeps a reserve for the long tail, and stays inside a human-meaningful time window.

That is the scaling plan. It is not an engineering tour de force. It is the result of taking seriously that compile-time knowledge, like biological memory, has to be consolidated continuously — and that consolidation, in both cases, is always bounded by sleep. Trail just has sleep on a cron.