Brainsless is a research lab building its own models and advancing the science of cognitive integration with AI. We develop architectures where AI operates structurally inside human work — not as a layer on top of it, but as a constituent part of the cognitive process itself. Our first commercial application is Planless.
Constraint Routing Failure and Governed Multi-Stream Attention
We run an AI that sits inside people's work all day. In long sessions, we kept seeing the same thing: models would follow rules at the start of a conversation, then quietly stop enforcing them toward the end — while still being able to recall and quote those rules back perfectly. That's not forgetfulness. Something structural was breaking down.
We ran the experiments. Traced it to the attention mechanism: when you put behavioral constraints, user instructions, episodic context, and persona tokens all into the same softmax pool, the constraints lose out as the context window grows. The math is simple and the effect is large — a compliance cliff that hits reliably around turn 20.
The fix we propose is Governed Multi-Stream Attention — separate softmax streams for different token roles, so behavioral tokens can't be crowded out by sheer context volume. We fine-tuned it, ran it against five benchmarks, and it holds up. Mean alignment jumped from 0.26 to 0.79 with no capability cost. This is the paper.
Read It All — Download PDF61 pages · Technical Report BRL-2026-05
Neptyn 1.0 is live and running inside Planless. It's our model — not a wrapper around someone else's API, not a fine-tuned GPT. We built it on a sparse Mixture-of-Experts backbone at the trillion-parameter scale, which means it has the depth of a frontier model without the compute cost of running one dense.
What makes it different from the models you already use: most frontier models are trained to be helpful to everyone, which means they're optimized for nobody in particular. Neptyn starts from a corpus of how founders and operators actually think — strategy, product decisions, how people reason under uncertainty when something real is at stake. That's the base.
Then it adapts to you. Planless observes your patterns over time — how you structure work, what you prioritize, how you think through decisions — and Neptyn's behavior shifts to match. Not a settings menu. Actual adaptation. The longer you use it, the more it sounds like the way you'd want to think, not the way a generic assistant wants to respond.
We'll keep releasing updates to it. 1.0 is the foundation — the architecture is in place, the memory system is running, and GMSA is being trained into the attention layers now. What comes next will be faster, more personalized, and better at holding behavioral constraints across very long sessions. We'll publish when there's something worth publishing.
Most models are trained to answer everything reasonably well. Neptyn is trained to think well about the specific problems founders and operators face — and then to adapt to the specific person using it. That's a fundamentally different optimization target.
Three layers — working memory (the current session), episodic memory (indexed retrieval from your history), and an executive system that decides what's relevant when. Not flat RAG over a document store. Structured recall, like the way you'd actually want a co-thinker to remember things.
The problem we documented in BRL-2026-05 — compliance dropping off in long sessions — is being fixed at the architectural level in Neptyn. Behavioral rules get their own attention stream so they can't be crowded out. That's not a prompt trick. It's the model being built differently.
1.0 is the start. We'll improve latency, personalization depth, and constraint robustness in subsequent versions. Each release will have a technical note. We're not shipping and disappearing — this is the model we use ourselves every day.
Our work is organized around four questions that arise when AI is positioned inside cognitive work rather than alongside it.
We investigate what happens when AI systems become structurally integrated with human cognitive work. Following the Clark–Chalmers framework, we ask: under what conditions does a computational system become part of the cognitive process itself, rather than a tool the mind uses?
We study how AI memory can be organized to reflect the structure of human episodic memory — indexing, associative binding, and slow consolidation into behavioral priors — rather than treating recall as flat retrieval over a flat corpus.
We research architectures that let AI agents execute complex workspace tasks during live voice conversations under strict latency constraints. The central problem is maintaining coherent, context-sensitive judgment while parallel execution is in progress.
We examine the conditions under which autonomous background agents can be extended trust incrementally, and the failure modes that arise when they cannot. Specifically: how should trust be scoped, logged, and revoked when an agent acts on judgment rather than explicit instruction?
Where it looks, how deep it goes, and why the search layer was built specifically for the way founders work — specialized indexes, four depth levels, and live website crawl.
A technical audit of the Planless vault: key derivation, storage topology, and attack surface. 75 tests written. Four critical issues found and resolved.
Connect your WHOOP. Your AI co-founder feels how you slept, what you've been carrying, what your body's actually doing — and quietly bends her day around it. Private access for now.
The second generation of Cortex adds four capabilities — her own persistent posture, theory of mind for the people in your orbit, predictive lessons that surface in the moment, and voice-tone reading that adds zero latency.
The proprietary memory system Brainsless built to give our AI co-founder continuous, bitemporal awareness across every channel. Three layers, running on neptyn 1.0.
In 1998, Andy Clark and David Chalmers published The Extended Mind in the journal Analysis. Their argument was not metaphorical: the boundary of the cognitive system is not the body. A notebook that reliably stores memory is memory. A tool that offloads decisions is thinking. Cognition extends into whatever it reliably uses.
"If, as we confront some task, a part of the world functions as a process which, were it done in the head, we would have no hesitation in recognizing as part of the cognitive process — then that part of the world is part of the cognitive process."
Andy Clark & David Chalmers · The Extended Mind · Analysis, Vol. 58, No. 1, 1998That paper was written about pocket diaries. We are building for what comes after: not AI you pick up and consult, but AI that already lives inside the thought because it lives inside the data that generated it. The goal is not to make AI more useful. It is to ask whether the distinction between cognition and computation is well-placed at all.
We are documenting our approach and methodology as we build. Our first technical report — Attention Is Not Enough (BRL-2026-05) — lays out the structural problem with standard transformer attention and proposes Governed Multi-Stream Attention as the fix. A full account of architecture, training corpus, and failure cases is in preparation.