Attention taught machines to represent. Intent teaches them to act. As the field crosses from models that predict to agents that pursue, the organizing principle of computation changes — from the token to the goal.
In 2017, six words reorganized machine learning: attention is all you need.1 The Transformer replaced recurrence and convolution with a single mechanism for modeling relationships in a sequence. It gave us models that predict — extraordinary engines of representation.
But a representation is not an agent. A model that maximizes the likelihood of the next token has no notion of a state it wants to reach, no criterion for success beyond plausibility, no reason to try again when it is wrong. As AI crosses from generation into autonomous action — systems that perceive, plan, act, and adapt through iterative control loops18 — the bottleneck is no longer how to attend to context. It is how to organize computation around a goal that persists across time, tools, memory, and self-correction.
Attention is the inner loop — perception and representation. Intent is the outer loop — goal-conditioned control. The claim of this paper is not that attention was wrong. It is that attention was the organizing principle of the model era, and that intent — the explicit, first-class representation of a goal — is the organizing principle of the agent era.
Objective: maximize the likelihood of the next token given context.
Optimizes fidelity to a distribution. No goal, no state, no consequence.
Objective: find a policy that reaches a specified goal, maximizing return.
Optimizes achievement of an intent. The goal g is an input, not an afterthought.
That single change — promoting the goal g to a first-class argument of every value and policy function — is what a decade of goal-conditioned reinforcement learning has been quietly building toward. The rest of this paper traces the four pillars that make intent is all you need a buildable architecture, and shows the three systems in which we have already built it.
Condition the value function on the goal itself and learning reorganizes. Universal Value Function Approximators generalize value across states and goals, V(s, g)3; Hindsight Experience Replay turns every failure into a lesson by relabeling the state actually reached as the goal that was intended4. Hierarchy makes it scale: FeUdal Networks split a Manager that sets abstract directional goals in latent space from a Worker that enacts primitive actions — decoupling goal-setting from execution and enabling credit assignment across very long horizons2, the same Manager/Worker split the options framework first formalized5.
If the goal is primary, the harness that pursues it need not be hand-built — it can be searched. Automated Design of Agentic Systems defines agents as code and lets a meta-agent program ever-better ones from a growing archive; because code is Turing-complete, the search space is every possible agent, and the discovered agents outperform state-of-the-art hand-designed ones6. The Darwin Gödel Machine closes the loop on itself — a system that rewrites its own code, improving its ability to improve, lifting SWE-bench from 20% → 50% through Darwinian evolution over an archive7. Promptbreeder evolves not just prompts but the mutation-prompts that evolve them8; Voyager accumulates an ever-growing library of self-written, reusable skills9. The harness becomes a thing the agent grows, not a thing we ship.
A goal supplies what a token never could: a criterion for being wrong, and therefore a signal to correct. Reflexion reinforces an agent through verbal feedback held in episodic memory — no weight updates — reaching 91% on HumanEval against a same-model 80% baseline10. RL makes the correction durable: SCoRe teaches self-correction with multi-turn online RL and a reward bonus11; Reflect-Retry-Reward uses GRPO to reward only the self-reflection tokens that turned a failure into a success12. Feedback scales past humans — RLAIF matches RLHF using an AI judge13 — and the emerging discipline of verifier engineering organizes the whole outer loop into search → verify → feedback14. Constant self-correction, applied to a goal, is the mechanism of autonomy.
The deepest theories of agency already put the goal first. In active inference, an agent acts to minimize expected free energy — and a goal is simply a prior preference over observations it expects to make; behavior is what closes the gap between the world and the intended world16. Empowerment and intrinsic-motivation accounts unify as constrained entropy maximization — goal-seeking without an external reward at all. World models such as Dreamer show that once an agent can imagine futures, planning becomes goal-conditioned inference over that model15. Attention answers what is related to what. Intent answers what should be true, and how do I make it so.
A thesis is only as strong as what it builds. Three AAIRC systems instantiate intent is all you need across the full stack: CORTEX orchestrates toward goals, COREAI governs the autonomy, and MANTIS gives it memory.
CORTEX is Pillar I and III, in production. It treats agents as definitions, not implementations: you give it a goal, and an LLM-powered Nerve Center decomposes it, manufactures the agents needed to achieve it at runtime, binds them to sub-goals, and adapts as results arrive. Its own whitepaper states the thesis exactly: "the shift from task-driven to goal-driven operation is what enables true autonomy."
Autonomy without governance is a liability. COREAI is the foundational platform that makes goal-driven agents safe to deploy: a master planner spawns and supervises agents, but every action passes through a Tool Runner they cannot bypass.
Intent persists only if the agent remembers. MANTIS is the graph-memory and world-model layer: a high-performance embedded graph database that stores the relationships between goals, agents, skills, tools, and outcomes — the substrate the self-correction loop reads from and writes to.
CORTEX orchestrates toward the goal. COREAI governs the autonomy. MANTIS remembers. Together they are the architecture of intent — a machine that is given a what, and discovers the how.
Not because attention is obsolete — it remains the substrate of perception. But because the question that defines this era is no longer "what is the most likely continuation?" It is "what outcome do we want, and what will a system do, autonomously and correctably, to achieve it?"
Every mechanism above — goal-conditioned value, self-designing harnesses, self-correction, active inference — is a different answer to the same reorganization: make the goal the primary object of computation, and intelligence follows as the search for how to reach it. That is a thesis, and a program of engineering. At AAIRC, it is also a running system.
In the post-agentic world, we do not program behavior. We specify intent — and the machine builds the rest.
Note: "Intent is all you need" is a programmatic thesis — a synthesis across goal-conditioned RL, self-referential self-improvement, and closed-loop correction — not a single empirically settled result. Named 2023–2026 findings are drawn from the primary sources above; benchmark figures are as reported by their authors.