With agent loops automating execution, the highest-value human skill becomes designing the environment and rules for the AI. This involves writing the strategy document (like 'program.md'), defining success metrics, and constructing the evaluation function. Your job is no longer to do the work, but to architect the system in which the work gets done.
Iterative AI agent loops, like Andre Karpathy's Auto Research, are not just another tool but a new foundational building block of work. Similar to how spreadsheets or email became ubiquitous across all roles and industries, these loops will be a core component of how knowledge work is performed, fundamentally changing process and productivity.
The next evolution of agentic work involves massive, collaborative swarms of AIs working together. Current tools like GitHub, designed for human workflows with a single master branch, are ill-suited for this paradigm. The future will require new, agent-native platforms, possibly resembling social networks, to manage thousands of parallel experiments and collaborative branches.
Agentic loops are not a universal solution. They are most effective in domains where success can be measured by a clear, objective score and where failed experiments are cheap and quick. This framework helps identify the best business processes to automate, starting with areas like code generation or ad testing, not subjective, slow-moving tasks like political negotiation.
To prevent autonomous agents from operating in silos with 'pure amnesia,' create a central markdown file that every agent must read before starting a task and append to upon completion. This 'learnings.md' file acts as a shared, persistent brain, allowing agents to form a network that accumulates and shares knowledge across the entire organization over time.
A key challenge for AI agents is their limited context window, which leads to performance degradation over long tasks. The 'Ralph Wiggum' technique solves this by externalizing memory. It deliberately terminates an agent and starts a new one, forcing it to read the current state from files (code, commit history, requirement docs), creating a self-healing and persistent system.
