Research in Recommendation Systems (RecSys) and Information Retrieval (IR) is described as uniquely unintuitive. The feedback from the modeling environment feels "rude" and disconnected from actions, as if the fundamental principles of cause and effect that apply in other ML domains are absent.
Naming AI research teams with terms like "AGI" is more about signaling a long-term "north star" and creating "vibes" to attract ambitious talent, rather than reflecting a concrete, step-by-step plan to achieve artificial general intelligence.
Despite its age, the Transformer architecture is likely here to stay on the path to AGI. A massive ecosystem of optimizers, hardware, and techniques has been built around it, creating a powerful "local minimum" that makes it more practical to iterate on Transformers than to replace them entirely.
Applying the machine learning concept of a "learning rate" to human cognition suggests that when a core assumption is proven wrong by a single counterexample, one should radically increase their learning rate and question all related beliefs, rather than making a small, incremental update.
Returning to a large tech company like Google after a period away is akin to resuming a saved video game. Your digital identity, username, and access to the vast internal infrastructure remain intact, allowing for a remarkably seamless re-integration despite significant organizational changes.
Deep expertise in one AI sub-field, like model architectures, isn't a prerequisite for innovating in another, such as Reinforcement Learning. Fundamental research skills are universal and transferable, allowing experienced researchers to quickly contribute to new domains even with minimal background knowledge.
Contrary to the "bitter lesson" narrative that scale is all that matters, novel ideas remain a critical driver of AI progress. The field is not yet experiencing diminishing returns on new concepts; game-changing ideas are still being invented and are essential for making scaling effective in the first place.
AI coding tools have surpassed simple assistance. Expert ML researchers now delegate debugging entirely, feeding an error log to the model and trusting its proposed fix without inspection. This signifies a shift towards AI as an autonomous problem-solver, not just a helper.
A key decision behind Google DeepMind's IMO Gold medal was abandoning their successful specialized system (AlphaGeometry) for an end-to-end LLM. This reflects a core AGI philosophy: a truly general model must solve complex problems without needing separate, specialized tools.
A remarkable feature of the current LLM era is that AI researchers can contribute to solving grand challenges in highly specialized domains, such as winning an IMO Gold medal, without possessing deep personal knowledge of that field. The model acts as a universal tool that transcends the operator's expertise.
The model training for the IMO Gold medal was a fluid, hackathon-like effort among four "captains" in London, Mountain View, and Singapore. The team operated with a highly ad-hoc workflow, passing responsibilities across continents as individuals traveled, ensuring continuous progress.
Getting hired at a premier AI lab like Google DeepMind often bypasses traditional applications. Top researchers actively scout and directly contact individuals who produce work that demonstrates excellent "research taste." The key is to independently identify and pursue fruitful research directions, signaling an innate ability to innovate.
On-policy reinforcement learning, where a model learns from its own generated actions and their consequences, is analogous to how humans learn from direct experience and mistakes. This contrasts with off-policy methods like supervised fine-tuning (SFT), which resemble simply imitating others' successful paths.
