Get your free personalized podcast brief

We scan new podcasts and send you the top 5 insights daily.

Gamifying AI token consumption via internal leaderboards, as seen at Meta, creates perverse incentives. Employees may burn tokens to climb the ranks rather than to solve real business problems. This "tokenmaxxing" promotes conspicuous consumption of compute, a vanity metric that masks true productivity and ROI.

Related Insights

The proliferation of AI leaderboards incentivizes companies to optimize models for specific benchmarks. This creates a risk of "acing the SATs" where models excel on tests but don't necessarily make progress on solving real-world problems. This focus on gaming metrics could diverge from creating genuine user value.

By ranking engineers on AI token consumption, Meta is experiencing Goodhart's Law: "When a measure becomes a target, it ceases to be a good measure." Employees reportedly build bots to needlessly burn tokens for status, demonstrating how gamifying a proxy metric can backfire and disconnect from actual business impact.

To get teams experimenting with AI, leaders should provide an open budget for tokens initially. Being 'profligate' at the start is crucial, as imposing constraints too early leads to unimpressive results, stifles creativity, and hinders true adoption. Efficiency can be optimized later.

A trend called "tokenmaxxing" is emerging in Silicon Valley, where companies like Meta use leaderboards to track employee AI token usage. This reflects a corporate bet that higher token consumption correlates with increased productivity, turning AI usage into a new, albeit gameable, performance metric for engineers.

According to Goodhart's Law, when a measure becomes a target, it ceases to be a good measure. If you incentivize employees on AI-driven metrics like 'emails sent,' they will optimize for the number, not quality, corrupting the data and giving false signals of productivity.

Gamification backfires when it rewards unintended actions. For example, when Visual Studio's badge system inadvertently incentivized developers to write curse words in code comments. This shows the need to understand the second-order effects of any incentive system before implementation.

Labs are incentivized to climb leaderboards like LM Arena, which reward flashy, engaging, but often inaccurate responses. This focus on "dopamine instead of truth" creates models optimized for tabloids, not for advancing humanity by solving hard problems.

An employee using AI to do 8 hours of work in 4 benefits personally by gaining free time. The company (the principal) sees no productivity gain unless that employee produces more. This misalignment reveals the core challenge of translating individual AI efficiency into corporate-level growth.

To accelerate its internal AI transformation, Meta is now grading employees on their use of company-provided AI tools as part of their performance reviews. This tactic moves AI from an optional productivity enhancer to a mandatory part of the job, creating powerful incentives for adoption and cultural change across the organization.

At companies like Meta, a new practice called "token maxing" is being used to measure productivity, where engineers compete on leaderboards to consume the most AI tokens. Promoted by leaders from Nvidia and Meta, this metric is criticized for being easily gamed and not necessarily reflecting true productivity.