Pangram Labs estimates that 40% of internet pages are AI-generated. This is largely driven by the SEO industry, which has switched to AI to produce keyword-targeted articles for pennies, flooding search results and platforms like Medium with low-cost, low-value content.
The economic incentive for AI-generated posts on platforms like Reddit is a B2B service. Startups sell companies the promise of "organic mentions," using AI bots that engage in normal-seeming conversations before strategically recommending or mentioning a client's product.
Instead of detecting AI fakes, a new approach focuses on proving authenticity at the source. Organizations like C2PA work with hardware makers to embed cryptographic signatures into photos and videos, creating a verifiable chain of "content provenance" that proves an asset was captured by a real device.
Pangram Labs uses an "active learning" loop to enhance its model. After an initial training, the model scans a massive corpus to identify its own errors (false positives/negatives). These hard-to-classify examples are then fed back into the training set, making the next version more robust.
Pangram Labs' detector isn't hard-coded. It's a deep learning model trained on millions of examples. For each human text (e.g., a Yelp review), it sees an AI-generated equivalent, learning the subtle, often inarticulable, differences in word choice and structure that separate them.
To distinguish between light AI assistance (like Grammarly) and heavy generation, advanced detectors analyze the "cosine difference"—the distance in a multidimensional space between the original human text and the AI-edited version. This quantifies the degree of AI influence.
Early AI detectors used "perplexity," a measure of how surprising text is to a language model. This method is flawed because while AI text is predictably low-perplexity, so is text from non-native English speakers who take fewer linguistic risks, leading to a high rate of false positives.
Historically, well-structured, grammatically correct writing served as a reliable heuristic for an intelligent and serious author. AI completely breaks this connection by allowing anyone to generate perfectly polished prose for any idea, no matter how absurd, removing a key filter for evaluating content.
