We scan new podcasts and send you the top 5 insights daily.
AI models excel at coding because correctness is easy to evaluate. Design is harder because "good" is subjective and tied to human taste, making it difficult to create a training feedback loop. Furthermore, design values novelty and cultural context, whereas software engineering prefers established, reliable patterns.
AI tools are commoditizing the act of writing code (software development). The durable skill and key differentiator is now software engineering: architecting systems, creating great user experiences, and applying taste. Building something people want to use is the new challenge.
AI lowers the technical barrier to building products, making design taste and judgment the critical differentiators. An AI can execute tasks, but it requires a designer's discerning eye to guide it toward a high-quality, cohesive, and valuable user experience.
Developers fall into the "agentic trap" by building complex, fully-automated AI coding systems. These systems fail to create good products because they lack human taste and the iterative feedback loop where a creator's vision evolves through interaction with the software being built.
AI tools are dramatically lowering the cost of implementation and "rote building." The value shifts, making the most expensive and critical part of product creation the design phase: deeply understanding the user pain point, exercising good judgment, and having product taste.
Rather than optimizing solely for performance on standard industry benchmarks, Ideogram focuses on embedding a subjective quality of "taste" into its models. This requires using human designers for evaluation, as they believe current AI is poor at judging aesthetic nuances, giving them a unique creative edge.
Creating AI that can reliably judge aesthetics is a frontier problem. Unlike tasks with clear right or wrong answers, aesthetics is subjective. This lack of a clear, objective benchmark makes it difficult to apply standard model improvement techniques, making it a better fit for Reinforcement Learning from Human Feedback (RLHF).
Despite AI's ability to generate functional code, replicating the nuanced, subjective quality of a specific designer's "taste" remains extremely difficult. Felix Lee, after spending weeks attempting to codify his own taste into an AI model with little success, notes it's a significant unsolved challenge.
Current benchmarks focus on whether code passes tests. The future of AI evaluation must assess qualitative, human-centric aspects like 'design taste,' code maintainability, and alignment with a team's specific coding style. These are hard to measure automatically and signal a shift toward more complex, human-in-the-loop or LLM-judged evaluation frameworks.
AI models are poor at "last-mile" visual design. However, if a human designer invests heavily in creating a perfect set of primitives (e.g., buttons, cards), AI becomes incredibly effective at reusing and intelligently extrapolating from that foundation for new contexts. Human effort on the core system pays off exponentially.
True taste isn't just recognizing good design; it's the judgment of when to innovate versus when to adhere to established patterns. This discernment, the ability to zoom in and out, is a uniquely human skill that current AI models cannot replicate.