Inspired by printer calibration sheets, designers create UI 'sticker sheets' and ask the AI to describe what it sees. This reveals the model's perceptual biases, like failing to see subtle borders or truncating complex images. The insights are used to refine prompting instructions and user training.
Anthropic strategically focuses on "vision in" (AI understanding visual information) over "vision out" (image generation). This mimics a real developer who needs to interpret a user interface to fix it, but can delegate image creation to other tools or people. The core bet is that the primary bottleneck is reasoning, not media generation.
Atlassian found users struggled with prompting, using vague language like 'change logo to JIRA' which caused the AI to pull old assets. They embedded pre-written, copyable commands into their prototyping templates. This guides users to interact with the underlying code correctly, reducing hallucinations and boosting confidence.
Atlassian improved AI accuracy by instructing it to first think in a familiar framework like Tailwind CSS, then providing a translation map to their proprietary design system components. This bridges the gap between the AI's training data and the company's unique UI language, reducing component hallucinations.
When a lab report screenshot included a dismissive note about "hemolysis," both human doctors and a vision-enabled AI made the same mistake of ignoring a critical data point. This highlights how AI can inherit human biases embedded in data presentation, underscoring the need to test models with varied information formats.
Many AI tools expose the model's reasoning before generating an answer. Reading this internal monologue is a powerful debugging technique. It reveals how the AI is interpreting your instructions, allowing you to quickly identify misunderstandings and improve the clarity of your prompts for better results.
To avoid generic, creatively lazy AI output ("slop"), Atlassian's Sharif Mansour injects three key ingredients: the team's unique "taste" (style/opinion), specific organizational "knowledge" (data and context), and structured "workflow" (deployment in a process). This moves beyond simple prompting to create differentiated results.
A practical AI workflow for product teams is to screenshot their current application and prompt an AI to clone it with modifications. This allows for rapid visualization of new features and UI changes, creating an efficient feedback loop for product development.
A significant real-world challenge is that users have different mental models for the same visual concept (e.g., does "hand" include the arm?). Fine-tuning is therefore not just for learning new objects, but for aligning the model's understanding with a specific user's or domain's unique definition.
To avoid generic, 'purple AI slop' UIs, create a custom design system for your AI tool. Use 'reverse prompting': feed an LLM like ChatGPT screenshots of a target app (e.g., Uber) and ask it to extrapolate the foundational design system (colors, typography). Use this output as a custom instruction.
Instead of generating UIs from scratch, Atlassian provides AI tools with a pre-coded template containing complex elements like navigation. The AI is much better at modifying existing code than creating complex layouts from nothing, reducing the error rate for navigation elements from 50% to nearly zero.