The key to reliable AI-powered user research is not novel prompting, but structuring AI tasks to mirror the methodical steps of a human researcher. This involves sequential analysis, verification, and synthesis, which prevents the AI from jumping to conclusions and hallucinating.
Despite the hype, AI-moderated user interviews are not yet a reliable tool. Even Anthropic, creators of Claude, ran a study with their own AI moderation tool that produced unimpressive, low-quality questions, highlighting the immaturity of the technology.
Instead of a single massive prompt, first feed the AI a "context-only" prompt with background information and instruct it not to analyze. Then, provide a second prompt with the analysis task. This two-step process helps the LLM focus and yields more thorough results.
Different LLMs excel at different research tasks. Caitlin Sullivan prefers Claude for its default thorough and nuanced analysis. However, she notes that Gemini is better for quickly identifying the top, most frequent themes that are solidly evidenced in the data.
After an initial analysis, use a "stress-testing" prompt that forces the LLM to verify its own findings, check for contradictions, and correct its mistakes. This verification step is crucial for building confidence in the AI's output and creating bulletproof insights.
For churn surveys, generic sentiment analysis is unhelpful as most responses will be negative. Instead, instruct the AI to use a multi-level "intensity rating" (e.g., 'soft exit,' 'frustrated,' 'angry'). This provides a much clearer signal for product teams to prioritize fixes.
Don't ask an AI to immediately find themes in open-ended survey responses. First, instruct it to perform "inductive coding"—creating and applying labels to each response based on the data itself. This structured first step ensures a more rigorous and accurate final analysis.
Large transcript files often hit LLM token limits. Converting them into structured markdown files not only circumvents this issue but also improves the model's analytical accuracy. The structure helps the AI handle the data more effectively than a raw text transcript.
Don't assume an LLM understands research terminology the way you do. Different models interpret concepts like "quote" differently. Provide clear definitions, rules, and examples for terms like "value anchors" or "fragile points" to ensure the AI's analysis aligns with your methodology.
Instead of running analyses sequentially, set up AI agents (e.g., in Claude Code) with pre-programmed workflows for different data types. You can then trigger both a survey analysis and an interview analysis simultaneously, effectively cutting your total analysis time in half.
