Don't assume an LLM understands research terminology the way you do. Different models interpret concepts like "quote" differently. Provide clear definitions, rules, and examples for terms like "value anchors" or "fragile points" to ensure the AI's analysis aligns with your methodology.
The word "evals" has been stretched to mean many different things: expert-written error analysis, PM-defined test cases, performance benchmarks, and LLM-based judges. This "semantic diffusion" causes confusion. Teams need to be specific about what part of the feedback loop they're discussing instead of using the generic term.
AI expert Andrej Karpathy suggests treating LLMs as simulators, not entities. Instead of asking, "What do you think?", ask, "What would a group of [relevant experts] say?". This elicits a wider range of simulated perspectives and avoids the biases inherent in forcing the LLM to adopt a single, artificial persona.
When using LLMs to analyze unstructured data like interview transcripts, they often hallucinate compelling but non-existent quotes. To maintain integrity, always include a specific prompt instruction like "use quotes and cite your sources from the transcript for each quote." This forces the AI to ground its analysis in actual data.
Anthropic suggests that LLMs, trained on text about AI, respond to field-specific terms. Using phrases like 'Think step by step' or 'Critique your own response' acts as a cheat code, activating more sophisticated, accurate, and self-correcting operational modes in the model.
To get consistent results from AI, use the "3 C's" framework: Clarity (the AI's role and your goal), Context (the bigger business picture), and Cues (supporting documents like brand guides). Most users fail by not providing enough cues.
Many AI tools expose the model's reasoning before generating an answer. Reading this internal monologue is a powerful debugging technique. It reveals how the AI is interpreting your instructions, allowing you to quickly identify misunderstandings and improve the clarity of your prompts for better results.
The key to reliable AI-powered user research is not novel prompting, but structuring AI tasks to mirror the methodical steps of a human researcher. This involves sequential analysis, verification, and synthesis, which prevents the AI from jumping to conclusions and hallucinating.
AI lacks the implicit context humans share. Like a genie granting a wish for "taller" by making you 13 feet tall, AI will interpret vague prompts literally and produce dysfunctional results. Success requires extreme specificity and clarity in your requests because the AI doesn't know what you "mean."
Hunt's team at Perscient found that AI "hallucinates" when given freedom. Success comes from "context engineering"—controlling all inputs, defining the analytical framework, and telling it how to think. You must treat AI like a constrained operating system, not a creative partner.
Instead of a single massive prompt, first feed the AI a "context-only" prompt with background information and instruct it not to analyze. Then, provide a second prompt with the analysis task. This two-step process helps the LLM focus and yields more thorough results.