A one-size-fits-all AI voice fails. For a Japanese healthcare client, ElevenLabs' agent used quick, short responses for younger callers but a calmer, slower style for older callers. This personalization of delivery, not just content, based on demographic context was critical for success.

Related Insights

Don't unleash a generic AI agent on your entire database. To get high response rates, segment contacts into specific sub-personas based on role, behavior, or status (e.g., churn risk). Then, train dedicated sub-agents or campaigns for each persona, allowing for true personalization at scale in batches of around 1,000 contacts.

An AI tool that prompts call center agents on conversational dynamics—when to listen, show excitement, or pause—dramatically reduces customer conflict. This shows that managing the non-verbal pattern of interaction is often more effective for de-escalation than focusing solely on the words in a script.

Don't worry if customers know they're talking to an AI. As long as the agent is helpful, provides value, and creates a smooth experience, people don't mind. In many cases, a responsive, value-adding AI is preferable to a slow or mediocre human interaction. The focus should be on quality of service, not on hiding the AI.

While many pursue human-indistinguishable AI, ElevenLabs' CEO argues this misses the point for use cases like customer support. Users prioritize fast, accurate resolutions over a perfectly "human" interaction, making the uncanny valley a secondary concern to core functionality.

The magic of ChatGPT's voice mode in a car is that it feels like another person in the conversation. Conversely, Meta's AI glasses failed when translating a menu because they acted like a screen reader, ignoring the human context of how people actually read menus. Context is everything for voice.

The most significant near-term impact of voice AI will be in call centers. Rather than simply replacing agents, the technology will first elevate their effectiveness and productivity. Concurrently, voice bots will handle initial queries, solving the common pain point of long wait times and improving overall customer experience.

A common objection to voice AI is its robotic nature. However, current tools can clone voices, replicate human intonation, cadence, and even use slang. The speaker claims that 97% of people outside the AI industry cannot tell the difference, making it a viable front-line tool for customer interaction.

To avoid robotic content, use “humanization prompting.” This involves uploading transcripts of your natural speech (from interviews or voice notes) to a custom GPT’s knowledge base, training it to adopt your unique cadence, vocabulary, and style.

To solve the problem that enterprise customers don't know how to choose a "good" voice, ElevenLabs created the role of a "voice sommelier." This expert voice coach works with clients to find the right voice for their brand and use case, effectively productizing the subjective process of voice selection and turning it into a sales asset.

Despite the focus on text interfaces, voice is the most effective entry point for AI into the enterprise. Because every company already has voice-based workflows (phone calls), AI voice agents can be inserted seamlessly to automate tasks. This use case is scaling faster than passive "scribe" tools.

AI Voice Agents Must Adapt Tone and Pace to User Demographics to Be Effective | RiffOn