Observing that younger generations prefer consuming information via video (TikTok) and communicating via voice, Superhuman's CTO predicts a fundamental shift in user experience. Future interfaces, including email, will likely become more conversational and audio-based rather than relying on typing and reading.

Related Insights

While most focus on human-to-computer interactions, Crisp.ai's founder argues that significant unsolved challenges and opportunities exist in using AI to improve human-to-human communication. This includes real-time enhancements like making a speaker's audio sound studio-quality with a single click, which directly boosts conversation productivity.

Contrary to the focus on professional use cases, OpenAI's largest study shows that 46% of messages from adult consumer users are from the 18-25 age group. This indicates the emergence of an "AI native" generation whose approach to work and education will be fundamentally different.

For professionals who find phone calls demanding and texting too superficial for relationship building, voice memos offer an effective middle ground. This asynchronous communication method allows for the nuance and personality of voice, fostering a deeper connection without the pressure of a real-time conversation.

The magic of ChatGPT's voice mode in a car is that it feels like another person in the conversation. Conversely, Meta's AI glasses failed when translating a menu because they acted like a screen reader, ignoring the human context of how people actually read menus. Context is everything for voice.

Despite the focus on text interfaces, voice is the most effective entry point for AI into the enterprise. Because every company already has voice-based workflows (phone calls), AI voice agents can be inserted seamlessly to automate tasks. This use case is scaling faster than passive "scribe" tools.