Modern communication (texting, social media) filters out crucial non-verbal information like tone, pacing, and emotional presence. This has led society to 'hypertrophy' word-based interaction while losing the high-resolution data that prevents misunderstanding and fosters genuine connection.
The internet's evolution from social networking (connecting with friends) to social media (broadcasting to followers) destroyed a valuable product category. This shift replaced genuine intimacy with performance, contributing to a global rise in loneliness and isolation as people stare at screens instead of connecting.
Observing that younger generations prefer consuming information via video (TikTok) and communicating via voice, Superhuman's CTO predicts a fundamental shift in user experience. Future interfaces, including email, will likely become more conversational and audio-based rather than relying on typing and reading.
Face-to-face contact provides a rich stream of non-verbal cues (tone, expression, body language) that our brains use to build empathy. Digital platforms strip these away, impairing our ability to connect, understand others' emotions, and potentially fostering undue hostility and aggression online.
This idea posits that language is a lossy, discrete abstraction of reality. In contrast, pixels (visual input) are a more fundamental representation. We perceive language physically—as pixels on a page or sound waves—and tokenizing it discards rich information like font, layout, and visual context.
Focusing solely on making communication faster or shorter is a mistake. Communication ultimately fails if the recipient doesn't interpret the message as the sender intended. The true goal is creating shared understanding, which accounts for the recipient's personal context and perspective, not just transmitting data efficiently.
Human communication is returning to its oral and visual roots. Text, a low-dimensional medium, was a temporary necessity for scalable knowledge storage—a 'parenthesis' in history. As AI makes creating rich media as easy as writing, society will default back to more natural, higher-bandwidth formats like audio and video.
For professionals who find phone calls demanding and texting too superficial for relationship building, voice memos offer an effective middle ground. This asynchronous communication method allows for the nuance and personality of voice, fostering a deeper connection without the pressure of a real-time conversation.
Sam Altman's verbal response to a question about OpenAI's finances was reasonable, but his negative body language and audible sigh—perceptible only on video—completely changed the message's reception. This highlights how non-verbal cues in video interviews can undermine a leader's intended message, a critical lesson in the age of multimedia communication.
In virtual settings, the lack of physical presence causes people to "over-index" on the few non-verbal cues available, like facial expressions. A leader's innocuous action, such as rubbing their face, can be misinterpreted as negativity. Leaders must be hyper-aware that their virtual body language is under a microscope.
A significant, yet invisible, cause of digital exhaustion is the constant mental work required to interpret communications lacking non-verbal cues. Our brains work overtime to decode the meaning behind a brief email or emoji, consuming vast cognitive resources and leading to depletion.