I recently participated in Vapi AI's #Build challenge, where the task was to develop voice-enabled solutions which were both creative and grounded in real-world application. I built a crisis helpline called "Lifeline" powered by automated voice agents.
Crisis helplines currently struggle to operate 24/7 due to high call volume, long wait times, and staff burnout due to high emotional demands. The potential real-world impact of such a solution could be transformative for crisis intervention.
Lifeline worked like a typical hotline and followed all the necessary protocols, but also offered risk assessment, de-escalation, and opt-in checkbacks via SMS with clinically vetted resources depending on the situation (suicide, domestic violence, etc). From a technical standpoint, it didn't take long to build, but it required carefully crafted prompts given the sensitive use cases. (You can try out the demo here – let me know what you think!)
"Kylie", one of the Lifeline agents, not only displayed various 'human' speech patterns, but almost perfectly assumed the role of a human helpline volunteer. I couldn't believe how authentic it seemed. I especially couldn't believe how much more effective this could be at scale compared to existing solutions.
However, LLMs don't really "understand" crisis situations the way trained counsellors do. At their core, they are sophisticated prediction engines built on input and output patterns. Yet somehow Kylie managed to provide responses that felt genuinely supportive. Does it necessarily matter if someone in crisis finds comfort in what is essentially pattern matching presented as empathy? But then again, it's ultimately better than if someone calls a helpline at 3am and there's nobody there to answer — especially where every second is crucial in saving lives. If such a tool can provide consistent, appropriate, potentially life-saving responses around the clock, does it matter that those responses come from predicted text rather than genuine understanding?
Until now, I've been somewhat sceptical about AI agents. This isn't because I don't recognise their vast potential — especially in enterprise use cases — but because I'm not sure we can, or even should, aim to optimise everything, and especially not all the time. In this scenario for example, it's possible we're better off automating a text-based helpline than a voice-based one. Or if possible, perhaps using a voice-based helpline as a triaging tool before swift human handoff and follow-ups.
Ultimately, it's not a choice between two different systems. In reality, both could work well together to achieve positive outcomes. But I suppose it underscores the importance of irreplaceable human elements, such as lived experience, which may be precisely what make traditional approaches effective even if they are not necessarily as 'scalable'.