Safely manage your Zendesk from the AI assistant you already use, via the Deltastring MCP. Beacon configuration platform
← Back to news

Google, Cohere launch new audio AI models

Google and Cohere have released new audio AI models, marking a significant expansion of generative AI capabilities into voice processing and generation. This development arrives as the broader AI landscape consolidates around multimodal systems—platforms capable of handling text, image, and now audio inputs simultaneously. For CX teams currently invested in text-based AI solutions, this represents both an opportunity and a pressure point: voice remains the dominant channel for customer support interactions, yet most enterprise platforms have lagged in native audio intelligence. The question becomes whether your current stack—whether Zendesk, Freshdesk, or Salesforce—will integrate these models natively or force you into costly third-party integrations that fragment your data and analytics.

The timing matters considerably. As AI optimization frameworks continue to improve efficiency across models, the cost-per-interaction for audio processing will drop, making real-time transcription, sentiment analysis, and quality assurance on voice calls economically viable at scale. This directly threatens the value proposition of legacy call centre software that charges per-seat or per-minute. Teams already running sophisticated omnichannel operations should assess whether their current vendor roadmap includes audio AI capabilities, or whether they risk being locked into outdated infrastructure as competitors adopt these tools to reduce handle time and improve first-contact resolution. The competitive advantage will accrue to organisations that can process voice data with the same analytical rigour they currently apply to chat and email—extracting intent, emotion, and compliance risks in real time rather than through post-call reviews.