updateMay 7, 2026· 1 min read

openai launches new voice intelligence features in api

openai's api now includes voice models for conversation, transcription, and translation. indie devs can use these tools to enhance user interaction in their apps.

openai has launched new voice intelligence features in its api. these include the gpt-realtime-2 model for realistic vocal simulation, gpt-realtime-translate for real-time translation, and gpt-realtime-whisper for live transcription. these tools are designed to help developers create applications that can engage in conversations, transcribe speech, and translate languages in real-time.

for indie devs, these features can significantly improve user interaction in games or apps. the ability to have a voice interface that listens, reasons, and acts on user input can enhance gameplay and accessibility. integrating these tools could streamline customer service, educational content, or any interactive experience.

the new models are included in openai’s realtime api. translation and transcription services are billed by the minute, while gpt-realtime-2 is billed by token consumption. consider testing these features to see how they can fit into your workflow and enhance your projects.

vibe check
useful if your game needs npc chatter, accessibility narration, or multilingual voice support without building your own audio pipeline