Gemini Live, Google’s answer to ChatGPT’s Advanced Voice Mode, launches

Gemini Live allows users to have detailed voice chats with Gemini, Google’s AI chatbot, on their smartphones. With a better speech engine, Google claims that the conversations will be more consistent, emotionally expressive, and realistic.

Henry William

Male , Lives in United Kingdom

View Profile

Posted 2 years ago 1,537 Views updated 2 years ago

Gemini Live, Google’s response to the new Advanced Voice Mode for OpenAI’s ChatGPT, is launching on Tuesday. This comes months after it was first announced at Google’s I/O 2024 developer conference and later at the Made by Google 2024 event.

Gemini Live allows users to have detailed voice chats with Gemini, Google’s AI chatbot, on their smartphones. With a better speech engine, Google claims that the conversations will be more consistent, emotionally expressive, and realistic. Users can interrupt Gemini while it is speaking to ask follow-up questions, and the chatbot will adjust to their speech patterns in real time.

Google explains it in a blog post: “With Gemini Live [through the Gemini app], you can talk to Gemini and choose from 10 new natural-sounding voices. You can speak at your own pace or interrupt mid-response with questions, just like in any normal conversation.”

Gemini Live can be hands-free if you want. You can keep talking with the Gemini app running in the background or even when your phone is locked. You can pause and resume conversations at any time.

So, how could this be helpful? Google gives the example of practicing for a job interview. Gemini Live can help you practice by giving speaking tips and suggesting skills to highlight when talking to a hiring manager (or AI, in some cases).

One benefit of Gemini Live over ChatGPT’s Advanced Voice Mode is its better memory. The AI model behind Live, called Gemini 1.5 Pro and Gemini 1.5 Flash, has a longer “context window.” This means it can remember and understand a lot of information — potentially hours of conversation — before giving a response.

“Live uses our Gemini Advanced models that we have changed to be more conversational,” a Google spokesperson told TechCrunch via email. “The model’s large context window is used when users have long conversations with Live.”

CREDIT: TechCrunch