Wednesday, April 22, 2026
Washington DC
New York
Toronto
Distribution: (800) 510 9863
Press ID
  • Login
RH NEWSROOM National News and Press Releases. Local and Regional Perspectives. Media Advisories.
Yonkers Observer
  • Home
  • World
  • Politics
  • Finance
  • Technology
  • Health
  • Culture
  • Entertainment
  • Trend
No Result
View All Result
  • Home
  • World
  • Politics
  • Finance
  • Technology
  • Health
  • Culture
  • Entertainment
  • Trend
No Result
View All Result
Yonkers Observer
No Result
View All Result
Home Technology

OpenAI Unveils New ChatGPT That Listens, Looks and Talks

by Yonkers Observer Report
May 14, 2024
in Technology
Share on FacebookShare on Twitter

As Apple and Google transform their voice assistants into chatbots, OpenAI is transforming its chatbot into a voice assistant.

On Monday, the San Francisco artificial intelligence start-up unveiled a new version of its ChatGPT chatbot that can receive and respond to voice commands, images and videos.

The company said the new app — based on an A.I. system called GPT-4o — juggles audio, images and video significantly faster than previous versions of the technology. The app will be available starting on Monday, free of charge, for both smartphones and desktop computers.

“We are looking at the future of the interaction between ourselves and machines,” said Mira Murati, the company’s chief technology officer.

The new app is part of a wider effort to combine conversational chatbots like ChatGPT with voice assistants like the Google Assistant and Apple’s Siri. As Google merges its Gemini chatbot with the Google Assistant, Apple is preparing a new version of Siri that is more conversational.

OpenAI said it would gradually share the technology with users “over the coming weeks.” This is the first time it has offered ChatGPT as a desktop application.

The company previously offered similar technologies from inside various free and paid products. Now, it has rolled them into a single system that is available across all its products.

During an event streamed on the internet, Ms. Murati and her colleagues showed off the new app as it responded to conversational voice commands, used a live video feed to analyze math problems written on a sheet of paper and read aloud playful stories that it had written on the fly.

The new app cannot generate video. But it can generate still images that represent frames of a video.

With the debut of ChatGPT in late 2022, OpenAI showed that machines can handle requests more like people. In response to conversational text prompts, it could answer questions, write term papers and even generate computer code.

ChatGPT was not driven by a set of rules. It learned its skills by analyzing enormous amounts of text culled from across the internet, including Wikipedia articles, books and chat logs. Experts hailed the technology as a possible alterative to search engines like Google and voice assistants like Siri.

Newer versions of the technology have also learned from sounds, images and video. Researchers call this “multimodal A.I.” Essentially, companies like OpenAI began to combine chatbots with A.I. image, audio and video generators.

(The New York Times sued OpenAI and its partner, Microsoft, in December, claiming copyright infringement of news content related to A.I. systems.)

As companies combine chatbots with voice assistants, many hurdles remain. Because chatbots learn their skills from internet data, they are prone to mistakes. Sometimes, they make up information entirely — a phenomenon that A.I. researchers call “hallucination.” Those flaws are migrating into voice assistants.

While chatbots can generate convincing language, they are less adept at taking actions like scheduling a meeting or booking a plane flight. But companies like OpenAI are working to transform them into “A.I. agents” that can reliably handle such tasks.

OpenAI previously offered a version of ChatGPT that could accept voice commands and respond with voice. But it was a patchwork of three different A.I. technologies: one that converted voice to text, one that generated a text response and one that converted this text into a synthetic voice.

The new app is based on a single A.I. technology — GPT-4o — that can accept and generate text, sounds and images. This means that the technology is more efficient, and the company can afford to offer it to users for free, Ms. Murati said.

“Before, you had all this latency that was the result of three models working together,” Ms. Murati said in an interview with The Times. “You want to have the experience we’re having — where we can have this very natural dialogue.”

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recommended

Best entertainment photos 2023: Billie Eilish, SZA, Nick Cannon

2 years ago

Nevada G.O.P. Sets February Caucus, Jumping Ahead of South Carolina

3 years ago

Kamala Harris holds fundraiser in Cape Cod amid calls for Biden to drop out

2 years ago

Jordanian King Rebuffs Trump Proposal to Displace Palestinians in Gaza

1 year ago
Yonkers Observer

© 2025 Yonkers Observer or its affiliated companies.

Navigate Site

  • About
  • Advertise
  • Terms & Conditions
  • Privacy Policy
  • Disclaimer
  • Contact

Follow Us

No Result
View All Result
  • Home
  • World
  • Politics
  • Finance
  • Technology
  • Health
  • Culture
  • Entertainment
  • Trend

© 2025 Yonkers Observer or its affiliated companies.

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In