With the new GPT-4o model OpenAI takes its ChatGPT to the next level

By Sofia Elizabella Wyciślik-Wilson
Published 4 months ago

Pioneering AI firm OpenAI has launched the latest edition of its LLM, GPT-4o. The flagship model is being made available to all ChatGPT users free of charge, although paying users will get faster access to it.

There is a lot to this update, but OpenAI highlights improvements to capabilities across text, voice and vision, and well as faster performance. Oh, and if you were curious, the "o" in GPT-4o stands for "omni".

See also:

The updates to the model open up a new world of possible uses, and OpenAI says that "GPT-4o is much better than any existing model at understanding and discussing the images you share". It is described as being a "step towards much more natural human-computer interaction -- it accepts as input any combination of text, audio, and image, and generates any combination of text, audio, and image outputs".

So far, so vague. But what does it mean in practice? The company offers up some potential usage scenarios:

You can now take a picture of a menu in a different language and talk to GPT-4o to translate it, learn about the food's history and significance, and get recommendations. In the future, improvements will allow for more natural, real-time voice conversation and the ability to converse with ChatGPT via real-time video. For example, you could show ChatGPT a live sports game and ask it to explain the rules to you. We plan to launch a new Voice Mode with these new capabilities in an alpha in the coming weeks, with early access for Plus users as we roll out more broadly.

The speed of GPT-4o is comparable to that of human responses, and OpenAI also draws attention to massive improvements to translation capabilities and operations in non-English languages.

There have been safety concerns about artificial intelligence from the very beginning, and these are only growing as the technology becomes more powerful. Acknowledging this, OpenAI says:

GPT-4o has also undergone extensive external red teaming with 70+ external experts in domains such as social psychology, bias and fairness, and misinformation to identify risks that are introduced or amplified by the newly added modalities. We used these learnings to build out our safety interventions in order to improve the safety of interacting with GPT-4o. We will continue to mitigate new risks as they're discovered.

We recognize that GPT-4o's audio modalities present a variety of novel risks. Today we are publicly releasing text and image inputs and text outputs. Over the upcoming weeks and months, we'll be working on the technical infrastructure, usability via post-training, and safety necessary to release the other modalities. For example, at launch, audio outputs will be limited to a selection of preset voices and will abide by our existing safety policies. We will share further details addressing the full range of GPT-4o's modalities in the forthcoming system card.

There is a wealth of additional information about GPT-4o available here.

1 Comment

With the new GPT-4o model OpenAI takes its ChatGPT to the next level

One Response to With the new GPT-4o model OpenAI takes its ChatGPT to the next level

Recent Headlines

Meeting the challenges of enterprise development [Q&A]

Microsoft launches Windows App so you can connect to Windows from just about any device

Microsoft is giving Windows 11 users (a bit of) control over the in-OS ads they see... but there’s a sting in the tail

SparkyLinux 7.5 arrives with updated kernel and software packages

Google expands passkey support to desktop with Google Password Manager

Sony celebrates 30 years of PlayStation with limited edition PS5 bundles

Forget Microsoft Windows 11, Zorin OS 17.2 is the Linux-based operating system you need

Most Commented Stories

Windows 12.1 is everything Windows 11 should be -- and the Microsoft operating system we need!

Rectify11 update arrives to fix Windows 11 -- download it now

Apple Intelligence will launch in beta and that’s unacceptable for a trillion-dollar company

Forget TeamViewer, RustDesk is the open-source alternative you've been looking for

Say goodbye to Microsoft Windows 11 and hello to Nitrux Linux 3.6.1

Microsoft is bringing ads to the Windows 10 Start menu, just like in Windows 11

Donald Trump vs. Kamala Harris: Google ramps up efforts to protect Presidential Election integrity

Are you ready for 6G? A breakthrough device just made it possible