By Mayank Chhaya-
OpenAI’s just unveiled flagship model GPT 4o moves the world of artificial intelligence (AI) closer to the eventual possibility of it becoming conscious and, in some new form, sentient.
Perhaps the most remarkable feature of GPT 4o, where ‘o’ stands for omni, is its response time to audio inputs. It can respond to human and fellow AI systems in just 232 milliseconds, with an average of 320 milliseconds. That is almost disturbingly similar to the response time in a conversation among humans. A millisecond is one-thousandth of a second.
Watching the videos showcasing the astonishing capabilities of GPT 4o that can reason across audio, vision, and text in real-time, a vision that comes to mind is the iconic interaction between the astronaut Dave Bowman, played by Keir Dullea, and the onboard AI computer HAL in the 1968 space fiction masterpiece ‘2001: A Space Odyssey’ directed by Stanley Kubrick and written by Arthur C. Clarke. Nearly six decades ago Clarke’s book and Kubrick’s movie presaged the rise of artificial general intelligence that nearly went out of human control on board a spaceship.
GPT 4o is nowhere near that but the way it is able to interact with the real world through phone and computer camera lenses, conduct live conversations in two different languages and even charmingly laugh almost in a humanlike fashion can be eerily reminiscent of the Dave-HAL interaction.
Its ability to reason across audio, vision and text in real-time is quite astonishing. As OpenAI’s promotional material points out, “GPT-4o (“o” for “omni”) is a step towards much more natural human-computer interaction—it accepts as input any combination of text, audio, and image and generates any combination of text, audio, and image outputs.”
“It matches GPT-4 Turbo performance on text in English and code, with significant improvement on text in non-English languages, while also being much faster and 50% cheaper in the API. GPT-4o is especially better at vision and audio understanding compared to existing models,” the company backgrounder said.
The demonstration videos on OpenAI’s website are at one level entertaining but on another quite ominous in the challenge that AI and AGI will eventually pose to humanity along the lines of what Clarke and Kubrick had visualized.
Of course, the movie’s theme was much broader than just what its virtually conscious AI HAL is frighteningly capable of doing. It was limited to a spaceship unlike GPT 4o which is already making its way into the real world quite rapidly after its unveiling.
Clearly, the rapid march of AI and AGI cannot be stopped. However, the fact that humans are creating it also offers the ability to control and regulate it before it is too late. And it is not too early to be too late going by the fast advances being made now.
As a side note, OpenAI has called its flagship model 4o, o being omni. Many might know what omni is. It means “all”. It can well be omniscient, omnipresent and omnipotent soon, a prospect humanity will have to grapple with very soon.