Loading Now

OpenAI’s GPT-5 Met With Mixed Reviews, Confusion in First Day

OpenAI’s GPT-5 Met With Mixed Reviews, Confusion in First Day

OpenAI’s GPT-5 Met With Mixed Reviews, Confusion in First Day


(Bloomberg) — For months, OpenAI Chief Executive Officer Sam Altman has been hyping up the capabilities of GPT-5, setting up the launch as a seminal moment for the company. But in the first 24 hours after its release, the new model was met with mixed reviews. 

In its announcement Thursday, OpenAI said GPT-5 was better at coding and reasoning through complex problems, and touted it as advanced enough to turn chatbot ChatGPT into a Ph.D.-level expert. Some with early access praised the model, with caveats. “It’s my new favorite model,” developer Simon Willison wrote in a blog post, calling it “competent” and “occasionally impressive.” He added: “It’s not a dramatic departure from what we’ve had before.”

On various social media platforms, however, ChatGPT users expressed frustration that GPT-5 continued to make up information and trip over simple math and spelling questions. Noah Giansiracusa, an associate professor of mathematics at Bentley University, said he felt the launch was “underwhelming.” While there were “some improvements,” he said, “they were much more marginal than I would’ve hoped.”

At least some of the reaction may come down to confusion over what’s happening under the hood. Unlike OpenAI’s prior software, GPT-5 automatically switches between models of varying levels of sophistication depending on the query. This approach can help maximize the company’s computing resources, but it also means users may not always be engaging with the most powerful version of OpenAI’s technology. 

Asked to identify how many times the letter “b” shows up in “blueberry,” for example, GPT-5 initially said “three” in one test. When told to “think harder,” however, GPT-5 appeared to engage its more advanced reasoning model and came up with the correct answer. 

On Friday, Altman responded to some of the feedback and said there was an issue with the system. “GPT-5 will seem smarter starting today,” he said. “Yesterday, the autoswitcher broke and was out of commission for a chunk of the day, and the result was GPT-5 seemed way dumber.” 

The stakes are high for the rollout. OpenAI is vying to keep ahead of growing AI competition from rivals in the US and China. The company is also fighting to convince businesses and individual users to pay up for its premium services to help offset the enormous amount it’s spending on talent, chips and data centers to support AI development. 

The San Francisco-based company kicked off the generative AI boom nearly three years ago with the release of ChatGPT, which was originally powered by an earlier model called GPT-3.5. Since then, the company has released a series of increasingly sophisticated systems, including multiple options that mimic the process of human reasoning.

As AI systems advance, it’s become harder to say definitively how various services stack up. As of midday Friday, GPT-5 had risen to the top of various categories on LMArena, a popular leaderboard for AI models based on user rankings. But a different benchmark, ARC-AGI-2, puts GPT-5 behind the latest version of Grok from Elon Musk’s xAI. 

In the absence of more definitive assessments, the model wars sometimes come down to vibes. And with nearly 700 million people now using ChatGPT each week, some are bound to disagree over how the model feels. It also takes longer than a day to gauge the value of a new AI system in someone’s personal and professional life. 

Ethan Mollick, a professor at the Wharton School of the University of Pennsylvania who frequently experiments with AI models, marveled at GPT-5’s ability to do research, come up with clever written responses and make programming simple, even for a novice. 

“GPT-5 just does stuff, often extraordinary stuff, sometimes weird stuff, sometimes very AI stuff, on its own,” he wrote in a blog post. “And that is what makes it so interesting.”

On Reddit, however, the reactions were very different. During an “Ask Me Anything” session Friday on the platform, Altman fielded pushback from users who were frustrated not to have more say and visibility into which model responds to their queries. Altman said OpenAI would take some steps to address these complaints, including making it “more transparent.”

At one point, Altman responded to a Reddit user’s question by noting that OpenAI thinks the “writing quality” in one version of GPT-5 is better than GPT-4.5. Then he asked: “Do you find it to be worse?” One user after another were quick to respond: yes. 

More stories like this are available on bloomberg.com

Post Comment