Researchers from Stanford University and the University of California, Berkeley have published a scientific paper that aims to show changes in GPT-4 performance over time, writes Ars Technica.
It’s a study called “How Is ChatGPT’s Behavior Changing over Time?” Its authors question the stable performance of OpenAI’s large language models (LLM), including GPT-3.5 and GPT-4.
Using API access, they tested the March and June versions of these models on tasks such as solving math problems, answering sensitive questions, generating code, and visual reasoning.
GPT-4’s ability to identify prime numbers plummeted from 97.6% accuracy in March to just 2.4% in June. Notably, GPT-3.5 performed better over the same period.
However, OpenAI denies any claims of GPT-4 capabilities decline. Here they assured that they make each new version smarter than the previous one.
Some experts were not convinced by the results of the study. For example, Arvind Narayanan, a professor of computer science at Princeton University, believes that the results of the study are not conclusive evidence of a decrease in GPT-4 performance and could potentially be related to OpenAI settings.
AI researcher Simon Willison also questions the paper’s conclusions. He believes that any changes in the capabilities of GPT-4 are due to the fact that the novelty of LLM wears off. As technology has become more down-to-earth, its shortcomings seem glaring.
“When GPT-4 came out, we were still all in a place where anything LLMs could do felt miraculous,” Willison said. “That’s worn off now and people are trying to do actual work with them, so their flaws become more obvious, which makes them seem less capable than they appeared at first.”
OpenAI is aware of the new research and says it is monitoring reports of reduced GPT-4 capabilities.
“The team is aware of the reported regressions and looking into it,” noted OpenAI’s head of developer relations, Logan Kilpatrick.
We will remind that OpenAI is forming a new team headed by chief scientist and one of the company’s co-founders, Ilya Sutskever, to develop methods of managing and controlling “superintelligent” AI systems. According to Sutskever and the head of the company’s control group, Jan Leike, AI with intelligence exceeding that of humans could appear within a decade.