OpenAI has unveiled its latest artificial intelligence model, codenamed Strawberry and officially called OpenAI o1, TechCrunch reports. The model is part of a family that includes two versions: o1-preview and o1-mini, with the latter designed to be smaller and more efficient, intended primarily for code generation.
Currently, OpenAI o1 is available for ChatGPT Plus and Team subscribers, and early access for corporate and educational users will be opened next week. However, the o1 model is still relatively basic. Unlike its predecessor, GPT-4o, it lacks web browsing or file analysis capabilities, and while it does have image analysis features, they are temporarily disabled for further testing. In addition, this model is limited in speed, allowing only 30 messages per week for o1-preview and 50 for o1-mini.
Another disadvantage of the new model is its cost. o1-preview has a high API price of $15 for 1 million incoming tokens and $60 for 1 million outgoing tokens, which is much more expensive than GPT-4o. Despite these problems, OpenAI plans to make o1-mini available to all users of the free ChatGPT, although no specific release date has been announced yet.
What differentiates o1 from other generative AI models is its ability to “check itself” by taking extra time to consider different aspects of a query before responding. This ability allows the model to “think” more efficiently, making it suitable for complex tasks that require a higher level of synthesis, such as analyzing emails for privileged information or brainstorming marketing strategies.
According to OpenAI, o1 uses reinforcement learning, encouraging the model to “think” before answering by simulating a chain of thought. This is reinforced by a system of rewards and punishments that helps the model plan ahead and perform multiple actions to get an answer. Thus, in theory, the longer OpenAI o1 processes a query, the better the answer will be.
A new OpenAI optimization algorithm and a specially selected training dataset that includes scientific literature and reasoning data further enhance o1’s capabilities. In testing, o1 outperformed GPT-4o on a variety of tasks, including solving 83% of problems on the International Mathematical Olympiad qualifying exam (compared to 13% for GPT-4o) and scoring in the 89th percentile on the Codeforces programming competition.
While o1 excels in areas such as data analysis, science, and coding, it does have some limitations. For example, it can be slower than other models, taking more than 10 seconds to respond to certain queries. In addition, early testers reported that o1 may “hallucinate” (generate incorrect but confident answers) more often than GPT-4o and is less likely to admit that it does not know the answer.
Despite these issues, OpenAI believes that o1 is a step forward in AI reasoning. Google DeepMind researchers have demonstrated similar improvements in model accuracy by giving their models more time to compute answers, highlighting the fierce competition in the AI industry. Interestingly, OpenAI has chosen not to show o1’s raw “thought chains” in ChatGPT, citing competition concerns, and instead has chosen to show “model-generated summaries” of these chains.
The final test for OpenAI will be to make o1 widely available at a more affordable price. The company plans to continue to refine the model by experimenting with versions that could reason for longer periods – from hours to days or even weeks – to further improve its reasoning abilities.
Loading comments …