Alibaba says its updated Qwen3 AI model outperforms OpenAI and DeepSeek in math and programming

Yevgeny Demkivskyi Автор новин Mezha.Media та гік. Пишу про технології, кіно та ігри. Можливо, про ігри з трохи більшою пристрастю.

23 July, 11:15 AM

Alibaba has unveiled an update to its AI model Qwen3-235B-A22B, which outperformed OpenAI and DeepSeek in math and coding tests. The new version is now available on Hugging Face and ModelScope, the South China Morning Post reports.

In the American Invitational Mathematics Examination 2025 test, the model scored 70.3 points, while DeepSeek-V3 scored 46.6 and GPT-4o scored 26.7. In the MultiPL-E coding test, Qwen scored 87.9 points, beating OpenAI (82.7) and DeepSeek (82.2), but falling behind Anthropic's Claude Opus 4 Non-thinking (88.5).

The model update also includes support for up to 256,000 tokens and a "non-thinking" mode — without explicit logical construction of the answer. At the same time, the new model will be integrated into HP's Xiaowei Hui "smart" assistant on PCs in China.

Qwen3, introduced in April, includes models with from 600 million to 235 billion parameters. According to LMArena, Qwen 3-235B-A22B-no-thinking ranks third among open-source LLMs, behind only the Chinese Kimi K2 (Moonshot AI) and DeepSeek R1-0528. In the Hugging Face ranking, three of the top ten Chinese models are from the Qwen series.

Overall, LMArena estimates that Chinese open LLMs are now ahead of Western competitors, from companies such as Meta and Google.

By the way, NVIDIA CEO Jensen Huang, during a visit to China, called the models from Alibaba, DeepSeek, and Moonshot "some of the best in the world" and "very progressive."

Advert:

Alibaba says its updated Qwen3 AI model outperforms OpenAI and DeepSeek in math and programming

Top Discussion

Latest News

Partner news