Google has released a new Gemini 2.5 model that can use a browser

Yevgeny Demkivskyi Автор новин Mezha.Media та гік. Пишу про технології, кіно та ігри. Можливо, про ігри з трохи більшою пристрастю.

8 October, 11:57 AM

Google announced the Gemini 2.5 Computer Use Model, which is capable of interacting with the browser interface like a normal user—clicking, scrolling, and typing. This enables tasks to be performed in API-less environments where automated access is limited.

The model uses visual recognition and logical reasoning to follow user instructions. For example, it can fill out and submit an online form, test an interface, or interact with websites as a human would. Similar technologies have been previously tested by Google in internal projects AI Mode and Project Mariner.

According to the company, Gemini 2.5 Computer Use outperforms its competitors in a number of web and mobile benchmarks. The model supports 13 basic actions, including opening tabs, typing, dragging items, and navigating pages. It works only through the browser, without accessing the system level of the OS.

The feature is available to developers through Google AI Studio and Vertex AI. The company also opened a public demo on Browserbase, where the model performs tasks like playing 2048 or searching for discussions on Hacker News.

The announcement comes a day after OpenAI showcased app integration into ChatGPT, a feature Anthropic already offered in its Claude model last year.

Advert:

Google has released a new Gemini 2.5 model that can use a browser

Top Discussion

Latest News

Partner news