April 18, 1AI learned from the beanbag big model team that UI-TARS-1.5 was officially released yesterday andOpen Source. This is an open source visual-linguistic model built on themultimodal intelligenceThe company is able to perform all kinds of tasks efficiently in the virtual world.

The relevant links are below:
- GitHub:https://github.com/bytedance/UI-TARS
- Website:https://seed-tars.com/
- Arxiv:https://arxiv.org/abs/2501.12326
UI-TARS-1.5 is based onbyteThe previously proposed native intelligentsia scheme, UI-TARS, further enhances the model's higher-order reasoning capabilities through reinforcement learning, enabling the model toThink before you act..
This version of the model also shows the team's new vision of using games as a vehicle to enhance the reasoning capabilities of the underlying model. Games rely more on intuitive, common-sense reasoning and less on specialized knowledge than domains such as math and programming, making them often ideal test scenarios for assessing and enhancing the general capabilities of future models.
According to the introduction, UI-TARS is a native GUI intelligence body, with the ability to operate real computer and cell phone systems, and at the same time, can also control the browser, complete complex interactive tasks.UI-TARS-1.5 can realize accurate GUI operation, based on the team's technical exploration in four dimensions:
- Enhanced visual perception:Relying on large-scale interface screenshot data, the model understands the semantics and context of the elements to form an accurate description.
- System 2 Reasoning Mechanisms:Generate "thought" before action to support multi-step planning and decision making for complex tasks.
- Unified action modeling:Build a cross-platform standardized action space to improve action controllability and execution accuracy through real trajectory learning.
- Self-evolving training paradigms:Through automated interactive trajectory acquisition and reflective training, the model continuously improves from errors and adapts to complex environmental changes.