The news of September 20thGoldThrough official public announcements TrafficVLM (Note: Traffic Visual Language Model). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 。

According to the introduction, in modern transport environments, drivers often face the challenge of information-blind areas: when a complex crossing passes, they can only see the current in front of themI can't predict which driveway 100 meters away will be blocked; it is difficult to foresee a “ghost jam” in the front, triggered by a slight brake, when travelling on a free-speed highway. These limitations of a local perspective make it difficult for drivers to make optimal decisions. Therefore, the TrafficVLM model is being upgraded to address the above-mentioned difficulties。
A completely new upgraded TrafficVLM, based on the space intelligence architecture, can bring an "observed" perspective to users. It allows users to have a comprehensive picture of the global traffic situation and thus make better decisions in complex environments. It is known that it gives every driver the ability to “know the whole perspective” when facing a roadblock or high speedIt is no longer limited to local vision, thus providing a more intuitive vision of the road ahead, respond to potential risks。
For example, on the 3 km main road in front of the user, the left side of the drive creates a new blockage due to a sudden tailing accident, which is immediately understood by TrafficVLM through real-time twin traffic, andThe reasoning identifies the point of the accident and understands its evolution: Crowding or will spread rapidly to form a three-kilometre-long congestion section. In the case of TrafficVLM, Goth can push the passage advice in time before the user arrives at the congestion point: “A three-kilometre accident ahead, with a large number of vehicles moving right in parallel, is recommended to you to move right in advance and avoid an emergency vehicle.”
Through the rapid response of the cloud-side control system, the system sends out observation instructions as soon as congestion occurs, extracts visual data from the first site and conducts smart analysis based on the depth of information in the image, and accurately restores the spatial structure and traffic posture of the congestion point。
This, it is described, means that users not only have direct access to “front traffic jams”, but are better able to understand why there is a need for diversion, when to slow down, and the real causes and extent of congestion. This shift from passive reception to active insinuation frees users from the limitations of "blind touch" to complex road conditionsVisual, perceptible, predictableThe intelligent navigation experience。
It's a visual language modelGeneral Qwen-VL For the base seat, intensive learning and data training based on traffic visual data of the Gothic volume and altitude reduction was completed。