September 2, 2011 - Technology media outlet 9to5Mac published a blog post yesterday (September 1), reporting thatappleThe company launched FastVLM on the Hugging Face platform. visual language modelThe trial version of the browser.
Note: FastVLM is known for its "lightning-fast"video captioningKnown for its speed, it's easy to get started with this cutting-edge technology when you have a Mac device with an Apple Silicon chip.
The core strength of the FastVLM model is its exceptional speed and efficiency. The model is optimized using MLX, Apple's own open-source machine learning framework designed for the Apple Silicon chip. Compared to similar models, theThe FastVLM model is about one-third the size, but delivers up to 85 times faster video captioning.
The lightweight version of FastVLM-0.5B released by Apple can be loaded and run directly from within the browser. According to the press test, on a 16GB M2 Pro MacBook Pro, it took a few minutes to load the model for the first time, but after launching, it was able to accurately depict people, environments, expressions, and various objects in the screen.

It is worth mentioning that the model supports local operation, and all data are processed on the device side without uploading to the cloud, thus safeguarding the user's data privacy.
FastVLM's ability to run natively and its low-latency characteristics show great potential in the field of wearable devices and assistive technologies. For example, in virtual camera applications, where the tool can instantly characterize multiple scenes in detail, FastVLM is expected to become the core technology of these devices in the future, providing users with smarter and more convenient interaction experiences.