Zero One Everything's Yi series model family has welcomed a new member, the Yi Vision Language (Yi-VL) multimodal language grand model has been open-sourced for the world. the Yi-VL model is developed based on the Yi language model and includes two versions, Yi-VL-34B and Yi-VL-6B. It is claimed that Yi-VL-34B successfully outperforms a series of multimodal grand models with an accuracy of 41.6% in the new multimodal benchmark test MMMU, second only to the GPT-4V (55.7%), demonstrating strong cross-disciplinary knowledge comprehension and application capabilities.
Hugging Face Address:
https://huggingface.co/01-ai
ModelScope Address:
https://www.modelscope.cn/organization/01ai
