Meta Debuts "Chameleon" Challenge GPT-4o, 34B Parameter Leads Multimodal Revolution

The Meta team released a 34B parameter multimodal model called "Chameleon" to challenge OpenAI's GPT-4o, which can seamlessly process text and images, adopts a unified Transformer architecture, and realizes modal information mixing through early fusion technology. It has set records in visual Q&A and image annotation benchmarks, and its performance is close to that of GPT-4V, but it currently mainly supports image text generation and lacks speech capabilities. (NIC)

Search