OCR model announces open source: parameters only 1B, multiple core capabilities SOTA

The news of November 25thTencent HunyuanToday announced the launch of the newOpen Source Model HunyuanOCRTHE PARAMETER IS ONLY 1B AND IS BASED ON A MULTI-MODULAR STRUCTURE BASED ON A MULTI-INDUSTRY OCR APPLICATION LIST SOTA (NOTE: STATE OF THE ART)。

OCR MODEL DECLARED OPEN SOURCE: PARAMETER 1B, MULTIPLE CORE CAPABILITIES SOTA

According to official sources, thanks to the conceptual design of the MMA “end-to-end” concept, the functions of the HunyuanOCR are best achieved by a single forward reasoning。

THE OCR EXPERT MODEL IS BASED ON A MULTIMODULAR STRUCTURE CONSISTING OF THREE MAIN COMPONENTS:Native resolution video encoder, self-adapted visual adapter and light quantified hybrid language model.

Unlike other open-source OCR expert models or systems, the training and reasoning of the HunyuanOCR model is based on a whole-to-end paradigm, with robust end-to-end reasoning demonstrated through scaled-up application-oriented data combined with enhanced online learning。

The hybrid OCR has several core competencies that achieve SOTA effects, of which the OmniDocBench assessment of complex document resolution achieves the highest 94.1 pointsMore than Google's Gemini3-pro and so on; the word detection and recognition capability, in the benchmark of the self-built 9 major applications (documentation, art, street scene, handwritten, advertising, paper, screen-stopping, games, video), is a significant lead in the same open-source model and the commercial OCR model; on the OCRBench listTHE TOTAL SCORE WAS 860 POINTS, AND THE MODEL CONFIGURATION OF ONLY 1B TOTAL PARAMETER OBTAINED THE TOTAL PARAMETER, INCLUDING THE GENERAL VISUAL UNDERSTANDING MODEL 3B UNDER SOTA.

IN SMALL-LANGUAGE TRANSLATION SKILLS, MIXED OCR SUPPORTS 14 HIGH-FREQUENCY SMALL-LANGUAGE TRANSLATIONS INTO CHINESE OR ENGLISH AND HAS WON THE ICDAR2025-END-TO-END DOCUMENT TRANSLATION SMALL-MODEL CHAMPION。

OCR MODEL DECLARED OPEN SOURCE: PARAMETER 1B, MULTIPLE CORE CAPABILITIES SOTA

In terms of applications, HunyuanOCR supports the resolution of complex documents in a multilingual language, with a combination of text detection and recognition capabilities, and applications in such settings as paper field extraction, video subtitle recognition, photo translation, etc。

In terms of text detection and recognition, models perform well on scenes such as documents, art words, street scenes, handwritten writings, advertising, bills, screens, games, videos, etc。

Complex document resolution refers to the electronicization of a multilingual document scanned or image taken, specifically, the organization of text elements that appear in a picture in the reading order, the use of the Latex formulae, and the presentation of complex tables in HTML format。

In addition to the usual applications, there is a need for field extraction, video subtitle extraction and photo translation。

1 interest fields for common cards and instruments (e.g. name / address / unit, etc.) are analysed in standard json format。

2. Automation of subtitles of videos, including bilingual subtitles。

3. Photo-translator function, which supports 14 small languages for high frequency applications, including: German, Spanish, Turkish, Italian, Russian, French, Portuguese, Arabic, Thai, Vietnamese, Indonesian, Malay, Japanese, Korean, and Chinese/English。

1AI with the following open source addresses:

https://github.com/Tencent-Hunyuan/HunyuanOCR
https://huggingface.co/tencent/HunyuanOCR
Direct experience: https://huggingface.co/spaces/tencent/HunyuanOCR

statement:The content of the source of public various media platforms, if the inclusion of the content violates your rights and interests, please contact the mailbox, this site will be the first time to deal with.

{{userData.name}}Verify

OCR MODEL DECLARED OPEN SOURCE: PARAMETER 1B, MULTIPLE CORE CAPABILITIES SOTA

ECONOMIST: AI, INVESTING IN "DIGITAL LETTUCE" CANNOT ESCAPE CORRUPTION

Singapore National AI plans to abandon the Meta model and turn to Aliwan

AI Weibo

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai tiktok

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

1ai WeChat

Five minutes a day

Become a master in one year

Scan the QR code to follow

{{userData.name}}Verify

Related content:

ECONOMIST: AI, INVESTING IN "DIGITAL LETTUCE" CANNOT ESCAPE CORRUPTION

Singapore National AI plans to abandon the Meta model and turn to Aliwan

Precise control of graphics! Tencent Hunyuan Wenshengtu open source model launches three ControlNet plug-ins

Alibaba Cloud: Tongyi Qianwen API daily call volume exceeds 100 million, and corporate users exceed 90,000

Tencent Hybrid 3D model generation upgrade version 2.5: modeling refinement improved, free generation quota doubled!

Kimi K2 Takes First Place in Open Source Modeling Over DeepSeek R1

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

Five minutes a day

Become a master in one year

Scan the QR code to follow