Apple releases DiffuCode-7B-cpGRPO programming AI model: based on Qwen 2.5-7B, can generate code out of order

Apple Releases DiffuCode-7B-cpGRPO Programming AI Model: Based on Qwen 2.5-7B, Can Generate Code Out of Order

July 5 News.appleThe company quietly released on Hugging Face a DiffuCode-7B-cpGRPO calledOpen Source AI ModelsThe model has innovative features in terms of generating code.Ability to generate code out of order with performance comparable to top open source coding models.

Apple Releases DiffuCode-7B-cpGRPO Programming AI Model: Based on Qwen 2.5-7B, Can Generate Code Out of Order

Note: Traditional Large Language Modeling (LLM) generates code in a left-to-right, top-to-bottom order, as most humans read text.

This is mainly because these LLMs work using Autoregression, meaning that when a user asks them a question, they process the entire question, predict the first token of the answer, then reprocess the entire question with that token, predict the second token, and so on.

The LLM also has a setting called Temperature that controls the randomness of the output. After predicting the next token, the model assigns probabilities to all possible options. Lower temperatures mean that the most likely token is more likely to be chosen, while higher temperatures give the model more freedom to choose less likely tokens.

The other option is the Diffusion model, which is commonly used for image modeling. In short, the model starts with a blurry, noisy image and iteratively removes the noise, while taking into account the user's needs and gradually directing it to an image closer to the user's request.

The model released by Apple is called DiffuCode-7B-cpGRPO, and it's based on a paper published last month called DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation.

The paper describes a code generation model using a diffusion-first strategy, but with a special feature: when the sampling temperature is increased from the default 0.2 to 1.2, the DiffuCoder becomes more flexible in the order in which it generates the tokens, thus breaking away from the strict left-to-right constraint.

More interestingly, Apple built this model on Ali's open-source Qwen2.5-7B model, transformed that model into a diffusion-based decoder as described in the DiffuCoder paper, and then tweaked it to better follow instructions. Once that was done, they trained another version of it with over 20,000 carefully selected coding examples.

In the mainstream programming run, DiffuCode-7B-cpGRPO maintains a test score improvement of 4.4% compared to the mainstream diffusion-based programming model in the case of generating code without strictly relying on left-to-right generation.

statement:The content of the source of public various media platforms, if the inclusion of the content violates your rights and interests, please contact the mailbox, this site will be the first time to deal with.

{{userData.name}}Verify

Apple Releases DiffuCode-7B-cpGRPO Programming AI Model: Based on Qwen 2.5-7B, Can Generate Code Out of Order

Ali Tongyi open-sources its first audio generation model ThinkSound: think like a "professional sound engineer"

Character.AI Breakthrough Technology: Real-time AI Character Video Interaction

AI Weibo

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai tiktok

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

1ai WeChat

Five minutes a day

Become a master in one year

Scan the QR code to follow

{{userData.name}}Verify

Related content:

Ali Tongyi open-sources its first audio generation model ThinkSound: think like a "professional sound engineer"

Character.AI Breakthrough Technology: Real-time AI Character Video Interaction

Apple spends $50 million to license millions of Shutterstock images for training AI models

Llama 3.2, the strongest open-source AI model on the end-side, has been released: it can run on cell phones, from 1B plain text to 90B multimodal, and challenges OpenAI 4o mini.

Ali Tongyi Qianqian Releases Qwen2.5-Turbo AI Model: Supports 1 Million Tokens Contexts, Processing Time Reduced to 68 Seconds

Researchers Open Source Sky-T1 Reasoning AI Model, Costs Less Than $450 to Train

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

Five minutes a day

Become a master in one year

Scan the QR code to follow