The news of January 13, this morningDeepSeek Open source complete new architecture module "Engram" and synchronized the release of technical papers, re-emerged in the authorLeung Man Fung.

It has been learned that the Engram module, by introducing a scalable searchable memory structure, provides a completely new slender dimension for the larger model, different from the traditional Transformer and MoE。
In a paper, DeepSeek noted that the current mainstream large model was structurally inefficient in dealing with two types of tasks: One is the "table" memory that relies on fixed knowledge, and the other is complex reasoning and combination calculations。
The traditional Transformer (whether Dense or MoE) has to re-establish these static patterns through multi-layered attention and MLP, leading to a significant consumption of computing resources on "repeated construction of known models"。
Engram's core mechanism is based on modern Hashi N-gram embedded O(1) search memory. The module will perform N-gram slices for input token sequences and achieves constant-time retrieval through multi-Hashi mapping to an extended static memory table。
It was stressed that such searches were not related to the size of the model and that the search costs remained stable even if the memory tables were extended to a billion-scale parameter。
In contrast to MoE's calculations, Engram offers "conditional memory". The module will determine whether the search results will be enabled according to the current context and will be integrated with the backbone network through a door-control mechanism。
The paper showed that Engram was usually placed in the early stages of the model to take on the role of "model reconstruction", thus releasing the depth of the calculation of the subsequent layer for complex reasoning。
DeepSeek, in an experiment of 27B parameter sizes, redistributed part of the MoE expert parameter to the Engram memory table, and the model was significantly upgraded in terms of knowledge, reasoning, code and mathematical tasks under the same parameters and equal computational conditions。
On the X platform, the technical discussions concluded that the Engram mechanism had been effective in reducing the need for re-establishment of static models at the early stages of the model, making the model more "deep" in the reasoning part。
SOME DEVELOPERS POINT OUT THAT THIS STRUCTURE ALLOWS LARGE-SCALE STATIC MEMORY TO BE REMOVED FROM THE GPU STORAGE LIMIT AND TO PREFEASIBILITY OF HOST MEMORY THROUGH A DEFINITIVE LOCATION, THUS KEEPING COSTS LOW AT THE REASONING STAGE。
Many observers speculate that Engram is likely to be the core technology base for the next generation of DeepSeek models, V4。