AI "Finger Trouble" flipped over. Six fingers exposed Transformer's fatal defects

The multiple AI models do not correctly count the number of fingers against the six-finger image, even if the prompt clearly indicates that there are six fingers, the model maintains the five; the root causes of the problem are the strong association of “man = five fingers” in the training data and the lack of a visible structural constraint in the Transformer structure, which transmits untraceable status information one way forward; and the proliferation model is very good at capturing the overall distribution and texture, but it is difficult to accurately control the local fragmentation structure and exposes the current Achilles stasis of AI's interpretation of visual reasoning and causation。

Search