neuron-level mechanisms

  • Understanding H-Neurons in LLMs


    H-Neurons: On the Existence, Impact, and Origin of Hallucination-Associated Neurons in LLMsLarge language models (LLMs) often produce hallucinations, which are outputs that seem plausible but are factually incorrect, affecting their reliability. A detailed investigation into hallucination-associated neurons (H-Neurons) reveals that a very small fraction of neurons (less than 0.1%) can predict these occurrences reliably across various scenarios. These neurons are causally linked to behaviors of over-compliance and originate from pre-trained base models, maintaining their predictive power for hallucination detection. Understanding these neuron-level mechanisms can help in developing more reliable LLMs by bridging the gap between observable behaviors and underlying neural activity.

    Read Full Article: Understanding H-Neurons in LLMs