Auditory Hallucinations Research: Exploring the Brain's Inner Voice in Schizophrenia
Auditory verbal hallucinations (AVH) remain one of the more unsettling hallmarks of schizophrenia. These hallucinations, estimated to occur in around 70% of patients, can manifest in various ways: single or multiple voices, often derogatory, and experienced inside or outside the head.
Now, new research - appearing in the journal Schizophrenia Bulletin - seems to back that up. The findings, drawn from research spanning both Australia and Hong Kong, lend credence to the idea that some hallucinated voices might be the brain’s own thoughts.
Theoretical approaches to AVH include so-called ‘cognitive’ models, which argue that they are a manifestation of non-perceptual processes, for example inner speech that fails to be labeled as internally generated, or memories whose vividness and/or intrusiveness leads them to be misinterpreted as perceptions. The other main approach, the ‘neurological’ or ‘perceptual’ model, proposes that AVH are in some sense genuinely perceptual in nature.
Complex perceptual experiences are known to occur in neurological disorders such as epilepsy and migraine, and electrical stimulation of the temporal lobe cortex in patients undergoing brain surgery can also result in auditory experiences up to and including speech.
This new data doesn’t just illustrate the results of a rigorous test of a decades-old theory. This new study, however, expands that to encompass the concept of inner speech.
By taking a closer look at the brain’s electrical signals as participants imagined saying simple syllables, the researchers wanted to find out whether the same suppression effect was taking place. In each trial, participants saw a visual countdown and, at a precise moment, either imagined saying “ba” or “bi” in their heads (the inner speech), heard one of those syllables through headphones (the audible sound), or did both simultaneously.
For most, corollary discharge helps the brain parse self-generated sensations from the outside world. When you speak aloud, for example, your brain predicts the sound of your own voice and suppresses part of the auditory response so you’re not startled by it. But what happens when that breaks down?
The Study: Brain Activity and Auditory Hallucinations
We employed a novel variant of the symptom capture paradigm and 3 T fMRI to examine 30 right-handed adult patients with a DSM-5 diagnosis of schizophrenia or schizoaffective disorder (see “Methods” for diagnostic and exclusion criteria). Fifteen of these patients (the AVH + group) reported experiencing AVH nearly continually. The remaining 15 patients (the AVH- group) had been free of AVH for at least six months. The two groups were matched for age (t = 1.53, p = 0.14), sex (χ2 = 0.75, p = 0.39), and premorbid IQ, as estimated using a word pronunciation test (t = 0.25, p = 0.81) (see “Methods” and Supplementary Table S1).
During the functional run (lasting 10 min 10 s) the AVH + patients were instructed to press a button with their right index finger each time they heard a voice (frequency of button press during scanning ranged from 5 to 174 times, mean 43.53, SD = 49.20, median = 23). During the same run 40 randomly timed examples of real speech were also delivered to both ears via MRI-compatible headphones, to which the patients had to respond with the left index finger (see Fig. 1). The real speech was individually tailored to be similar in form to each patient’s AVH. To achieve this, prior to scanning the patients were asked to repeat out loud what their voices said, as they heard them, over a 5-min period, and their verbatim responses were tape-recorded and transcribed (for more details, see “Methods”). Examples, which took the form of single words, short phrases or sentences such as ‘The good boy’ or ‘You will change the world’, were then recorded for presentation during scanning in a neutral voice by an individual of the same gender of the hallucinated voice, as reported by each patient. The real speech stimuli were separated by random intervals ranging from 3 to 30 s (mean = 14.94 s, SD = 7.06); stimulus duration ranged from 0.53 to 3.22 s (mean = 1.33, SD = 0.67).

Representation of Brain Activity During Auditory Hallucinations
Key Findings
Healthy participants showed the expected dampening of the N1 wave, an early EEG marker of auditory processing, when their imagined syllable matched the heard one. The patients with hallucinations showed the opposite effect. Instead of the normal suppression, their N1 response jumped up. It was as if the sound of their imagined voice had become more salient, not less.
Patients who’d reported a lack of hallucinations displayed a different - but still altered - pattern. Overall, the degree of N1 disruption correlated with the severity of hallucinations on standardized rating scales.
The study provides rare physiological evidence that inner-speech-induced suppression crumbles in schizophrenia, particularly among those hearing voices. Notably, even patients not hallucinating still showed abnormal patterns, hinting that this dysfunction might serve as a biomarker for schizophrenia spectrum disorders more broadly.
If future efforts could replicate these results in longitudinal studies, the neural signatures could become powerful clinical tools. Ultimately, the study reinforces just how fragile our sense of self can be. And how much it relies on precise brain timing. For most of us, that voice inside the head is comfortably familiar.
Detailed Results and Activations
Activations in response to AVH and real auditory stimuli Findings using whole brain, voxel-based analyses, with an initial threshold of z = 3.1 (p < 0.001) and cluster-corrected for multiple comparisons at p < 0.05, are shown in Fig. 2 (for full details of the data analysis see “Methods”; MNI coordinates for all clusters are given in Supplementary Table S2). As a group, the AVH + patients showed no activation in most of the superior temporal cortex when they experienced AVH, including its posterior portion which contains the primary auditory cortex (Heschl’s gyrus). The only exception was a bilateral cluster in the extreme posterior superior temporal gyrus and the adjacent supramarginal gyrus, which on the left includes the regions usually identified as Wernicke’s area.

Brain Lobes and Activation Areas
Experience of AVH was, however, associated with activations in circumscribed regions outside the temporal lobe. In contrast, when hearing real speech, the AVH + patients showed activation along the length of the superior temporal cortex bilaterally, as well as in areas outside this (see Fig. 2, panel B). The extra-temporal areas activated largely overlapped with, but were more extensive than, the regions activated by experience of hallucinations. As shown in Fig. 3, Heschl’s gyrus was robustly activated in response to real speech, but activation barely rose above baseline in response to AVH. In contrast, activation levels for both real speech and AVH were similar in the two regions generally accepted as comprising Broca’s area, the left inferior frontal gyrus, pars opercularis and pars triangularis, and in its homologue on the right. This was also the case for the anterior and posterior portions of the supramarginal gyrus, which on the left overlap with Wernicke’s area. Finally, activations were similar for AVH and real speech in the precentral gyrus and supplementary motor area; these activations may have reflected the effect of button-pressing, but the ventral premotor cortex has also been suggested to play a role in speech perception.

Task-related activation in anatomically-defined ROIs for auditory perception, language processing and motor regions.
Interfering Effects of Real Auditory Stimuli
Given that the design of the study meant that AVH occurred in the same blocks as the presentation of real auditory stimuli, it needs to be considered whether auditory cortex activations produced by real speech might have obscured activations to AVH occurring very soon afterwards. As a first test of this, we measured the correlation between the regressors for AVH and real speech in the individual GLMs in our first-level model. In all cases, this was close to zero or negative (mean = − 0.14, range = + 0.04 to − 0.48). This finding does not suggest that temporal co-occurrence between AVH and the real speech stimuli was playing an important role.
The findings are shown in Supplementary Figure S1 and Supplementary Table S4; the pattern of activation remained similar, although with smaller cluster sizes reflecting the smaller number of events captured, and the auditory cortex continued to be uninvolved. As in the original analysis, we found a small cluster of activation in the extreme posterior superior temporal gyrus, in this case only in the right hemisphere, roughly overlapping with Wernicke’s area right homologue.
Discussion
Contradicting several earlier studies using the symptom capture paradigm as well as meta-analyses of such studies, we found nothing to suggest that the experience of AVH is associated with auditory cortex activation. This failure, coupled with the fact that perception of formally similar real speech strongly activated a large expanse of the superior temporal cortex, would seem to exclude theoretical approaches to AVH in schizophrenia that invoke abnormal neuronal activity in the auditory cortex.
This then implies that some version of the cognitive model of AVH must be correct. It should be noted that the findings of our study make one theory of this type-that AVH are misinterpreted vivid/intrusive memories-unlikely. There was no hint of a default mode network pattern of activation in response to AVH in our study. Nor have meta-analyses of symptom capture studies suggested such an activation pattern.
We did find activations outside the temporal lobe during experience of AVH: these involved Wernicke’s area and its right homologue, Broca’s area and its right homologue, and the precentral gyrus and supplementary motor area bilaterally. Given the lack of accompanying auditory cortex activation, this would presumably be at the level of decoding of the linguistic properties of speech rather than its initial detection and analysis of its auditory perceptual qualities.
Based on functional imaging studies, Broca’s area, the precentral cortex and the supplementary motor area are regarded as core regions subserving working memory, specifically its verbal non-executive component, the articulatory or phonological loop. Interestingly, a further region is implicated in verbal short-term memory, the left supramarginal gyrus, which has been argued to fulfil the temporary storage or ‘buffering’ function of the articulatory/phonological loop. Since verbal short-term memory equates to some extent with the concept of inner speech, our findings could therefore be interpreted as providing support for Frith’s mislabelled inner speech theory of AVH.
Our findings pertain to AVH as experienced by patients with schizophrenia. However, it is now well documented that around 6% of healthy adults also report having experienced AVH. To date, two symptom capture studies have examined the functional imaging correlates of AVH in such ‘healthy voice hearers’.