The Psychology of Sound Localization
The ability to locate sound in our environments is an important part of hearing. Localizing sound could be considered similar to the way that we perceive depth in our visual fields. The location of a sound is defined on three dimensions: distance, elevation, and azimuth.
Sound localization is the process by which the human brain determines the origin of a sound in space. This ability allows individuals to identify where a sound is coming from, which is crucial for navigating the environment and responding to auditory stimuli. The brain uses various auditory cues, such as the differences in timing and intensity of sounds reaching each ear, to pinpoint the source of the sound.

Monaural and Binaural Cues
Localizing sound involves the use of both monaural and binaural cues. Each pinna interacts with incoming sound waves differently, depending on the sound’s source relative to our bodies. This interaction provides a monaural cue that is helpful in locating sounds that occur above or below and in front or behind us.
Binaural cues, on the other hand, provide information on the location of a sound along a horizontal axis by relying on differences in patterns of vibration of the eardrum between our two ears. If a sound comes from an off-center location, it creates two types of binaural cues: interaural level differences and interaural timing differences.
Interaural Level Difference (ILD)
Interaural level difference refers to the fact that a sound coming from the right side of your body is more intense at your right ear than at your left ear because of the attenuation of the sound wave as it passes through your head.
Interaural Timing Difference (ITD)
Interaural timing difference refers to the small difference in the time at which a given sound wave arrives at each ear.

Brain Processing and Neural Mechanisms
All the information the brain has about a sound comes from the auditory nerve. The firing rate of auditory neurons depend on the spectrum of the sound and the sound level. In addition to firing rate, auditory neurons have the remarkable ability to "phase lock" to a stimulus up to 1-2 kHz.
The lateral superior olive (LSO) has neurons that are responsible for extracting ILD cues. The neurons receive excitatory inputs from the ipsilateral ear and inhibatory inputs from the contralateral ear. The medial superior olive (MSO) has neurons that are responsible for extracting ITD cues. The neruons receive excitatory inputs from both the ipsilateral and contralateral ears. The is specialized anatomy that preserves the timing of the neural firings such that these excitatory-excitatory (EE) neurons can act as a coincidence detector and only fire if they receive both ipsilateral and contralateral input at the same time. In some species there are intricate axonal "delay lines" that change the relative time of arrivals of the contralateral and ipsilateral inputs such that each MSO neuron is tuned to a particular ITD.
Sound localization is mainly processed in the auditory cortex, where both ITD and ILD are analyzed to determine sound direction.
Factors Influencing Sound Localization
Sound localization is influenced by various factors, including the shape of the outer ear, which can affect how sounds are funneled into the ear canal. The shape of the outer ear, or pinna, plays a crucial role in sound localization by affecting how sound waves are captured and funneled into the ear canal. Different frequencies may be enhanced or diminished based on the contours of the outer ear. This filtering effect helps the brain gather more accurate information about the direction of a sound source, enabling better localization abilities.
Listeners can adapt to different listening environments, meaning they can improve their localization abilities based on experience or changes in their surroundings.
The Role of Head Motion
Head motion is probably not a critical part of the localization process, except in cases where time permits a very detailed assessment of location, in which case one tries to localize the source by turning the head toward the putative location. Sound localization is only moderately more precise when the listener points directly toward the source. The process is not analogous to localizing a visual source on the fovea of the retina. Thus, head motion provides only a moderate increase in localization accuracy.
Changes in Distance, Elevation, and Azimuth
When the distance between a listener and a sound source is changed there is a change in the overall level as well as the relative levels of direct and reverberant sound energy. When the elevation is changed the overall level and the direct to reverberant ratio say roughly constant. The pinna, and to some extent the head and body, shape the spectrum of the sound in an elevation dependent way. This so called pinna filtering is typically talked about as introducing a notch in the spectrum, at around 8 kHz, and that the frequency of this notch varies with elevation. Changing the azimuth of a sound changes the relative levels at the two ears and the relative timing between the ears. The interaural level difference (ILD) arises from the head shadow effect, basically the head blocks some of the sound from reaching the "far" ear.
Implications of Impaired Sound Localization
Impaired sound localization due to hearing loss can significantly affect an individual's ability to navigate their environment safely and communicate effectively. When someone struggles to identify where sounds are coming from, it can lead to difficulties in social interactions and an increased risk of accidents. Difficulties in sound localization can occur with hearing impairments or damage to specific brain regions responsible for processing auditory information.
Summary of Spatial Cue Usage
In keeping with our promise earlier in this review, we summarize here the process by which we believe spatial cues are used for localizing a sound source in a free-field listening situation. We believe it entails two parallel processes:
- The azimuth of the source is determined using differences in interaural time or interaural intensity, whichever is present. Wightman and colleagues (1989) believe the low-frequency temporal information is dominant if both are present.
- The elevation of the source is determined from spectral shape cues. The received sound spectrum, as modified by the pinna, is in effect compared with a stored set of directional transfer functions. These are actually the spectra of a nearly flat source heard at various elevations. The elevation that corresponds to the best-matching transfer function is selected as the locus of the sound. Pinnae are similar enough between people that certain general rules (e.g. Blauert's boosted bands or Butler's covert peaks) can describe this process.
Humans can typically locate sounds within a few degrees of accuracy, demonstrating a highly refined auditory system.
| Cue Type | Description | Neural Processing | Influence Factors |
|---|---|---|---|
| Monaural Cues | Interaction of sound waves with the pinna, providing information about elevation and front/back location. | Pinna filtering shapes the sound spectrum, introducing notches that vary with elevation. | Shape of the pinna, individual differences in ear structure. |
| Binaural Cues | Differences in sound patterns between the two ears, providing information about horizontal location. | Interaural Level Difference (ILD) processed in LSO, Interaural Timing Difference (ITD) processed in MSO. | Head size, frequency of sound, neural timing precision. |
| Adaptation | Improvement in localization abilities based on experience or changes in the listening environment. | Auditory cortex refines processing of ITD and ILD cues. | Experience in specific environments, feedback from head movements. |