The Science of Sound Localization: How We Pinpoint Sound
Sound localization is the process by which the human brain determines the origin of a sound in space. This ability allows individuals to identify where a sound is coming from, which is crucial for navigating the environment and responding to auditory stimuli. The brain uses various auditory cues, such as the differences in timing and intensity of sounds reaching each ear, to pinpoint the source of the sound.

The Remarkable Precision of Human Hearing
Most people with normal hearing can locate sound sources with impressive accuracy-typically within 5 degrees of the actual location. To put this in perspective, that's about the width of your thumb when held at arm's length. This precision is remarkable considering how our brain accomplishes it using subtle differences in timing and volume between our two ears.
The Five-Degree Advantage
That remarkable 5-degree localization precision gives normal-hearing individuals a significant advantage in challenging listening environments. When two people speak simultaneously from positions separated by at least 5 degrees, those with healthy hearing can mentally separate these sound streams into distinct conversations.
This means that in a typical restaurant setting, someone with normal hearing localization abilities can:
- Focus on their dining companion's words
- Tune out neighboring tables' conversations
- Switch attention when necessary
- Participate in group discussions without missing key information
Key Factors in Sound Localization
The fundamental auditory cues to direction arise because the ears are on either side of the head. The sound from a source on the right side of the head will arrive at the left eardrum after it arrives at the right eardrum, because the left ear is further away, and it will also be lower in level at the left than at the right because the head is a solid object and so casts an acoustic shadow. These differences are termed, respectively, interaural time differences (ITDs) and interaural level differences (ILDs).
The auditory system likely uses some form of mapping to decode ITDs, whereas the analysis of ILDs may be as simple as a comparison of the sound levels at the two ears. But it is important to remember that though ITDs and ILDs are usually described separately, and often manipulated separately in experiments, the sound from any real source must have both an ITD and an ILD, even if either one (or both) is zero.
To a first approximation, the relationship between direction and ITD can be found from a simple geometrical calculation of the additional distance to the far ear divided by the speed of sound. If the head is assumed to be spherical in shape, and the source of sound is sufficiently far away for the wavefronts to be planes, then the additional distance is given by rθ + rsinθ, where r is the radius of the head (about 9 cm) and θ is the azimuth of the sound source, in radians.
For instance, a source located slightly to the side, say 10° to the right, gives an additional distance of 3 cm and an ITD of 9 μs, but a source located as far to the left as possible (−90°) gives an ITD of −670 μs. Woodworth’s formula rθ + rsinθ accounts very well for measurements of the ITDs of clicks from loudspeakers and of high-frequency pure tones, but it fails for low-frequency pure tones, in that the measured ITDs are larger than the formula predicts by a few hundred microseconds.
Empirical observations also show that the presence of the torso and any clothing on the subject can affect the values of ITD. The magnitudes of ILDs also vary with direction and frequency. They are generally larger at higher frequencies and are mostly larger at larger azimuths. However, unlike ITDs, there are sharp dips in ILD at some frequencies but strong peaks at others, and an ILD found for one direction may bear little resemblance to the ILD found for a neighboring direction.
The dips and peaks occur because of diffraction and reflection of the incoming sound with the torso, head, and pinnae and can be of the order of 20 dB. These are crucial for differentiating up versus down and front versus back.
The Duplex Theory
The standard explanation of the midfrequency maximum is that people use ITDs to locate sound sources at lower frequencies but not at higher frequencies and ILDs at higher frequencies but not at lower. The theory is known as the Duplex Theory, which was developed around a century ago by JW Strutt, Lord Rayleigh and is described in many standard textbooks.
Its explanation at low frequencies is that the magnitudes of the ITDs can be relatively large, whereas the magnitudes of the ILDs are always relatively small as the wavelengths are too long for diffraction to be substantial. At high frequencies, its explanation is that the magnitudes of the ILDs can be much larger as the wavelengths are far shorter, meaning that the head shadow becomes a more significant factor and there is more scope for constructive or destructive interference, whereas though there are still substantial ITDs, the neural mechanisms for decoding them fail to work (at least with high-frequency pure tones).
Over a century after its formulation, the low frequency = ITD versus high frequency = ILD dichotomization of the Duplex Theory remains valid-at least for pure-tone stimuli. But the experiments cited above on the JNDs for ITD and ILD demonstrate that the auditory system can discriminate ITDs and ILDs at all frequencies, as there is no frequency at which either JND is impossible to measure.
Presumably, the system can therefore use the information at any frequency to help determine the direction of a sound source: for instance, envelope ITDs at high frequencies may be useful for locating sources in reverberation. This would make sense as a strategy, as in most natural listening circumstances the target sound may not simply be a low-frequency or high-frequency pure tone presented in quiet. It is more likely to be mostly broadband, partially masked by various backgrounds, and continually changing in instantaneous level and spectrum, with ITD cues and ILD cues at all frequencies.
The cues will be changing moment-to-moment, frequency-to-frequency. Sound localization is mainly processed in the auditory cortex, where both ITD and ILD are analyzed to determine sound direction.
Sound Localization in Daily Life
This ability affects many aspects of our daily existence:
Social Connection
In busy cafés, restaurants, or family gatherings, sound localization helps us stay connected. We can lean toward a friend's voice across the table while mentally "turning down the volume" on the conversations happening just a few feet away.
Safety and Awareness
When crossing busy streets or navigating crowded spaces, sound localization helps us identify potential hazards-an approaching car, someone calling a warning, or other environmental cues that keep us safe.
Professional Settings
Many work environments demand effective communication in noisy conditions. Whether you're in an open-plan office, a factory floor, or a busy hospital, your ability to focus on relevant speech while filtering out background noise directly impacts your effectiveness.
Entertainment Experiences
From enjoying surround sound at movies to appreciating the spatial placement of instruments in music, sound localization enhances our entertainment experiences and helps create immersion.
The Cocktail Party Effect
Perhaps nowhere is sound localization more valuable than at social gatherings-what scientists call the "cocktail party effect." This term describes our ability to focus on a single conversation while filtering out competing noise. It's a complex feat our brains perform almost effortlessly (when our hearing system is working optimally).
Thanks to our precise sound localization abilities, we can:
- Focus on someone speaking directly to us while ignoring background conversations
- Switch attention between different speakers at will
- Follow multiple conversation threads happening around us
- Quickly turn toward new sounds that might be important
The Challenge of Aging Hearing
As we age, however, this precision often diminishes. Many older adults find they need speakers to be separated by significantly more than 5 degrees-sometimes 15, 30, 45 degrees or more-to effectively distinguish between them. This degradation in sound localization ability explains why many older individuals struggle in environments younger people navigate with ease.
That busy restaurant that seems merely "energetic" to a 30-year-old can become an incomprehensible wall of noise to someone in their 70s. What's perceived as a slight background hum by younger diners might completely overwhelm an older person's ability to focus on the conversation at their own table.
This difference isn't about paying attention or cognitive ability-it reflects actual changes in how the auditory system processes and localizes sound, reminding us that hearing challenges deserve our understanding and accommodation, not frustration or dismissal. As our population ages, designing environments and technologies that support better sound localization will become increasingly important for maintaining social connection and quality of life for everyone.
Hearing Impairment and Sound Localization
In general, hearing-impaired listeners perform worse in spatial-hearing experiments than those with normal hearing. For example, Hausler, Colburn, and Marr (1983) measured minimal audible angles for white-noise stimuli as part of a comprehensive set of experimental tests on spatial hearing. For presentation from the side, they found that the smallest MAA for a group of bilaterally sensorineural-loss listeners was 7°, and about half of the listeners gave MAAs of 30° or more. In contrast, all the normal controls gave MAAs of 12° or less.
A second example comes from Neher, Laugesen, Jensen, and Kragelund (2011), who measured the highest frequency at which listeners could discriminate an interaural phase difference (IPD) of 0° from 180° for a pure-tone stimulus. They found a mean highest frequency of just 850 Hz in a group of older hearing-impaired listeners, whereas they found a mean highest frequency of 1230 Hz for a control group of younger normal-hearing listeners. Moreover, the across-listener range of highest frequencies was wider in the hearing-impaired listeners: around 300 to 1250 Hz, whereas it was 900 to 1500 Hz in the control group.
Background sounds generally exacerbate the decrements in performance by hearing-impaired listeners, even over what would be expected simply from reduced audibility. One example is from Lorenzi, Gatehouse, and Lever (1999), who measured the accuracy of reporting the spatial direction of a 300-ms, broadband click train when partially masked by a spatially diffuse, white noise. At a SNR of −6 dB, the mean error in reported direction from their four hearing-impaired listeners was about 50°, but it was only about 25° for a control group of four normal-hearing listeners.
However, though most experiments on spatial hearing by hearing-impaired listeners have reported some decrement in performance, it is not true that hearing impairment leads to substantial spatial impairment in all tasks. Another one of Hausler et al.’s (1983) results demonstrates this. For measurements of the MAAs of sounds presented from the front, they recorded values of 6° or less for a group of bilaterally sensorineural-loss listeners and 4° or less for the normal-hearing listeners. The mean difference is negligible: It is about the visual width of the thumb when held at arm’s length (O’Shea, 1991).
Impact of Hearing Aids
Hearing aids do not improve the localization of sound sources: indeed, in many cases, they interfere. A few examples follow; all found larger errors in horizontal localization for aided than unaided listening.
First, Drennan, Gatehouse, Howell, van Tasell, and Lund (2005) compared the accuracy in localization for aided versus unaided listening, using single words in a speech-shaped noise at a SNR of 0 dB. Despite 10 to 15 weeks of acclimatization to the hearing aids, the localization errors when the listeners (n = 7) were tested aided were generally equal or larger (about 30°-35°) than when tested unaided (about 30°-35°). Before acclimatization, aided accuracy was poorer still, reaching as much as 45°.
Second, Van den Bogaert, Klasen, Moonen, van Deun, and Wouters (2006) measured localization accuracy for older listeners (n = 10, aged 44-79) with their hearing aids set for adaptive-directional mode, omnidirectional mode, or not used. They used various stimuli, including telephone rings-notable for being a stimulus that has a clear ecological validity for localization, for when one hears a telephone ring, one often wants to know where it is. Aided performance was, on the whole, worse than unaided: the mean errors were 18° (adaptive-directional), 16° (omnidirectional), and 13° (unaided) but was just 4° for a control group of younger normal-hearing listeners (aged 20-25).
Third, Keidser, O’Brien, Hain, McLelland, and Yeend (2009) compared the accuracy for aided versus unaided versus normal listening, using a variety of different kinds of stimuli. The mean errors in localization were about 20°, 20°, and 10°, respectively. A follow-up experiment with different kinds of hearing-aid microphone modes showed that performance interacted between stimulus and directionality: With a pink noise, a fully directional microphone gave remarkably high localization errors at around 35°, even after 3 weeks of acclimatization, whereas omnidirectional microphones (or two types of partially directional microphones) gave errors around 20° to 25°.
Fourth, Best et al. (2010) compared the accuracy for localizing single-word stimuli by aided listeners with two types of hearing aid, unaided but impaired listeners, and normal-hearing controls. They found mean errors in accuracy of about 14°, 14°, and 8° respectively.
The directional microphones often found on hearing aids can also interfere with the perception of direction. A recent example was reported by Brimijoin, Whitmer, McShefferty, and Akeroyd (2014), who measured how people orientated to new targets in multitalker, multiangle babble. Two groups of aided listeners participated, one group whose own hearing aids were fairly directional (n = 8) and one whose aids were not directional (n = 7). Brimijoin et al. found that the directional group generally took longer to orientate to the target (for some target angles, by as much as half a second) and made more initial misorientations.
A domain in which hearing aids can cause particular problems is in distinguishing sound sources in front from those behind. A confusion between whether a source is ahead or behind occasionally happens to normal-hearing listeners-for example, people often comment how hard it is to locate the emergency siren of an ambulance or fire engine. The reason is because the head and ears are fairly front/back symmetric.
Key Terms
- Sound Localization: The process by which the brain determines the location of a sound source.
- Interaural Time Difference (ITD): The difference in arrival time of a sound between the two ears.
- Interaural Level Difference (ILD): The difference in sound pressure level between the two ears.
- Minimum Audible Angle (MAA): The smallest change in angle between two sound sources that a listener can detect.
- Duplex Theory: The theory that ITDs are used for low-frequency sounds and ILDs are used for high-frequency sounds in sound localization.
Sound localization is influenced by various factors, including the shape of the outer ear, which can affect how sounds are funneled into the ear canal. Listeners can adapt to different listening environments, meaning they can improve their localization abilities based on experience or changes in their surroundings.
Difficulties in sound localization can occur with hearing impairments or damage to specific brain regions responsible for processing auditory information.
| Parameter | Normal Hearing | Hearing Impaired |
|---|---|---|
| Minimum Audible Angle (MAA) | 1° - 3° | 7° or more |
| Highest Frequency for IPD Discrimination | 1230 Hz | 850 Hz |
| Localization Error in Noise (-6 dB SNR) | 25° | 50° |