Ap Cam

Find The Best Tech Web Designs & Digital Insights

Technology and Design

The Importance of Sound Localization

Sound localization refers to the ability of the human auditory system to determine the location of a sound source in space. This is done by analyzing the differences in the arrival time, intensity, and spectral content of the sound waves that reach the two ears. The brain processes the incoming sound signals from both ears to calculate the interaural time difference (ITD) and interaural level difference (ILD), which are used to determine the location of the sound source.

Localizing a sound source requires the auditory system to determine its direction and its distance. In many simple listening circumstances, a normal-hearing person’s percept of the position of a sound source is reasonably veridical: the perceived direction corresponds closely to the actual direction, and the perceived distance is at least passably accurate.

This article summarizes the major experimental effects in direction (and its underlying cues of interaural time differences and interaural level differences) and distance for normal-hearing, hearing-impaired, and aided listeners. Front/back errors and the importance of self-motion are noted.

It is well known that a listener is more comfortable when the speaker can be located accurately. The world of sound assumes full three dimensionality only when the capacity for spatial localization can be used to its fullest. Normal ability to distinguish sounds of biological and social importance in unfavorable listening situations requires that two independent sequences of neural events (time of arrival and intensity), one from each inner ear, activate the central nervous system.

This ability affects many aspects of our daily existence:

  • Social Connection: In busy cafés, restaurants, or family gatherings, sound localization helps us stay connected.
  • Safety and Awareness: When crossing busy streets or navigating crowded spaces, sound localization helps us identify potential hazards.
  • Professional Settings: Many work environments demand effective communication in noisy conditions.
  • Entertainment Experiences: From enjoying surround sound at movies to appreciating the spatial placement of instruments in music, sound localization enhances our entertainment experiences and helps create immersion.
Sound Localization Explained

The Role of ITD and ILD

The fundamental auditory cues to direction arise because the ears are on either side of the head. The sound from a source on the right side of the head will arrive at the left eardrum after it arrives at the right eardrum, because the left ear is further away, and it will also be lower in level at the left than at the right because the head is a solid object and so casts an acoustic shadow. These differences are termed, respectively, interaural time differences (ITDs) and interaural level differences (ILDs).

The auditory system uses both ITD and ILD as complementary cues that work together to allow for accurate sound localization in the horizontal plane, aka stereo field. At low frequencies, ITD is the dominant cue for sound localization, while at high frequencies, ILD becomes more important.

The auditory system likely uses some form of mapping to decode ITDs, whereas the analysis of ILDs may be as simple as a comparison of the sound levels at the two ears. But it is important to remember that though ITDs and ILDs are usually described separately, and often manipulated separately in experiments, the sound from any real source must have both an ITD and an ILD, even if either one (or both) is zero.

To a first approximation, the relationship between direction and ITD can be found from a simple geometrical calculation of the additional distance to the far ear divided by the speed of sound. If the head is assumed to be spherical in shape, and the source of sound is sufficiently far away for the wavefronts to be planes, then the additional distance is given by rθ + rsinθ, where r is the radius of the head (about 9 cm) and θ is the azimuth of the sound source, in radians. For instance, a source located slightly to the side, say 10° to the right, gives an additional distance of 3 cm and an ITD of 9 μs, but a source located as far to the left as possible (−90°) gives an ITD of −670 μs.

Woodworth’s formula rθ + rsinθ accounts very well for measurements of the ITDs of clicks from loudspeakers and of high-frequency pure tones, but it fails for low-frequency pure tones, in that the measured ITDs are larger than the formula predicts by a few hundred microseconds. Empirical observations also show that the presence of the torso and any clothing on the subject can affect the values of ITD.

The magnitudes of ILDs also vary with direction and frequency. They are generally larger at higher frequencies and are mostly larger at larger azimuths. However, unlike ITDs, there are sharp dips in ILD at some frequencies but strong peaks at others, and an ILD found for one direction may bear little resemblance to the ILD found for a neighboring direction. The dips and peaks occur because of diffraction and reflection of the incoming sound with the torso, head, and pinnae and can be of the order of 20 dB. These are crucial for differentiating up versus down and front versus back.

The temporal characteristics of an audio event, such as its onset and duration, can have an impact on sound localization as well. Generally speaking, sounds with a more distinct onset, such as a drum hit, are easier to localize than sounds with a more sustained signal, such as white noise. In the case of a drum hit, the sharp onset creates a more pronounced difference in the arrival time and intensity of the sound at the two ears, which makes it easier for the auditory system to use ITD and ILD cues to locate the sound source.

Sound coming from directly in front (point A in Figure 2) will be the same in both ears (assuming symmetrical hearing sensitivity). However, if the sound comes from somewhat to the right (point B) or left (just reverse this), the sound will be slightly louder in the ear closest to the sound (near ear). This is due to the sound’s direct path being blocked by the head (head shadow). In the real world, listeners rely on those stimuli arriving first at the ears to determine the direction of a sound source, primarily because “first of arrival” signals carry the greatest loudness/intensity. The difference between the relative loudness of sound reaching the two ears is called Interaural Loudness Difference or Interaural Intensity Difference (ILD or IID). The IID is directly related to the head shadow effect discussed in a previous post.

Interaural intensity difference

Figure 2. Interaural intensity difference (ITD) effects as a result of the signal angle of incidence and the impact of the head shadow.

Just-Noticeable Differences (JNDs)

Many experiments have measured the magnitudes of the just-noticeable differences (JNDs) that listeners can detect in ITD, ILD, or actual direction, while varying the frequency, duration, overall intensity, waveform, onset, offset, masker type, signal-to-masker ratio, and so forth of the stimuli. For normal-hearing listeners, the JND for ILD is of the order of 1 dB. There is little frequency dependence to the JND, except a slight worsening around 1 kHz by no more than a quarter of a decibel.

In contrast, the data for the JND for ITD show an extreme frequency dependence. For pure-tone stimuli, the JND is about 60 μs for a frequency of 250 Hz, 10 μs for 1000 Hz, 20 μs for 1250 Hz, but then essentially becomes unmeasurably large for frequencies above about 1500 Hz. The rate of change of ITD JND with frequency around 1500 Hz is perhaps one of the steepest functions in all of auditory psychophysics.

The JND for actual direction (known as the minimum audible angle, MAA) is, at best, about 1°. This is found for pure-tone stimuli at around 750 Hz, using sound sources directly ahead and for changes in direction limited to the horizontal plane. It reaches a maximum (about 3°) at frequencies around 2000 Hz before reducing again for frequencies up to about 8 kHz.

The Duplex Theory

The standard explanation of the midfrequency maximum is that people use ITDs to locate sound sources at lower frequencies but not at higher frequencies and ILDs at higher frequencies but not at lower. The theory is known as the Duplex Theory, which was developed around a century ago by JW Strutt, Lord Rayleigh and is described in many standard textbooks.

Its explanation at low frequencies is that the magnitudes of the ITDs can be relatively large, whereas the magnitudes of the ILDs are always relatively small as the wavelengths are too long for diffraction to be substantial. At high frequencies, its explanation is that the magnitudes of the ILDs can be much larger as the wavelengths are far shorter, meaning that the head shadow becomes a more significant factor and there is more scope for constructive or destructive interference, whereas though there are still substantial ITDs, the neural mechanisms for decoding them fail to work (at least with high-frequency pure tones).

Over a century after its formulation, the low frequency = ITD versus high frequency = ILD dichotomization of the Duplex Theory remains valid-at least for pure-tone stimuli. But the experiments cited above on the JNDs for ITD and ILD demonstrate that the auditory system can discriminate ITDs and ILDs at all frequencies, as there is no frequency at which either JND is impossible to measure.

Presumably, the system can therefore use the information at any frequency to help determine the direction of a sound source: for instance, envelope ITDs at high frequencies may be useful for locating sources in reverberation. This would make sense as a strategy, as in most natural listening circumstances the target sound may not simply be a low-frequency or high-frequency pure tone presented in quiet. It is more likely to be mostly broadband, partially masked by various backgrounds, and continually changing in instantaneous level and spectrum, with ITD cues and ILD cues at all frequencies. The cues will be changing moment-to-moment, frequency-to-frequency.

Impact of Hearing Impairment

In general, hearing-impaired listeners perform worse in spatial-hearing experiments than those with normal hearing. For example, for presentation from the side, they found that the smallest MAA for a group of bilaterally sensorineural-loss listeners was 7°, and about half of the listeners gave MAAs of 30° or more. In contrast, all the normal controls gave MAAs of 12° or less. Moreover, the across-listener range of highest frequencies was wider in the hearing-impaired listeners: around 300 to 1250 Hz, whereas it was 900 to 1500 Hz in the control group.

Background sounds generally exacerbate the decrements in performance by hearing-impaired listeners, even over what would be expected simply from reduced audibility. At a SNR of −6 dB, the mean error in reported direction from their four hearing-impaired listeners was about 50°, but it was only about 25° for a control group of four normal-hearing listeners.

However, though most experiments on spatial hearing by hearing-impaired listeners have reported some decrement in performance, it is not true that hearing impairment leads to substantial spatial impairment in all tasks. For measurements of the MAAs of sounds presented from the front, they recorded values of 6° or less for a group of bilaterally sensorineural-loss listeners and 4° or less for the normal-hearing listeners. The mean difference is negligible: It is about the visual width of the thumb when held at arm’s length.

Hearing Aids and Sound Localization

Hearing aids do not improve the localization of sound sources: indeed, in many cases, they interfere. A few examples follow; all found larger errors in horizontal localization for aided than unaided listening.

Despite 10 to 15 weeks of acclimatization to the hearing aids, the localization errors when the listeners were tested aided were generally equal or larger (about 30°-35°) than when tested unaided (about 30°-35°). Before acclimatization, aided accuracy was poorer still, reaching as much as 45°.

Aided performance was, on the whole, worse than unaided: the mean errors were 18° (adaptive-directional), 16° (omnidirectional), and 13° (unaided) but was just 4° for a control group of younger normal-hearing listeners (aged 20-25). The mean errors in localization were about 20°, 20°, and 10°, respectively.

The directional microphones often found on hearing aids can also interfere with the perception of direction. A domain in which hearing aids can cause particular problems is in distinguishing sound sources in front from those behind.

The Remarkable Precision of Human Hearing

Most people with normal hearing can locate sound sources with impressive accuracy-typically within 5 degrees of the actual location. To put this in perspective, that's about the width of your thumb when held at arm's length. This precision is remarkable considering how our brain accomplishes it using subtle differences in timing and volume between our two ears.

The Cocktail Party Effect

Perhaps nowhere is sound localization more valuable than at social gatherings-what scientists call the "cocktail party effect." This term describes our ability to focus on a single conversation while filtering out competing noise. It's a complex feat our brains perform almost effortlessly (when our hearing system is working optimally).

Thanks to our precise sound localization abilities, we can:

  • Focus on someone speaking directly to us while ignoring background conversations
  • Switch attention between different speakers at will
  • Follow multiple conversation threads happening around us
  • Quickly turn toward new sounds that might be important
Cocktail party effect

The Five-Degree Advantage

That remarkable 5-degree localization precision gives normal-hearing individuals a significant advantage in challenging listening environments. When two people speak simultaneously from positions separated by at least 5 degrees, those with healthy hearing can mentally separate these sound streams into distinct conversations.

This means that in a typical restaurant setting, someone with normal hearing localization abilities can:

  • Focus on their dining companion's words
  • Tune out neighboring tables' conversations
  • Switch attention when necessary
  • Participate in group discussions without missing key information

The Challenge of Aging Hearing

As we age, however, this precision often diminishes. Many older adults find they need speakers to be separated by significantly more than 5 degrees-sometimes 15, 30, 45 degrees or more-to effectively distinguish between them. This degradation in sound localization ability explains why many older individuals struggle in environments younger people navigate with ease.

That busy restaurant that seems merely "energetic" to a 30-year-old can become an incomprehensible wall of noise to someone in their 70s. What's perceived as a slight background hum by younger diners might completely overwhelm an older person's ability to focus on the conversation at their own table.

This difference isn't about paying attention or cognitive ability-it reflects actual changes in how the auditory system processes and localizes sound, reminding us that hearing challenges deserve our understanding and accommodation, not frustration or dismissal.

As our population ages, designing environments and technologies that support better sound localization will become increasingly important for maintaining social connection and quality of life for everyone.

Group Stimuli Condition Mean Error
Younger Normal-Hearing Telephone rings Unaided
Older Listeners Telephone rings Unaided 13°
Older Listeners Telephone rings Adaptive-Directional 18°
Older Listeners Telephone rings Omnidirectional 16°
Hearing-impaired listeners Single-word stimuli Aided listeners with two types of hearing aid 14°
Hearing-impaired listeners Single-word stimuli Unaided but impaired listeners 14°
Normal-hearing controls Single-word stimuli Normal-hearing controls