Ap Cam

Find The Best Tech Web Designs & Digital Insights

Technology and Design

Auditory Scene Analysis: Understanding Bregman's Framework (1990)

Albert Bregman's (1990) book, Auditory Scene Analysis: The Perceptual Organization of Sound, has significantly impacted research in auditory neuroscience.

Auditory Scene Analysis addresses the problem of hearing in complex auditory environments, using a series of creative analogies to describe the process required of the human auditory system as it analyzes mixtures of sounds to recover descriptions of individual sounds.

In a unified and comprehensive way, Bregman establishes a theoretical framework that integrates his findings with an unusually wide range of previous research in psychoacoustics, speech perception, music theory and composition, and computer modeling.

The core of the book explains how primitive auditory processes employ a set of principles, analogous to those employed in computer pattern recognition that use correlations in acoustic inputs, received simultaneously and successively, to conclude that they must have been caused by the same environmental event.

Bregman shows how the resulting organization affects perceptions and demonstrates that the same principles apply in laboratory studies of simple signals, in music composition, and in speech perception.

He examines schema-based scene analysis, relations between vision and audition, and problems with current theories concerning the duplex perception of speech. He concludes by summarizing what is known about auditory scene analysis and suggests directions for future study.

Auditory Scene Analysis describes how our perceptual system parses the incoming complex vibration (sound) in order to produce a meaningful representation of the environment. It involves the process of grouping or separating sound events in time, which is called auditory streaming. The elements can either be grouped together (integration), separated in layers (segregation) or separated in successive events (segmentation).

For example, in the noisy scene of a city street at any given time, some of the sound components reaching your ears may belong to a motorcycle driving by, others to ambient traffic noise, and still others to voices of people on the sidewalk next to you: your auditory system deciphers which is which.

Auditory Scene Analysis Diagram

Auditory Scene Analysis Diagram

Additionally, the auditory system must group incoming sound components into units that are delimited in time (segmentation), for example musical notes, and decide which ones to group together into extended sequences such as melodies. This is called auditory streaming.

Complicated though it may be, there are fortunately relatively few principles that guide the auditory system through this task. They are:

  • Harmonicity: Frequencies (or partials) related by simple integer ratios tend to group together. For example, if the auditory scene contains frequencies at 110 Hz, 220 Hz, and 330 Hz (n, 2n, 3n), the auditory system will tend to fuse them together into a single complex sound, whereas frequencies at 110 Hz, 201 Hz, and 350 Hz, which are not related by simple ratios, are less likely to fuse.
  • Amplitude comodulation: Sound components that get louder or softer in parallel tend to group together.
  • Source location: Sound components that originate from the same physical location in space tend to group together.

For the most part, we are unaware that this process is happening, and take it for granted. But before the auditory scene makes it into your conscious awareness, an amazing feat of pre-attentive analysis has already converted the dizzying complexity of air vibrations around you into a coherent picture of the world.

The steady increase in neuroscience research following the book’s pivotal publication has advanced knowledge about how the brain forms representations of auditory objects.

This research has far-reaching societal implications on health and quality of life. For instance, it helped us understand why some people experience difficulties understanding speech in noise, which in turn has led to development of therapeutic interventions.

Importantly, the book acts as a catalyst, providing scientists with a common conceptual framework for research in such diverse fields as speech perception, music perception, neurophysiology and computational neuroscience.

Which raises the interesting question: how do we keep track of them all? How does the auditory system make sense of the amazingly complex and ever-changing air vibrations that reach the ear?

Auditory Scene Analysis: Principles and Examples