Music Perception: How the Auditory System Encodes and Retains Acoustic Information
Music involves the manipulation of sound, and our perception of music is thus influenced by how the auditory system encodes and retains acoustic information. The perception of music depends on many culture-specific factors, but is also constrained by properties of the auditory system.
This topic is not a new one, but recent methods and findings have made important contributions. Understanding the auditory processes that occur during music can help to reveal why music is the way it is, and perhaps even provide some clues as to its origins. We will review these along with some classic findings in this area.
We will focus primarily on the role of pitch in music. Pitch is one of the main dimensions along which sound varies in a musical piece. Other dimensions are important as well, of course, but the links between basic science and music are strongest for pitch, mainly because something is known about how pitch is analyzed by the auditory system.
The Role of Pitch in Music Perception
Pitch is the perceptual correlate of periodicity in sounds. Periodic sounds by definition have waveforms that repeat in time (Fig. 1a). They typically have harmonic spectra (Fig. 1b), the frequencies of which are all multiples of a common fundamental frequency (F0).

The F0 is the reciprocal of the period - the time it takes for the waveform to repeat once. The F0 need not be the most prominent frequency of the sound, however (Fig. 1b), or indeed even be physically present.

Although most voices and instruments produce a strong F0, many audio playback devices, such as handheld radios, do not have speakers capable of reproducing low frequencies, and so the fundamental is sometimes not present when music is listened to.

Most research on pitch concerns the mechanisms by which the pitch of an individual sound is determined [3]. The pitch mechanism has to determine whether a sound waveform is periodic, and with what period. In the frequency domain, periodicity is signaled by harmonic spectra. In the time domain, periodicity is revealed by the waveform autocorrelation function, which contains regular peaks at time lags equal to multiples of the period (Fig. 1c).
Both frequency and time domain information are present in the peripheral auditory system. The filtering that occurs in the cochlea provides a frequency-to-place, or “tonotopic,” mapping that breaks down sound according to its frequency content. This map of frequency, established in the cochlea, is maintained to some degree throughout the auditory system up to and including primary auditory cortex [4]. Periodicity information in the time domain is maintained through the phase-locking of neurons, although the precision of phase-locking deteriorates at each successive stage of the auditory pathway [5]. The ways in which these two sources of information are used remains controversial [6,7]. Although pitch mechanisms are still being studied and debated, there is recent evidence for cortical neurons beyond primary auditory cortex that are tuned to pitch [8].
Relative Pitch and Its Importance
When listening to a melody, we perceive much more than just the pitch of each successive note. In addition to these individual pitches, which we will term the absolute pitches of the notes, listeners also encode how the pitches of successive notes relate to each other - for instance, whether a note is higher or lower in pitch than the previous note, and perhaps by how much. Relative pitch is intrinsic to how we perceive music. We readily recognize a familiar melody when all the notes are shifted upwards or downwards in pitch by the same amount (Fig. 2a), even though the absolute pitch of each note changes [9].

This ability depends on a relative representation, as the absolute pitch values are altered by transposition. Relative pitch is presumably also important in intonation perception, in which meaning can be conveyed by a pitch pattern (e.g. Figure 2. a) Transposition. Figure depicts two five-note melodies. The second melody is shifted upwards in pitch. In this case both the contour and the intervals are preserved. b) Contour and intervals of “Somewhere Over the Rainbow”.
The existence of relative pitch perception may seem unsurprising given the relational abilities that characterize much of perception. However, standard views of the auditory system might lead one to believe that absolute pitch would dominate perception, as the tonotopic representations that are observed from the cochlea [10] to the auditory cortex [4] make absolute, rather than relative, features of a sound’s spectrum explicit. Despite this, relative pitch abilities are present even in young infants, who seem to recognize transpositions of melodies just as do adults [11].
One of the most salient aspects of relative pitch is the direction of change (up or down) from one note to the next, known as the contour (Fig. 2b). Most people are good at encoding the contour of a novel sequence of notes, as evidenced by the ability to recognize this contour when replicated in a transposed melody (Fig. 2a) [12,13]. Recent evidence indicates that contours can also be perceived in dimensions other than pitch, such as loudness and brightness [14]. A pattern of loudness variation, for instance, can be replicated in a different loudness range, and can be reliably identified as having the same contour.

In contrast to their general competence with contours, people tend to be much less accurate at recognizing whether the precise pitch intervals (Fig. 2b) separating the notes of an unfamiliar melody are preserved across transposition. If listeners are played a novel random melody, followed by a second melody that is shifted to a different pitch range, they typically are unable to tell if the intervals between notes have been altered so long as the contour does not change [12], particularly if they do not have musical training [15]. This has led to a widely held distinction between contour and interval information in relative pitch.
Discrimination thresholds for the pitch interval between two notes are measured by presenting listeners with two pairs of sequential tones, one after the other, with the pitch interval larger in one pair than the other. The listener’s task is to identify the larger interval. To distinguish this ability from mere frequency discrimination, the lower note of one interval is set to be higher than that of the other interval, such that the task can only be performed via the relative pitch interval between the notes. Thresholds obtained with this procedure are typically on the order of a semitone in listeners without musical training [16]. A semitone is the smallest amount by which musical intervals normally differ (Fig. 3a). This suggests that the perceptual difference between neighboring intervals (e.g. major and minor thirds; Fig.
Scales, Intervals, and Tonal Hierarchy
Intervals defined by simple integer ratios (Fig. 3b) have a prominent role in Western music. The idea that they enjoy a privileged perceptual status has long had popularity, but supporting evidence has been elusive. Interval discrimination is no better for “natural” intervals (e.g. the major third and fourth, defined by 5:4 and 4:3 ratios) than for unnatural (e.g. 4.5 semitones, approximately 13:10) [16]. One exception to this is the octave. Listeners seem more sensitive to deviations from the octave than to deviations from adjacent intervals, both for simultaneously [17] and sequentially [18,19] presented tones. The same method yields weak and inconsistent effects for other intervals [20], suggesting that the octave is unique in this regard.
This special status dovetails with the prevalence of the octave in music from around the globe; apart from the fifth, other intervals are not comparably widespread with much consistency. Despite the large thresholds characterizing interval discrimination, and despite the generally poor short-term memory for the pitch intervals of arbitrary melodies, interval differences on the order of a semitone are critically important in most musical contexts.
For melodies obeying the rules of tonal music (see next paragraph), a pitch-shifted version containing a note that violates these rules (for instance, by being outside of the scale; Fig. 3a) is highly noticeable, even though such changes are often only a semitone in magnitude. A mere semitone change to two notes can turn a major scale into a minor scale (Fig. 3a), which in Western music can produce a salient change in mood (minor keys often being associated with sadness and tension). Intervals are also a key component of our memory for familiar melodies, which are much less recognizable if only the contour is correctly reproduced [12,14].
The ability to encode intervals in melodies seems intimately related to the perception of tonal structure. Musical systems typically use a subset of the musically available notes at any given time, and generally give special status to a particular note within that set [23]. In Western music this note is called the tonic (in the key of C major, depicted in Fig. 3a, C is the tonic). Different notes within the pitch set are used with different probabilities, with the tonic occurring most frequently and with longer durations. In Western music this probability distribution defines what is known as the key; a melody whose pitch distribution follows such tendencies is said to be tonal. Listeners are known to be sensitive to these probabilities, and use them to form expectations for what notes to expect (Fig.
In many musical systems the pitch sets that are commonly used are defined by particular patterns of intervals between notes. Thus, if presented with five notes of the major scale, a Western listener will have expectations for what other notes are likely to occur, even if they have not yet been played, because only some of the remaining available notes have the appropriate interval relations with the observed notes. Listeners thus internalize templates for particular pitch sets that are common in the music of their culture (see Fig. 3c for two examples from Western music). Most instances in which interval alterations are salient to listeners involve violations of tonal structure - the alteration introduces a note that is inappropriate given the pitch set that the listener expects. Interval changes that substitute another note within the same scale, for instance, are often not noticed [27]. Conversely, manipulations that make tonal structure more salient, such as lengthening the test melodies, make interval changes easier to detect [13]. It thus seems that pitch intervals are not generally retained with much accuracy, but can be readily incorporated into the tonal pitch structures that listeners learn via passive exposure [28].
Neuropsychological and Functional Imaging Evidence
Evidence from neuropsychology has generally been taken as suggestive that contour and intervals are mediated by distinct neural substrates [30,31], with multiple reports that brain damage occasionally impairs interval information without having much effect on contour perception. Such findings are, however, also consistent with the idea that the contour is simply more robust to degradation. Alternatively, anatomical segregation could be due to separate mechanisms for contour and tonality perception [32], the latter of which seems to be critical to many interval tasks.
Functional imaging studies in healthy human subjects indicate that pitch changes activate temporal regions, often in the right hemisphere [35-38], with one recent report that the right hemisphere is unique in responding to fine-grained pitch changes on the order of a semitone or so [39].
An impaired ability to discriminate pitch change direction is thought to at least partially characterize tone-deafness, officially known as congenital amusia. Tests of individuals who claim not to enjoy or understand music frequently reveal elevated thresholds for pitch direction discrimination [46-48], although the brain differences that underlie these deficits remain unclear [49]. At the other end of the spectrum, trained musicians are generally better than non-musicians at relative pitch tasks like interval and contour discrimination [15,16]. However, they also perform better on basic frequency discrimination [50] and other psychoacoustic tasks [51]. There is thus some evidence for a locus for relative pitch in the brain, which when damaged can impair music perception.
Table: Summary of Key Concepts in Music Perception
| Concept | Description | Relevance to Music Perception |
|---|---|---|
| Absolute Pitch | The ability to identify or produce a specific pitch without an external reference. | Provides a fixed point of reference, but less critical for general music perception. |
| Relative Pitch | The ability to perceive and understand musical intervals and relationships between pitches. | Fundamental for recognizing melodies, harmonies, and tonal structures. |
| Contour | The general shape of a melody, defined by the direction of pitch changes (up or down). | Easily recognized and helps in identifying transposed melodies. |
| Intervals | The distance between two pitches. | Critical for understanding harmonies and recognizing familiar melodies. |
| Tonal Structure | The organization of pitches around a central note (tonic) within a musical key. | Provides a framework for expectations and helps in detecting deviations from established musical patterns. |