Ap Cam

Find The Best Tech Web Designs & Digital Insights

Technology and Design

HRTF Audio Implementation in Games: A Deep Dive

Imagine you're the last player standing in a critical game round. Approaching the objective, you hear footsteps to your right, but upon turning, you find nothing. Suddenly, you're eliminated. This scenario highlights a crucial aspect of gaming often overlooked: audio. While graphics have advanced significantly, audio has lagged behind, remaining stagnant or even regressing.

For gamers who use high-quality headphones, the audio in modern 3D games can often feel "flat" and unengaging. The audio APIs commonly used provide only basic 2D panning algorithms when mixing for headphones, adjusting sound intensity in each channel but missing the nuances of true sound localization. This is where Head-Related Transfer Functions (HRTF) come into play.

A game that implements HRTFs when mixing audio for headphones can provide an immersive, realistic, 3D sound stage in which the player can perceive the location of a sound source without any visual cues.

HRTF Plane

What is HRTF?

HRTF stands for Head Related Transfer Function. Sound coming from around you is filtered when bending around your head and into your ears. This filtering is different for different directions and gives the brain clues (in addition to level and time differences between the ears) for localizing sound sources. Also it is different for different individuals, because we all have different shaped heads and ears.

By using fairly sophisticated digital signal processing algorithms, 3D audio attempts to duplicate how we hear in the real world and mimic that effect. For example, if a sound is over to your left, it will reach your left ear before it reaches your right ear. That's called the "inter aural time difference." It will also be a bit louder in your left ear than your right, because your head blocks the sound. That's called the "interaural intensity difference." A third things that is used is called the "pinna transfer function". That is what happens as sound coming from a particular location in space bounces off the folds of your out ear (the 'pinna') before entering your ear canal.

The Limitations of Current Audio Solutions

Game development studios should not content themselves with panning sounds across 5.1 and 7.1 speaker systems as a solution for 3D audio. Such speaker systems do not represent the ideal consumer-grade 3D audio solution. Headphones don’t have any of these problems and you can enjoy amazingly clear and accurate sound if you purchase a high-quality pair.

A surround sound system (eg: 7.1 or 5.1) has speakers placed around the listener at head height but there are no speakers above or below. Dolby Headphone takes a 7.1 or 5.1 channel audio stream as its input and converts it to a 2 channel audio stream for headphones. Therefore, you won't be able to hear the difference between a sound coming from above or below with Dolby Headphone.

Without the use of Dolby Headphone, most games effectively provide 1D audio for headphones because they don't implement the psychoacoustic cues that provide the sensation of front-behind and above-below.... not at the very least because it would require much computing overhead. This does provide directional cues to locate audial events, but that's it.

HRTF Implementation and Its Benefits

To get satisfying results, HRTFs should be applied within the audio API itself - when all the original directional information is still available. For years, alternative OpenAL implementations have been the only APIs that provide HRTFs. Rapture3D is commercial software and has been licensed by Codemasters Racing for use in several of their games. With a bit of tweaking, Rapture3D can also be used in a variety of other games that use OpenAL.

We are now also starting to see solutions emerging as plug-ins for the likes of Wwise, FMOD and Unity. “RealSpace3D” and “3Dception” are examples of these newer systems. Apart from the application of HRTFs to point sound sources, both of these products feature room modelling for sound reflections and these reflections are also passed through the HRTF filters.

Any type of reverberation added after this should be able to simulate the environment that the viewer is seeing. Providing the player with an awareness of the proportions and shape of the environment is the purview of reverberation effects.

HRTF

Also, the implementation of HRTFs is well within the boundaries of what can be achieved on current computer hardware. This has been proven by excellent implementations like Rapture3D and OpenAL Soft.

HRTF vs. Surround Sound

People sometimes ask what the difference between "3D Sound" and "surround sound" is. With surround sound, you have multiple physical speakers that surround the listener. 3D sound does something different. Rather than rely on multiple speakers, 3D sound (sometimes called "HRTF" audio) uses only two speakers or headphones.

Sound coming from around you is filtered when bending around your head and into your ears. This filtering is different for different directions and gives the brain clues (in addition to level and time differences between the ears) for localizing sound sources.

Using one's HRTF it is possible to generate binaural simultions of sound sources in any location and distance. That means you can play it over headphones and it will sound as if it is coming from a certain direction and a certain distance. In particular it is possible to binaurally simulate any number of loudspeakers in different positions around you. And this can be combined with headtracking such that the virtual loudspeakers stay in a fixed position if you move your head.

When done correct this can compete with a real multichannel loudspeaker system. Maybe not 100% equal but it can come very close. And there can be some advantages.

Personalization and Calibration

This works best using your own personal HRTF. Some systems try to do it using a generic HRTF, for example measured with a dummy head. This gives varying results, depending (amongst other things) on how average your HRTF is, or how sensitive your brains are to errors in the HRTF.

As explained by others, the more the measurements come from your own body, the better the subjective results. The typical and still most reliable method, for now, is to play sine sweeps through actual speakers and record the sound with binaural microphones in your ears (so the recording has most of the changes caused by your very own head and outer ears). Then those sweeps are used to create impulses that will be used to replay any audio with the signature, delay, reverb of a particular channel/speaker from a particular direction.

So you need to perform an HRTF calibration profile, per headphone(?), to achieve best results. This explains all of the 60+ HRTF preset profiles for games I've seen with the OpenAL implementation. I've never seen or heard of HRTF calibration done before, or profile saving/creating. I suppose this would be done externally, outside of the game, ideally, and most simplistically.

Ah... Per person... Per person... Ah... Otherwise when you turn your head, all the sound turns with you.

The Future of HRTF in Gaming

With consumer-level head-mounted displays coming in the near future, the desire for true 3D audio with headphones will be even greater than it already is. The immersive, 3D visuals of VR deserve to be supplemented by similarly sophisticated audio.

Edit (12 Jan 2017):Valve has acquired Impulsonic. Impulsonic are the creators of Phonon 3D, which is the 3D audio system that Valve used to provide HRTF in CS:GO. I think this is great news because it means that Valve are taking 3D audio seriously and I think that we will see HRTF being introduced into other Valve games (especially any future VR titles).

The Future of 3D Audio in Gaming with HRTF

The possibilities of future game audio are so exciting...

HRTF Implementation Examples

EarGames games make broad use of 3D Sound where appropriate, using a technology known as HRTF or Head Related Transfer Function. The particular implementation of HRTF used by EarGames comes from QSound Labs, creators of Virtual Haircut, the most widely viewed 3D sound on the internet.

FMOD provides 3D headphone audio via third party plugins. FMODs solution isn't rudimentary, instead of brute forcing a single setting on all sounds it allows us to pick and choose, as well as fine tune.

Microsoft Spatial Audio

When it comes to Snowdrop’s audio tech-stack the product is greater than the sum of parts, but if you were to point the three of the biggest highlights of TCTD2 for immersive audio it would probably be the separate indoor reverberant field simulations, the outdoor early reflections, and our headphone implementation via Microsoft Spatial Audio (Windows Sonic and Dolby Atmos).

Microsoft Spatial Audio gave us an additional height layer to mix with, and the possibility of spatializing it with either their Windows Sonic HRTF [1] headphone processing or Dolby’s. Since our outdoor early reflections usually had some height to them, it was an obvious thing to try spatializing this way, and I think it worked incredibly well.

Even before Sony published their developer documentation for the PS5, Mark Cerny demonstrated the console’s ability to do HRTF processing in hardware, AND give the player 9 different HRTF data sets to try out. Since it is available in hardware, this tells us that Sony have decided to invest big in it - because they expect almost every player to enable it.

Feature Description
Indoor Reverberation Offline raytracing captures late reflections in room spaces.
Outdoor Early Reflections Spatialized with Microsoft Spatial Audio for height and realistic sound.
HRTF Processing Hardware-based processing for personalized audio experiences.