Have we perfected sound yet? No, not by a long shot. We imagine that graphics will be “done” when we can create anything we can imagine without hitting a hardware limit. Whether it’s a perfect recreation of reality or whatever our imagination can conjure. Considering the intangibility of sound, it might be hard to imagine how you can make it “better”. Graphics and sound compete for computing resources. At the moment we expect a level graphics from games that doesn’t allow any major advances in sound and music. The next generation of consoles will make developments possible as there will be additional resources to work with. There are still great advances to be made, and, thanks to audio demos, we can get a glimpse of what the future might hold. With the next generation of consoles a year or two away this might be a preview of what you’ll get to hear from them.
Sound Technology and Tools
Better tools and more powerful hardware are critical if we want the average sound quality of games to reach the next level. Next generation tools will primarily be designed to cut down the time and cost of developing games. The production values of the most expensive games today is the minimum audiences will expect tomorrow and, in order to keep costs in check, tools will have to be as easy to use and efficient as possible. They must be good enough for musicians and sound designers unfamiliar with game development to within 5 minutes be able to start putting sound samples and music triggers into levels. Low and mid-budget games will have to be able to reach a similar quality of AAA games today with one person doing the job instead of an entire team.
Given that sound is just one of many components developers have to worry about and the possibility that whoever is responsible for it might not have the time or knowledge to maximize its potential of sound, it’s important that these tools make sound process easy and quick enough so just about anyone could do it well. Many things will work automatically to some extent such as volume balancing of separate elements and assigning surface properties for acoustic effects like wave tracing. When things don’t work automatically we end up with games like Max Payne 3, where on one level you have some very well done acoustic effects that make the world come alive and, on the next level level, you will move through areas that are completely dead acoustically and, because of it, feel neglected and rushed. The optimal sound tools would make areas sound reasonably realistic automatically without any extra effort by the sound team.
Large teams would still exist in the future, however. For big budget games, that push the envelope, they would be working on sound of a different level of quality from anything we hear in games today. What we will see more and more in AAA games is wave tracing. Today, only a couple games use it, such as Metro 2033, and it only does so on the PC because of the performance impact. Wave tracing is similar to ray tracing for lightning systems in that it’s a real-time simulation of what happens to sound when it travels through the air and bounces off surfaces in the room. Games today use an approximation of what the echo of a room will sound like instead of full simulation. What this means is that a developer will use the level editor to mark out areas of the map and give those zones echo effects which will hopefully make everything sound the way it looks like it should. To save time developers will most likely not fine tune the effect but will rather have a set of generic echoes for “street”, “cathedral” etc that they then assign manually to all the areas of the entire game.
This is the method games have used since EAX was first invented in the late 90s and it worked well when level design consisted of mostly square rooms and corridors. Where this approach falls apart is when the level design becomes more complex. There are many objects like characters, vehicles or debris that will occlude the sound and the player often moves between interiors and exteriors so there are multiple types of echoes that should be playing at the same time. The effects all these should have on the sound are approximated with filters, but they don’t sound nearly as good as wave tracing does.
With wave tracing the developer will instead tell the sound engine what all the walls and objects are made of whether it be wood, stone, metal etc. The wave tracing engine will then just like in real life make the sound bounce off all the surfaces before it reaches the player. The further away the sound is the more surfaces it bounce off before it reaches your ears and this is where wave tracing really shows its advantages. What Metro 2033 proved was that wave tracing makes everything sound “just right” at any distance. This might sound like subtleties, but if you know what to listen for you can tell instantly when the sound is wrong. What wave tracing did in Metro 2033 was remove almost all the subtle imperfections from the sound. It is often said that sound works best when you don’t notice it. This is exactly what wave tracing enables. You might not think that there is anything wrong with how games sound now with the simpler methods, but your brain always knows when the sound you hear isn’t real. We listen to a perfect echo simulation engine all our lives in the real world. Whether you actively listen for it or not your brain notices every imperfection. What you see and what you hear should always be in perfect sync, otherwise little by little you will be drawn out of the experience, your brain telling you that it is “only a game”.
It’s already possible to do wave tracing on a modern PC since Metro 2033 which came out 2010 uses it. The advantages of wave tracing will make it standard in all games of the future. To get an idea of what wave tracing sounds today the clip from Metro 2033 demonstrates what happens to the sound of battle as you get further away. The way it morphs from a direct in your face type of sound into a diffuse echo in the far distance is done better than simpler methods are capable of, unless you use several different sound recordings done to simulate the effect of distance as some games like Medal of Honor Airborne have done. Metro 2033 doesn’t require several sound samples to be made as the wave tracing simulates the effects of distance so well it would be redundant.
Exploiting the power of Dynamic Range
An element of sound we have started to see developers use more actively in recent years is dynamic range. We saw the potential of wide dynamic range audio in Mass Effect 3, but for sound like that to become common it must become easier for developers to make sound like that. Wide dynamic range makes the overall volume of sound lower, so it might not be appropriate for all games such as portable ones. But if you’re making a game made to be played at home on a console or PC it should be standard. Wide dynamic range has been standard in movies for decades given the numerous advantages it comes with and it’s time games assumed the same level of attention from players as movies does of their audience. Things that might sound simple such as having explosions and heavy weapons be louder than all other sounds in the game are not a given in games today.
Most games make their game unnecessarily loud by making dialogue, music or maybe even ambient sound max out the volume. Careless use of volume is what makes games turn down the sound in order to make dialogue legible over the noise, or have weapons which sound weak compared to even the earliest shooters like Doom or Duke 3D which didn’t overwhelm the sound like many modern games do. Mass Effect 3 was free of all these artifacts since it had a wide dynamic range designed with care. It is widely known among sound mixers that overly compressed music and sound causes listener fatigue. The ears and brain become tired and longer play sessions become difficult or exhausting due to the aural onslaught. A wider dynamic range is more stimulating and interesting for the brain. This will also cause the player to subconsciously become more engaged by the sound experience and might find it harder to shut the game off. Dynamic sound and music also adds depth to the sound. If everything is equally loud the sound field flattens and feels two dimensional. With carefully balanced dynamics the sound appears more three dimensional, and the illusion of sound and visuals becomes seamless.
Sound becoming louder and the loss of dynamic range is a well known problem for music. The trend is called the “loudness war” as music has gradually gone up in volume over the past four decades and is now as loud as it can possibly be. The loss of dynamic range is partially responsible for contemporary music not being as engaging to listen to as any recording from the early 90s where the average volume was around 12db lower than today. Musicians are rediscovering the power of dynamic range today and game developers need to look at movies and see the potential of dynamic range for the drama and impact of the experiences they are trying to create. Thanks to Mass Effect 3 there is now also a game example for what dynamic range can do and I expect many studios to try to follow Bioware’s lead in how their future titles will sound. The video below is a good example of what dynamic range can do for music and it is applicable to games as well.
This is part one of a three part series examining the future of sound, music and virtual surround. Come back on Wednesday and we will take a look at the future of music.