The Evolution of a Feature: Diegetic Music in Infamous Second Son

March 23rd, 2014

While I’m proud of so much of the audio design in inFamous Second Son, one feature stands out as a testament to never letting go of a good idea. It was a concept, not new or necessarily innovative, that began incubating around 7 years ago. It wasn’t until 2013 that I was able to make the idea work in a title. I thought it’d be fun to trace that feature from its nascent stages through to its full fledged life. To do so, we have to go all the way back in time to a year we called 2007. Ah 2007! There was a palpable hum in the air. The iPhone was introduced by a little upstart company called Apple, Microsoft excitedly released their newest blockbuster (*cough*) Operating System, Vista, and the Nintendo Wii had captured people’s hearts, minds, wrists, and pocketbooks.

I was working at Shaba Games, where we had just finished up the DLC/Gold Edition of Marvel Ultimate Alliance and were looking for a new project. Like many others, we were captivated by the Wii and began working on a concept for a downhill skateboarding game for the platform. Shaba’s other sound designer, Lorien Ferris, and myself began brainstorming ways we could introduce interesting audio to what would ostensibly be a multiplayer racing game. Obviously the skateboard sounds would reign supreme and we came up with an idea of emitters tied to occluder objects such as buildings which would play a quick whoosh as you passed them (an idea I would later harvest for the mobile title, SummitX Snowboarding). Another idea we had was to have music emanating from buildings as you skated by. You’d be going fast and could never go back uphill, so they could be short loops, and once we applied some doppler it would sound awesome!

Unfortunately, while the Wii as a piece of hardware was popular for a slew of years, the software didn’t seem to sell as well, so the project was scrapped before we got very far. BUT after multiple other false starts we were finally given something wholly different and rather exciting: Spider-Man, and what would eventually become Web of Shadows. The goal was straightforward: create a new, unique open world Spider-Man game using the engine from the recently released Spider-man 3. Once again Lorien and I dove into brainstorming cool new features we could implement on the audio front to push the superhero qualities of Spider-man and the real life interactivity of the city. Early on, our storefront music concept was revived. I even added some various loops to embed into some stores simulating dance and jazz clubs and restaurants. Unfortunately we ran into some design problems early on: the storefronts we had in the game didn’t really match the music, they were destructible but we didn’t have a signal to turn off the music when the store was destroyed, and truthfully it just didn’t sound super-convincing to have the sound of filtered talking and clinking dishes and glasses of a restaurant while you’re right outside fighting. You think there’d be screams and hushed whispers. Basically with a tight schedule and a skeleton crew, our storefront music plans would have to wait for another day…

…which came just a year and a half later. We were working on a new superhero title and, with so much of the infrastructure in place now, we spent some time focusing on how to make storefronts believable. We created a multi-stage approach: idle, which would be the default and would play a basic ambient loop. For example some cheesy Italian music emanating from a restaurant. If a fight broke out in the vicinity we would enter a threatened state which would trigger an appropriate one-shot sound effect of screams and maybe instruments falling, dishes breaking, etc. and the music would cease. During high-tension moments (using the same tension meter as our interactive music system) the stores would be silent. Once tension went back down to low, we would slowly ramp up the idle state again until another fight broke out. Perfect plan! Unfortunately the studio ended up shifting gears and we moved from superhero games to music games. The storefront music would lay dormant again…

Fast forward to early 2012. I had just joined Sucker Punch and we were in pre-production on inFamous Second Son. Being back in an open world title, I pretty quickly started to think about my beloved storefront music concept again. Everyone I pitched it to from our creative director to our music team down at Sony HQ loved the concept. So now it was time to design it. The first step was just to get looping sounds emitting from a point in space and figuring out proper attenuation and processing for them. Next it was time to get into the real nitty gritty. I had several challenges to tackle:

A world inside a world

inFamous Second Son takes place in present-day/slightly-future Seattle. It’s not real Seattle, it’s our take on the city, but we still wanted it to be a unique, diverse, funky place, just like real Seattle. We did not want it to be full of grunge music (and that is a story for another day!). I began talking with the environment team to get a sense of the variety of storefronts we would have, and some of what they created helped influence my ideas. Early on, we got an Irish Pub in the game. At which point I thought, man that’d be cool to have it play Irish music during the day and then become a punk club at night. Just like in real life! Then I started to take it further: what if we had traditional Irish music in the earlier times of day, changing to more upbeat, raucous Irish music in the evening and THEN a punk club at night. I was on to something. As we fleshed out the stores my list of music grew and grew. I wanted jazz and Chinese and J-Pop, club, top 40, and why not mariachi music in the Mexican restaurant and thai music, new age music for the yoga studio, and hell even Russian music to put into apartments where the Akulan gangs live? Sure they’re musical stereotypes, but they’re serving the purpose of a low ambient bed, they were never meant to be featured sounds. The result will be filling the city with greater perceived life. I also wanted to reach out to some local bands and get them featured in the game as well. I wanted a lot. So how the hell were we gonna get all this music?

APM to the rescue!

As anyone who’s worked with Sony can attest to, they have some of the most amazingly talented, brilliant people working in their music department. We were very fortunate to have a few of them working closely with us throughout the project. Beyond the game score, we started discussing this source music idea and they carved out some of their budget for a blanket license from APM for stock music. Matt Levine worked directly with APM who would put together playlists for various genres of music we were interested in. He would then send me the lists, which I would review, make notes and approve or ask for more. In the end, we had over 100 tracks in the game spread out over 8 times of day. On the local band side, having been in a band and played with some acts up here I reached out to some friends’ bands and also KEXP, the local college station, and got a list of some potential candidates, several of which made it into the final game. We also started talking to Sir Mix-A-Lot and he really wanted to get some tracks in, too. Now that we had music, we just had to get it playing in game.

Rock Against the Man

As I mentioned, I had earlier rigged up a test playing source music in a test world pretty quickly to help figure out volume, attenuation, and processing. From there, it was on to the challenging part: figuring out how to make it gel in-game. In inFamous Second Son, you play as Delsin Rowe, a rebelistic youth with super powers battling against an evil authoritarian police force, the D.U.P. (Department of Unified Protection, think of the TSA with guns, armor and superpowers). Delsin can clear the DUP out of each district of Seattle as part of the systemic, non-mission open world gameplay. The main theme here is freedom vs. security. The DUP keeps people secure, but Delsin gives them the freedom to do as they wish. To help reinforce this thematically we decided that when the DUP controls a district we’d only hear DUP music. We started with stoic, patriotic sounding cues, but steered the direction more towards syrupy, happy music that provides a wonderfully stark juxtaposition to the menace of the DUP. Once Delsin begins to drive the DUP out of a district, we stop the DUP music from playing in that area and instead let the storefronts come to life with their own individuality. We had a programmer working on the district status rigging, so I asked him to give me a callback signal for when the district status changed. I was then able to use this to determine what district the player was in and whether DUP music should be playing there (it emits from DUP speakers and closed off DUP storefronts), or whether the other storefronts should be allowed to rock in the slowly-becoming-free world. I didn’t feel my initial idea from way back about multiple states work work in this instance. The music acted more as personality for the district than simulating people inside, so I didn’t pursue any kind of multi-state reactive environment. Maybe next time!

At the same time, I wanted some semblance of reactivity and also wanted to ensure the source music didn’t clash with the game score. So I tied the volume of the source music to our tension rtpc (Real-time Parameter Control in Wwise) which is also used for controlling the music intensity. When the player got caught up in combat, the music would fade out, when the combat abated, the source music would slowly ramp back up in volume. As if the owners of the shops were peeking through their windows, and once they saw the DUP dispatched, they cranked up the tunes again. So everything was working great, but now I had dozens and dozens of songs across ten or so genres, how was I going to make it all fit in a shippable state?

Making it fit

Beyond the goal and using source music to bring more life to our fictitious Seattle, I also wanted breadth and variation within the music so you wouldn’t hear the same cue EVERY time you passed a storefront. With a blanket license from APM plus around 20 local musician tracks the content was near limitless. Our soundbank budget, unfortunately, was not. However, every time we change the time of day in the game, we do a load to bring in our new skybox and other time-specific content. In fact, I was already loading all of my ambient sounds with these time of day loads. I devised a scheme to load certain music which could play at any time of day in our core ambient bank, which is always loaded. This ended up being the DUP music and our local acts. For the rest of the store fronts, I would load in 3-5 cues per TOD per genre. This way we have some variation during each time of day, as well as completely new tracks for most storefronts for each time of day change. For the local music, we had all 20 tracks in a random playlist emanating from Sonic Boom Records (a real Seattle record store), Sir Mix-A-Lot played from some of our neon-drainable low-rider hatchbacks (we HAD to have My Hooptie for that!), and the aforementioned Irish punk club featured 3 bands each rotating through a set of 5 songs each. You could theoretically stand by the Irish pub at night and enjoy a whole night of music (if it wasn’t so much fun to run around and use your powers instead!)

My budget for the TOD banks was 7mb, of which I used 2-3mb for source music at VERY low bitrates. We processed them heavily with severe low pass filters and reverb, so we really didn’t need a lot of high end, and the lower encoded bitrates (24kbps OGG) aided in making the tracks sound like they were coming out of crappy speakers inside the storefronts. Most of the cues were edited to around 60-90 seconds since most people wouldn’t really be standing around listening to the music, and we wanted more quantity of tracks than longer songs for this reason.

Here’s a video showing off just a few of the myriad storefronts we added music to. If you have a copy of Second Son, I highly suggest pushing the DUP out of some districts and running around to see how the source music aids in filling in the world without stepping on the score or any critical gameplay. It’s a subtle effect no one would likely notice, but subtlety is often the key to effective sound design.

Adventures in Foley: The Tumbling Machine

November 26th, 2013

A few months ago we were recording some sounds for inFamous Second Son, when I realized how challenging it is to get continuous debris recordings in a tiny recording booth. Inspired in part by the ArenaNet team’s field recording journal from Guild Wars 2, I started to think about a way to be able to record long continuous debris recordings and, lo, the Tumbling Machine was born. I call it the Tumbling Machine because that sounds impressive, but really it’s ridiculously simple, yet pretty damn effective.

I started with a giant plastic garbage can. The issue there is that the molded handles on each side prevent an even roll, so I cut them off with a dremel tool. Now it rolled nice and smooth but the plastic surface would obviously color the sound of the debris. To counteract the resonance of the plastic, I bought a package of eggcrate foam that you put on top of a mattress and lined the bottom and sides of the trash can with it. I tried a few different methods to affix it, but found the most effective was gaffer tape (duct tape would work fine too). The foam did a great job of insulating the impacts so you get the debris, with very little coloration from the plastic. The drawback, is that the foam can trap smaller particles of concrete, wood, glass or other debris you may want to record, but worse off, you could always replace the insulation each time you record a different surface. Here’s a short movie detailing the construction and use of the Tumbling Machine. In this instance, we were using it to record concrete rubble sounds.

It’s a cheap, effective way to make clean, continuous debris movement sounds. Here’s a capture from the concrete recording session cleaned up, so you can hear the results:


The one issue we’ve had is that the debris spills out as you roll the trash can. I’m planning on cutting a fairly wide hole in the lid of the can (so a blimp can fit inside without hitting the edges during tumbling), and cover the inside of the lid with foam to prevent coloration and try to keep the debris inside. Hope this inspires someone to make their own Tumbling Machine or maybe even something more outlandish/useful. Happy Tumbling!

Expectations of Perception

July 23rd, 2013

Recently I was working on a project in which a country road had a small drainage ditch to the side of it with flowing water in it. I looked at it once, and instantly thought, “I need to add a sound for that!”

Two weeks later, I was taking a hike through Cougar Mountain Regional Park (didn’t see any cougars– feline or otherwise), when I came across a very similar scenario in real life: a small stream of water flowing downhill. I stopped, looked, and listened, but to my surprise I heard no water trickling or babbling sounds emanating from this little stream.

If I went back and removed the sound from my project, someone could walk through the world, see that ditch, and wonder, “why the hell isn’t there a water flowing sound coming from that water?” The simple point here is that our perception of sound often differs from the reality of sound, and in games (or any form of media for that matter) we need to carefully weigh this when crafting an aural landscape. If a user is expecting a sound and it’s not there, it makes a negative impression. Not necessarily because the overall sound design is bad, but rather s/he notices a sound is missing. We have broken the wall of immersion. In the real world, slow moving water needs speed, but it also needs an obstruction in its path to cause enough movement to generate an audible sound. In the game world; however, it may just need to exist with the illusion of movement: perhaps it’s just an animated texture, or a shader trick. There doesn’t need to be a rock or an eddy causing a rapid, it’s just there, it’s expected, so let it have sound. Unless of course that goes against the aesthetic you’re trying to develop in the course of your project.

Sound design is all about managing perceptual expectation. We all know how weak gunfire sounds in real life compared to that which we create for games and film. So there is both the need to manage perception in the design of individual sounds as well as on the implementation side of sound design. But how do we choose what aspects in the world should and should not have sound and how those sounds behave? There are two things to consider here: technical and aesthetic.

On the technical side there are decisions to make based on what is available to you. What device(s) are you developing for? How much memory do you have available? Do you have DSP? Is there any sort of scripting or complex behavioral structure at your disposal? How many concurrent sounds can we play? What else may be concurrently going on in the world? Fortunately, as technology evolves, tools and technical specs are both improving so that even mobile games can use Wwise, FMOD, Unreal, etc. to provide the designer with more options, power, creativity, and flexibility to achieve their sonic goals for a project. Handhelds and mobile are losing their “stripped down,” “less powerful” monikers so that the only limitations we may have on our sound design are those we choose to put there. Of course, we’re not to the Mecca of no technical restrictions yet. Even on Playstation 4, I don’t have limitless memory and resources and that’s probably a good thing. Limitations often drive creativity and allow you to see things in a different light. We still need to fit our design into the technology we’re using, it’s just a matter of understanding the limitations of that tech and working through them.

The aesthetic side is more of a gray area. Technical specs are often set in stone, and while you may be able to negotiate for extra resources, you’re still working in an established ballfield. When determining what should have sound and how it should sound, that’s where the creative and artistic part of sound design really kicks in. This is where you get to decide (either by yourself is sometimes with the assistance of a game/creative director or other audio personnel) where you want the audio to take the user and how it should make them feel. There’s no real science in determining what is right or wrong, it’s usually a mix of gut feeling, experience, and inspiration from others that can drive you to the right place creatively.

I do not mean to suggest that technical and aesthetic design decisions are mutually exclusive. On the contrary, in a well designed audio plan, they are intimately entwined, each one informing the other. We generally want to create a believable soundscape within the context of the game world. What that means specifically is part of the beauty and mystery that is our craft. And the key to meaningful sound design is often understanding the differences in perception and reality and ensuring your audio vision for a project matches the sonic landscape you wish to create.

Wwise HDR best practices

May 19th, 2013

Audiokinetic has released Wwise 2013.1 with many new features, among them PS4 support, ITU BS 1770 compliant loudness meters and HDR audio. We worked with Audiokinetic to develop the HDR feature set over the past year and now that it’s out, I’d like to share some of my best practices that I’ve come up with (so far) in using it:

1). Keep it mellow: The first thing to be aware of is that the Wwise implementation of HDR audio is a relative volume scheme. We initially played with using SPL, similar to DICE’s Frostbite Engine, but abandoned that because a). we learned that even DICE didn’t use real-world SPL values, which sort of negates the whole reasoning behind using real-world values to set volume and b). because not everyone would use HDR and introducing a second volume slider (Volume and SPL) in Wwise just confused and overcomplicated things. So anything you want to be affected by the HDR effect (which may generally include all game sounds except UI, mission critical VO and the like) will live in its own bus with a special HDR effect on it. But this bus should be kept at a reasonable level. Generally around -12 to -18 dB. This will give you headroom in the final mix, and give your loudest sounds the ability to play without clipping. Furthermore when you have lots of very loud sounds plays, a more conservative bus level will allow things to sound cleaner. For individual sound structures, you can start with 0dB as your baseline, bring down sounds that should be quieter in the mix and bump up the louder ones above 0dB so they’ll push the HDR window up when they play.

2). The voice monitor is your best friend – The new voice monitor (shortcut: Ctrl + Shift + H) is a fantastic asset for tuning individual sound levels within the HDR space. Being able to visualize the input and output of all sounds, as well as see what affects the HDR window and how is immeasurably important when it comes to tuning individual sounds within the HDR space or preventing pumping of quiet sounds when a loud sound plays. The voice monitor is a fantastic tool whether or not you’re using HDR, but the ability to see the window behavior really makes it very intuitive as to how the effect works.

3). It’s okay to cheat: Don’t be afraid of a little judicious use of make-up gain to make an important sound punch through without affecting the HDR mix. Make up gain is applied post-HDR effect, so it won’t affect the window movement, but will boost a sound’s level. More importantly, play with the sensitivity slider in the HDR tab to dial in the best curve for your sounds. The HDR window can follow the volume of a sound, but often you only want the initial transient to affect the window and the tail to decay naturally while letting quieter sounds come through. For even more granular control, you can edit the envelope of individual waveforms in the source editor window. As an additional control, you can also reduce the tail threshold for louder sounds. Most of my louder sounds are set to 3 or 6, which means after the first 3 or 6 dB of loudness, the sound is removed from the calculation of the HDR window.WwiseEnvelopeEditing

4). Only generate envelopes on the important sounds in the game. This is a simple optimization tip. It takes CPU to constantly analyze the envelope of every sound. I only generate envelopes for the louder sounds in my game (making sure they’re not generated on ambience and incidental effects). It won’t affect the mix, but provides some performance savings.

5). The EBU/ITU-R BS.1770 standard is gold. Keep you game averaging around -23 LUFS/LKFS (based on a minimum of 30 minutes of gameplay). Everytime you play your game, connect Wwise and keep an eye on the Integrated meter in the Loudness Meter. What matters here is the AVERAGE loudness, the longer you capture, the more accurate your measurement. As a rule of thumb, I always keep the loudness meter up and running in my project.


6). Inverse square attenuation make sounds behave naturally – one of the initial “issues” I had once we got HDR working in our game was that using our old attenuation curves (generally an exponential curve over a set distance based on the general loudness of the sound ranging from 15m – 250m) just didn’t work as we needed them to. We wanted attenuation curves to sound natural in a real-world environment, so I created a set of inverse-square curves. The inverse square law states that the volume of a sound is halved by a multiple of every x meters. For example the most common curve we use falls off fifty percent every 4 meters over the span of 80 meters. So at 0m it’s 0 dB, at 4m it’s -6dB, at 8m it’s -12dB, at 16m it’s -18dB, at 32m it’s -24dB, etc. This has the added benefit of limiting the number of attenuation curves needed which is a performance savings. Of course, inverse square curves are not a blanket solution, there will always be times when you want/need something custom, so we still maintain some custom curves.


I’m happy to share the settings I have on my HDR effect, but I feel this will vary based on project, so I’m not sure how useful that would really be for people. Another feature we’ve added is a speaker_type switch controlled by an rtpc which affects the HDR threshold based on the speaker type the user is playing through. The end result is automatic dynamics switching based on speaker type where the better your speaker system, the greater the dynamic range in the mix (similar to what games like Uncharted offer in their audio options menu). In short, there’s a ton of ways to use this great feature, and I’m sure there’s going to be plenty of other tips and tricks people figure out as they start to play around. Enjoy!

A sound designer is born: my origin story, or how to get lucky and lie your way to success

April 7th, 2013

During GDC this year and the week after I ended up telling the story of how I got into the industry a few times, so I decided to commit it to the ether for posterity or some false sense of self-worth. I’ve also decided to embarrass myself publicly by digitizing the demo I made way back then in 1998 that got me into game audio.  It is horrible and borders on unlistenable. Well technically you can listen to it, but you wouldn’t want to, and it’s hard to fathom how someone could have heard this monstrosity and then offered me a job.

My story, while it may not have been exactly common 15+ years ago, doesn’t really happen anymore. The short story is that I lied my way into game audio. The longer story is that I was temping at Berkeley Systems, a video game company in Berkeley, CA after graduating college and they liked me enough to keep me on as their shipping guy. I liked it there, but really wanted to be doing something creative, so I started making a lot of noise as such. I was passed up for a production assistant job (thankfully) and ended up talking to their sound designer a couple times because I thought he had such a cool, crazy job. At this point in my life I’d never used a computer program related to sound ever. I knew how to play notes in BASIC and had a cassette 4-track and had done tons of music, tape loops, and other weird experimental stuff ever since I was a kid, but I didn’t know what MIDI was, how to create a sound effect or really much of anything in regards to sound and computers.

Anyway, one day the VP of Product Development called me into his office to tell me they fired their sound designer (apparently he didn’t come into work very often and they’d had to contract out all their sound work as a result). So he wondered what experience I had and if I’d be interested in the job. I couldn’t believe this was happening, so seeing an amazing opportunity, I lied through my teeth, telling him I had tons of experience and had scored some student films, blah blah blah. He asked me to bring in a demo the next day. I ran home that night and banged a couple things out on my sampler (half of which were a couple synthy pad soundscapes I claimed were from a student film I worked on. They weren’t.) and threw another horrible track called “Gall Stone Attack” onto a cassette and gave it to him. The next week he called me into his office and said “It’s nothing you’re ever gonna hear in any of our games, but it shows you know what you’re doing, so you got the job.” I was ecstatic. And because they’d already farmed out their sound work for the next 6 months or more, I locked myself in my office and started teaching myself everything I could about digital audio and sound design. I believe my first experiment in editing digital audio was removing all the guitar solos from Slayer’s Seasons in the Abyss, but that’s a story for another day. Nowadays, kids are coming out of school with degrees in sound design and blowing me away with their skillsets, so this whole thing known as my career could never happen today.

Everything on my demo was recorded with a Roland S-50 12-bit (!) sampler. It had a floppy drive and I had tons of sample disks for everything from pads to horns and strings to sfx. “Gall Stone Attack” also had a Roland R-8mkII drum machine and Casio SK-5 on it (and I think I used the SK-5 on “Silly Torture” as well). Since I had no sequencer or even an audio editor or audio interface for my computer, each track was recorded live onto my Fostex 4-track and mixed down to the cassette below. (I opted to not de-noise these as part of the digitization process, so they could “preserved” in the state in which they were originally heard).

And so without further ado, I present a public shaming: two tracks from my demo reel in early 1998. I cringe when I listen, and laugh a little. My skills have definitely come a long way, but I still can’t believe they listened to this crap and took a gamble on me anyway. I’m eternally grateful and shocked. Be forewarned.  Be gentle.