Anatomy of a Cue

As a continuation of the discussion in the last post, I'm going to walk through the creation of a single cue in some detail. The cue in question is one of those lovely long, exposed, story-telling cues you rarely get to do; in this case, an audio slice of a drive-in movie. It is used in the production I just opened.

The voice-over session:

As with most cues involving dialog, the actual dialog to be used is specified in the original script (the show in this case is Grease and the lines in the B Movie sets up the drive-in scene and a song.)

In this case I didn't have to go looking for vocal talent; members of the cast had already been picked and had been doing the lines during rehearsal. The latter was a mixed blessing; although they didn't need to hold scripts, having already memorized the lines, that also meant I lost the chance to mark up the lines to better shape the line readings.

When I do voice-over work, I like to print the lines in big type, double spaced, one "take" to a page. The professionals will take the opportunity to mark breath pauses, special or problem pronunciation, emphasis needed, etc.

In this case I had flat, rote performances to start from. Working closely with the director we were able to delve into the meaning of the lines, find the important beats, and get those beats into the vocal performance. ("Beats" in the acting terminology sense.)

I've said this before; physicality is key. If I had time, I would have actually blocked the scene to give the change in voice that movement would cause. I was able to rework the second movie excerpt by requesting the voice actor playing the "hero" stand behind, with his hands on the shoulders of, the actress playing the "girl." This is a very, very typical couples pose in movies of the period. Looking over her shoulder like that caused the actor to give a more warm, comforting performance than the flat, affect-less performance he had been giving before that direction.

In a long-ago session, I recorded an actor seated, and had him rise from his chair when he reached a more emotionally intense motion. It can not be said enough; physicality shows up in the voice.

(The great, great story comes from the creation of the music for the Lord of the Rings computer games. The male chorus was giving a flat and lifeless rendition of the Dwarves song -- until the conductor had the men shift their weight from one foot to the other as they sung.)

Also unfortunately, we had only the theater lobby to work in, and it was raining. I knew I was going to get both room tone and extraneous noise on the track, but I felt I could probably work around it anyhow. Often in theater you have to accept what will work instead of what would be wonderful, because opening night is coming far too soon. And beside, if this cue is not as great as it could have been, you know there will be another show, and another opportunity, next month.

I set up an omni mic as back-up, but my primary mic was my home-made budget fish-pole boom.

I'm going to explain that, too. The fish-pole is the simplest kind of boom mic; nothing but a long stick with the mic at one end. The idea is to come down from above and stop just out of frame (that is, out of the frame of the camera for an actual movie). I've found it is also an excellent sound for voice-over work.

Putting microphones in front of talent causes many of them to deliver a performance to the mic; they get small, they talk into the mic. The voice you often want -- the voice I most certainly wanted for an imaginary scene from a B Movie -- is one that is large, space-filling, and directed out. So using the fishpole removes that obvious thing-to-talk-into and forces them to act to the room, to their partners, and to the imaginary audience.

A mic that is below the head, or even at mouth level, is a less pleasing sound than one that is aimed towards the forehead. This is why the hairline is the superior position for wireless microphones. A boom coming down from above and forward is a very natural, pleasing sound that mimics well how we perceive voices sound like in ordinary surroundings.

Here's the budget fish-pole (I should write another Instructable -- it was an Instructable I got the idea from!) Get one of those extending poles they use to change lightbulbs in big buildings. I found one for under thirty bucks at Orchard Supply Hardware. The fittings on the top screwed on with the same screw as found on industrial brooms and mops. I used a grinder to make the screw just a little smaller; until I could force a microphone clip over it. Then I mixed up some epoxy and stuck a universal microphone clip (another ten bucks) on to the end.

I don't have a Sennheiser MKH-416, but I do have a Shure PG-81; a mere cardiod (instead of short shotgun) but at the boom distances I work with it works just fine.

I boomed this time through my Mackie mixer, mostly for the headphone amp; this way I can wear headphones and hear what I am recording during the session. I followed the actors somewhat, shifting the boom a little to be closer to whoever was speaking at that moment. For such a short "scene," it was easy enough to memorize the necessary moves. Had they had blocking, that, too, would have been easy enough to memorize.

Of course blocking would have meant I would have had to walk around while holding up the boom...this is why actual good boom operators are valued members of the production sound team in the film world. I'm a pretend operator, totally self-taught, but I do it for the results I've heard in my voice-over recordings. (Plus, it looks cool and gives the actors a kick!) many takes as we had time for, made sure the file had saved properly to hard disk, and on to the next step.

Oh, and I knew I had a "mojo" take in the can. Most sessions, there will be one take that will make you sit up straight. Something about it cuts through the boredom of familiarity and makes the material fresh and exciting again. Nine times out of ten, that's the take you will end up using.

Processing the vocal tracks:

This has been a dry entry so far: let's enliven it with some pictures.

Here's the raw recording session in Audacity. I recorded at a basic 44.1/16 bit depth; the cue didn't call for anything more. In Audacity, I listened through the various takes and selected the take I would use -- yes, it was that "mojo" take -- copied that to a fresh file and normalized.

As I had feared, the rain came through. In an annoying fashion. If I had been more pressed for time I might have worked with the rain instead, but since it was a relatively constant sound I was able to remove most of it from the track with SoundSoap SE (purchased at half price as a special from Musician's Friend).

The trick to digital noise removal is to have a nice chunk of sound file that doesn't have anything on it you want to keep. The breaks between sessions, for instance. This is also another good reason to record a few minutes of room tone, without any dialog in it. After that it is a matter of ears and judgment to take out as much noise as is plausible without causing audible artifacts.

I read an interview with a production sound person recently and he stated the best way to do noise removal is to use several different methods. Every method leaves artifacts. If you turn any single process up until all the noise is gone, you inevitable turn up some objectionable sound. So instead you apply a bunch of different processes and as each leaves different footprints on the sound, those footprints are smaller and more easily hidden.

Anyhow -- SoundSoap was the first step, and that knocked the rain down until it wasn't objectionable. Now I could import the files into Cuebase and continue to knock them into shape.

Within Cubase I cloned the track, once for each speaker, then chopped each track until only the lines of one speaker appeared on it. This meant I could apply custom equalization and compression to each individual speaker despite them having originally been on a single mono track.

The gaps between their lines made this possible. But now that I was in an audio sequencer, I could also tighten that up a bit; I shifted several of the chunks of dialog in space in order to either close the gaps between speakers, or to allow insertion of an effect.

There was also a door that opened in the middle of the take. Of course this was the mojo take. Fortunately the door sound only occurred over one short chunk of dialog, so I pasted in those same lines from one of the other takes. The speaking tempo was different in that performance, however. But more luck; a time stretch operation, and not only did it fit the gap, it also gave the words more gravitas; it was a better line that way than what we had originally recorded.

I believe I may have applied a very slight pitch shift to one of the speakers as well, but for this project it was important to me to be honest to the voices of the original actors; to enhance them, not to hide or change them.

The girl's vocal levels shifted enough (my own clumsy boom operating was partly to blame!) and trying to fix that with compression would result in too funny a sound. Thus hand-drawn volume changes, akin to what we call "riding the fader" in live music, to bring it to a consistent level where the processors could work on it.

I worried at this point I might have to cut in room tone in every gap between dialog chunks, but I ended up going the other way: the lobby we recorded in was a little too "live" for what we were doing and I was getting some echo off the walls. Each vocal channel got, as a result, a hefty chunk of expander, set to an ultra-fast pick-up to close down the moment the last vowel sound left the actor's mouth.

Again this is a matter of listening carefully and balancing one unwanted artifact against another.

In period, dialog tended to be quite dry unless an unusual environment was being suggested (like an echoing cave). For that matter, there was a lot less production audio in the 50's; noisy cameras and so forth meant some films were entirely shot MOS and all the dialog picked up later in ADR.

Err...I'm showing off here with film terminology, and there aren't exact relationships to theater practice. "MOS" is filmed without sound. "ADR" is Automatic Dialog Replacement; the poor actor stands in front of a mic watching themselves on a screen and tries to lip-synch in the reverse direction (aka matching the words to the lips).

But this is also a philosophical question you hit every time you do a period show; how much do you want to be accurate to period, and how much do you bend to the expectations and perceptions of a modern audience? I have a byword I go to often; nothing is "old-fashioned" at the time. For someone living in the 50's, they were listening to top-notch, state-of-the-art studio sound. So we have a choice as a designer; to point a finger in laughter at the quaint past we are presenting, or to bring the audience back with us to experience an earlier time as the people back then lived it.

Anyhow...the choice made this time was to do relatively modern dialog recording methods. Or, to put it another way, dialog the way most of the audience are used to hearing it.

When I'm working on a voice-over taken on a close mic (say, for a radio announcer), I often have to manually edit out plosives. Another manual edit is when your actor manages to swallow a key consonant -- you can actually paste one in from a different part of the performance. But this is long, painstaking work and you really hope you don't have to get that detailed on your tracks (I had to do this once with quarter-inch tape and a razor blade, way way back on a production of Diary of Anne Frank!)


So now the dialog was done. The client apparently expected this is where my work would stop. I knew it wouldn't; without something to look at, raw dialog can be very, very dry and boring. I played the edited dialog track in rehearsal and it was obvious it needed something more.

The first thing I tried was filling some of the space with Foley.

Well, not really. In the film world, even when there is production sound the intent by the production recordist is to get clean dialog. Not all the other sounds. Film is a world of artficial focus. Instead of hearing all the sounds of an environment, you hear a careful selection of sounds; those sounds that are most essential towards painting a picture. In film parlance, some of these are "hard effects" -- things seen on screen that have some sort of directly applicable sound effect, like motor noise on a passing car or a gun going off. Some are Foley; these are all the sounds of the characters of the film in motion; the footsteps, the clothing rustles, the fumbling hands, rattle of change in a pocket, etc.

In the film world, these sound are produced by talented, rhythmic and athletic people known as Foley Artists (or, sometimes, Foley Dancers). They perform, like the actor in ADR, in front of a screen, but what they perform is footsteps and small parts and hand tools and bedsheets being pulled and all those other small, usually-human sounds.

So it is a misnomer to say you add Foley to a radio play. You can add similar effects, but the process is much different. Instead of matching to visual, you are trying to substitute for a visual. And there lies the problem. Foley sounds by their nature are fluid and indistinct. They mean something because we see the object that we expect to be making sound. Without seeing a man pull on a sweater, the soft slipping sounds you hear could be anything.

I've found that in general the more concrete sounds work best. Footsteps are great. And then of course what would be "hard" effects; doors, cars, gunshots, etc. You can do some fumbling and some cloth stuff, but it is more like an overall sweetener. Used nakedly, the subtler sounds tend to come across more as noise that snuck into the recording, than as sounds you designed in!

I had a cue for a previous show that was a scuffle taking place just off stage. The artists, taking their cue from the director, recorded the vocals while standing around a table. Dead, dead, dead! I was able to sell some of the scuffle with added sound effects I recorded on the spot, however -- including slapping myself so hard I got a headache!

There's the period problem again; the 50's was light on Foley (modern films are swimming in effects, and the effects are strongly present and heavily sweetened). In contrast a 50's film can be very dry. Even the effects tend to stand out isolated.

Anyhow...I cut a bunch of individual footsteps out of a recording of footsteps on leaves, did some pitch shifting and so forth, and arranged them to suggest some of the blocking that didn't actually take place. But it didn't quite fill the space properly. The effort didn't sound like a film yet. It sounded more like a noisy recording.


I am always leery about introducing music within a musical. In another cue for the same production, I conferred with the Music Director to find out what key the following song began in, and made sure my sound was within that key. This is even more critical when your sound has a defined pitch center and will be overlapping some of the music.

For a full-length movie or more typical theatrical underscore the first composing step is to basically sit at a piano and noodle; to come up with some kinds of themes and motifs. For an except this short, I knew I'd be basically comping; even if a motif showed up, it would be created just for that moment anyhow.

So I put the completed dialog track on loop, plugged in a VST instrument, and started noodling along to see what sort of musical development might occur and what the tempo might be.

Musically, the major moments were as follows; first the girl talks about her encounter with the werewolf. The hero briefly comforts her. Then the Doctor speaks up in one of those "for the benefit of the audience" speeches that in B Movies are often the big morality lecture at the end; "Perhaps Man was not meant to explore space." What I heard in my head for this moment was a french horn or somber brass doing a stately slow march with much gravitas; the "grand philosophical themes are being discussed here" effect.

Okay, and then the switch; the girl reveals the werewolf is her brother AND is a stock car racer (!!!) And to finish up this emotional turning point, the hero notices there is a full moon (apparently rising over the local dirt racing track).

And orchestral scoring didn't work. It probably would have worked if I had had time, but it would have required enough MIDI tracks to write by section and fill out a full studio orchestra; at least three violins, 'cello, base, two winds, keyboard, percussion, etc. And I'd have to spend the time to work out harmonic development and voice leading for all these parts. A good week of work to do it right. Plus of course movie music of the 50's had a particular sound informed both by aesthetics, circumstance, and technical limitations. So more work there in altering the sound of the instruments to feel appropriate and to blend into that distinctive sound.

So the alternative was to score on the cheap; to use as so many budget movies of the time had, the venerable Hammond B3 to comp and noodle through most of the score (with, one presumes, more instruments budgeted for the big title track).

And that also gave me an exciting and iconic way to treat the big turning point; an electric guitar.

Jump back a page. One of the requirements for this effect, stated directly in the script, is "werewolf howls." During the VO session, the director mentioned she did a great werewolf, and demonstrated. Which, since I am a canny and experienced recordist, I captured on one of the mics that was open at the time. With some processing and clean-up that became the werewolf effect for the show.

I liked it so much because of an unexpected quality. This was not a dirty, animalistic sound. There was no slaver in it. Nor was it a mournful, poor-me-I've-become-a-monster sound. Instead it was a full-throated "I'm a wolf and this is my night to howl!"

Which changed, in my mind, the entire character of the movie. Up until the emotional turning point it has been a sad, somber (remember those french horns?) depiction of the descent of an innocent young man into some horrible transformation. Then the wolf howls, accompanied by an upbeat electric guitar chord; this a wolf that revels in his transformation and is not about to be steamrollered by fate. He's gonna howl, and he's gonna win that stock car race, too. If he can just figure out how to get his paw around the stick shift!

So the new version of the score was a mere pedal point under the girl's first speech, then a somber minor-key progression of chords under the Doctor's big speech ("The radiation has transformed him into some kind of a monster, half man, half beast.") And then a jangling electric guitar over the howl of the wolf.

I got lucky; the Doctor's speech worked out to six bars at 110 BPM; I was able to establish a tempo track and turn on the metronome while recording the organ part. The characteristic swell pedal effect of the B3 was roughed in with the volume slider on my keyboard and cleaned up manually in the piano roll view.

But then I went back once again, because the script specifically says "eerie music" and besides just opening with the girl's lightly-underscored dialog wasn't selling the moment -- nor was it making a clear transition from the previous scene change.

So I added a theramin at the top. This is sort of a-chronological; we are joining the movie in the middle of a scene. There is not really a theramin at that point of the score (can you say it isn't diegetic there?) Instead this is like an overlapping sound from the previous scene; at some point before we joined the scene there was a theramin, plus a brassy main title track, and who knows what else. But as we join the scene, that extra-scene element is just fading out.

Well, I think it comes across the way I intended it!

The theramin, by the way, is pre-recorded. I didn't have time to try to perform and/or draw a convincing and idiomatic theramin, and I don't own a real one at the moment. So instead I purchased a pre-existing bit off of my usual supplier.

Anyhow. Last step is to route all the VST instruments to a group bus and apply bus effects to it; a bit of reverb and EQ mostly.

And then to do some overall effects using the master effects section; a fairly strong mid-range EQ, mostly, to make the track pop and to give just a little sense of being a period film soundtrack (I didn't want to go too far in this direction -- the aesthetic concept again of hearing the sound as the people of the time would have heard it. But, also, the track was so nice I hated to grunge it up!)

