Digital Audio in the Digital Cinema
by Michael Karagosian
©2000 MKPE Consulting All rights reserved worldwide
Published in the September 2000 issue of Film Journal
We are all familiar with today's competitive
world of cinema digital audio. Three major "formats" exist side-by-side, each
with their own compression and storage method, each requiring unique electronic boxes in
the projection booth to decode and reproduce the digital audio signal. With few
exceptions, the end result to the listening audience is same. Each format is capable of
reproducing 5.1 or so-called 6.1 tracks, with the only exception to the norm being a
relatively small number of 7.1 mixes, which can be reproduced by only one of the formats.
The point is that despite the competitive nature of digital audio today, the audience
experience is very much the same in all cases.
The competitive nature of digital cinema today stems from the widely different
technologies used to create this vanilla sound experience. Two of the formats store
digital audio on film in uniquely clever ways, occupying different real estate on the
film, while the third stores digital audio on CDROMs, which are synchronized to the film.
Due to the technical challenges presented by each of these storage methods, very
different, and in some cases, very radical compression schemes are used. Yet, in spite of
the very clever engineering that has gone into each of these digital audio formats, they
produce little difference in audience experience. The focus of competitive digital audio
today is not to provide a difference in quality and experience for the audience, as was
once the situation when SVA tracks were the norm, but to hoard market share and sell
This sad twist of focus among the major cinema audio companies has forced the exhibitor to
invest an array of electronic equipment that, in toto, does little to enhance the sound of
the theatre. It's no wonder that exhibitors complain about the state of digital audio
today. To help overcome this burden, more and more features are now distributed with all
three digital sound tracks. Nevertheless, the damage has already been done. Wouldn't it
have been a lot smarter if, instead, exhibitors were encouraged to invest in additional
arrays of speakers and amplifiers? More channels would allow a producer to create unique
and interesting mixes for their features and enrich the audience experience.
Enter the world of digital cinema. Right from the start, digital cinema imposes a major
change in the playing field for digital audio. Instead of vying for real estate on film,
and meeting the unique technical challenges imposed by that real estate, digital audio is
handed a chunk of digital real estate in the form of a computer file. Gone are the
technical challenges imposed by the storage scheme. In fact, if we look at the prototype
digital cinema format in use today, audio is stored in its full 24-bit glory, rather than
being subjected to one of several clever compression schemes. Further, the image is also
stored in a digital file, and being the giant of the two, it forces certain standards that
limit potential variations on the stored audio signal.
However, today's prototype digital cinema is not scalable, and cannot become the digital
cinema of the future. It is safe to bet that "full rollout" digital cinema will
be based on a different storage and playback scheme. The big question for digital audio in
tomorrow's digital cinema will be how strict the storage method will be defined and how it
will allow for product differentiation.
Unfortunately, there is still room in digital cinema for the kind of crafty engineering
that will lead to a monopoly of the sound system. The question is not whether this is
possible, but whether the production community will allow for it. This may seem
surprising. After all, how clever can the audio companies be? The production community has
24-bit audio available to them today. It seems like it would be a hard sell to convince
the producers that less is more.
But less could be more. You have to understand that when I say "less", I'm
referring to compression. Let's take a closer look at the prototype digital cinemas of
today. The average file size for a 90-minute feature is around 40GB. Of that, nearly 5GB
are audio, or stated differently, about 12% of the feature storage area is consumed by
audio. Not a bad figure if you're thinking in terms of bang for byte. However, this is
just vanilla 5.1 audio. If we want a richer and more interesting sound in the future,
we'll need more tracks. More tracks will drive that percentage figure higher, giving less
bang for the byte. Less bang for the byte paves the way for compression, and compression
could pave the way for product differentiation as we know it today, even in spite of the
standardization efforts now underway.
These calculations, though, could be moot given the uncertainties presented by the image
data. Full rollout digital cinema may very well support projectors having higher
resolutions than we now experience, resulting in larger image files. Larger image files
become necessary to store the additional information required of higher resolutions, even
with compression. And larger image files will lower the percentage of digital storage real
estate that audio will occupy, making it all the less likely that audio will be
compressed, even after the eventual evolution of increasing the number of channels. Still,
light compression may be desirable, but light compression can easily be standardized,
removing the incentive for product differentiation at the compression level.
Still, even without compression there remains room for monopolizing the sound format.
Digital cinema will provide the means for sophisticated metadata, which could become the
battleground of the future for digital cinema turf. Metadata is defined as "data
about data", or in more practical words, data that describes the contents of a file.
Metadata is useful in that it is machine readable, providing a way to imbed specific
instructions within the stored audio file that are read by the digital audio playback
system. Here's the scenario that could likely occur. Audio company A sells equipment to
the postproduction house that mixes audio for the feature. During the mix, the mixing
engineer finds that desirable results are obtained by using Feature X of Company A's
equipment. Feature X works by embedding certain commands in the metadata space of the
digital audio stream, or simply embedding the commands in the audio stream itself. These
commands are uniquely recognized by Company A's playback equipment, forcing the exhibitor
to buy that particular brand and model of audio processor to correctly play the movie
soundtrack. Say a well-known cinema "standards" company (you know those three
letters?) gets behind it and requires theatres displaying their brand to buy the special
gear. All of the standards in the world cannot prevent this scenario from taking place.
One would hope that attempts to monopolize the audio format will be thwarted by the
production community. The choice will be entirely theirs to make.
In an ideal world, the radical changes to audio systems imposed by digital cinema will
shift the investment funnel from multiple audio cinema processors to additional amplifiers
and speakers. This should lead to an improvement in audience experience, which is exactly
what is needed to further differentiate cinema from home theatre. This shouldn't spell
doom for the cinema processor companies, though. Digital cinema systems impose unique
technical challenges, leaving room for products to go back to competing on a feature and
quality basis, rather than on a monopolized sound track basis. Instead of producing highly
engineered products that provide little difference in audience experience, there is the
potential for feature rich digital cinema processors that provide sophisticated remote
maintenance capability, for instance.
Among the many features one can envision is modular processing. Let's say that in our
digital cinema of the future, our finest screen, Screen 6, has a 5.1 digital audio system.
But the latest blockbuster feature coming out requires the shiny new 7.2 format. That
means adding an additional subwoofer and two full-range speaker systems in the auditorium.
This shouldn't require discarding our previous investment in a digital audio processor.
Instead, we should be able to add on a new parallel processor that expands our I/O
capability. One can think of audio processing in the future in terms of modular
"bricks", instead of dedicated products that are limited in their flexibility.
Technologies such as IEEE 1394 and DSP can go a long way towards making this a reality.
This is an area where standards can greatly benefit both the production and exhibition
communities by providing a standard way for sound systems to grow. Not necessarily at the
product or "brick" level, but more importantly, at the level of the digital bit
stream that feeds the digital processors.
Digital cinema clearly imposes a new order to the world of digital audio. It has the
potential to give the audio production community the tools they have long desired. To the
exhibitor, it can provide the opportunity to invest in audio equipment that clearly adds
to the audience experience. To get there will take a concerted effort on the part of both
standard makers and the production community. Let's hope we do this one well.