Digital Audio in the Digital Cinema

by Michael Karagosian
©2000 MKPE Consulting  All rights reserved worldwide
Published in the September 2000 issue of Film Journal

We are all familiar with today's competitive world of cinema digital audio. Three major "formats" exist side-by-side, each with their own compression and storage method, each requiring unique electronic boxes in the projection booth to decode and reproduce the digital audio signal. With few exceptions, the end result to the listening audience is same. Each format is capable of reproducing 5.1 or so-called 6.1 tracks, with the only exception to the norm being a relatively small number of 7.1 mixes, which can be reproduced by only one of the formats. The point is that despite the competitive nature of digital audio today, the audience experience is very much the same in all cases.

The competitive nature of digital cinema today stems from the widely different technologies used to create this vanilla sound experience. Two of the formats store digital audio on film in uniquely clever ways, occupying different real estate on the film, while the third stores digital audio on CDROMs, which are synchronized to the film. Due to the technical challenges presented by each of these storage methods, very different, and in some cases, very radical compression schemes are used. Yet, in spite of the very clever engineering that has gone into each of these digital audio formats, they produce little difference in audience experience. The focus of competitive digital audio today is not to provide a difference in quality and experience for the audience, as was once the situation when SVA tracks were the norm, but to hoard market share and sell equipment.

This sad twist of focus among the major cinema audio companies has forced the exhibitor to invest an array of electronic equipment that, in toto, does little to enhance the sound of the theatre. It's no wonder that exhibitors complain about the state of digital audio today. To help overcome this burden, more and more features are now distributed with all three digital sound tracks. Nevertheless, the damage has already been done. Wouldn't it have been a lot smarter if, instead, exhibitors were encouraged to invest in additional arrays of speakers and amplifiers? More channels would allow a producer to create unique and interesting mixes for their features and enrich the audience experience.

Enter the world of digital cinema. Right from the start, digital cinema imposes a major change in the playing field for digital audio. Instead of vying for real estate on film, and meeting the unique technical challenges imposed by that real estate, digital audio is handed a chunk of digital real estate in the form of a computer file. Gone are the technical challenges imposed by the storage scheme. In fact, if we look at the prototype digital cinema format in use today, audio is stored in its full 24-bit glory, rather than being subjected to one of several clever compression schemes. Further, the image is also stored in a digital file, and being the giant of the two, it forces certain standards that limit potential variations on the stored audio signal.

However, today's prototype digital cinema is not scalable, and cannot become the digital cinema of the future. It is safe to bet that "full rollout" digital cinema will be based on a different storage and playback scheme. The big question for digital audio in tomorrow's digital cinema will be how strict the storage method will be defined and how it will allow for product differentiation.

Unfortunately, there is still room in digital cinema for the kind of crafty engineering that will lead to a monopoly of the sound system. The question is not whether this is possible, but whether the production community will allow for it. This may seem surprising. After all, how clever can the audio companies be? The production community has 24-bit audio available to them today. It seems like it would be a hard sell to convince the producers that less is more.

But less could be more. You have to understand that when I say "less", I'm referring to compression. Let's take a closer look at the prototype digital cinemas of today. The average file size for a 90-minute feature is around 40GB. Of that, nearly 5GB are audio, or stated differently, about 12% of the feature storage area is consumed by audio. Not a bad figure if you're thinking in terms of bang for byte. However, this is just vanilla 5.1 audio. If we want a richer and more interesting sound in the future, we'll need more tracks. More tracks will drive that percentage figure higher, giving less bang for the byte. Less bang for the byte paves the way for compression, and compression could pave the way for product differentiation as we know it today, even in spite of the standardization efforts now underway.

These calculations, though, could be moot given the uncertainties presented by the image data. Full rollout digital cinema may very well support projectors having higher resolutions than we now experience, resulting in larger image files. Larger image files become necessary to store the additional information required of higher resolutions, even with compression. And larger image files will lower the percentage of digital storage real estate that audio will occupy, making it all the less likely that audio will be compressed, even after the eventual evolution of increasing the number of channels. Still, light compression may be desirable, but light compression can easily be standardized, removing the incentive for product differentiation at the compression level.

Still, even without compression there remains room for monopolizing the sound format. Digital cinema will provide the means for sophisticated metadata, which could become the battleground of the future for digital cinema turf. Metadata is defined as "data about data", or in more practical words, data that describes the contents of a file. Metadata is useful in that it is machine readable, providing a way to imbed specific instructions within the stored audio file that are read by the digital audio playback system. Here's the scenario that could likely occur. Audio company A sells equipment to the postproduction house that mixes audio for the feature. During the mix, the mixing engineer finds that desirable results are obtained by using Feature X of Company A's equipment. Feature X works by embedding certain commands in the metadata space of the digital audio stream, or simply embedding the commands in the audio stream itself. These commands are uniquely recognized by Company A's playback equipment, forcing the exhibitor to buy that particular brand and model of audio processor to correctly play the movie soundtrack. Say a well-known cinema "standards" company (you know those three letters?) gets behind it and requires theatres displaying their brand to buy the special gear. All of the standards in the world cannot prevent this scenario from taking place. One would hope that attempts to monopolize the audio format will be thwarted by the production community. The choice will be entirely theirs to make.

In an ideal world, the radical changes to audio systems imposed by digital cinema will shift the investment funnel from multiple audio cinema processors to additional amplifiers and speakers. This should lead to an improvement in audience experience, which is exactly what is needed to further differentiate cinema from home theatre. This shouldn't spell doom for the cinema processor companies, though. Digital cinema systems impose unique technical challenges, leaving room for products to go back to competing on a feature and quality basis, rather than on a monopolized sound track basis. Instead of producing highly engineered products that provide little difference in audience experience, there is the potential for feature rich digital cinema processors that provide sophisticated remote maintenance capability, for instance.

Among the many features one can envision is modular processing. Let's say that in our digital cinema of the future, our finest screen, Screen 6, has a 5.1 digital audio system. But the latest blockbuster feature coming out requires the shiny new 7.2 format. That means adding an additional subwoofer and two full-range speaker systems in the auditorium. This shouldn't require discarding our previous investment in a digital audio processor. Instead, we should be able to add on a new parallel processor that expands our I/O capability. One can think of audio processing in the future in terms of modular "bricks", instead of dedicated products that are limited in their flexibility. Technologies such as IEEE 1394 and DSP can go a long way towards making this a reality. This is an area where standards can greatly benefit both the production and exhibition communities by providing a standard way for sound systems to grow. Not necessarily at the product or "brick" level, but more importantly, at the level of the digital bit stream that feeds the digital processors.

Digital cinema clearly imposes a new order to the world of digital audio. It has the potential to give the audio production community the tools they have long desired. To the exhibitor, it can provide the opportunity to invest in audio equipment that clearly adds to the audience experience. To get there will take a concerted effort on the part of both standard makers and the production community. Let's hope we do this one well.