Friday, April 9, 2010

Digital Audio

by James Meder

History - Audio’s Transformation from Analog to Digital

1976 - “The first 16-bit digital recording in the US was made at the Santa Fe Opera on a handmade Soundstream digital tape recorder developed by Dr. Thomas G. Stockham.” (Pohlmann)

1982 - “The first digital audio 5-inch CD discs marketed, merging the consumer music industry with the computer revolution.” (Pohlmann)

1988 - “For the first time, CD sales surpassed LP sales, leaving CD and cassettes as the two dominant consumer formats...” (Pohlmann)

1998 - “Jonell Polansky produced the first 24-bit 48-track digital recording session at Ocean Way on Nashville's Music Row.” (Pohlmann)

2001 - “Apple Computer introduced on Oct. 23 the iPod portable music player. (solid-state iPod released Jan. 11 2005).” (Pohlmann)

Digital Audio Basics

“At its most elementary level, it is simply a process by which numeric representations or analog signals(in the form of voltage levels) are encoded, processed, stored, and reproduced over time through the use of binary number system.” (Huber)

Digital Music Systems use a Binary (2 base) System which is encoded in media as: 1 or 0, On or Off, Voltage or No Voltage, Magnetic Flux or No Flux, and Optical Reflection Off of a Surface or No Reflection.

Sample Rates, Bit Depth and Resolution

In Analog, continuous signals are passed, recorded, stored, and reproduced as changes in voltage levels (change over time).

Depending on the sample rate, periodic periods of time are processed by creating a binary word that represents the signals level and waveform. These binary words are stored and can later be recreated during the D/A process.

A sample rate of 48 kHz represents a sample every 1/48,000th of a second.

The higher the sample rate, the higher bandwidth of signal is available. This results in the ability to clearly represent higher frequencies of a digitally recorded sound (higher resolution).

“During the sampling process, an incoming analog signal is sampled at discrete and precisely timed intervals (as determined by the sample rate). At each interval, this analog signal is momentarily ‘held,while the converter goes about the process of determining what the voltage level actually is, with a degree of accuracy that’s defined by the converter’s circuitry.” (Huber)

A binary encoded number is then given to the computer (hard drive or storage device) that represents the analog level.

From here, the audio can be stored and referred back to in the future when the timed interval is reassembled with the binary word during the D/A process.

The converter then continues the process by quickly moving on the the next sampling period.

“The Nyquist Theorem states that in order for the desired frequency bandwidth to be faithfully encoded in the digital domain, the selected sample rate must be at least twice as high as the highest frequency to be recorded. If you wish to record and capture a frequency of 20 kHz (upper level of human hearing), the sample rate must be at least 40 kHz.” (Huber)

‘Quantization’ represents the amplitude level in relation to the sampling process. Voltage levels of the signal are processed and stored as binary digits so that they can be stored and later recreated.

Currently, the most common binary word length for consumer audio is 16-bit (cd quality).

Added internal headroom at the bit level, helps reduce errors in level and performance at low-level resolutions.

Also, the higher the bit level, the more headroom is allowed, creating less of a chance for clipping and digital distortion to occur. In the near future, system of 32- and 64-bit resolution will become the norm.

Analog-to-Digital and Digital-to-Analog Conversion

“In its most basic form, the digital recording chain includes a low-pass filter (or anti-alias filter), a sample-and-hold circuit, an analog-to-digital converter, the circuitry for signal coding (or multiplexor), and error correction.” (Huber)

Low pass occurs in order to block frequencies that are greater than half the sample rate frequency from having to be converted.

A Sample-and-Hold (S/H) circuit holds and measures the analog voltage level for the duration of a single sample period, its length is dependent on the sample rate.

For the next step, A/D conversion happens by encoding the signal into a binary a word. This is the most important step because the converter has to efficiently create the binary word from a DC voltage ? level that is correctly quantized to the nearest step level, very quickly in order to move on to the next sample.

The digital data then needs to be sent to storage, but before that happens, more processing and conditioning takes place, which includes: data coding, data modulation and error correction “(synchronization and address information)”. Rather than just keeping the data raw, it is coded into a form that can be easily and accurately stored and found.

An important process of digital audio data coding is pulse-code modulation (PCM).

“The density of stored information within a PCM recording and playback system is extremely high.” (Huber) In order to compensate for any problems in the audio due to the large amounts of audio data, many forms of error correction is used.

A mathematical pattern is used with PCM and error correction, where the binary word is sent in random order to be stored in a binary bitstream. This process also allows for the corrupted audio to become available later when a ‘puzzle piece’ of the bitstream is regenerated during the D/A conversion process.

Without this part of the coding process, digital audio would be close to useless because it ultimately saves the quality of the audio.

The digital reproduction chain (D/A) for recreating the binary word in order to be heard, is much like the A/D conversion happening in reverse.

The recorded data is restored to its modulated binary state and into pulse code form.

Next up is the digital-to-audio conversion where the analog voltage levels are reinstated from the binary word.

A sample-and-hold happens to distinguish the most-significant to least-significant bit.

The final step is a low pass filter so that the signal doesn’t distort due to high frequency harmonics.

Within every step of the digital ‘reproduction chain,’ summing each 1 or 0 together determines the voltage output.

Digital Audio Transmission

“When looking at the differences between the distribution of digital and analog audio, it should be kept in mid that, unlike its counterpart, the transmitted band with of digital audio data occurs in the megahertz range...” (Huber)

Digital audio is very susceptible to errors and signal irregularities due to the large bandwidth that is required.

Useable transmission of digital audio signals include: AES/EBU, S/PDIF, MADI, ADAT Lightpipe, TDIF, and mLAN.

AES/EBU (Audio Engineering Society and the European Broadcast Union) audio transmission is used between professional audio devices and is capable transferring two channels of interleaved audio through a single XLR cable. This is possible by having pin 1 of the XLR acting as the ground and having pin 2 and 3 carry the two signals.

S/PDIF (Sony/Phillips Digital Interface) is used to connect consumer grade 2 channel audio devices to professional interfaces and the like. Can be used with a single conductor unbalanced phono (RCA) cable, as well as an optical ‘Lightpipe’ connection. Can work as a link between multichannel data devices like that of a surround sound system.

ADAT Lightpipe uses the same optical Lightpipe cable as S/PDIF, but is capable of handing up to 8 channels in one direction. The 8 channels can link multiple audio devices and with two cables, is also able to handle 8 In/Outs simultaneously.

TDIF (Tascam Digital Interface) uses a 25 pin D-Sub cable that is able to send and receive 8 digital audio signals bidirectionally. This is the Cable system that is used to run audio from the 192’s to the HD PCIe audio cards in the Mac.

A digital distribution device can be used to route audio devices together to prevent jitter and wordclock errors.

Word clock and Jitter

“Jitter is a time base error...It is caused by varying time delays in a circuit paths from component to component in the signal path. The two most common causes of jitter are poorly designed Phase Locked Loops (PPLs) and waveform distortion due to mismatched impedances and/or reflections in the signal path.” (Katz)

Jitter occurs when long cables are used with incorrect impedances or the source impedance is not correctly matched at the load. It can lead to sound waves that were initially square to become round, fast amplitude times can become slow, and the zero crossing point of the waveform can be less accurate.

However, when looking at the binary word of a square wave that is encountering jitter, it would be the same as the original ‘unjittered’ binary number. The only way you would be able to hear the difference is through obvious distortion; which usually is clicks or tics in the audio.

During D/A conversion, the process where jitter occurs is during the sample and hold period. If this process isn’t stable, then the digital audio won’t be able to return to an analog signal quickly enough and result in “loss of low-level resolution caused by added noise, spurious (phantom) tones, or distortion added to the signal.” (Huber) With a jitter problem, a digital converter can shrink the dynamic range of a recording by a lot. When listening back to such a recording, it can sound grainy, have loss of definition, have loss of stereo width, and obvious signal loss.

The purpose of Wordclock or a Master Clock is to reduce jitter and to link multiple converter devices to the same sample rate times, as well as the same sample-and-hold periods.

Only one Master Clock can be used at once so that each devices’ internal clock can run “within a connected digital distribution network.” (Huber)

A wordclock is only necessary when using systems with various conversion devices.

Sound Quality

Tech Side...

There are ups and downs to both analog and digital audio. In analog recording, the tape is able to capture a continuous signal. If this same signal were to be captured in digitally, a bit depth would have to be chosen and the signal would need to be quantized.

The introduction of bit depth and quantization brings about the signal-to-noise issue and how accurately a signal can be reproduced. In a similar manner, when recording in analog, a signal-to-noise (S/N) ratio is introduced and can cause the dynamic range of a recording to suffer.

Dither is another component that separates analog from digital. The digital process of sampling and quantizing can create a ‘squared off’ understanding of a waveform and when recreated, low level harmonic distortion can be heard due to the ‘square wave’ sound of a sampled and quantized piece of digital audio. This happens during the A/D process when the Least Significant Bit (LSB) is encoded into the binary word. It can be avoided by adding dither, which is a small amount of noise. When noise is added, a converter has an easier time encoding, or deciding whether to use a 1 or 0 as the LSB.

Ultimately, the low level harmonic distortion is a lot more obvious when recording at a lower bit depth.

When converting from a higher bit depth (24) to a lower bit depth (16), it is a good idea to use dither. The converter will have an easier time deciding the LSB for the 16 bit signal, thanks to the dither.

If you are transferring an analog recording to digital, you don’t need to worry about dither because the tape noise will naturally decide the LSB.

Opinion Side...

Some engineers like recording to tape because they believe that it will yield a warmer, more natural and vintage sound compared to the sterile sound of digital. Other engineers like recording digitally because the S/N ratio is important to them and they feel that it will give a more exact rendition of the performance. Digital can be a lot more dependable and can be a lot more portable. If you like the sound of analog, there are many tape emulators on the market; hardware and plugin.

Works Cited

Huber, David Miles. 2005. Modern Recording Techniques. Massachusetts: Focal Press.

Katz, Bob. 2002. Everything You Always Wanted To Know About Jitter But Were Afraid To Ask. Mastering Audio, The Art and The Science. Massachusetts: Focal Press.

Pohlmann, Ken C. 1995. The Digital Revolution. New York: McGraw-Hill. Retrieved from

No comments:

Post a Comment