DIGITAL AUDIO GLOSSARY
Originally, this glossary was limited to digital audio terminology.
However, since digital audio involves the use of a computer,
some computer terminology has snuck in.
Aliasing refers to an effect that causes different audio signals to become
indistinguishable when sampled (
of one another). It also refers to the
distortion or artifacts that result when the signal that is reconstructed from
samples, is different from the original analog signal.
In the world of digital audio, the highest frequency information often suffers from
aliasing in the form of poorly or incorrectly reconstructed waveforms. Piano,
cymbals and some stringed instruments can be noticeably affected.
A to D
. An A/D converter is an electronic device used
to convert analog electrical signals to digital values whose numbers
(combinations of ones and zeros) represent the level (volume) and frequency
information contained in the original analog signal.
Defined simply as a given range of frequencies (audible or not). We humans, for
example, have an audible hearing bandwidth ranging from approximately 20 Hz
(cycles per second) to about 20 kHz. (20 kilohertz or 20,000 cycles per second)
An acronym that means Basic Input Output System. In the computer world, the
BIOS is the first thing that is read by the CPU when power is applied and
subsequently determines the configuration of the computer.
The BIOS is a chip that contains ROM and is made using LSI technology.
The BIOS chip is generally specific to the motherboard.
Within the realm of sampling an analog waveform (using Pulse Code Modulation
divides a given sample by its value (16 bit, 24 bit, etc.).
Bit depth is
binary code that uses a mathematical language that is Base 2.
defines the parameters of the Dynamic Range of an audio signal.
By increasing the bit depth, smaller fluctuations of the audio signal can be
resolved. The noticeable result for the listener is usually increased clarity and
tonality. Listening tests have shown that an increase in bit depth is more readily
detected than an increase in
The rule-of-thumb for bit depth resolution is;
For every 1-Bit increase in Bit Depth, the dynamic range will increase by 6dB.
A bit depth of 16 Bits yields a dynamic range of 96 dB (16X6). A bit depth of 24
Bits yields a (theoretical) dynamic range of 144 dB. Notice that an 8 Bit increase
in Bit Depth yields a difference of 48 dB in dynamic range!
I say ‘theoretical’ because the best analog equipment we can use to playback
that decoded 24 Bit audio has a realistic dynamic range of about 120 dB.
I have written an article on this subject:
Bit Depth Defined
The bit rate is typically the amount of data bits per second that can be
transmitted over a specified interface. Interface is this context could be USB,
Firewire, HDMI, Thunderbolt or any number of specified digital protocols.
The transfer or bit rate for any given protocol is different. USB 2.0 for example,
has a maximum bit rate of 480 Megabits per second (MBit/s) or 480 million data
bits per second. The terms
are used interchangeably.
Common bit rate (or data rate) terms:
per second (
): one thousand bits per second.
per second (
): one million bits per second.
per second (
): one billion bits per second.
So, now when you see or hear the term
, you will know that it is
a network interface that can convey or transfer
one billion bits per second
A term (an oxymoron, in my opinion) that refers to the 16 bit-44.1kHz digital
audio file specification used for the Compact Disk.
In computers, it stands for Central Processing Unit. The CPU is the chip that
controls everything. It determines the speed at which the computer operates.
The smartphone that you carry has a CPU in it. In fact,
Apollo 11 (that landed on
the Moon) had much less computational power than your smartphone.
, it performs the inverse of the A/D converter by
converting those digital ones and zeros to an analog waveform that is hopefully
pleasing to listen to.
Dither is used to randomize noise at discrete frequencies in a digital audio
recording. In computer audio editing, dither is an intentionally applied form of
noise. Dither can be applied during recording or after the fact. It is often one of
the last stages of audio production used for the Compact Disc. When digital
audio with lower bit depths is at the threshold of signal level, it can sound grainy.
Noise is often added to mask this anomaly. One of the trade-offs of applying
Dither is that it will slightly raise the noise floor of a given audio recording.
In the world of audio, Dynamic Range is defined simply as the range of volume
from the loudest to the softest of audible sounds. Dynamic Range is expressed
in decibels (dB), which is a mathematical Log-10 ratio. We Audio Engineers
often refer to Dynamic Range in terms of the difference between the loudest
undistorted signal that can be recorded down to the noise level (floor) of a given
medium. Analog tape has a dynamic range approaching 70dB when used at 30
Inches Per Second. An audio CD has a theoretical dynamic range of 96dB.
The average human can hear a dynamic range of approximately 140dB.
A Hard Disk Drive (HDD), is an electro-mechanical data storage device that
stores and retrieves digital data using magnetic storage and one or more rigid
rapidly rotating platters coated with magnetic material. The platters are paired
with magnetic heads, usually arranged on a moving actuator arm, which read
and write data to the platter surfaces. HDDs are a type of non-volatile storage,
retaining stored data even when powered off. Modern HDDs are typically in the
form of a small rectangular box that is usually 3-1/2inches or 2-1/2 inches wide.
Introduced by IBM in 1956, HDDs were the dominant storage device for general-
purpose computers beginning in the early 1960s.
There are also Solid State hard Drives (SSD’s) that use a different technology.
There is a separate description for a SSD in this glossary.
In the digital world, LSI refers to Large Scale Integration. LSI technology is used
in the manufacture of integrated circuits of all kinds. The technology has been
around since the 1920’s, but it wasn’t practical until the late 1950’s.
A lot has changed since then and the acronym is now ULSI meaning;
Ultra Large Scale Integration.
Also known as the ‘Mainboard’ and sometimes referred to as ‘Mobo.’
The Motherboard in your computer is the circuit board that holds everything.
Some things are plugged in to it and others are soldered on.
Known officially as MPEG-2 Audio Layer III. It is a patented digital audio
encoding scheme using a form of lossy data compression. It’s a common audio
exchange format for consumer digital audio players and streaming.
The MP3 file format is designed to greatly reduce the amount of data required to
represent an audio recording and still sound like the original uncompressed
audio for (most) listeners. An MP3 file that is created using the setting of 128
Kilobits per second (kbits/s) will result in a file that is about 11 times smaller than
a Compact Disk file created from the original analog audio source. An MP3 file
can also be constructed at higher or lower bit rates with the higher bit rates
resulting in higher audio quality. It's worth noting that listening tests have shown
that the average listener can tell the difference in fidelity between an MP3 file
and a CD quality file.
Named after the Swedish-American engineer Harry Nyquist, the Nyquist
frequency is half the sampling frequency (sample rate) of a discrete signal.
If the sample rate is 48 kHz, the Nyquist frequency would be 24 kHz. According
to the Nyquist theorem, at least 2 samples of a peak-to-peak audio waveform
are required for it to accurately be reproduced. Acquiring less than 2 samples of
a peak-to-peak waveform can produce a form of distortion known as Aliasing.
All of the frequencies above the Nyquist frequency are attenuated by filtering.
In the computer world, OS stands for Operating System and there are many.
The most popular for personal computers (PC’s) are Microsoft Windows, Apple
and LINUX. In that order, I think.
In digital signal processing, oversampling is the process of sampling a signal
significantly higher than twice the bandwidth (or the
highest frequency) of the signal being sampled.
Pulse Code Modulation (
), is the most widely used method of converting
analog audio to a digital format. Basically, it’s defined as the uniform sampling
of an audio waveform at regular intervals. This applies to both frequency
(Sample Rate) and volume or the magnitude of the signal (Bit Depth).
An acronym that means Random Access Memory. RAM is what your OS and all
of the software that is run on your personal computer use. Think of RAM as a
scratch pad. The RAM gets wiped clean every time you boot or re-start your
The CD standard
The original set of books containing the specifications for all forms of optical
Compact Disk were individually color bound. The book containing the
specifications for Audio CD’s was bound in red, hence the name.
(I’m not kidding!)
The Compact Disk standard was developed jointly by Phillips & Sony, and the
technical specifications were released in 1980.
Another acronym that means Read Only Memory. ROM is often used in LSI
chips. Your personal computer uses it in the BIOS, for example. ROM chips can
retain their memory because they are connected to a battery or some kind of
constant voltage source.
SACD is an acronym for Super Audio Compact Disk. SACD’s are high fidelity,
read-only optical disks (often, a DVD) for audio playback. Developed jointly by
Sony and Philips Electronics, SACD recordings have a wider frequency
response and dynamic range than conventional audio CD’s. The SACD format
uses a digital audio scheme called
Direct Stream Digital
(DSD), which has a
very high sampling frequency of 2.8224 MHz (Millions of cycles per second).
A SACD can stream data at four times the rate of an audio CD.
Sample Rate defines the number of samples per second taken of an analog
waveform within a given bandwidth. The terms Sample Rate and
Sample Frequency are used interchangeably to describe the same thing.
I go more into this subject
Sample Rate Conversion
This is where digital audio that was originally digitized at a given sample rate is
re-sampled to another sample rate and often to another bit depth.
For example, a digital file created at 24 bit/48 KHz has to be sample-rate
converted to 16 bit/44.1 kHz so it can be used for Audio CD’s. This process is
often accomplished by software using a mathematical formula called an
‘algorithm’ or in real time by a hardware converter. The real time process is most
often used in a sound mixing environment where several sources were delivered
that are not all the same sample rate or bit depth. This often happens in the
making of Radio and TV Commercials, Feature Films and Music Production.
Either process can result in the creation of unwanted audible artifacts, which is
why everyone tries to maintain the same specifications for a given project.
The term is also an acronym that stands for Solid State Drive.
The technology comes in several forms; USB drives, SD cards, etc.
When compared to a traditional hard drive (HDD), a SSD is faster,
will withstand shock better, (no moving parts) and will run quieter.
However, because of the technology involved, some forms
(USB, SD cards, etc.) can lose or corrupt data when not connected to power.
also known as
Digital audio is created by taking a sample of an analog signal on a periodic
basis, say 48000 times per second (the ‘Sample Rate’). A dedicated clock, the
‘sample clock’ ticks at that rate, and, every time it does, a new sample is
measured. Sample clocks are built into most devices that handle digital audio
and video. Your CD player and your DVD player have sample clocks built into
them in order to stream the data accurately enough to convert the signal to
analog audio and/or video.
Whenever you connect two digital audio or video devices together in order to
move data from one to the other, you must ensure they share the same sample
clock. Why is this necessary? The oscillating crystals used for sample clocks are
generally very stable, but there are always minute differences in the frequency
of any two or more sample clocks (A man with two watches, never knows the
exact time). When used individually, this is not a problem, but connect two digital
devices together and those minute differences will accumulate over time.
Eventually, one of the devices will be trying to read a sample in the middle of the
other device's tick, and the result is a small click or pop in the audio stream or
noticeable jump in the picture. In the consumer realm, when you connect your
CD/DVD player to your home theater processor, via a digital interconnect cable,
the theater processor will adjust its clock to the incoming data stream, and all
works well. In the professional world, many digital devices are often connected
together or to a single source (a mixer, for example), and in order to avoid
clocking errors over time, a central clock source is used and fed to all of the
equipment. This central clock is known as a Word Clock generator.
Rudolph F. Graf, “Dictionary of Electronics” Howard W. Sams, 1974
Glenn D. White, “The Audio Dictionary” University of Washington Press, 1987
© Corey Bailey Audio Engineering