DIGITAL AUDIO GLOSSARY
Originally, this glossary was limited to digital audio terminology.
However, since digital audio involves the use of a computer,
some computer terminology has snuck in.
Aliasing refers to an effect that causes different audio signals to become
indistinguishable when sampled (
of one another). It also refers to the
distortion or artifacts that result when the signal that is reconstructed from
samples, is different from the original analog signal.
In the world of digital audio, the highest frequency information often suffers
from aliasing in the form of poorly or incorrectly reconstructed waveforms.
Piano, cymbals and some stringed instruments can be noticeably affected.
A to D
. An audio A/D converter is an electronic
device used to convert analog electrical signals to digital values whose
numbers (combinations of ones and zeros) represent the level (volume) and
frequency information contained in the original analog signal.
Defined simply as a given range of frequencies (audible or not). We humans,
for example, have an audible hearing bandwidth ranging from approximately
(cycles per second) to about 20 kHz. (20 kilohertz or 20,000 cycles per
An acronym that means Basic Input Output System. In the computer world,
the BIOS is the first thing that is read by the CPU when power is applied and
subsequently determines the configuration of the computer.
The BIOS is a chip that contains
and is made using
The BIOS chip is generally specific to the motherboard.
Within the realm of sampling an analog waveform (using
divides a given sample by its value (16 bit, 24
bit, etc.). Bit depth is
binary code that uses a mathematical language that is
Base 2. It
defines the parameters of the
of an audio signal.
By increasing the bit depth, smaller fluctuations of the audio signal can be
resolved. The noticeable result for the listener is usually increased clarity and
tonality. Listening tests have shown that an increase in bit depth is more
readily detected than an increase in
The rule-of-thumb for bit depth resolution is:
For every 1-Bit increase in Bit Depth, the dynamic range will increase by 6
A bit depth of 16 Bits yields a dynamic range of 96 dB (16X6). A bit depth of 24
bits yields a (theoretical) dynamic range of 144 dB. Notice that an 8 bit
increase in bit depth yields a difference of 48 dB in dynamic range!
I say ‘theoretical’ because the best analog equipment we can use to playback
that decoded 24 bit audio has a realistic dynamic range of about 120 dB.
I have written an article on this subject:
Bit Depth Defined
The bit rate is typically the amount of data bits per second that can be
transmitted over a specified interface. Interface is this context could be USB,
Firewire, HDMI, Thunderbolt or any number of specified digital protocols. The
transfer or bit rate for any given protocol is different. USB 2.0 for example, has
a maximum bit rate of 480 Megabits per second (MBit/s) or 480 million data
bits per second. The terms
are used interchangeably.
Common bit rate (or data rate) terms:
per second (
): one thousand bits per second.
per second (
): one million bits per second.
per second (
): one billion bits per second.
So, now when you see or hear the term
, you will know that it
is a network interface that can convey or transfer
one billion bits per second
A term (an oxymoron, in my opinion) that refers to the 16 bit-44.1
audio file specification used for the Compact Disk.
In computers, it stands for Central Processing Unit. The CPU is the chip that
controls everything. It determines the speed at which the computer operates.
, it performs the inverse of the A/D converter by
converting those digital ones and zeros to an analog waveform that is
hopefully pleasing to listen to.
Dither is an intentionally applied form of noise. Dither is used to randomize
noise at discrete frequencies in a digital audio recording.
In layman’s terms, when digital audio with lower
is at the threshold
of signal level, it can sound grainy. Noise is often added to mask this anomaly.
One of the trade-offs of applying Dither is that it will slightly raise the noise
floor of a given audio recording. Dither can be applied during recording or after
the fact. It is often one of the last stages of audio production used for Compact
In the world of audio, Dynamic Range is defined simply as the range of
volume from the loudest to the softest of audible sounds. Dynamic Range is
(dB), which is a mathematical Log-10 ratio. We Audio
Engineers often refer to Dynamic Range in terms of the difference between
the loudest undistorted signal that be recorded down to the noise level (floor)
of a given medium. Analog tape has a dynamic range approaching 70dB
when used at 30 Inches Per Second. Your audio CD has a theoretical dynamic
range of 96dB. The average human can hear a dynamic range of
A Hard Disk Drive (HDD), is an electro-mechanical data storage device that
stores and retrieves digital data using magnetic storage and one or more rigid
rapidly rotating platters coated with magnetic material. The platters are paired
with magnetic heads, usually arranged on a moving actuator arm, which read
and write data to the platter surfaces. HDDs are a type of non-volatile storage,
retaining stored data even when powered off. Modern HDDs are typically in
the form of a small rectangular box that is usually 3-1/2inches or 2-1/2 inches
wide. Introduced by IBM in 1956, HDDs were the dominant storage device for
general-purpose computers beginning in the early 1960s.
There are also Solid State hard Drives (SSD’s) that use a different technology.
There is a separate description for a
in this glossary.
In the digital world, LSI refers to Large Scale Integration. LSI technology is
used in the manufacture of integrated circuits of all kinds. The technology has
been around since the 1920’s, but wasn’t practical until the late 1950’s.
A lot has changed since then and the acronym is now ULSI meaning; Ultra
Large Scale Integration.
Also known as the ‘Mainboard’ and sometimes referred to as ‘Mobo.’
The Motherboard in your computer is the circuit board that holds everything.
Some things are plugged in to it and others are soldered on.
Known officially as MPEG-2 Audio Layer III. It is a patented digital audio
encoding scheme using a form of lossy data compression. It is a common
audio exchange format for consumer digital audio players. The MP3 file format
is designed to greatly reduce the amount of data required to represent an
audio recording and still sound like the original uncompressed audio for (most)
listeners. An MP3 file that is created using the setting of 128 Kilobits per
second (kbits/s) will result in a file that is about 11 times smaller than a
Compact Disk file created from the original analog audio source. An MP3 file
can also be constructed at higher or lower
with the higher bit rates
resulting in higher audio quality. It's worth noting that listening tests have
shown that the average listener can tell the difference in fidelity between an
MP3 file and a
Named after the Swedish-American engineer Harry Nyquist, the Nyquist
frequency is half the sampling frequency (sample rate) of a discrete signal.
For example: If the sample rate is 48 kHz, the Nyquist frequency would be 24
kHz. According to the Nyquist theorem, at least 2 samples of a peak-to-peak
audio waveform are required for it to accurately be reproduced. Acquiring less
than 2 samples of a peak-to-peak waveform can produce a form of audible
distortion known as
. All of the frequencies above the Nyquist
frequency are attenuated by filtering.
In the computer world, OS stands for Operating System and there are many.
The most popular for personal computers (PC’s) are Microsoft Windows,
Apple and LINUX. In that order, I think.
In digital signal processing, oversampling is the process of sampling a signal
significantly higher than twice the bandwidth (or the
highest frequency) of the signal being sampled.
Pulse Code Modulation (
), is the most widely used method of converting
analog audio to a digital format. Basically, it is defined as the uniform sampling
of an audio waveform at regular intervals. This applies to both frequency
) and volume or the magnitude of the signal (
An acronym that means Random Access Memory. RAM is what your OS and
all of the software that is run on your personal computer use. Think of RAM as
a scratch pad. The RAM gets wiped clean every time you boot or re-start your
The CD standard
The original set of books containing the specifications for all forms of optical
Compact Disk were individually color bound. The book containing the
specifications for Audio CD’s was bound in red, hence the name.
(I’m not kidding!)
The Compact Disk standard was developed jointly by Phillips & Sony, and the
technical specifications were released in 1980.
Another acronym that means Read Only Memory. ROM chips are used in
anything that uses LSI chips. Your personal computer uses them in the BIOS,
for example. ROM chips retain their memory because they are connected to a
battery or some kind of constant voltage source.
SACD is an acronym for Super Audio Compact Disk. SACD’s are high-
resolution, high fidelity, read-only optical disks (typically a DVD disk) for audio
playback. Developed jointly by Sony and Philips Electronics, SACD recordings
have a wider frequency response and dynamic range than conventional CD’s.
The SACD format uses a digital audio scheme called
Direct Stream Digital
(DSD), which has a very high sampling frequency of 2.8224 MHz (Millions of
cycles per second) which coincidentally, is 64 times the sample rate of a CD.
A stereo SACD recording can stream data at four times the rate of CD.
Sample Rate defines the number of samples per second taken of an analog
waveform within a given
. The terms Sample Rate and
Sample Frequency are used interchangeably to describe the same thing.
I go more into this subject
Sample Rate Conversion
This is where digital audio that was originally digitized at a given sample rate
(and bit depth) is re-sampled to another sample rate and often to another bit
depth. For example, a digital file created at 24 bit/48 KHz has to be sample-
rate converted to 16 bit/44.1 KHz so it can be used for Audio CD’s. This
process is often accomplished by software using a mathematical formula
called an ‘algorithm’ or in real time by a hardware converter. The real time
process is most often used in a sound mixing environment where several
sources were delivered that are not all the same sample rate or bit depth. This
often happens in the making of Radio and TV Commercials, Feature Films
and Music Production. Either process can result in the creation of unwanted
audible artifacts, which is why everyone tries to maintain the same
specifications for a given project.
The term is also an acronym that stands for Solid State Drive.
The technology comes in several forms; USB drives, SD cards, etc.
When compared to a traditional hard drive (HDD), an SSD is faster,
will withstand shock better, (no moving parts) and sometimes will run quieter.
However, because of the technology involved, some forms of will lose or
corrupt data when not connected to power.
also known as
Digital audio is created by taking a sample of an analog signal on a periodic
basis, say 48000 times per second (the ‘Sample Rate’). A dedicated clock, the
‘sample clock,’ ticks at that rate, and, every time it does, a new sample is
measured. Sample clocks are built into most devices that handle digital audio
and video. Your CD player and your DVD player have sample clocks built into
them in order to stream the data accurately enough to convert the signal to
analog audio or video.
Whenever you connect two digital audio or video devices together in order to
move data from one to the other, you must ensure they share the same
sample clock. Why is this necessary? The oscillating crystals used for sample
clocks are generally very stable, but there are always minute differences in the
frequency of any two or more clocks (A man with two watches, never knows
the exact time). When used individually, this is not a problem, but connect two
digital devices together and those minute differences will accumulate over
time. Eventually, one of the devices will be trying to read a sample in the
middle of the other device's tick, and the result is a small click or pop in the
audio stream or noticeable jump in the picture. In the consumer realm, when
you connect your CD/DVD player to your home theater processor, via a digital
interconnect cable, the theater processor will adjust its clock to the incoming
data stream, and all works well. In the professional world, many digital devices
are often connected together or to a single source (a mixer, for example), and
in order to avoid clocking errors over time, a central clock source is used and
fed to all of the equipment. This central clock is known as a
Rudolph F. Graf, “Dictionary of Electronics” Howard W. Sams, 1974
Glenn D. White, “The Audio Dictionary” University of Washington Press, 1987
© Corey Bailey Audio Engineering