Digital audio is pretty awesome.  The ability to convert an analog signal into the digital world makes recording/mixing/all that good stuff a lot easier, and a lot cheaper.  No longer do we suffer the tyranny of moving/cutting/storing those cursed tape reels!  But, when an analog signal is converted into a digital signal, some quality is lost, but the question is whether or not it’s noticeable.

When sampling a digital signal, the two parameters that are used to translate the analog signal are Sample Rate and Bit Depth, which I’m sure all of us have heard of.  Generally, the higher these to numbers are the more accurate the translation from analog to digital will be.  CD’s are mastered at 44.1 kHz sampling rate, and 16 bit depth.  DVD’s are mastered at 48 kHz and 16 or 24 bit depth, but usually at 16.  DVD-Audio/SACD is almost always done at 24-bit.  Blu-rays master at 96 kHz or 192 kHZ and  24 bit-depth.  So what do all these mean, and how do they sound different?

Sample Rate

Basically, sample rate is how often the ADC (Analog to Digital Converter) takes a snapshot of the audio. Ideally, the sample rate would be infinite, so the ADC was continuously translating the analog signal to digital, but that would require infinite disk space, so we settle for a less than perfect translation, and hopefully we can’t notice the difference.  The picture below shows a signal being sampled at each of the red lines, and you can see that the signal is translated not as a smooth curve, but as a series of straight lines that approximate the original curve.


So, easy to see where the signal gets chopped a little, and it looks like quite a bit of damage. Now, what happens if the sample rate is doubled?


That turns out to be a much more accurate translation, but it will use a lot more disk space. The important thing though is whether or not there is an audible difference between the two.  If you can’t tell the difference, you probably want to go with the one that takes up less space right?

Audibly, sample rate translates to frequency response; the lower the sample rate, the lower the frequency response.  Let’s say the picture below is a 10 kHz sine wave.  If the sample rate is also 10 kHz, the ADC will sample the wave once per cycle (complete up and down of the wave).  A signal at the same frequency as the sample rate gets completely nullified.


and from there, it’s a nice smooth curve of obliteration until the frequency of the signal is half that of the sample rate.   Now, I will raise the sample rate to 20 kHz.

So, to correctly translate, or preserve the frequency of a signal, from analog to digital, a sample rate of at least twice the highest frequency in the signal must be used.  Human hearing range is 20Hz-2okHz, so to correctly translate the audible portion of a signal, we need a sample rate of at least 40kHz, hence why CD’s are done at 44.1kHz.  If you ask an engineer, a sample rate never need be any higher than 40kHz because we can’t hear anything above 20kHz. “You’re just getting signal you can’t hear anyways!”  But, the picture above clearly shows a triangle wave instead of a sine wave, and triangle waves sound different than sine waves.  So, even thought the frequency is preserved, the timbre takes one for the team, and is left by the wayside. What we end up with is incorrect sounds in the really high range of our hearing; things like cymbals take the biggest hit. Not only that, but frequencies above our range of hearing still enter the ear canal, and the effect the way that audible frequencies sound, or at least the way they feel.  A higher frequencies can be perceived even if they can’t be heard. This is one reason why Blu-ray audio is so much more crisp–The high sample rate (192 kHz) preserves all those really high and really little sounds we normally don’t hear reproduced. Some of the world’s best soundboards for recording sample at 384kHz to capture and correctly reproduce everything. They also cost mucho much monies, which is why most bands/studios don’t use them. That’s about it for sample rate, so on to bit depth!

Bit Depth

While sample rate is resolution along the x axis (time), bit depth is the resolution along the y axis (voltage).  Bit depth is what makes the straight lines from the sampling process look like a bunch of little steps instead of a perfectly straight line. A bit can either be 0 or 1, so a signal sampled at 1 bit depth could either be 0 volts, or 1 volt, which results in a square wave. Each bit increase double the amount of voltage levels available. To know how many levels are available at a given bit depth, it’s simply 2^bit.  So 16 bit depth is 2^16, or 65,536 different voltage levels. When a signal is sampled, the actual voltage level is rounded to the nearest voltage level that the bit depth allows, resulting in the “staired” look, or aliasing. Below, a sine wave has been sampled with a bit depth of four, which gives us 16 steps (2^4).  The vertical lines on the signal are the samples, while the horizontal lines are the available voltages.

Audibly, bit depth affects the dynamic range of a signal.  The higher the bit depth, the softer the signal is able to be, and the rule of thumb is that for every bit, you add 6dB of dynamic range. 16-bit has a dynamic range of 96dB, and 24-bit has a dynamic range of 144dB. Dynamic range can also be written as “noise floor,” or the point at which quieter sounds cannot be heard.  When listening, this is heard as the quiet hiss in the background, though it’s difficult to tell if that hiss is the result of bit depth, or a noisy amplifier (or headphones).  Lower bit depths mean louder hiss, and this is devastating to music that has a large dynamic range such as classical, orchestral, choral music, or some progressive music, but is hardly noticeable in popular music, due to dynamic compression and other things.

24-bit has a theoretical range of 144dB, but we are limited to a range of about 124dB due to random noise generated in circuits by temperature and other factors, so really after about 21 bits, we lose any advantage.  This is about the range of human hearing however, so it works out well.  24 bit audio is extremely quiet in the soft parts.  If you ever get a chance, get a DVD-Audio version of a band you like, find a friend that has a DVD-Audio player (most normal dvd players do not play dvd-audio) and listen on a nice stereo.  You’ll be surprised at the difference in quality and dynamic range, especially if listening to classical music, or something else with a huge dynamic range.

All this pontificating about theory is dandy, but I want to hear what actual difference changing around these numbers makes.  So, here is a video I made showing different sample rates (44.1kHz, 22kHz, and 11kHZ), then different bit depths (8 bit), on different types of music.  I hope you like it.  I thought it was very interesting to see how differently these affect classical vs loud popular stuff.  Very interesting indeed. Play it in the highest quality you can to better hear the difference.