Timbre definition from Wikipedia, “In music, timbre, also known as tone color, is the quality of a musical note or sound or tone that distinguishes different types of sound production, such as voices and musical instruments, string instruments, wind instruments, and percussion instruments. The physical characteristics of sound that determine the perception of timbre include spectrum and envelope. In psychoacoustics, timbre is also called tone quality and tone color.” I suggest reading the Wikipedia article as timbre contains both subjective and objective attributes, both of which are discussed in detail in this post.
From a sound reproduction perspective, if ones goal is to reproduce music as faithfully as possible, then timbre (and all of its subjective and objective attributes) is a significant factor. I consider room acoustics the worst offender for destroying timbre (i.e. tone quality). If you are into the scientific research, there are a number of references in the Audio Engineering Society's library, here are a couple AES E-Library » Natural Timbre in Room Correction Systems (Part II) and AES E-Library » The Influence of the Room and of Loudspeaker Position on the Timbre of Reproduced Sound in Domestic Rooms
I made two major acoustical improvements in my listening environment recently and thought I would share not only the acoustical measurements, but the actual the sound too. Literally, you will be able to hear what I hear when listening to my stereo in my listening room. You will be able to listen to the difference between an acoustically untreated room, a treated room, and using digital room correction (DRC) software.
How? With these in-ear binaural microphones. Click on the details tab, give a quick listen to the acoustic guitar demo. With these, you will be able to hear my stereo/room combo in 24/96 resolution with timbre that accurately represents what I hear. I use these to record live music and thought I could use them to record my stereo in the listening position.
I was going to call this article, “Modern Room Tuning Techniques”, as it is a continuation of my, Speaker to Room Calibration Walkthrough, but ultimately, no matter how many words I write, or graphs I post, it will not fully communicate what my speaker/room combination sounds like. You need to hear it too. Actually listening to the sound will put into perspective what the words and measurement graphs mean.
Professional acousticians and audio engineers routinely take acoustic measurements as part of their everyday job. If you have been doing it full-time for a career, then you can read an acoustic measurement graph and hear the sound in your head. Same as how a musician can read notes off a music sheet and hum the tune (some with perfect pitch) in their head.
While every acoustic space is unique, there are a couple of basic tenets that hold true for small room acoustics, which the majority of our listening rooms fall under this classification. These tenets are controlling room resonances and overall room decay times (i.e. RT60). This is based on a large body of knowledge specifically on small room acoustics. Here is a quick overview with a few reference links.
Just like electronic engineers use circuit diagrams and part’s list (BOM) to communicate the designs and sonic signatures of audio amplifiers, acousticians use time, energy, and frequency information to communicate the sonic signature of an acoustic environment (i.e. both speakers and room). In this article, you will be able to correlate what you see with what you hear (literally) and vice versa.
The Design Process
I am going to analyze my listening room and come up with two designs to improve the timbre of the room acoustics, one passive, and the other active. But first, I will measure my room “as is” and make a reference recording with the binaural microphones so you can hear my stereo and room as is. Here are a couple of pics of my very live, untreated room.
I try and make my analysis balanced between 50% what I hear and 50% what I measure. Based on that analysis, I will design and implement passive acoustic treatments. Then take another set of measurements and binaural recording of the speaker/room combo with the acoustic treatments in place.
Next, Digital Room Correction (DRC). I will fine tune the frequency response using a “target” or “designed” frequency response to reproduce the best effort tonal balance and fine tune the impulse response, (i.e. timing) for the best possible timbre from the speaker/room combo.
Meaning achieving the best possible tone quality (i.e. timbre) is limited by the physical dimensions of the listening room. Given that we can digitally manipulate all three dimensions of sound (amplitude, frequency, and time), we can create any sonic signature we want, with the limitation being the physical dimensions of the room itself. Technically, this is called a transfer function. A transfer function at this level encompasses everything that makes up the sonic signature of the speaker/room combo.
Because of digital audio, we can design and implement our own transfer functions (i.e. sonic signatures) in software with distortion and noise levels far below what we can perceive and correction at a level of resolution far greater that our ears (read:brain) can discriminate.
Historically it was thought that we could only discriminate to 1/3 of an octave (hence the 1/3 octave analog equalizer). Later research has determined that we can discriminate somewhat closer to 1/6 of an octave. So when viewing acoustical frequency response graphs, 1/6 octave smoothing is the preferred resolution to view the graphs as that is the most accurate representation of how our ear hears (or more technically correct, how the brain interprets the electrical signals).
In the digital domain, we have digital filters that can have 65,535 “bands” (or more). Compared to a 31 band 1/3 octave analog equalizer... That's a revolution.
I chose a linear phase filter (as opposed to minimum phase) as this produces the best phase coherence and time alignment. Not only is the sound “time aligned”, but some early reflections are reduced so that the phase coherence holds together long enough to hear the depth on the recording before the 3D image is destroyed by comb filtering effects of the room. Comb filtering is the root of all evil for an audiophile.
Reducing early sound reflections, (and diffusing later reflections), is critical to the realistic reproduction of any stereo recording and achieving best possible timbre (i.e. tone quality). You want to hear enough of the recording long enough so that the phase coherence or sound stage is heard before the room takes over and interferes with comb filtering “location” cues that blurs the (depth of) image and colors the sound quality with the tone of the room. You will (easily) be able to hear this in the binaural recordings when I compare “as is” with “passive acoustic treatments” and finally with “DRC”.
But first, this is what we need to listen for. It is a bit of science, hopefully presented in a fun and easy to hear manner as it is important to understand what is happening and especially what it sounds like. We all listen to it, but can we hear it?
Pretty cool the Hass effect. “The Haas effect is a psychoacoustic effect, described in 1949 by Helmut Haas in his Ph.D. thesis. It is often equated with the underlying precedence effect (or law of the first wavefront).”
“Haas found that humans localize sound sources in the direction of the first arriving sound despite the presence of a single reflection from a different direction. A single auditory event is perceived. A reflection arriving later than 1 millisecond after the direct sound increases the perceived level and spaciousness (more precisely the perceived width of the sound source). A single reflection arriving within 5 to 30 milliseconds can be up to 10 dB louder (My note: that’s twice as loud!) than the direct sound without being perceived as a secondary auditory event (echo). This time span varies with the reflection level. If the direct sound is coming from the same direction the listener is facing, the reflection's direction has no significant effect on the results. A reflection with attenuated higher frequencies expands the time span that echo suppression is active. Increased room reverberation time also expands the time span of echo suppression.”
Key concept. It is amazing how a 5 millisecond delay can have that much width. The majority of rock and pop (and most mono multi-track) recordings use the Hass effect extensively, along with more digital delays, reverbs, stereo expanders, etc. If you listen to rock and pop, or any other mono recorded, multi-track recording, it is fake stereo. It's all an illusion and fools our brain every time (speaking as someone that spent over 10,000 hours in the recording/mixing chair doing exactly that). Personally, I don't care. When I crank up SRV's Tin Pan Alley (DR 15) on my rock and roll audiophile system and it feels like I am at Buddy Guy's Legends night club in Chicago, the illusion is complete for me.
A bit more physics, as this is directly related to speaker location and listening position. Sound travels roughly 1 foot per 1 millisecond. The wavelength of a 20 KHz frequency is 0.68 of an inch. If my stereo's equilateral triangle is out even by an inch, I will already have destroyed some of the high frequency image (especially depth of field), because the equilateral triangle is misaligned and I am creating comb filtering at high frequencies.
The learning from this is that time alignment of everything is critical, due to the Haas effect, and its role in reproducing proper timbre. The better aligned the equilateral triangle, the more phase coherent image can be reproduced, which is one of the key attributes of reproducing the most realistic timbre. Additionally, this is why early reflections need to be tamed, typically 15 dB below the direct signal, so we don’t get the Haas effect blurring the time alignment of the stereo image (especially depth).
My design approach to modern room tuning techniques includes using passive acoustic treatment to minimize room resonances, early reflections, and over all room decay time (RT60). I also use state of the art DRC software to trim the frequency response for best effort tonal balance, time align the signal so that the waveform (all frequencies) arrives at the same time in the listening area, and minimize early reflections to enhance the depth and overall phase coherence of the stereo image before comb filtering destroys the recorded illusion. This is captured on the binaural recordings.
Acoustic Analysis and Design
Fellow CA readers, I am the recipient of the 2nd worst possible sounding room award, only beaten by a room shaped like a cube. This is because the length of the room is almost twice the width. Additionally, my stereo is set up off center in the room. So how do I know it is the 2nd worst possible sounding room? I am using Bob Gold’s room mode calculator that will produce a nice graphic display of the room modes given the dimensions of my room.
According to the calculator, my rooms Schroeder cutoff frequency is 92 Hz. This is my room’s fundamental transition frequency, below this frequency, the room behaves as a resonator, above, a diffuser/reflector. This transition point is far from smooth and resonates below the cutoff and rings (like a filter) above the cutoff. Just like blowing air across the mouth of a near empty coke bottle, every room resonates a tone that rides on top of all low frequency notes. Depending on how bad it is, like my room ratio for example, will produce what is sometimes called “one note” bass tone, meaning the rooms resonant frequencies are so dominant (i.e. too much amplitude) so all the bass notes (and sometimes drums) sound like just one note is playing. Also called “room boom”.
You will hear the room boom in my listening room as it is captured in the binaural recording. You too can work out your rooms resonant frequencies using this calculator. Here is a frequency response measurement of my room to see if it correlates with the model. Many thanks to JohnM for his most excellent REW measurement software.
This measurement correlates well with the model. Major peaks and valleys between 92Hz to 300 Hz. That’s the ultimate challenge isn’t it, 2nd worst possible sounding room from an acoustic perspective. If I can make this room sound good… Note the blue horizontal line is mine to help delineate the problem areas. The circled mid-range area also represents a problem area. Initially looks like too much amplitude, but the real culprit for the raised amplitued is midrange room reverb build up. We need to look at another view to see it.
This brings up a story I feel is worth sharing so you can understand where I am coming from on this. As mentioned elsewhere on my blog, I had the good fortune to have been a live sound, recording/mixing engineer for 10 years. SQ was of major importance to me and I worked extra hours to ensure the artist/group got the best possible sound I could come up with. I worked in a several state of the art acoustic spaces, with this one below sounding so good that I gave up on my home system.
The studio control room facilities I worked in were designed from the ground up acoustically to be state of the art. The rooms sounded incredible. Perfect neutral timbre. If you ever get a chance to visit a properly designed studio control room and listen to some music... I got so used to state of the art sound, that no matter what I did in my home stereo it paled in comparison to the sound of the state of the art control rooms. And I am not talking about the gear.
The biggest difference between working in the control room and listening at home was the timbre (i.e. tone quality) of the rooms. The studio control room is designed so that the engineer sitting behind the console would hear the sound of the music picked up by the mic and room of the studio before the sound of the control room could be heard. Also known as a reflection free zone (RFZ). RFZ is control room design based on knowledge of the Haas effect.
That meant obtaining a reflection free zone at the mix position and ensuring that any room timbre (i.e. tone quality and all of its subjective and objective attributes) was as neutral sounding as possible. I.e. no coke bottle resonance effects, no boxiness, etc. If you saw the blueprints for one of these control rooms, you would see no surface is parallel and are designed to ensure early reflections did not enter the RFZ and later reflections were thoroughly diffused so any room sound was perceived as a neutral sounding extension that made the room sound a bit bigger than it really was. A very neat psychoacoustic trick.
As mentioned, the point was to hear the direct sound from the mic in the studio, plus the early reflections (i.e. tonal colorations) before you could hear the sound (i.e. timbre) of the control room. That way, when you were placing mics and eq'ing, you were not making decisions based on a hearing the tonal colorations of the control room, mixed in with the sound from the studio.
When I compared the acoustics of my home listening space versus the state of the art control room I was working in +8 hours a day, the timbre gap was so great, I gave up on a traditional speaker setup at home. Mostly I listened to headphones. Sometimes, I invited the boys over to the studio when it wasn't busy and we would listen to tunes there.
While looking at some programming sites, I came across a few Digital Signal Processing (DSP) articles. One of them was showing how you can use a well-known DSP technique, called convolution where you can digitally mix (i.e. real-time convolve) the “bit-perfect” music signal with a digital filter (both in the frequency and time domain) that was the inverse (well, they really are algorithms) of the measured room response. Convolution is a transfer function.
JRiver MC has a state of the art convolution engine to host these designed digital filters. What can be done in software far exceeds what can be done in hardware and analog domain. Every modern consumer and pro A/D D/A is performing DSP on the audio signal with digital filters (in conjunction with analog filters) already. “The precision offered by Media Center's 64bit audio engine is billions of times greater than the best hardware can utilize. In other words, it is bit-perfect on all known hardware”
A bit more searching and I found a few DRC software products that used this filter design for audio. One is called Audiolense. I downloaded the demo and ran it on my crappy Logitech G51 computer speakers. If it can make those sound good… As soon as I heard it, I knew that someone (Bernt!) had figured this out in the digital domain, which is a revolution compared to what we can do in hardware/analog audio. This is what I was waiting for.
For me, it is a new ball game and gave me the opportunity to get back into listening to music the way I heard it in those acoustically (near) perfect rooms, or at least come a lot closer than ever before. We will see if the proof is in the binaural recordings in which you can listen to and draw your own conclusions.
Back to the passive acoustic filter design. The first thing I need are bass traps that have good absorption capabilities from 92 to 300 Hz. When I was in the pro audio industry, I used ASC Tube Traps (and RPG products) extensively with good success. Unfortunately, I don’t have budgets like that anymore, but I think I have found a reasonably priced bass trap that should do the job.
It is a corner trap, and should go directly behind the speakers in the corners. Because of my room’s offset, the best I can do is directly behind the speakers in a sorta corner. The idea here is twofold; one is to dampen the low end sound coming off the back of the speaker cabinet so the refection off the wall and back to the listening position is minimized. This would correspond to about 4 or 5 milliseconds delay. Remember the Haas effect video on what 5 milliseconds delay sounded like? That’s roughly 5 feet of distance, and in this case, after the main sound wave arrives, a secondary wave arrives off the wall from behind the speakers and confuses my brain on location. In this case, destroys the image from front to back. Depth of field, due to early reflections (and comb filtering) is the first thing to go. It is the green circled portion in the graph below.
With the bass traps in place, it should help dampen those resonances/ringing from 92 to 300 Hz, plus dampen the impact off the back of the speaker. This should result in a tighter (i.e. more transient) bass sound with minimal 5 millisecond later reflection so it does not blur the (depth of the) image. This is captured on the binaural recording. We can also measure this with an Energy Time Curve (ETC).
Technically, we can measure the room’s early reflections with an ETC, typically from 0 to 40 or 50 milliseconds. That’s 40 to 50 feet of travel after the direct sound arrives at the listening position. That way we can inspect anywhere along the time curve and with the wavelength calculator, turn that into distance. This allows us to figure out where the early reflections are coming from and to either dampen or diffuse accordingly.
Looking at the spikes on the graph and corresponding millisecond time reading, can be translated into feet using the wavelength calculator. Then measuring from the mic position to the point of reflection to identify where passive acoustic treatments should go.
And it is mostly the same type of acoustical treatments, one to tame the room’s resonant/ringing frequencies with bass trapping in corners. Next is diffusion or absorption of the early reflections off the floor, ceiling, and side walls. Of course, the back wall and front wall (with the windows). The windows may benefit from heavy velour curtains. Ideally, the speakers would be mounted in soffits, like in recording studio control rooms, but it’s just my living room, so it’s a design tradeoff (ha ha).
Pretty easy to correlate as one can take a tape measure, or string, or a laser distance measurer, measuring from the mic, with a mirror to find the reflection points and correlate to the ETC by using the wavelength calculator.
This is an ETC measurement of my untreated room. I can label the reflections based on translating to a physical measure in the room. As it stands it is not too bad as the rule of thumb is that all early reflections should be 15 dB or more down from the main signal amplitude. I am almost there. This is simply by virtue that my listening position is as far away from any reflecting surfaces as possible, given the contraints of my room.
Check out this waterfall graph showing at which range of frequencies are producing the long decay times. This means my room is very lively as the carpet is indeed the only real absorbent material in a room that is otherwise all drywall, glass, tile, and hardwood (on top of being the 2nd worst room ratio).
What you are seeing here is sound measured in 3 dimensions, vertical scale is level or amplitude in decibels, the horizontals scale is frequency in hertz and the z scale is time in milliseconds. In my case, the time scale is from 0 to 300 milliseconds, meaning the sound has travelled roughly 300 feet (10x the length of the room and 20x the width of the room) in the room when the microphone measured 300 milliseconds after the direct sound, so that we get the sound of the room and it’s decay and display in a visual 3D graph.
I have circled the two problem areas. The one on the left is showing the room resonances with peaks and valleys, that I identified earlier. The one in the lower middle is showing the long midrange decays times, which build up more than other frequencies and caused me to incorrectly compensate by lowering the DRC "target" frequency response by -3 dB at 2 KHz. More on that later.
Let’s look at shape of the decay over time. There are ITU, IEC, ISO, BBC, and other standards bodies specification of the reverb time (spec’d as RT60) or more properly, early decay time, for critical listening environments of a minimum volume of 2500 cubic feet. The specification or preferred range is from .4 to .6 seconds decay across the frequency band, with some rise in the bottom end allowed. That’s 400 to 600 milliseconds max.
I am definitely over the .6 second mark in the midrange as circled in the graph (turns out to be .7 seconds). In this case, some broadband absorbers with good absorption in the midrange will be called for in this design. These should be mounted at the first reflection point on the ceiling and the rear wall to not only reduce early reflections, but further dampen the “brightness” and “boxiness” sound of the untreated room. If my room happened to be the opposite, i.e. dead sounding, then I would put diffuser panels on the ceiling and rear wall instead, with that .4 to .6 second decay as the target RT60.
That’s the analysis of my room acoustics and some basic acoustic design, not only based on measurements, but extended hours listening for early reflections, room modes, and midrange comb filtering. My design is to dampen the back of that pounding 15” woofer and the room modes at the cutoff frequency and harmonic ringing. In addition, absorb broadband midrange due to bare walls, plus take care of the early reflections (floor has carpet, the ceiling gets the absorber) to get rid of that “boxy” sound. We will see if it is enough or not. As a last resort, I can hang heavy (velour) curtains over the front windows plus a good portion of the wall.
Listening to the Untreated Room
Let’s take a listen to a binaural recording of my untreated room so you can hear for yourself, the “boxiness” sound and “one note” bass sound. I chose the tune "Arbantana" from Hossam Ramzy's album, "Rock the Tabla" (iTunes) for a number of reasons. The bass notes really activates the "one note" bass tone of the room, lots of transient percussion for timing and imaging cues, plus good artificial reverb that further exaggerates the boxy tone of my room. Besides, I like the tune and Hossam is an awesome percussionist.
Sept 15 2012 Update
Unfortunately, there are two issues with my binaural recordings. One is that I did not calibrate the binaural microphones frequency response, and the other is I used my head instead of a “fixed” dummy head.
This is the frequency response of the binaural microphones as best as I could record with my head not moving. The monitors producing the swept sine have already been calibrated to 20Hz to 22KHz flat +-4db. What is being shown is the microphones frequency response deviation from flat. There should have been an inverse digital (FIR) filter applied to the mics frequency response to make the recording flat.
<p><a href="<fileStore.core_Attachment>/monthly_2012_09/58cd9bc2d0160_binauralmicsfreqresp.JPG.64dd65619ba84e009b9b04ec3129eae0.JPG" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28129" src="<fileStore.core_Attachment>/monthly_2012_09/58cd9bc2d0160_binauralmicsfreqresp.JPG.64dd65619ba84e009b9b04ec3129eae0.JPG" class="ipsImage ipsImage_thumbnailed" alt=""></a></p>
The other flaw was a sample rate problem with the right channel (in blue) dropping out at 16Khz. These issues reduce the true representation of what I was listening to versus what the mics picked up.
There is value in the comparisons as the relative differences are audible. However, the overall tonal balance (i.e. timbre) does not accurately represent the tone quality of the system due to the un-calibrated mics.
<p><a href="<fileStore.core_Attachment>/monthly_2012_09/3diosound.JPG.b0b4ee8a07382b9ab3e4292d95e26388.JPG" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28130" src="<fileStore.core_Attachment>/monthly_2012_09/3diosound.JPG.b0b4ee8a07382b9ab3e4292d95e26388.JPG" class="ipsImage ipsImage_thumbnailed" alt=""></a></p>
Here are the new binaural mics (with silcone ears) that can be mounted on a mic stand. Note these are not the ones I used for the recordings in this post, but will use in an upcomig article. Now that I have a repeatable way to make binaural recordings, and corrected the sample rate issues, I can calibrate these new binaural mics. I have also updated my A/D D/A converter to a Lynx Hilo. I may the tests/recordings and update this post, but more likely create a new post.
In the meantime, I hope you find value in the walkthrough and comparing the binaural recordings.
Use headphones to fully realize the binaural effect.
Download MP3 320kbps 4 meg
Download hires WAV 40 meg
I recorded this while sitting in the listening position and using my Lynx L22 pro sound card’s ADC direct to Audacity at 24/96. You may want to listen to it a few times through to acclimatize your ears to binaural sound and the sound of my speaker/room combo. To me, my speakers/room combo sounds bright and boxy through the mids. And the one note bass tone is evident as well. As a side, this is more a function of the room than the speakers as will be seen in a future post.
Here is what to listen for, keeping in mind the sound of the Haas effect from the video. Specifically, what bass there is, sounds muddy and has a dominant overtone. It will be hard to distinguish the kick drum and bass guitar as the "one note" tone makes everything kind of blend together, sounds drone like or room boom.
With respect to the midrange, notice a set of tight drums cracking away, interplaying with the main drum kit. Notice the width and depth of field of these drums, almost slap echo off the untreated walls - heavily comb filtered/ringing and its "depth" position in the mix is wrong. It sounds too far back, but too up in level because of the reverb build up (comb filtering/ringing) of the midrange frequency range. Listen as many times as required to tune into these timbre issues.
Adding Passive Acoustical Room Treatments
Every listening room has a fundamental resonant frequency (plus harmonics) that will need some taming. It is simply a function of the physical dimensions of the room. Depending on how “live” or “dead” sounding the room is will determine the number of diffusers and/or absorbers for any particular sound environment to achieve the recommended RT60 decay time. The ideal design is to have all sound at all frequencies decay at the same rate and meet halfway between the RT60 specification of .4 to .6 seconds.
Every critical listening environment could benefit from this basic passive acoustic filter design pattern. A more encompassing design pattern looks may look like this:
I have used this design pattern (and portions thereof) extensively and successfully when I was in the pro audio business
Here is what I ended up installing in my room. 6 panels, 4 clipped onto the back wall and 2 on the ceiling to take care of the early reflections. 2 corner bass traps behind the speakers:
Here are a few measurements to see how the passive acoustic treatments helped out the acoustics, even though I can hear the difference just standing in the room. These overlays are to compare before and after acoustic treatments. I have zoomed in the vertical scale to 2 dB per division to show detail, which exaggerates the "un-smoothness" of the frequency response.
The acoustic treatments are able to significantly dampen the circled areas almost by 5 dB at 200 Hz and 3dB through the midrange, which is reducing the room power by half. Said another way, the passive acoustic treatments reduce the room gain by half in the identified problem areas. That's significant.
The early reflection in the 4 to 5 millisecond range has been reduced considerably as a result of the bass traps placed behind the speakers and reducing the reflection off the wall behind the speakers. This is key to the kick drum having definition and hearing all the bass notes from the bass guitar at equal loudness, both in the frequency and time domain.
Compare the two 3D waterfall graphs above, the first one before treatments and the latter after. The mid-range decay times (the boxiness sound) have been reduced as circled. Also note, the 200 Hz peak and decay has also been reduced 5 dB. I was going to screen cast switching between the two graphs so you could get a real good sense of the passive acoustic filters at work as it is much more than just the circled points, the overall sound is further diffused.
Because of the passive acoustic treatments, my room's RT60 is now within the .4 to .6 second specification across the frequency range. If I was to add any more absorbent material, I might add a couple more ceiling absorbers right over the listening position to reduce the comb filtering effects of the couch, or adding heavy velour drapes to the windows in the front of the room.
Listen to the Difference - Untreated versus Treated Room
Here is another binaural recording of the same tune, but now in the treated acoustic space. What I did so you can hear the difference between untreated and treated room, I level matched the binaural recordings of the before and after (within .1 dB). Every 15 seconds, starting with the untreated room, I switch from untreated to treated to untreated, etc., every 15 seconds. Like so in Audacity:
What to listen for? When the binaural recording switches from untreated room to treated room, you should hear a timbre change as the midrange damping will lessen the boxy sound. You should also hear a tightening of the bass and less of the "one note" bass tone. As the binaural recording progresses and more instruments are played, it will become easier to hear the switch between boxy, reverberant sounding to much more defined soundstage and tighter sounding.
Especially note the sound of the kick drum and bass guitar in the low end. Remember how reverberant the tight cracking drums sounded in the untreated room? These should sound less "slappy" sounding, having a tighter definition. In fact, you should hear a tighter definition and more focus on everything.
As noted, above, as more instruments are playing, you will notice a timbre change towards the end with the soloing guitar. The slap echo or ringing effect, that adds tone color, is gone when the recording switches to treated room towards the end. You may have to play it a few times to really key in what is going on. You can also use a timer while listening and looking at the graph to key right in on the transition changes.
Use headphones to fully realize the binaural effect.
Download MP3 320kbps 4 meg
Download hires WAV 40 meg
Based on my listening tests, the bottom end and midrange are much tighter defined, as is the overall stereo 3D image. An overall improvement in frequency response smoothness, with tighter definition or imaging or timing. Sounds more focused. It is easier to hear the tone quality change towards the end. It seems I am right in there for the proper decay time.
The sonic improvements that I hear line right up with what I measure and vice versa. So from a timbre perspective, I am pretty happy with the end result.
Analysis and Design Part 2
After living with the acoustic treatments for a week and listening everyday, have made a major improvement in tone quality. Dampening the “one note” bass room mode and dampening the “boxiness” comb filtering in the mids. The decay time is within specification as evidenced by both the measurements and binaural recordings that you can hear the timbre (or tone quality) improvement yourself.
What further improvements can I make to the speaker/room interface? How can I further improve the timbre? There seems to be more room to improve, especially given the frequency response still deviates quite a few (14) dB, when I should be in the +- 3 dB range across the frequency band. Even then, 1 dB either way is audible. How do I further smoothen the frequency response?
Also, what about phase coherence and timing at the listening position? Can I improve that? I remember owning Thiel CS 3.6 time aligned speakers in the consumer world and when I was recording/mixing, I was using the Urei 813C time aligns. I can hear time alignment, and I can measure it. So how do I improve the time alignment (as my speakers don’t have that feature built-in – many don’t as it is hard to do - meaning expensive) plus how do I further minimize early reflections to get the best image possible at the listening position?
Basically I need both frequency and time alignment capabilities. Just like every piece of audio gear has a sonic signature, the revolution that is digital audio, provides a facility to correct the sound in the digital domain at high resolution (64 bit data path) and low distortion. Given the computing power and sophisticated DSP software we have today, there is a classification of software that is called Digital Room Correction (DRC) software.
Therefore, I can easily create any sonic signature I want since I have more control over the frequency and time domain than my ears can discriminate using software like this. With the software, you can use default digital filters, or using a Designer, create your own. This is designing the transfer function for the speaker/room combo. In this case, a linear phase FIR filter.
Digital Room Correction
How do we do this? We design the digital filter using a “target” frequency response, one that we design in software. If time domain correction is enabled, which it is in my case, then the impulse values change with target frequency response. The best impulse response can be achieved by matching the target's high frequency roll-off, with the natural roll-off of the tweeters filtered frequency response.
For me, this tunes the filter to yield the best possible timbre for the speaker/room combo. When this is tuned properly, the timbre tunes in like a guitar string being brought into tune. I have guitars, mics, A/D converter, so I can compare live and recorded timbre of the guitars, plus shakers, tambourines, triangles, etc.
Here is an example of a "designed target" frequency response using Audiolense. I draw or enter in the data points of the frequency response curve I want (red dots).
I have tried dozens of targets before I added my acoustical treatments. Here are a few examples:
Every one of these "targets" sound different, both from (frequency) tone and (impulse) timing perspective. I circled the 2 KHz region. If you look back at the untreated room frequency response, near the top of the article with the 3 dB peak circled in the 1 to 2 KHz range, I used DRC to compensate for this, by dropping the target down -3 dB at 2 KHz. However, this is the wrong thing to do as DRC cannot compensate for excessive midrange RT60 decay time, only by adding acoustical treatments can.
Once I treated my room acoustically, I no longer needed to drop the target frequency response by - 3 dB at 2 KHz. That was a learning for me. Actually, a re-learning for me as I remember reading this in Don Davis excellent book on Sound System Engineering "You can't effectively (digital) eq a reverberant field".
Here is another view of the target plus the uncorrected frequency response of my speakers in the main form view of Audiolense. Note how the targets frequency extremes match the speakers natural roll-offs at the extremes.
I snuck in a little bottom end lift on the target, but given the Klipsch QB3 alignment of the ported bass bin, I can tuck in little more low end and still have the bass sound tight and not over tax the amplifiers.
Now I can have Audiolense generate the digital FIR filter (which is almost an inverse of the uncorrected response, I say almost because there are other algorithms at play here):
Here is the resultant corrected frequency response:
The uncorrected frequency response is on top and the corrected frequency response is on the bottom (along with the target). In addition to the acoustical treatments, and short of building a state of the art critical listening room from the ground up, I know of no other way to achieve this level of timbre correction, given my awful room ratio.
Before we listen, let’s look at a few measurements.
Frequency response. I have zoomed way in on the amplitude scale again to show detail. The DRC is able to correct for a 14 dB swing and reduce it to +- 2 dB deviation. The spectral response is similar to preferred spectral responses as described in B&K's paper (Figure 5) and Dr. Sean Olive's paper (slide 24).
ETC looks to be in spec as almost all early reflections are – 15 dB below the main signal arrival. The early reflection of around 2 milliseconds is the first reflection off the floor to the listening position. Other than mounting the speakers in soffits, not much can be done there. The good news is on how diffuse the later reflections are. Which means the room adds little tone color to the reproduced music through the speaker/room combo. This is captured in the binaural recordings.
The blue waterfall graph is as good as it is going to get given my room ratio. I can play with the decay time of the 50 to 60 Hz wave by adjusting a parameter in the time domain window in Audiolense’s Correction Procedure Designer as a next step to tune this back a bit, but I don’t notice it too much in the sound.
Listen to the difference Part 2
Just like the previous AB compare of untreated room to treated room, I did the same thing using the in-ear binaural mics to record both the treated room and with DRC. Starting with the treated room and then switching to the treated room with DRC in 15 seconds and then switching back and forth every 15 seconds after.
Use headphones to fully realize the binaural effect.
Download MP3 320kbps 4 meg
Download hires WAV 40 meg
Even a greater timbre change with DRC enabled. So much so, it can be rather startling at first, but as you peruse the graphs and listen, you can make the correlation. It is quite flat sounding, both figuratively and literally. If that ain't your type of sound, then you can change the target to be anything you want.
Take a moment and listen to the full 1:30 seconds of the recording with room treatments and DRC (with time alignment) enabled.
Download MP3 320kbps 4 meg
Download hires WAV 40 meg
Listen for the kick drum and bass guitar. The kick drum now has a full, yet tightly delineated sound to it. It is nice and clear and note it's position in the mix. The bass guitar, you can hear every note at equal loudness, no hint of "one note" bass tone. Near the end of the tune, you can hear the bass note slide very low and rise. If you were physically in the room, you would feel the bass and drums.
Also gone is the "boxy" sound. Focus in on those tight cracking drums, no longer do they sound outside the mix. They now fit in the mix at the right level and nowhere near as reverberant.
Note the overall stereo image and depth of field. Listen to where the kick drum is slotted in the mix and that it sounds crystal clear and not buried in a muddy sound. The image now has a depth of field as deep as the width is wide.
The more you listen, the more you will notice. Our ears will acclimatize after a couple of minutes and will have forgotten the orginal horror tone of my untreated room.
I had a lot of fun doing this. I think the binaural recordings a bit novel. The timbre changes between the untreated room, treated room, and with DRC, are definitely audible in the binaural recordings. As mentioned earlier, it is too bad that the mics are not calibrated and that there is no way I was able to position myself repeatedly in the same spot. Now that I have a fix for that, along with a mic calibration procedure, my next set of binaural recordings will match much closer to the true timbre of my system.
If achieving the best possible timbre from your audiophile system is of interest to you, then this article and my previous article, Speaker to Room Calibration Walkthrough, may be of some use.
Update October 15th new frequency and impulse response measurements
I have learned a lot about digital room correction and FIR filter design over the last year. These two resources, Sound correction in the frequency and time domain and DRC: Digital Room Correction have really helped me understand what is happening and what targets to shoot for.
Also, Bob Katz, assisted with the target response of flat to 1 kHz, and using 1 kHz as a hinge point, straight line to -6 dB at 20 kHz. Here is the measured response at the listening position:
<p><a href="<fileStore.core_Attachment>/monthly_2013_02/58cd9bc3432a5_stereofr.jpg.3ff30fe9f510f2ddcb2d03a38a36135f.jpg" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28144" src="<fileStore.core_Attachment>/monthly_2013_02/58cd9bc3432a5_stereofr.jpg.3ff30fe9f510f2ddcb2d03a38a36135f.jpg" class="ipsImage ipsImage_thumbnailed" alt=""></a></p>
Then I measured the un/corrected impulse response, using REW. It is remarkably similar to the before and after impulse responses at Denis Sbragion's site. Here is the impulse response of Denis system, before and after DRC applied:
<p><a href="<fileStore.core_Attachment>/monthly_2012_10/CorrectionBig.png.e393dd0290c21959aea8944ea1dd0a05.png" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28133" src="<fileStore.core_Attachment>/monthly_2012_10/CorrectionBig.png.e393dd0290c21959aea8944ea1dd0a05.png" class="ipsImage ipsImage_thumbnailed" alt=""></a></p>
Here is the uncorrected and corrected impulse response of my system:
<p><a href="<fileStore.core_Attachment>/monthly_2012_10/58cd9bc2ee410_Impulseresponse.JPG.4b40759209f015221bfd340b27e03ecc.JPG" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28134" src="<fileStore.core_Attachment>/monthly_2012_10/58cd9bc2ee410_Impulseresponse.JPG.4b40759209f015221bfd340b27e03ecc.JPG" class="ipsImage ipsImage_thumbnailed" alt=""></a></p>
Remarkably similar to Denis’s impulse response, but maybe a bit less ringing. Getting close to textbook perfect impulse response.
The Audilolense designed 64-bit FIR filter is hosted in JRiver’s version 18 64-bit convolution engine. “The precision offered by Media Center's 64-bit audio engine is billions of times greater than the best hardware can utilize. In other words, it is bit-perfect on all known hardware.”
Using JRiver’s loopback function, I am able to send the swept sine wave output of Audiolense into the input of JRiver, where the output of JRiver goes out the DAC, line output, and into the amp/speakers. In JRiver, I can toggle the FIR filter on/off in the Convolution engine and take measures as the mic at the listening position picks up the signal, and through a mic preamp, into the analog line inputs of my Lynx Hilo, through the ADC, and routed to the digital input channel of Audiolense so it can measure the responses. Like so:
<p><a href="<fileStore.core_Attachment>/monthly_2013_02/ASIO.jpg.5d69b8eaf08e2b24ee6c52237893a9a3.jpg" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28145" src="<fileStore.core_Attachment>/monthly_2013_02/ASIO.jpg.5d69b8eaf08e2b24ee6c52237893a9a3.jpg" class="ipsImage ipsImage_thumbnailed" alt=""></a></p>
In JRiver, another feature of the 64-bit Convolution engine eliminates the FIR filters delay. This is an awesome feature for HTPC’s and/or Streaming Media players as the audio will not be delayed and the picture will be in sync. In my case, I am not using a lot of filter taps, so my filter’s delay is 171 milliseconds plus add low latency delay Lynx Hilo routing through digital loopback into JRiver’s audio/convolution engine.
This also means there is no delay when performing swept sine wave measurements where the playback and recording are synchronized, and using a tracking filter, ensure high signal to noise ratio, typically 100 dB of dynamic range.
If there is too much delay introduced either by the JRiver loopback feature or latency through the A/D D/A device or too long buffering times, both Audiolense and REW will try and “hunt” for the signal. But because of excessive delay, neither software may be able to signal lock and the measurements are no good. This is what happened to me in JRiver 17 as the Convolution engine in that version does not have this feature.. It is important to have all timing settings in the system as low latency as possible, without the audio stuttering, pops, drop outs, etc.
It takes experimenting to arrive at the optimum system settings, whether it is WASAPI, ASIO, or Kernel Streaming, plus fiddling with both the A/D D/A device buffers and JRiver buffers… But it can be done and is worth it. Why? JRiver is my predominant music player, so a valid test would be to run the test signal through the same audio engine I am listening too, either with 64-bit Convolution off (audio routed through Hilo's excellent headphone amp to my Senns), or Covolution on, routed through the amp/speaker/room playback chain.
Now that I have a repeatable way to take audio measurments through the entire signal path that I listen to music through, I think I can fine tune the system even more. Stay tuned for more updates.
Cheers, Mitch<p><a href="<fileStore.core_Attachment>/monthly_2013_02/Stereo.JPG.3a2b35f52815d3fbff7d3f475e1cca63.JPG" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28143" src="<fileStore.core_Attachment>/monthly_2013_02/Stereo.JPG.3a2b35f52815d3fbff7d3f475e1cca63.JPG" class="ipsImage ipsImage_thumbnailed" alt=""></a></p><p><a href="<fileStore.core_Attachment>/monthly_2012_09/58cd9bcae00f3_binauralmicsfreqresp.JPG.2d30a4fc31c7db14711d86d91bd8bb7d.JPG" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28379" src="<fileStore.core_Attachment>/monthly_2012_09/58cd9bcae00f3_binauralmicsfreqresp.JPG.2d30a4fc31c7db14711d86d91bd8bb7d.JPG" class="ipsImage ipsImage_thumbnailed" alt=""></a></p><p><a href="<fileStore.core_Attachment>/monthly_2012_09/3diosound.JPG.65ee8e67fd934f619e082b012967a79e.JPG" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28380" src="<fileStore.core_Attachment>/monthly_2012_09/3diosound.JPG.65ee8e67fd934f619e082b012967a79e.JPG" class="ipsImage ipsImage_thumbnailed" alt=""></a></p><p><a href="<fileStore.core_Attachment>/monthly_2012_10/CorrectionBig.png.f37af0404984ef6e1406b42ba5f9fa44.png" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28383" src="<fileStore.core_Attachment>/monthly_2012_10/CorrectionBig.png.f37af0404984ef6e1406b42ba5f9fa44.png" class="ipsImage ipsImage_thumbnailed" alt=""></a></p><p><a href="<fileStore.core_Attachment>/monthly_2012_10/58cd9bcb0766a_Impulseresponse.JPG.342b6e93bc1fe54421565b9f1be77855.JPG" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28384" src="<fileStore.core_Attachment>/monthly_2012_10/58cd9bcb0766a_Impulseresponse.JPG.342b6e93bc1fe54421565b9f1be77855.JPG" class="ipsImage ipsImage_thumbnailed" alt=""></a></p><p><a href="<fileStore.core_Attachment>/monthly_2013_02/Stereo.JPG.8a1f8870db3f966592ba4b06befe35b2.JPG" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28393" src="<fileStore.core_Attachment>/monthly_2013_02/Stereo.JPG.8a1f8870db3f966592ba4b06befe35b2.JPG" class="ipsImage ipsImage_thumbnailed" alt=""></a></p><p><a href="<fileStore.core_Attachment>/monthly_2013_02/58cd9bcb4b019_stereofr.jpg.66dd9d0bbf32176666829f99b3f4c479.jpg" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28394" src="<fileStore.core_Attachment>/monthly_2013_02/58cd9bcb4b019_stereofr.jpg.66dd9d0bbf32176666829f99b3f4c479.jpg" class="ipsImage ipsImage_thumbnailed" alt=""></a></p><p><a href="<fileStore.core_Attachment>/monthly_2013_02/ASIO.jpg.636f7fe4a0b1bbd01539c82a7b612ba6.jpg" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28395" src="<fileStore.core_Attachment>/monthly_2013_02/ASIO.jpg.636f7fe4a0b1bbd01539c82a7b612ba6.jpg" class="ipsImage ipsImage_thumbnailed" alt=""></a></p>
I believe every piece of audio equipment has its own sonic signature. E.g. CD transports, cartridges, tone arms, turntables, preamps, amps, cross overs, speakers, interconnects, basically every component, part, and wire in (and around, e.g. power supplies) the audio signal path will have its own sonic signature, whether designed or not. Technically, a sonic signature is called a transfer function, but we will get to that shortly. I also believe there is a direct correlation between what we hear and what we measure. Again, related to transfer function.
Let’s go back in history and see what the state of the art in audio design was say in 1953, almost 60 years ago. The audio designers “bible” at that time was The Radiotron Designer’s Handbook: Radiotron Designer's Hanbook on CD-ROM - Circuit Cellar, Inc. .
The Radiotron Designers handbook has been written as a comprehensive, self-explanatory reference handbook for the benefit of all who have interest in the design and application of audio amplifiers. The book was designed to be as self-contained as possible.
What is interesting, even in 1953, there was a collection of state of the art valve amplification designs, including amongst other notable classics was a single ended triode Class A design, regarded among many, even today as the most musical sounding amplifier design. At 1400 pages, the book contains everything an aspiring amplifier designer needs to design and build a high fidelity amplifier.
Back in 1953, the definition for high fidelity was, “True fidelity is perfect reproduction of the original. In reality, true fidelity can only be regarded as an ideal to aim for.” I believe that definition still holds true today. As does many other principles, designs, mathematical theories, measurement standards, etc., all encompassed in this classic book on audio amplifier design.
For example, most of the types of audio distortions were already well known in 1953. Not only were they classified by type, but also how to measure each one against a standard, and how to interpret the results by characterising the sound quality using subjective terms.
Put aside for a moment the revolution that is digital audio, and we will see that analog audio design has not changed much from the state of the art since 1953. Of course, there has been many “material” changes that have dramatically increased the fidelity, but there have only been a few new “design” innovations.
For example, back in 1953, the fundamental class or type of amplifier designs were mostly figured out and well documented. Electronic amplifier - Wikipedia, the free encyclopedia Of course, new innovations produce new amplifier classes. However, each amplifier class has its own sonic signature. Technically, this sonic signature is called a transfer function.
An excellent definition of a transfer function, as related to audio design, can be found in Bob McCarthey’s most awesome book, Amazon.com: Sound Systems: Design and Optimization, Second Edition: Modern Techniques and Tools for Sound System Design and Alignment (9780240521565): Bob McCarthy: Books
“The response of a device from input to output is its transfer function. The process of comparison measurement between input and output is termed transfer function measurement. This could be a passive device, such as a cable, attenuator or filter, or an active analog or digital circuit. The transfer function of a device is obtained by comparing its input signal to its output signal.”
“The transfer function of a hypothetical perfect transmission system would be zero. Zero change in level, zero delay, and zero noise at all frequencies. Any device that passes signal will have some deviation in transfer level and transfer time and will add some noise in the process. Transfer function measurement can detect these changes and has a variety of ways to display them for our analysis.”
Wikipedia’s definition of transfer function is more formal, “A transfer function is a mathematical representation, in terms of spatial or temporal frequency, of the relation between the input and output of a linear time-invariant system.” Whew, have a look at the math on that page. What is interesting to note is that this has all been figured out long ago and part of the science of audio engineering.
What’s my point? If we go back to the class or type of amplifier design in my example, these are the corresponding transfer functions of Class A, AB, and B amplifiers:
<p><a href="<fileStore.core_Attachment>/monthly_2012_05/ClassATransferCurves.jpg.eb6175970d5125e79895927a86b2e582.jpg" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28090" src="<fileStore.core_Attachment>/monthly_2012_05/ClassATransferCurves.jpg.eb6175970d5125e79895927a86b2e582.jpg" class="ipsImage ipsImage_thumbnailed" alt=""></a></p>
Note, because of these transfer functions, and other inherent characteristics (or properties) of different classes or types of amplifiers, each class will have its own unique sonic signature. Personally, I like the sonic signature of Class A amplifiers. Subjectively speaking, excellent wide-band damping factor (i.e. control) especially into highly reactive loads:
<p><a href="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc182d8f_PassA40DampingFactor.JPG.5acbb1ceaad2c767b756381d231c373b.JPG" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28091" src="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc182d8f_PassA40DampingFactor.JPG.5acbb1ceaad2c767b756381d231c373b.JPG" class="ipsImage ipsImage_thumbnailed" alt=""></a></p>
Wide bandwith response (100 KHz) with low distortion:
<p><a href="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc18a2d0_PassA40Distortion.JPG.4b33d8308b5208ac46fde87c3e883e9a.JPG" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28092" src="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc18a2d0_PassA40Distortion.JPG.4b33d8308b5208ac46fde87c3e883e9a.JPG" class="ipsImage ipsImage_thumbnailed" alt=""></a></p>
There is a direct correlation between what we hear and measure based on these transfer functions. Not only can the transfer functions be fully measured, but can also be represented mathematically (in a software modeling tool) so the designer can model in software, the sonic signature, run tests to verify the design. The designer can then implement the design that is a transformation into a physical device (or a software runtime). Rerun the measurements and verify the implementation meets the design. Been a reality for quite a while.
The art is in how a particular designer uses the science to create sonic signatures (i.e. transfer functions).
Let’s take an interesting example. Remember back in 1985 the Stereophile Bob Carver challenge? You can read about it here The Carver Challenge | Stereophile.com and I attached the PDF. It is a fascinating read.
The point is that Bob was able to modify and match the transfer function of his amplifier to a well-known audiophile amplifier. How did he do this? He used a null test technique to adjust the transfer function in his amp to match the transfer function of the reference amp. The deeper the null, the more his amp will sound like the reference amp. In fact to the point where the Stereophile folks admitted defeat.
Therefore a person can design, model, and replicate anyone’s transfer function (e.g. sonic signature). This actually has been going on a long time in the world of pro audio.
I invite you to watch this 5 minute video from one hardware turned software manufacture. I used their real physical hardware (e.g. Studer, 1176LM), now these are all emulated in software, including the most sought after characteristic, their sonic signatures:
UAD-2 Powered Plug-Ins Platform | Digital Audio Products and Plug-Ins | Universal Audio
Did you happen to notice the digital representation of the Studer A-800 analog tape recorder? I used to work with the physical machines and I know it’s sonic signature only too well. To “hear” that characteristic sound emulated in software is mind blowing.
Consider the Universal Audio 1176 limiter, the hardware version here: 2-1176 Twin Vintage Limiting Amplifier | Universal Audio and the software version here: 1176 Classic Limiter Plug-In Collection If you read the blurbs, you will understand exactly what I mean by sonic signature. And the fact that they have both a real physical version and a virtual emulated software version, in which I can assure you have been nulled tested to death to ensure the units have the exact same sonic signatures.
The other reason that this is relevant is that almost every piece of rock, pop, jazz, blues, and other multi-tracked, and even live to two track, mastered, pressed, etc., has passed through this device. Everyone listens to the sound, but can you hear it? I have spent so much time with this device, as others have in the recording industry, that I hear it on almost every piece of music I listen to. In fact, for every tune I listen to, I mentally turn the dials on the ol 1176 to the settings I think I am hearing. That’s how much I hear it. I made a comment about this that I can track the limiter on Nora Jones album and if you read the blurb on the 1176 hardware version, you will see they reference Nora Jones using the device. I try and not let this interfere with my enjoyment of the music.
So can measurements measure everything about audio? Of course. The above proves it. How else can you emulate the sonic signature of a piece of audio hardware in software. Transfer function testing is used all the time (including null testing) to make this happen. Btw, if you paid close attention to the video, check out the amount of test gear.
For the software, The Audio Programming Book is a great place to start. The Audio Programming Book - The MIT Press This has already been figured out and commercialized in the pro audio industry. In addition to the Universal Audio software plugins, there are literally thousands of already developed and commercially available plugins in the largest plugin database in the world. KVR audio KVR: All Plug-ins / Hosts / Apps On One Page
Null testing is one proven technique to measure sonic signatures (i.e. transfer functions). It is backed up by science and math and been utilized by designers for decades. The 1954 Radiotron book, The Universal Audio example, as with the Bob Carver example, and even examples on this very site prove it’s real.
If you are interested in how this really works, Amazon.com: Sound Systems: Design and Optimization, Second Edition: Modern Techniques and Tools for Sound System Design and Alignment (9780240521565): Bob McCarthy: Books book is an excellent one. Of course, there are several other references, textbooks, DSP sites, etc., that walk you through the science and practice.
You can try the null test yourself, like esldude did in his post here. I have performed many null tests in my electronics and software programming career. Nowadays it is really easy as we have access to free tools like Audacity and Audio DiffMaker (or any Digital Audio Workstation for that matter) that have incredible dynamic range and measurement sensitivity.
Here is a null test that anyone can perform with this free software and can easily repeat the same test and get the same results I did. In fact you can listen and hear the differences yourself as I attached the difference files to this post. Take any WAV file and convert it to an MP3 and either using Audacity or Audio DiffMaker, you can compare the difference between a lossless file format and a lossy format.
I preformed these tests using both Audacity and Audio DiffMaker and got similar results. Note I used the LAME MP3 encoder at 320 Kbps. Listen to the files, that is the absolute signal difference between WAV and MP3. There is nothing else. That's the entire transfer function.
The fact that there is a difference is no surprise to me and that I don’t get a deep null is no surprise either. In fact, you should not get a deep null, you should get roughly a null of in the -40 to -50 dB range as that is the entire signal difference (or transfer function) between lossless and lossy file formats.
As a side note, remember that a transfer function, by definition, includes everything in the amplitude, frequency, and time domains. Not only can we measure it, but designers routinely model these transfer functions (in software) to produce certain sonic signature as evidenced by the video above. Null testing captures the absolute signal difference between the device under test and the reference. The science (read: math) proves this just like Ohms law can be proven.
I am not saying performing null testing is easy. One sample difference can offest the results, so precision is required. Further, there are so many interacting components within even the simplistic audio system, makes it non-intuitve to figure out. Setting up null test experiments takes a lot of understanding of how things work. For example, sometimes people hear huge differences in speaker cables. I have heard it myself. However, upon closer analysis, and the particular amplifier I was using at the time, adding a little extra cable capacitance (measured!) was just enough to destabilize the amplifier into ultrasonic oscillations, caused by poor amp design (and a sagging power supply). So was it really the cable that sounded so different? No in this case, the cable caused the amplifier to oscillate ultrasonics that produced a very harsh sound. Of course, since the cable was swapped, the immediate conclusion is, oh it must be the cable. Well, no, not really.
The much more interesting question, is the WAV vs MP3 difference audible in relation to the program level? In fact, that is the question isn’t it. At what level of signal difference can our ears (read: brain) perceive the difference? If you took the time to read the Stereophile Bob Carver challenge, you will see that at -70 dB null, the difference to the listeners was inaudible. Based on my own tests, I would concur. It seems near the -50 dB range is the limit where, for me anyway, I start not hearing any difference (using ABX plugin in Foobar2000).
For me, the reason I perform transfer function tests is because I am trying to get as close to reproducing the correct timbre in my listening room as possible. As I have mentioned in other blog posts, I believe the weakest link in the audio chain is the speaker to room interface. In fact, you can measure its transfer function, both in the frequency and time domains, using excellent (free) software like REW. And just like the emulation of the studio gear above, I can emulate the sound of a really nice room by using the power of digital audio, (DSP, DRC, and Convolution), but that is the subject for another post.
Does every piece of gear have a sonic signature. As barrows says, everything matters. You bet. At what level is it audible at seems to be the real question. Do we have the measurement capabilities to measure all aspects of audio gear? Based on the Universal Audio example above, the answer must be yes. How else can a hardware sonic signature be modeled and emulated 100% in software if you can’t measure 100% of the hardware's sonic signature (a.k.a. transfer function)?
Audio is both an art and a science. It has both subjective and objective components. So really, there is no subjective versus objective, one is simply a different view of the same thing. Remember even back in 1953, the Radiotron book they had both objective measures and subjective descriptions for those measures. We still do that, but the art and science, along with the digital audio revolution, has advanced to the stage of perhaps diminishing returns.
For example, Bob Katz and his book called, Mastering Audio, the art and the science (another awesome book that correlates what we hear with what we measure) states in the section on converters that, “All the converters mentioned here are at least A grade. The difference between an A and an A+ is extremely small, perceptible by only the most discriminating listeners and opinions vary on which is better.” You can see some of the transfer function measurements, they are literally textbook perfect – can’t get any better.
Soon we will see DAC emulation in software, where you can get your favorite DAC sonic signature :-)
Enjoy the music!<p><a href="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc16e925_CarverStereophileChallenge_pdf.e7af3a569261d9979453c3c35c3b83da" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28087" src="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc16e925_CarverStereophileChallenge_pdf.e7af3a569261d9979453c3c35c3b83da" class="ipsImage ipsImage_thumbnailed" alt=""></a></p><p><a href="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc172f27_FLACvsMP3AudioDiffMakertest_zip.b4e693154604b661b99b4067d5591852" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28088" src="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc172f27_FLACvsMP3AudioDiffMakertest_zip.b4e693154604b661b99b4067d5591852" class="ipsImage ipsImage_thumbnailed" alt=""></a></p><p><a href="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc176f57_FLACvsMP3FileNulltest_zip.484986b2e42d5312f6accc403ed1ce34" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28089" src="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc176f57_FLACvsMP3FileNulltest_zip.484986b2e42d5312f6accc403ed1ce34" class="ipsImage ipsImage_thumbnailed" alt=""></a></p><p><a href="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc99415b_CarverStereophileChallenge_pdf.a2cf9dc4b90886d6b9cdd134d6b8a0bd" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28337" src="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc99415b_CarverStereophileChallenge_pdf.a2cf9dc4b90886d6b9cdd134d6b8a0bd" class="ipsImage ipsImage_thumbnailed" alt=""></a></p><p><a href="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc998b26_FLACvsMP3AudioDiffMakertest_zip.6a1b3a58816568aac0bddf1d0fd50388" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28338" src="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc998b26_FLACvsMP3AudioDiffMakertest_zip.6a1b3a58816568aac0bddf1d0fd50388" class="ipsImage ipsImage_thumbnailed" alt=""></a></p><p><a href="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc99d862_FLACvsMP3FileNulltest_zip.367fef543c7c4c79dc82da17cdb4cd32" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28339" src="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc99d862_FLACvsMP3FileNulltest_zip.367fef543c7c4c79dc82da17cdb4cd32" class="ipsImage ipsImage_thumbnailed" alt=""></a></p><p><a href="<fileStore.core_Attachment>/monthly_2012_05/ClassATransferCurves.jpg.791eb6c7c296582250c65fac796a3399.jpg" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28340" src="<fileStore.core_Attachment>/monthly_2012_05/ClassATransferCurves.jpg.791eb6c7c296582250c65fac796a3399.jpg" class="ipsImage ipsImage_thumbnailed" alt=""></a></p><p><a href="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc9a98ed_PassA40DampingFactor.JPG.63b16947b5bff7b75b33518739b5bb60.JPG" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28341" src="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc9a98ed_PassA40DampingFactor.JPG.63b16947b5bff7b75b33518739b5bb60.JPG" class="ipsImage ipsImage_thumbnailed" alt=""></a></p><p><a href="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc9b08a3_PassA40Distortion.JPG.ddc14dfd5addb2d779ffadd67c898f32.JPG" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28342" src="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc9b08a3_PassA40Distortion.JPG.ddc14dfd5addb2d779ffadd67c898f32.JPG" class="ipsImage ipsImage_thumbnailed" alt=""></a></p>
As an ex recording/mixing engineer/producer, http://www.thepikes.com/bio here are a few thoughts with respect to evaluating high resolution masters for sound quality.
Unfortunately, for most recordings, especially multi-track, there are many, many steps/paths from the mic to the final master we listen to. Most folks I think would be surprised to see the workflow. But that is another post.
The criteria I use to eval music sound quality is:
1) Musical performance 1st. Ultimately, if the performance is not the artists best, then the rest really does not matter.
2) Who recorded/engineered/produced the tracks. If you follow closely in the music genre you are most familiar with, (for me it is rock/pop), you start seeing a trend as to who has the touch and who doesn’t.
3) Who mastered the mix(es). Again, you will see a trend if you spend the time and effort. However, this is the most variable part of the equation as there are many, many pressing and re-masters being released. The biggest problem here is finding out which pressing/master was the source of the re-mastering. Sometimes re-mastering means a re-mix from the original multi-track sources, sometimes not. Sometimes the master is a tape generation copy as the studio/artisit does not want to touch the original (or the original has gone missing or...) Sometimes, it takes quite a bit of research to find out all the relevant info.
4) Finally, if the material has been re-mastered for “hi res”, then it becomes even more critical to find out where the source master came from, what processing, if any, was done to it. And that the new re-master has not become a victim of the loudness war.
The Loudeness War in Under 2 Minutes:
No question, if you are serious about the SQ of your music, then it does take effort to figure this out.
Some places to look:
Right here of course at CA!
Steve Hoffmans forums: http://www.stevehoffman.tv/forums/forumdisplay.php?f=2 is a primary source to look as the folks there go to great lengths and detail about performances and best pressings/masters (down to catalog numbers).
Another source is the Dynamic Range Database: http://dr.loudness-war.info/ At least you can do a quick check to see if the music/master you are looking for has a chance.
Here is an example walkthrough using the Police’s Synchronicity album that was recorded in 1983. http://en.wikipedia.org/wiki/Synchronicity_(The_Police_album)
But first aside. I happen to know the Studio Synchronicity was recorded in. http://en.wikipedia.org/wiki/Le_Studio And you know what, the actual gear does not matter so much as most professional studios have similar if not the same gear. If it was analog, Studer machines were king, but Sony (MCI) and Ampex were no slouches either. Everyone has the same compliment of mics, outboard processing gear etc.
Most importantly were the folks actually doing the recording, mixing, producing to get the performance out of the artist and capturing the sound. A trick I used to play on artists was to get them to run through the song a couple of times before we “rolled tape”. What the artist/band did now know what that I was rolling tape and more often than not, the first take was the artists best performance as they were excited, but relaxed as they knew the tape was not rolling ;-) In one instance, the bed tracks I recorded for a demo ended up being used on the album and both the engineer and producer could not recreate the feeling or the sound of the band at another time and place – neither could the artists. Remember, a recording is a snapshot of history that may never repeat itself.
Back to the Police and Synchronicity. Have a look at: http://dr.loudness-war.info/index.php?search_artist=police&search_album=synchronicity There are at least 4 masters of the same “album”. If you click on the info button on each one, you can see the details. One is the original pressing/master on A&M from 1983, another reissue on A&M in 2003, a MSFL and SACD “audiophile” versions as well.
What is very interesting to note is that the original 1983 pressing has the most dynamic range, beating out the MSFL and SACD “audiophile” versions. That means when the MFSL and SACD versions were re-mastered, some compression was applied and who knows what else. Also note that DR is but one evaluation criteria, but a significant one.
Which brings me to my point. Just because it says “hi-res” or MSFL or SACD, does not automatically mean it sounds better than the original (or other) pressing/master. This was one example to illustrate my point. Btw, you can read all about Synchronicity on Steve Hoffman forums (and others) to assist in making up your own mind without (somehow) purchasing all 4 versions yourself: http://www.bing.com/search?q=steve+hoffman+forum+police+synchronicity&qs=n&form=QBLH&pq=steve+hoffman+forum+police+synchronicity&sc=0-20&sp=-1&sk= Oh yeah, lots of discussion. Another point of eval is the recording/mixing engineer – Hugh Padgham – one of my favourites: http://en.wikipedia.org/wiki/Hugh_Padgham
So before you plunk down dollars on a high resolution format of whatever music you happen to enjoy, and if sound quality is really important to you, then take a bit of time to do some research to ensure you are getting what you are expecting. Otherwise, you may be disappointed. Or in some cases appalled to find out that not only has your favourite artist/band, that has been re-mastered in a high resolution format, had the snot compressed out of it, but clipping as well.
Best of luck!
While visiting a friend in The Big Smoke recently, I had an opportunity to assist in tuning a set of DIY speakers to his critical listening environment. This is a walkthrough of what we did, how it sounded, and lessons learned. I tried to present this in a step by step format so if desired, by following the same steps, you could obtain similar results. Here is pic of my friend’s rig:
Let’s inventory the gear. The DIY satellite speakers are enclosed in a designed (using Thiel Small parameters) sealed box using a Seas 6 1/2” woofer that has a characteristically smooth frequency response. The tweeter is an interesting Philips ribbon design.
The passive sub-woofer is a sensitivity matched 10” Pioneer polypropylene driver with a dual voice coil that replaced the original Paradigm driver in which the surround rotted away.
In the above picture, the bass cabinet is upside down showing the two vented ports on the bottom. The speaker is mounted such that if the ports are facing the rear wall, the speaker is facing the listening position (i.e. firing forward).
In my friends listening tests, plus mine, we found the bass response was smoother with the driver firing towards the rear wall, with the ports facing the listening position. The bass frequency response was smoother, both from a listening, and as it turns out, a measurement perspective.
The satellites and passive sub are powered by 100 watt per channel vintage NAD power amplifier, using the preamp section of the NAD integrated amplifier to drive the amp.
Source music is iTunes on an iMac G5.
Here is a check list I assembled for this calibration task. As mentioned at the top of the article, I tried to write it in easy to perform steps.
Step 1 Check to see if the speakers and listening position can be set up in an equilateral triangle.
It is not the end of the world if it can’t be done for whatever reason. In my friend’s case, given the limited space, this is how it turned out. But once the speakers were tuned to the room, it was impressive to hear the rock solid 3D soundstage that these speakers reproduce. There are a couple of reasons for this that will be brought forward later.
Here is an example "blueprint" of the type of speaker to room interface setup required for playback if the goal is to hear what the mixing engineer intended to be heard as accurately as possible. This means trying to achieve symmetry as close to a tight tolerance as possible. Measuring the symmetry (i.e. physical distance) between speakers, walls, and listening position is the key to accurately reproducing (i.e. decoding) the soundstage.
Here is the specification.
Both Dolby Labs, “5.1 Channel Music Production Guidelines”, http://www.dolby.com/uploadedFiles/zz-_Shared_Assets/English_PDFs/Professional/4_Multichannel_Music_Mixing.pdf and “ITU-R BS.775-1* MULTICHANNEL STEREOPHONIC SOUND SYSTEM”, http://www.gearslutz.com/board/attachments/bass-traps-acoustic-panels-foam-etc/234800d1305311642-sticky-links-itu-r-bs775-1.pdf (and ANNEX 1 for stereo) have similar “specifications” when it comes to interfacing the speakers to the room.
Full disclosure, in a previous career, I worked as a recording/mixing engineer for 8 years. I was fortunate enough to have worked in very nice control rooms, with state of the art acoustics and monitoring. I was lucky to have been involved in the ground up build of this facility with Chips Davis: http://chips-davis.com/ I learned a lot. By far the best sounding critical listening environment I have ever heard.
Those Urei/JBL 813B "time aligns" were being driven by 2.4 kilowatts of Crown power. The bottom end was super tight and still went to 20 Hz. Using the mid and high frequency trim pots, we would adjust the tonal balance to typically that of the B&K target frequency response (discussed later). This was the only eq required as the room itself was designed and built to be as acoustically transparent as physically possible (LEDE certified). Almost all pro facilities since then are based off of this deign. It was a revolution in control room design.
In order to certify a mixing, mastering, broadcast, video post-production, critical listening, etc., facility, it means implementing this specification. So if the goal is to accurately reproduce what the mixing engineer hears in their chair, it makes sense to use the same specification in our own critical listening environments.
Note this is how stereo and 5.1 music is mixed using this "speaker to room" specification. If we want to properly “decode” what was mixed, then our speakers to room interface should conform to this specification. The Dolby document, “5.1 Channel Music Production Guidelines” is a great read. Even though a bit dated gear wise, almost every piece of music we listen to has been recorded, mixed, and mastered using this speaker to room interface specification.
Step 2 “Voice” the speaker to room interface by using the following two methods. One is by ear the other using REW acoustic measurement software. Let’s go by ear first.
By Ear. I voice (i.e. play music on) one speaker at a time. I play music that has a solid bass line, preferably with a wide range of notes. Ideally a bass line that produces significant output at 20 Hertz and moves up and down the musical scale.
What we want to do is listen to the music while moving the speaker either closer or further away from the rear wall. Here is the method I use. Looking at my friend’s setup, I would stand parallel, facing the side of one of the speaker cabinets. One ear is focused towards the listening position, with the other ear focused on the wall behind the speakers. So I am standing 90 degrees perpendicular (i.e. sideways) to how I would normally sit and listen to the stereo.
While the music is playing, on one speaker at a time, I am slowly moving the speaker either forward out into the room or backward towards the wall. What am I listening for? I am listening for the evenness of the bass notes. As the bass notes are playing up and down the musical scale, each note should sound equal in level (i.e. loudness) to the ear. Technically, I am trying to “hear” the room modes.
If one or more different bass notes drops quite a bit in level that means I am in a position in the room where those bass notes (i.e. range of frequencies) cancel each other out or in a “null” spot. Conversely, if some bass notes sound loud or boomy, relative to other notes, then I am standing in a position in the room where those bass notes add together. However, there is likely one or more spots in the room where the speaker’s bass notes sound the most even. That’s where the speakers should be placed, if the geometry works out. In other words start with a rough equilateral triangle and fine tune from there.
Here is an illustration that animates this concept of waves adding or canceling depending on room position.
Further reading: http://www.acs.psu.edu/drussell/Demos/superposition/superposition.html
Note this is relative to the listening position as well, which further complicates the mixing of waveforms. Ideally, the listening position is as far away from all reflecting surfaces as possible. This means towards the middle of the room. Called, a reflection free zone. Here is a picture of my critical listening environment, when I remove the table that is. The listening position is away from the rear and side walls as much as possible. The floor and ceiling and wall/glass behind the speakers are the early reflectors, but are dampened by the thick carpet I have between the speakers and listening position.
Back to my friends room. Another way to voice the speakers to room interface is by using a computer software measurement system. Fortunately, high resolution software measurement systems, too expensive ten years ago, are now state of the art for a commodity price or even free.
One such fine piece of measurement software is REW http://www.hometheatershack.com/roomeq/ , which unbelievably is free. In conjunction with a calibrated measurement microphone http://www.content.ibf-acoustic.com/catalog/product_info.php?cPath=30&products_id=35 this state of the art measurement system can assist in optimizing any speaker to room interface. In fact, given the computing power, sophisticated measurement software, and calibrated mic, the resolution of the measurements made by this system exceed what our ears are capable of hearing.
First we need to calibrate the sound card. What this means is that the REW software is going to measure the frequency response of the sound card and if there are any deviations from “flat” from 20 Hz to 20 KHz, the “calibration file” will compensate. This takes any of the sound card’s frequency response variations out of the measurement equation. That way we are measuring the true frequency response of the device under test, which in our case are the speakers in the room.
This is the measurement of my Lynx L22 pro sound card, frequency response is flat out past 50 KHz. Since this measurement is taken in external loopback mode, the frequency response is the combined response of both the input/output analog amplifiers and ADC/DAC components. You achieve external loopback by connecting the output of your sound card back to the input of same sound card. You can find good instructions here: http://www.hometheatershack.com/roomeq/wizardhelpv5/help_en-GB/html/calsoundcard.html#top
As an aside, it is amazing to me that the onboard chip in my Dell laptop has an ADC and DAC that supports up to 24/192. This is the onboard chip! Soon we will see this level of DAC in smart phones.
The same calibration approach applies for the measurement microphone as well. We want to plug in its calibration file, supplied by the microphone manufacturer, into the REW software. One of the reasons I like this measurement kit http://www.content.ibf-acoustic.com/catalog/product_info.php?cPath=30&products_id=35 is because the mic preamp is calibrated to the microphones sensitivity. This means I can calibrate SPL’s as well if I want to meet a proposed monitoring level specification like Bob Katz’s K System: “An Integrated Approach to Metering, Monitoring, and Leveling Practices” http://www.digido.com/level-practices-part-2-includes-the-k-system.html Excellent read and proposal in my opinion.
My mic comes with two calibration files, one for the mics on axis frequency response for when the mic is pointing down the center line towards the speakers, and 90 degree off axis response for when the mic is pointing up. I have used both and for room measurements, I prefer the 90 degree diffuse field correction. Also relevant if you are tuning a 5.1 system.
Here is my mic’s calibration file. This is the frequency response graph and partially what the 2 calibration files contain from 20 Hz to 20 KHz.
This is a pic of the microphone position used for the following measurements in my friend’s listening room. Note that the listening position is at the back wall, as opposed to the middle of the room, as space is limited. There is a sneak peek of the frequency response graph on REW’s software.
Now we are in a position to measure the frequency response, but first we need to set some levels and calibrate the SPL. That is covered in REW’s guide: http://www.hometheatershack.com/roomeq/wizardhelpv5/help_en-GB/html/measurementlevel.html#top and http://www.hometheatershack.com/roomeq/wizardhelpv5/help_en-GB/html/inputcal.html#top
But before we take any full range frequency measurements, we want to “voice” the speaker’s low frequency response to the room, like we did with our ears. In some respects, this is a validation that we have found the right spot based on our ear’s voicing of the room.
Aside from swept sine waves, we can use REW’s Real Time Analysis (RTA) function to pressurize the room (like blowing on a coke bottle to form the resonances). We output pink noise for low frequencies, and while watching the real–time display, moved the speaker/sub combo to and from the rear wall to achieve the most even response possible. As it turns out, where I voiced the speakers by ear turns out to be the same spot that measures the flattest frequency response in the low end. Here is what that looks like in REW:
Step 3 Now measure up and ensure we have the best possible symmetry to within a 1/4” tolerance.
Based on voicing the speaker to room interface for low frequency response, we need to ensure that each speaker is exactly the same distance from the rear wall. While I have used tape measures in the past, it is truly worth the investment in a digital laser measurer like http://www.amazon.com/Bosch-DLR130K-Digital-Distance-Measurer/dp/B001U89QBU
Why? Here is a bit of math to better understand why this is important: http://www.sengpielaudio.com/calculator-wavelength.htm This is a wavelength calculator. If I type in 20,000 Hz, the wavelength is 0.675 of an inch. That’s a little more than a 1/2". So if my equilateral triangle is out, or off center, or the distance from the speakers to the rear wall are different, then “comb filtering” occurs. All it takes is for the tolerance to be out slightly more than a 1/2". It not only affects the frequency response, but the soundstage as well. Ethan Winer does an awesome job explaining this: http://ethanwiner.com/believe.html
Bottom line. Try and ensure that the equilateral triangle between the speakers and listening position is within a 1/4” tolerance. Same goes for the distance between the speakers and the rear and side walls. Try and make it as symmetrical as possible. I shoot for a 1/16” – 1/8” to a 1/4" max. tolerance. I learned this from Chips Davis. When that studio was built in 1986, he used a laser survey standard to layout out the room from the blueprints. He viewed it as the most critical part of the build and spent a great deal of time getting it right on the button for every measure.
It may take a couple hours of work, measuring distances, readjusting, re-measuring, rinse and repeat, down to below 1/4" tolerance. The reward? Most people have not heard this level of precision towards neutral tonal balance and decoding of the soundstage.
Here is a pic using the laser distance measurer to ensure both speakers are exactly the same distance (I got it down to a 1/16” tolerance) from the rear wall.
Step 4 measure the frequency response of each speaker (and sub). Now that everything is configured to a tight tolerance, sound card and mic calibrations loaded into REW, we can take a full range measurement.
A couple of points to keep in mind while looking at this frequency response plot. One is that the REW software has considerably more measurement resolution than our ears do. “The ear tends to combine the sound within critical bandwidths, which are about 1/6 octave wide (historically thought to be 1/3 octave). This has led to the practice of averaging frequency response over 1/3 octave bands to produce beautiful-looking frequency response curves. In my opinion this is misleading.” From “Music and The Human Ear” Another excellent read in my opinion: http://www.silcom.com/~aludwig/EARS.htm
I agree. I use 1/6 octave smoothing on the measured frequency response, which more closely represents how our ears hear. Here is the same frequency response as above, except with 1/6 octave smoothing applied.
Major difference and provides a clue as to how much (displayed)resolution the software and components are capable of measuring.
I know it seems far away from a flat frequency response, as compare to the frequency response of the Lynx L22 sound card above. To quote Ethan Winer, "The room you listen in has far more influence on what you hear than any device in the signal path, including even the loudspeakers in most cases." I agree.
For example, here is the frequency response of one of my speakers in my listening room:
Similar, but different than my friends room. However, unfortunately, typical of small room acoustics. All of our listening environments suffer from this. Some more than others. To look into this more, I highly recommend Bob Gold’s Room Mode Calculator: http://bobgolds.com/Mode/RoomModes.htm to plot the room modes, cut off frequency, and other important acoustical parameters to take into consideration.
To better understand the output of the calculator, Ethan Winer, again, does an excellent job of explaining this and what can be done in his article, “Acoustic Treatment and Design for Recording Studios and Listening Rooms: http://www.ethanwiner.com/acoustics.html
As an aside, when we were running REW full range sine wave sweeps, there was an audible “chirp” detected in the right speaker. It was for just the briefest moment of time, but still audible during the sweep. We ran the test a number of times while feeling around the cone of the Seas driver. As it turns out, while we were soldering a connection earlier, we forgot to tighten up one of the bolts that mounts the driver. So we were hearing the vibration of the speaker frame against wood at a certain resonant frequency due to one bolt not being tightened up. After tightening the bolt, and re running the sweeps, the chirp was gone.
Btw, other folks are obtaining similar results. CA’s own Nyal Mellor and Dallas Justice worked together to use acoustic measurements for speaker placement: http://www.whatsbestforum.com/showthread.php?6050-A-good-example-of-how-to-use-acoustic-measurements-to-place-speakers
CA’s wgb113 also wrote a blog post: http://www.computeraudiophile.com/blogs/Room-EQ-Wizard Nice frequency response!
Another interesting speaker room calibration discussion is Bruce Brown, The Pro Audiophile: http://www.whatsbestforum.com/showthread.php?5893-Speaker-Room-calibration
Step 5 Calibrate the speakers to a known target frequency response reference.
So now that I have measured the frequency response, now what? Considerable research and listening tests show that we do not want or prefer a “flat” 20 Hz to 20 KHz frequency response at the listening position. In fact, most of the research shows that we prefer a “target” frequency response at the listening position to be sloped downward from 20 Hz to 20 KHz, typically down -6dB at 20 Khz.
Reading this excellent paper from B&K, http://www.bksv.com/doc/17-197.pdf and based on the listening/measurement tests, the “optimum” target frequency response at the listening position is:
Translating the target frequency response into numbers: 0 dB at 20Hz, -0.5 dB at 200 Hz, -3.0 dB at 2 KHz, and -6 dB at 20 KHz. In addition, looking at the work Dr. Sean Olive http://seanolive.blogspot.ca/2009/11/subjective-and-objective-evaluation-of.html has done, both in measuring and listening tests, a very similar frequency response "slope" emerges as the preferred frequency response curve at the top of this diagram: https://docs.google.com/file/d/0B97zTRsdcJTfY2U4ODhiZmUtNDEyNC00ZDcyLWEzZTAtMGJiODQ1ZTUxMGQ4/edit?hl=en_US&pli=1
It is also stated that listeners do not prefer a “flat” frequency response, but rather the downward sloped response. “Flat in-room response is not the preferred target” from Sean’s slides.
I concur, here is the frequency response of one of my speakers at the listening position, corrected for perfectly flat. I am using Digital Room Correction (DRC) to achieve this flat response of 20 Hz to 20 Khz +- 3db at the listening position. It is definitely too bright and has another consequence of moving the soundstage too far forward. Note these graphs are 1/12 octave smoothing.
Because I use DRC software (i.e in the digital domain), I can easily change the “target” frequency response in less than a few minutes and be listening to a different “calibration”.
Here I am following the “optimum” B&K and similar Harman target frequency response at the listening position.
To my ears, and obviously to the folks that took the tests in both articles, this frequency response “slope” at the listening position sounds the most natural. It’s not too bright, not too dull, but just right. It also has the best sound stage in my opinion, not too far back, not too far forward, but just right. To my ears, best resembles the tonal balance that I took for granted while working in those state of the art acoustic and monitoring facilities.
Because of the ease in which I can change target frequency response, I have listened to dozens of different target frequency responses. By doing this, I have learned that a few dB up or down deviation from the B&K and Harman target's, makes a big difference in tone quality (i.e. timbre) and sound stage. That is why it is critical to calibrate the sound card and have a calibrated microphone, as a few dB off calibration makes a big difference and can produce incorrect results.
So how does this help my friends room tuning? Well, he has tweeter level controls so I can “calibrate” the frequency response slope to match my target frequency response that I know is already calibrated to a known target. Here I have overlaid 3 frequency responses. The one in blue is my friends, the one in green is mine, and the one in red is mine also, but with DRC (Audiolense) enabled.
It is not a fair comparison as I am using DRC (which makes a significant difference). But the point being is that I can match the high frequency response “slope” of my friends speakers to match the target for the best tonal balance and soundstage. Our listening tests confirmed that this is the right tonal shape at the listening position, even if the bottom end is a bit more variant in frequency response.
Step 6. Repeat as required. It may take a few iterations to fine tune the system as there are a lot of variables involved. For example, while the mic was pointing straight up, I used the wrong calibration file which accounts for a 4 dB difference at 20 KHz. Noticeable. You may want to space apart the iterations over days so that you can settle in and form an opinion as to the sound quality.
It has been my experience that there is a direct correlation to what we measure and see with graphs with what we hear. It just takes practice and patience to tune into both and correlate the two. If you take some DRC software for a spin, make sure you can quickly switch between targets. It is very interesting to see that small measured changes in frequency response slope, makes for large audible tonal and soundstage changes.
“The room you listen in has far more influence on what you hear than any device in the signal path, including even the loudspeakers in most cases”. As quoted from Ethan Winer. I would say by orders of magnitude difference based on my experience. This is where significant improvement to any existing sound system can be had for a relatively small investment.
Where to go from here?
If you are interested in Digital Room Correction (DRC), and advanced topics like True Time Domain correction (i.e. time alignment), I wrote a series of articles called, “Hear music the way it was intended to be reproduced” starting with: http://www.computeraudiophile.com/entries/6-Hear-music-the-way-it-was-intended-to-be-reproduced-part-1
Note with Audiolense DRC, your head is not “locked” into one mic position. Audiolense has the capability for multi-seat DRC so that you can tune the sweet spot to cover a couch area, for example:
If DRC is not for you, then try some simple passive acoustic treatments. This is a great starting point: http://www.gearslutz.com/board/studio-building-acoustics/610173-acoustics-treatment-reference-guide-look-here.html
Digital Audio has been around since the early 80’s, that’s 30 years ago. According to Moore’s law, http://en.wikipedia.org/wiki/Moore%27s_law we are using computing power that contains 100,000 times more transistors than in 1980. Today we have very advanced software working at 64 bit data paths, http://wiki.jriver.com/index.php/Audiophile_Info where noise and distortions are below our ability to hear them. That means Digital Room Correction, using digital FIR filters, has no audible effect on the sound quality when inserted into the playback chain.
REW is powerful acoustic measurement software as it can also measure Energy Time Curves (ETC’s) which will help you achieve a reflection free zone in your listening area. Also using REW’s waterfall measurement capability will show how sound decays in your room, typically over a 300 millisecond window. This is useful to see if bass traps are required as this is the most common problem, along with early reflections...
My goal in writing this was to walkthrough a typical speaker to room calibration and demonstrate the benefits of such an effort in hopefully easy to perform steps.
After a month of listening, my friend is very happy with the end result (both audible and visual).
Updated with more info on Audio DiffMaker, plus ABX listening tests.
Lots of discussion around this article: 24/192 Music Downloads...and why they make no sense http://people.xiph.org/~xiphmont/demo/neil-young.html
I decided to run a science experiment using Audio DiffMaker to compare 16/44 to 24/192 format of the same master from Soundkeeper Recordings: http://soundkeeperrecordings.com/format.htm
I have used Audio DiffMaker before to compare FLAC vs WAV and comparing two bit-perfect music players on my computer audio playback system.
Here is the result of my 16/44 vs 24/192 experiment.
First a refresher on how Audio DiffMaker works:
There are also a two papers, http://www.libinst.com/AES%20Audio%20Differencing%20Paper.pdf and http://www.libinst.com/Detecting%20Differences%20(slides).pdf The help file that comes with the program is very well documented and goes into much more detail.
Updated - I wanted to provide more with respect to how Audio DiffMaker works and why it is an important state of the art measurement tool in any Audiophiles arsenal.
Audio DiffMaker’s Differencing Process
Excerpt from the DiffMaker Help file on how the differencing process works:
While it may not be possible to show whether alteration is having effects directly on the listener, it is possible to determine whether an audio signal has been changed.
The existence of any changes to a digital recording of an audio signal can be detected by the simple process of subtraction, performed on a sample-by-sample basis. If each audio sample is the same, then subtracting one from the other leaves nothing (zero signal).
A recorded copy of the original signal (called the "Reference") can be mathematically subtracted from a recorded copy of the possibly changed signal (called the "Compared" signal). This results in a "Difference" signal recording that can be evaluated by ear or other analysis.
If the resulting Difference signal, when played as audio, is effectively silence or at least is not perceivable to a listener when played at levels in which it would occur when it was part of the "Compared" signal, then the investigator can with good confidence conclude that the change has made no audible difference.
The problems and operational, perceptual, or psychological complications about listening for whether sound is being changed are greatly reduced by transforming the task into the much simpler issue of listening for anything significant at all. The evaluation of the result is done by ear, and the user doesn't need to question hearing ability to use the tool. Audio DiffMaker test, encourages you to still "trust your ears".
Audio DiffMaker is a state of the art differencing tool that automates this workflow from 5 years ago: http://forum.audacityteam.org/viewtopic.php?f=28&t=3873#p15071 One of the reasons it is state of the art is because the software can differentiate time differences in decimal places in the parts per million (ppm): “The sample rates or speeds of player decks and soundcards are constantly drifting, if only by very small amounts. But even as little a change in sample rate as 0.01ppm (one hundredth of a part per million) can cause two otherwise identical files to leave difference sound after subtracting.”
In order to compare the two formats, I had to up sample the 16/44 to 24/192. I used http://www.voxengo.com/product/r8brainpro/ to perform the sample rate conversion:
I used the default settings. Then I used Audacity to edit the waveforms so I am just looking at the first 40 seconds of each waveform.
Then it is a matter of loading the two waveforms into Audio DiffMaker and extracting the difference.
According to DiffMaker, the difference file is -94 dB. I opened up the difference file in Audacity and here is what is left over:
Something definitely there. Here is the frequency analysis:
I have also included the difference file as an attachment to this post. Given that the majority of content is 20KHz and above, I can’t hear anything on the difference file.
Note that this is one data point. I have used Audio DiffMaker for a while now and here is one tip that will help you get consistent results if you decide to try it out.
This is the output status window from the DiffMaker progam as it is running. Note the arrow. It says that the sample rate error is low enough not to require adjustment. If the sample rate error is too high, there will be a notification as such on this line, then the program tries to automatically align the tracks. However, there seems to be a bug in the program, as noted in one of my other posts, so the track alignment does not seem to work or work very well. Therefore, I am unable to get consistent results.
If you look in the status window and see that your comparison requires sample rate adjustment, then here is what you can do. Open up the waveforms in your favorite digital audio editor and ensure that the both waveforms “start” at exactly the same time. That’s the trick. This is why I sample the first 40 seconds of the waveform, because in most cases, you do not need to line the waveforms up. Such is the case with the Soundkeeper filesas they both start at exactly the same time.
If you do need to line the waveforms up because you are recording the samples, then you can trim them later in your favorite digital audio editor. It is tedious as it may take a couple of passes before you get it lined up exactly.
Edited to add this section.
I ran another DiffMaker test, this time on Kote Moun Yo? samples from Equinox. I really enjoyed this recording as it definitely has ultrasonic information recorded (i.e. percussion instruments) and is crystal clear sound with very low noise floor. I would say state of the art recording. Great job Barry! http://soundkeeperrecordings.com/format.htm
I followed the same process as above. Again, the point in this is to either confirm or deny Monty’s claim that 16/44 is already better than our ears can hear and our sound system can reproduce. 24/192 should contain much more audio information than 16/44, so by comparing 16/44 to 24/192 using DiffMaker will show exactly how much difference there is between the two. In order for me to digitally compare the 16/44 to 24/192, I up-sampled the 16/44 to 24/192. If the R8 Brain resampler I used is doing its job proper, there should be no waveform changes as there is no information being added (or lost!), simply a (lossless) file format change.
Here is what Audio DiffMaker reports as being the difference.
-100dB difference file. It is very similar to my first test above, showing I can repeat the results, even on a completely different song/master.
Here is what the Difference waveform looks like.
And frequency analysis.
As you can see, the frequency plot shows ultrasonic energy, even though it is very low in overall level. Again, I have attached the difference file so you can listen to it. I cannot hear the ultrasonic information.
Part 2 Listening Tests
Given that the difference between 16/44 versus 24/192 is ultrasonic energy, it is important to verify that the gear used can actually reproduce ultrasonic energy. I used my Lynx L22 pro sound card that has a ruler flat frequency response out to at least 50KHz: http://i1217.photobucket.com/albums/dd381/mitchatola/lynxl22-1.jpg
I used my Sennheiser headphones with a custom Class A headphone amp that I built from the Audio Amateur from years gone by:
On the right is a toroid transformer feeding a regulated power supply and then my perf boards of the amp itself on the far left. I have measured the frequency response out to +200HKz. The headphone amp has enough clean power that you can place the headphones on the floor opened and crank it up like it was a boom box.
Next step is to verify that my gear can play ultrasonic information properly. These intermodulation test files provided by Monty’s article should be played first on your system to ensure you hear nothing at all. If you do hear tones, pops or clicks, that means the system under test is producing intermodulation distortion. http://people.xiph.org/~xiphmont/demo/neil-young.html#toc_intermod
With my particular computer system, Lynx L22 and Class A headphone amp, I did not hear any tones, clicks or pops. Ok onto step 2.
ABX testing. For listening tests that provide any level of statistical probability, double blind is the only way to go. I used Foobar2000 http://www.foobar2000.org/ and the ABX plugin http://www.foobar2000.org/components/view/foo_abx I made sure that I clicked on the Hide Results checkbox before I started the tests.
First up, 16/44 vs 24/192.
Here was the problem with this test. I could just tell by a very small delay when my DAC was switching from 16/44 to 24/192. So I was able to “game” the test:
So I resampled the 16/44 to 24/192 so I could not hear the DAC switch sample rates.
Here are the results:
Obviously I cannot hear the difference. This correlates with the DiffMaker results as well. The difference is so small that I was guessing, even though I was trying not to.
Since I cannot (significantly) measure or hear the difference between 16/44 and 24/192, I tried one more experiment where there is a known difference – MP3.
I took the 16/44 and converted it using the best MP3 codec (LAME) and encoded at 192Kbps bit rate. I used this bit rate as I listen to a lot of music on Zune and this is the default bit-rate when I download the music onto my disk for playing. As you may imagine, there is a reason that Microsoft chose this bit-rate and I will show why shortly.
Now comparing the 16/44 to the MP3 version produces the following Difference file in Audio DiffMaker:
And if I open up the waveform in Audacity:
I have included the Difference file again so you can hear the results. And it correlates very well with the other two other MP3 difference tests I performed here: http://www.computeraudiophile.com/blogs/FLAC-vs-WAV-Part-2-Final-Results#comment-131768
So the $64 million dollar question is, can I hear the difference in an ABX test for 16/44 and MP3?
While I did better than the 16/44 vs 24/192, it is in the territory of guessing :-) Listening closely, I thought I could hear a loss of transients on the percussion, but just barely perceptible to my ears.
Another way I can listen is to use Audio Diffmaker where I can reconstruct the comparison track by adding the difference back to the reference. By incrementally increasing the difference track level, I can easily hear the difference when the difference track is boosted by about +6dB.
I would hazard a guess this is the reason why Microsoft (and others) choose 192Kbps with MP3 as it gives the best fidelity versus file size. And likely the reason why most people don’t complain about it as most people (including me) cannot hear a quality difference, even under ABX testing conditions.
Well, for me, my ears, on my equipment, my test and listening results confirms Monty’s article that 16/44 is enough for my ears. This is also qualified by the science and engineering in the Digital Audio field: http://www.computeraudiophile.com/blogs/1644-vs-24192-Experiment#comment-135987
In fact, it may be that even high bitrate MP3’s is enough resolution, but that’s another debate.
Full disclosure, I am 53 years old and given the hearing loss versus age http://www.roger-russell.com/hearing/hearing.htm in the chart below, I may not be the best candidate for trying to hear ultrasonic audio information :-)
A quick hearing test from: http://www.phys.unsw.edu.au/jw/hearing.html confirms that I can hear to at least 12KHz, but down at 16KHz. It is no suprise to me why I don't hear ultrasonic audio information:
My perspective is this. If I was going to pick one cause to get behind in the world of music, it would not be over high resolution file formats. It would be the Loudness War.
Almost 30 years ago, the pop band, The Police, created a very popular album called Synchronicity: http://en.wikipedia.org/wiki/Synchronicity_(The_Police_album) With an overall Dynamic Range of 15 http://dr.loudness-war.info/details.php?id=12040 and the final cut on the album, Murder By Numbers, with a DR of 18 is an excellent example of taking the full advantage of the Red Book standard. The disc sounds fantastic. What happened since then?
The Loudness War in less than 2 minutes:
Given that this is CA, I would think everyone could correlate what they see in the visual representation of the waveform and what they hear. As I have discussed before, there is a direct correlation to what is measured with what is heard – it’s fundamental to the princples of audio. You can see and hear the difference, even over YouTube!
Final thoughts: All of the software used to perform both measurements and listening tests is free. Therefore, if you are curious and want to verify or deny Monty’s (and as it turns out, me too) claim, you can perform the same tests yourself.
Happy listening!<p><a href="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc1280a9_16441vs24192difference_zip.4e172f8cd059cc25e231d1cdde27b118" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28078" src="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc1280a9_16441vs24192difference_zip.4e172f8cd059cc25e231d1cdde27b118" class="ipsImage ipsImage_thumbnailed" alt=""></a></p><p><a href="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc12d67b_SR002KoteMounYo16441vs24192Difference_zip.0639a2fcdb04739c72b1b2340c337153" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28079" src="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc12d67b_SR002KoteMounYo16441vs24192Difference_zip.0639a2fcdb04739c72b1b2340c337153" class="ipsImage ipsImage_thumbnailed" alt=""></a></p><p><a href="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc133bc8_SR002KoteMounYo16441vsMP3Difference_zip.ad5a74a83f5e5795890afd0fded68864" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28080" src="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc133bc8_SR002KoteMounYo16441vsMP3Difference_zip.ad5a74a83f5e5795890afd0fded68864" class="ipsImage ipsImage_thumbnailed" alt=""></a></p><p><a href="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc956d27_16441vs24192difference_zip.59be35f20697d46dd7e5a4c52b7d071f" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28328" src="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc956d27_16441vs24192difference_zip.59be35f20697d46dd7e5a4c52b7d071f" class="ipsImage ipsImage_thumbnailed" alt=""></a></p><p><a href="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc95b9b7_SR002KoteMounYo16441vs24192Difference_zip.a6ccdafd25b35df086dc7aa4690793b8" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28329" src="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc95b9b7_SR002KoteMounYo16441vs24192Difference_zip.a6ccdafd25b35df086dc7aa4690793b8" class="ipsImage ipsImage_thumbnailed" alt=""></a></p><p><a href="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc9603f0_SR002KoteMounYo16441vsMP3Difference_zip.882cfc4894a8fe525818867b5cc2bc89" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28330" src="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc9603f0_SR002KoteMounYo16441vsMP3Difference_zip.882cfc4894a8fe525818867b5cc2bc89" class="ipsImage ipsImage_thumbnailed" alt=""></a></p>
Recommended reading first The reason is that I am not going to reiterate the baseline components and measurements of my test gear already covered in that post.
Here is a high level block diagram of my test setup:
On the left side is my HTPC with both JRiver MC 17 and JPLAY mini installed. The test FLAC file is the same Tom Petty and The Heartbreakers, Refugee at 24/96 that I have been using for my FLAC vs WAV tests.
JRiver is set up for bit perfect playback with no DSP, resampling, or anything else in the chain, as per my previous tests:
JPLAY mini is set up in Hibernate mode and the following parameters:
On the right hand side of the diagram, I am using Audio DiffMaker Audio DiffMaker for recording the analog waveforms off my Lynx L22 analog outputs of my playback HTPC. All sample rates for the tests are at 24/96.
Here is the differencing process used by Audio DiffMaker:
Audio DiffMaker comes with an excellent help file that is worth the time reading in order to get repeatable results. One tip is to ensure both recordings are within a second of each other.
As an aside, this software can be used to objectively evaluate anything in your audio playback that you have changed. Whether that be a SSD, power supply, DAC, interconnects, and of course music players.
My assertion is that if you are audibly hearing a difference when you change something in your audio system (ABX testing), the audio waveform must have changed, and if it has changed, it can be objectively measured. I find there is a direct correlation between what I hear and what I measure and vice versa. I want a balanced view between subjective and objective test results.
First, I used JRiver as the reference and I recorded about 40 seconds of TP’s Refugee onto my laptop using DiffMaker. Then I used JPLAY mini, in hibernate mode, and recorded 40 seconds again onto the laptop. I did this without touching anything on either the playback machine or the recording laptop aside from launching each music player separately.
Just to be clear what is going on, the music players are loading the FLAC file from my hard drive and performing a Digital to Analog conversion and then though the analog line output stage. I am going from balanced outs from the Lynx L22 to unbalanced ins on my Dell, through the ADC, being recorded by Audio DiffMaker.
Clicking on Extract in Audio DiffMaker to get the Difference produces this result:
As you can see, it is similar to when I compared FLAC vs WAV. What the result is saying is that the Difference signal between the two music players is at -90 dB. I repeated this process several times and obtained the same results.
You can listen to the Difference file yourself as it is attached to this post. PLEASE BE CAREFUL as you will need to turn up the volume (likely to max) to hear anything. I suggest first playing at a low level to ensure there are no loud artifacts while playing back and then increasing the volume.
As you can hear from yourself, a faint track of the music, that nulls itself out completely halfway through the track and slowly drifts back into being barely audible at the end.
According to the DiffMaker documentation, this is called sample rate drift and there is a checkbox in the settings to compensate for this drift.
“Any test in which the signal rate (such as clock speed for a digital source, or tape speed or turntable speed for an analog source) is not constant can result in a large and audible residual level in the Difference track. This is usually heard as a weak version of the Reference track that is present over only a portion of the Difference track, normally dropping into silence midway through the track, then becoming perceptible again toward the end. When severe, it can sound like a "flanging" effect in the high frequencies over the length of the track. For this reason, it is best to allow DiffMaker to compensate for sample rate drift. The default setting is to allow this compensation, with an accuracy level of "4".”
Of course this makes sense as I used a different computer to record on versus the playback computer and I did not have the two sample rate clocks locked together. The DiffMaker software recommends this approach, but I have no way of synching the sample rate clock on the Dell with my Lynx card.
Given that the Difference signal is -90 dB from the reference and that the noise level of my Dell sound card is -86 dB, we are at the limits of my test gear. A -90 dB signal is inaudible compared to the reference signal level.
I am not going to reiterate my subjective listening test approach as I covered it off in my FLAC vs WAV post.
In conclusion, using my ears and measurement software, on my system, I cannot hear or (significantly) measure any difference between JRiver and JPLAY mini (in hibernate mode).
April 2, 2013 Updated testing of JRiver vs JPLAY, including JPLAY ASIO drivers for JRiver and Foobar plus comparing Beach and River JPLAY engines. Results = bit-perfect.
June 13, 2013 Archimago's Musings: MEASUREMENTS: Part II: Bit-Perfect Audiophile Music Players - JPLAY (Windows). "Bottom line: With a reasonably standard set-up as described, using a current-generation (2013) asynchronous USB DAC, there appears to be no benefit with the use of JPLAY over any of the standard bit-perfect Windows players tested previously in terms of measured sonic output. Nor could I say that subjectively I heard a difference through the headphones." Good job Archimago!
Interested in what is audible relative to bit-perfect? Try Fun With Digital Audio - Bit Perfect Audibility Testing. For jitter, try Cranesong's jitter test.
Happy listening!<p><a href="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc11cee0_jrivervsjplayanalogdifference_zip.abc5ef36e963925ad0e4deb087100dfd" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28076" src="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc11cee0_jrivervsjplayanalogdifference_zip.abc5ef36e963925ad0e4deb087100dfd" class="ipsImage ipsImage_thumbnailed" alt=""></a></p><p><a href="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc122aa6_jrivervsjplaydigitaldifference_zip.20206be38ed0e9589a31ef13f8b678e6" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28077" src="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc122aa6_jrivervsjplaydigitaldifference_zip.20206be38ed0e9589a31ef13f8b678e6" class="ipsImage ipsImage_thumbnailed" alt=""></a></p><p><a href="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc94d683_jrivervsjplayanalogdifference_zip.a113b760512958701d5cb35ef7e6ddac" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28326" src="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc94d683_jrivervsjplayanalogdifference_zip.a113b760512958701d5cb35ef7e6ddac" class="ipsImage ipsImage_thumbnailed" alt=""></a></p><p><a href="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc9523e8_jrivervsjplaydigitaldifference_zip.2e148f06b06fbf3b249a96e630e6facb" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28327" src="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc9523e8_jrivervsjplaydigitaldifference_zip.2e148f06b06fbf3b249a96e630e6facb" class="ipsImage ipsImage_thumbnailed" alt=""></a></p>
In part 1, I used a null test technique to show that both FLAC and WAV (lossless) file formats are identical. In this post, I have expanded the null test to cover off playing the same FLAC and WAV files dynamically from JRiver and capturing the audio waveform after the Digital to Analog conversion and analog line output stage. Here is a high level block diagram of my test setup:
For playback, I am using the exact same original FLAC and converted (by JRiver) WAV file I used in Part 1. It is Tom Petty and Heartbreakers Refugee at 24/96. JRiver is set up for bit perfect playback with no DSP, resampling, or anything else in the signal chain. I used the native Lynx ASIO driver to communicate between the sound card and JRiver. All sample rates for the tests are at 24/96.
My Win7 64 Bit HTPC build is nothing special. No special power supply or SSD or interconnects.
Side note, for Windows users, always invaluable to check your PC for latency with http://www.thesycon.de/deu/latency_check.shtml
I have tested the frequency response of my Lynx L22 sound card using REW http://www.hometheatershack.com/roomeq/ and noise levels, distortion, etc., using RightMark Audio Analyzer http://audio.rightmark.org/index_new.shtml
For capturing (i.e. recording) the audio waveforms, I used a Dell M4600 latptop and the onboard HD audio chip and driver. Here is the noise measurement of the on board sound chip. Not as good as my Lynx card above, but a check to see that everything is in working order.
I used Audio Diffmaker http://www.libinst.com/Audio%20DiffMaker.htm for recording the waveforms that were coming off the analog outputs of my playback PC. Here is the process used by Audio DiffMaker:
As an aside, I should point out that you can use this software to objectively measure anything in your audio playback chain that you have changed. Whether that be power supply, DAC, interconnects, music players, SSD, VST plugins, or whatever.
Remember, if you are audibly hearing a difference when you change something in your audio system (ABX testing), the audio waveform must have changed, and if it has changed, it can be objectively measured. I find there is a direct correlation between what I hear and what I measure. For me, to form any valid opinion about audio reproduction, I want to correlate my subjective results with my objective results and vice versa. I want a balanced view.
In the Audio DiffMaker help file, the software program is able to line up the waveforms if the program material is within 1 second of each other (protip).
Here I am capturing the first 40 seconds of TP’s Refugee in Audio DiffMaker:
I did this twice, once playing the FLAC and then the WAV, without making any changes on either computer.
To test the DiffMaker software (and everything else) is working correctly, I took the FLAC recording and compared it to itself. Theoretically, it should null itself out completely.
And it does. Ok so now let’s compare the two recordings, one FLAC and the other WAV:
What the result is saying is that the difference signal is almost -90 dB. I repeated the test ten times and obtained the same results.
You can listen to the difference track for yourself as it is attached to this post. PLEASE BE CAREFUL as you will need to turn up the volume (likely to max) to hear anything. I suggest doing this in volume level stages so you can verify there are no other artificats while listening.
As you can hear for yourself, a faint ghost track of the music, that nulls itself out completely halfway through the track and slowly drifts back into being barely audible at the end.
According to the DiffMaker documentation, this is sample rate drift and there is a checkbox in the settings to compensate for this drift:
“Any test in which the signal rate (such as clock speed for a digital source, or tape speed or turntable speed for an analog source) is not constant can result in a large and audible residual level in the Difference track. This is usually heard as a weak version of the Reference track that is present over only a portion of the Difference track, normally dropping into silence midway through the track, then becoming perceptible again toward the end. When severe, it can sound like a "flanging" effect in the high frequencies over the length of the track. For this reason, it is best to allow DiffMaker to compensate for sample rate drift. The default setting is to allow this compensation, with an accuracy level of "4".”
Of course this makes sense given that I used a different computer to record on versus the playback computer and I did not have the two sample rate clocks synched together. The DiffMaker software recommends this approach, but I have no way of synching the sample rate clock on the Dell to my Lynx card.
So when this is not possible, the DiffMaker documentation indicates to use the sample rate compensation.
However, when I tried the sample rate compensation, the DiffMaker program thru the following error:
I sent an email to the software manufacture and will follow up once I hear back.
Given that the signal is almost -90 dB from the reference and that the noise level of my Dell sound card is -86 dB, we are definitely nearing the limits of my gear. Also, given that the dynamic range of most music material we listen to is less than 20dB http://en.wikipedia.org/wiki/Dynamic_range#Audio it seems unlikely that I could hear the difference track, relative to the reference level – that’s a 90 dB difference.
Subjective Listening Tests
In JRiver, I played the FLAC and WAV (and vice versa) several times through headphones and speakers. I did this sighted and blind. I also played back the recorded reference and compare files in Audio DiffMaker using headphones. Finally, I played back the Reference + Difference track.
In my subjective listening tests, I could not hear any differences between the FLAC and WAV files in any combination of the above. Not only from the playback machine but also the recorded tracks. They all sounded identical to me. There seems to be good correlation between objective and subjective results.
As a side note, I have been into audio and music for over 40 years. For 8 of those years I was a recording/mixing engineer where I was trained and relied upon to note very small audible changes. http://www.thepikes.com/bio The reason I am saying this is because of psychoacoustic http://en.wikipedia.org/wiki/Psychoacoustics effects, our ears can be easily fooled http://en.wikipedia.org/wiki/Auditory_illusion or put in a positive way, our ears adapt to changes very quickly.
In fact, most recording, mixing, and mastering engineers use these psychoacoustic effects on purpose. For example, the HAAS effect http://en.wikipedia.org/wiki/Haas_effect#Experiments_and_findings to make the sound more full, wider, sense of air, etc. All tricks played on our ears: http://www.algorithmix.com/en/kstereo.htm including some remastered material we download from HDTracks.
So do we not trust our ears? I am not saying that. What I am doing is bringing a balance of both subjective and objective thoughts together so we can correlate what we hear with what we measure and vice versa. Again, when performing ABX listening tests, if you are hearing an audible difference, then the waveform must have changed. If the waveform has changed then we can measure the difference.
Btw, all of the software used in these tests is free. I would encourage you to download the software’s and try this out for yourself as it does not require any special equipment. Further, you can objectively quantify any differences throughout the audio chain in your playback system.
In conclusion, using my ears and measurement software, on my system, I cannot hear or (significantly) measure any difference between FLAC and WAV. Not only just file formats, but the rest of the audio playback chain as well.
Happy Listening!<p><a href="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc10d5d3_TPRefugeeflacwavdifference_zip.7edacfe8fdec5923b37786c66c719085" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28073" src="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc10d5d3_TPRefugeeflacwavdifference_zip.7edacfe8fdec5923b37786c66c719085" class="ipsImage ipsImage_thumbnailed" alt=""></a></p><p><a href="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc112c71_FLACvsMP3AudioDiffMakertest_zip.1e0ae86bc8309e7c67b5e3bfa5859ea6" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28074" src="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc112c71_FLACvsMP3AudioDiffMakertest_zip.1e0ae86bc8309e7c67b5e3bfa5859ea6" class="ipsImage ipsImage_thumbnailed" alt=""></a></p><p><a href="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc117b12_FLACvsMP3FileNulltest_zip.75cfcaa945e30ee84f2f3496a30f3221" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28075" src="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc117b12_FLACvsMP3FileNulltest_zip.75cfcaa945e30ee84f2f3496a30f3221" class="ipsImage ipsImage_thumbnailed" alt=""></a></p><p><a href="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc93ec4e_TPRefugeeflacwavdifference_zip.da60e6c8f17aaf005c1def1aff6ab600" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28323" src="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc93ec4e_TPRefugeeflacwavdifference_zip.da60e6c8f17aaf005c1def1aff6ab600" class="ipsImage ipsImage_thumbnailed" alt=""></a></p><p><a href="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc944659_FLACvsMP3AudioDiffMakertest_zip.76f778e131f30391613959b5ca5c6cda" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28324" src="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc944659_FLACvsMP3AudioDiffMakertest_zip.76f778e131f30391613959b5ca5c6cda" class="ipsImage ipsImage_thumbnailed" alt=""></a></p><p><a href="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc948d99_FLACvsMP3FileNulltest_zip.10aa080a234c3daba1815c93edfa470c" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28325" src="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc948d99_FLACvsMP3FileNulltest_zip.10aa080a234c3daba1815c93edfa470c" class="ipsImage ipsImage_thumbnailed" alt=""></a></p>
Lots of discussion on the SQ of software music players on CA. I am a fan of correlating what I hear with what I measure and vice versa. In this post, I am proposing a way of measuring the difference between music players by expanding the “null test” I performed here http://www.computeraudiophile.com/blogs/FLAC-vs-WAV-vs-MP3-vs-M4A-Experiment
Rather than performing a null test on audio file formats, the single unit under test will be the music player, so when one music player is switched for another, that will be the only variable in the test setup. How to do this? If you are interested read on.
First a note. My plan is to provide a balanced view between what is measured and what is heard. I have no preconceived notions of what the outcome is going to be and I am not affiliated with any software or hardware audio manufacturer.
2nd I make the following assertion. In an ABX listening test, if you hear a difference, than the audio waveform must have changed in some way from the original. If it has changed then we should be able to measure the difference. So how do we capture the difference?
I intend to take a holistic systems approach and capture the measurement of the audio signal chain from music player to analog output. I intend to use my existing gear, http://www.computeraudiophile.com/blogs/Rock-n-Roller-s-Guide-Designing-Audiophile-Sound-System in which I have made many measurements including frequency response, http://i1217.photobucket.com/albums/dd381/mitchatola/lynxl22-1.jpg noise, distortion, http://i1217.photobucket.com/albums/dd381/mitchatola/L22noiselevel.jpg etc.
The scientific experiment will be to play a tune on the music player, through the DAC and then loopback through the ADC of the Lynx card and then record the tune using Audacity or some similar recorder. I will do this twice with the same music player (or more times) to establish a(null test)baseline. Then repeat the same test with a different music player while all other variables stay the same. The only difference will be the music player’s involved.
Once I have the audio files captured, then it is following the same null test procedure as outlined here: http://www.computeraudiophile.com/blogs/FLAC-vs-WAV-vs-MP3-vs-M4A-Experiment with the possible exception of using this software http://www.ohl.to/about-audio/audio-softwares/hear-the-difference to perform the null test as I may need to line up the waveforms by using the delay feature in this software.
I don’t expect the files to completely null out as there will be analog noise (which will have some random component to it), which means that “theoretically” what is left over from the null test should be just noise. However, if there is a diffence in waveforms, we will see it and hear it.
But then again, this is an experiment and anything can happen.
I would be interested to hear CA’s thoughts to this experiment.
Using the chart below, what subjective terms would you use to describe the tone quality (a.k.a. tone color or tone balance or timbre http://en.wikipedia.org/wiki/Timbre) of your sound system at the listening position?
This chart, used by permission from Bob Katz, Mastering Engineer extraordinaire, http://www.digido.com/ shows the subjective terms we use to describe excess or deficiency of the various frequency ranges.
As an aside, there is an important underlying concept here that needs some explanation. That concept is that audio is both an art and a science and that there is correlation between the two. In the case above, the art is the subjective descriptors and the science is the frequency range that the subjective descriptors map to.
It is my belief and experience (more on that later) that shows there is a direct correlation between the art and the science of audio. Put another way, how does it sound using our ears can be mapped to what is being measured and vice versa. I would suggest that this is a balanced view of our hobby where art and science come together and it's not just one or the other extreme.
Looking at the chart also provides us with common terminology in describing our sound systems or the sound of a particular song. So when we say the song sounds "bright" we know that the frequency range for this is likely 3KHz to 10Khz and has been lifted 1 or more decibels relative to the other frequencies on the scale. This can be measured using a real time spectrum analyzer VST plugin http://en.wikipedia.org/wiki/Virtual_Studio_Technology on either the Mac or PC (if your music player software supports VST plugins) using Blue Cat's free FeqAnalyst for example: http://www.bluecataudio.com/Products/Product_FreqAnalyst/
Listening to various songs and watching the spectrum analyzer at the same time, you will soon start correlating songs that sound bright with songs that sound warm for example.
Let's get back to tonal quality. How do I train my ears to understand the tonal quality of my sound system? Nothing like experimenting to help assist with what is going on. As mentioned above, if your playback software supports VST plugin's there are literally thousands of plugins available for both the Mac and PC platforms: http://www.kvraudio.com/allpluginsononepage.php Note that most modern recordings and masters have been processed through a Digital Audio Workstation (DAW) and uses the same VST plugin technology. This has been going on since the mid 90's http://en.wikipedia.org/wiki/Digital_audio_workstation and is very likely that most of the music you listen to has been processed through a DAW, with several VST digital plugins and analog processing chain.
With the advent of unprecedented computer processing power for cheap, software designers of music players http://www.jriver.com/audiophile.html can take advantage of 64bit processing that is way beyond our hardware output capability of 24 bits and therefore has 0 impact on sound quality. You should have no fear in using any of these VST plugins as they will not affect the sound quality of the audio signal passing through them.
Here is a free parametric equalizer that is available on the Mac and PC that you can download and install. http://sonimus.com/site/page/downloads/ The beauty is that you can play with the controls while at the same time listening to the sound and hearing the effect in real time. You can then correlate the sound you hear at the frequency range and the corresponding subjective terms as described in Bob's chart above.
With this eq, you can roll off the extreme frequency ranges and listen to the effect on your speakers or headphones. Note the center slider. What I like to do is turn up the boost and sweep the frequency range while listening to the tonal differences in real time. It is a real ear opening experience. Then you can start correlating the sound of your audio system not only with subjective terms but exactly at what frequencies. You can also correlate by flipping back and forth from the eq to the frequency analyst and correlate with what you are adjusting to what you are seeing to what you are hearing all at the same time.
Where is this leading to? Well, the frequency response of your sound system at the listening position not only describes tonal quality (or timbre) but also directly correlates to the sound stage presented at the listening position. This is an important concept to understand. For example if there is too much high frequency arriving at the listening position, not only is it a bit bright sounding or at the frequency extreme more "air", it correlates to the soundstage being "upfront" or too forward or lacking depth. Conversely, if the high frequency roll off (or slope or shelf) is too much, then not only does it sound "dull" but the soundstage is too far back or "distant".
Is there an optimum tonal balance or timbre or frequency response at the listening position that also has the right soundstage depth? Yes there is. Here is an excellent paper, complete with subjective descriptions and scientific measurements describing such a frequency response we should strive for at the listening position: http://www.bksv.com/doc/17-197.pdf
It has been my experience that this target frequency response curve measured at the listening position, not only provides the best tone quality from my sound system (not too bright or dark, but just right) but also the perfect soundstage (not too forward or too far back, but just right).
Sean Olive is another double blind test. (Thanks hulkss for the link!) Interestingly enough, the virtually same target curve as the B&K curve above is the preferred spectral response at the listening position. It is the top curve in this comparison:
And here is the measured frequency response of my system at the listening position:
Using the B&K target curve as reference, my system is easily within +-3db from 20Hz to 20KHz at the listening position. I have experimented with dozens of target curves, including flat, which is etch a sketch bright with the soundstage in our face. I always end up coming back to the B&K curve.
Note all three graphs are virtually identical. This is no coincidence. If you read Bob Katz's excellent book, Mastering Audio - The Art and The Science, you will note the frequency response of his monitors in his mastering studio also exhibit the same target curve. Finally, when I was working as a recording/mixing engineer at several studios, there was always at least one set of monitors "calibrated" to the B&K curve above in each control room.
Point is, if you want the best tone quality and soundstage from your audiophile system, calibrating your speaker to room interface to a reference target, like those described above, will give you the best possible result.
Most people that hear my rock and roll sound system http://www.computeraudiophile.com/blogs/Rock-n-Roller-s-Guide-Designing-Audiophile-Sound-System are surprised at the soundstage. Casual listeners comment on how they can easily hear the different layers of sound (i.e. the mix) whereas they cannot hear that on their own sound system. That is the first comment I get before, wow, does it sound clean, or punchy or whatever other subjective terms are used.
I wrote a series of six articles detailing on how I went about achieving perfect timbre in my system starting with: http://www.computeraudiophile.com/blogs/Hear-music-way-it-was-intended-be-reproduced-part-1 Given that the speaker to room interface has the biggest impact on tonal quality of your sound system, the best investment you can make to optimizing your existing system is to measure its frequency response at the listening position and compare it to the optimum frequency response as described in the B&K article.
How to do this? For software, REW is fantastic: http://www.hometheatershack.com/roomeq/ and well supported. All you need is a calibrated microphone. It must be calibrated. One such mic is http://www.parts-express.com/pe/showdetl.cfm?Partnumber=390-801 and another is kit so you don't have to fuss with phantom power: http://www.content.ibf-acoustic.com/catalog/product_info.php?cPath=30&products_id=35 I use the latter and have had excellent results.
What's my point in all of this? Using free tools (save the measurement mic) you can experiment with tone quality to see what frequencies you may have too or too little of and correlate that with what you are seeing in the way of spectrum analysis and frequency response graphs and compare to a known standard. That way you can achieve the best tonal quality of your existing sound investment. Given that the speaker to room interface effects timbre and soundstage the most, with little effort as described in this article, can produce huge returns on your existing sound investment.
I wanted to try an experiment of measuring any differences between various media file formats as described in the title. Consider this, if you are hearing a difference when you change media file formats (e.g. from FLAC to WAV), then the audio waveform must have changed, and if it is has changed, then that change can be measured. While the waveform pictures in this article are technical, it really is a case of which picture does not belong with the others.
If you are comparing media file formats, the theory is that FLAC and WAV are lossless file formats and therefore should be identical. Meaning the waveforms are identical. My plan is to use Audacity and a well known procedure to measure the waveforms and identify any differences.
But first, how can we check our gear to see if everything is working as it should? I happen to have a function (i.e. waveform) generator and dual channel oscilloscope for viewing analog waveforms on an old school CRT.
Notice on the oscilliscope, the top trace is a sine wave. The top trace is monitoring a 20 Khz sine wave coming out of the fucntion generator and going to the analog input of my Lynx L22 sound card in my PC. Lynx card flat from 15Hz to +50Khz with no phase shift. Then it is converted into digital format and routed through the mixing console and then converted from digital to analog and monitoring on the 2nd scope channel (i.e. bottom trace).
Ok let's line it up the two waveforms a bit and see how close the waveforms are after going through analog input amp --> ADC --> DAC-->Analog output stage -> scope.
I blame any discrepancies on the +20 year old scope and the much older carbon unit operating it. Would be great to have a digital storage scope.
Ok now lets record 60 seconds of that 20Khz sine wave in Audacity at 24/96 and save it as FLAC.
This is what it looks like in digital format. Mind you I have zoomed waaaayyyy in, look at the time line. Amazing that it can be reconstructed back into a perfect sine wave again. A technological marvel.
I played back the 60 second 20Khz sine wave and lined the waves on the scope again and they matched. Same procedure for a WAV conversion of the FLAC and MP3 version. I did not want to make this repetitive, so I did not add the pics, they all look the same. My purpose was to see if everything is running as it should be.
To really see if the waveforms have been altered we are going to use Audacity and a procedure to take 2 waveforms, normalize their amplitude, massively zoom in and align the two waveforms, invert one, and mix them together. If they are identical, it will "null" out the two waveforms and there will be no signal left.
Here is the step by step procedure:
1) Import copies of both the mp3 and Wav files into the same Audacity project.
2) Amplify both to the peak volume, use the amplify effect's default value on each signal separately.
3) Zoom way in on a distinct part of the waveform. Zooming on something percussive will make this part much easier, it's easier to see.
4) Use the Time-Shift Tool to line up the waveforms exactly. Keep zooming in and adjusting until you're lined up as accurately as possible, make sure you can see each sample by the time you're done.
5) Now invert one of the signals, either will work.
6) Highlight both signals and select Tracks -> Mix and Render or Project -> Quick Mix.
7) The signal you have left will be the difference between the two original files. Everything here is what you lost when you went to the mp3 format, it's mostly high frequencies and quick changes in dynamics (such as percussion).
I chose Tom Petty's song Refugee in 24/96 FLAC format downloaded from the Tom Petty site as my "master". The reason for my choice is the attached Producer/Engineer note that comes with the download that states:
"FLAC is a “lossless” format, which sounds the same as the source files it was created from. We made the FLAC files from the same high-resolution uncompressed 24-bit 96K master stereo files we used for the vinyl and Blu ray versions of Damn The Torpedoes: Deluxe Edition. When we compared those files to the FLAC’s, the waveforms tested out to be virtually identical. With the right system, you’ll be as close to being there as we can get you."
So I am getting a copy that that the waveforms are "virtually identical" to the original analog 2 track master. Great I like that. I wish all remasters were like that. Give me an uncompressed and virtually identical waveform copy of the best mix/master that you can get your hands on. As we will see, one of the files is badly compressed and is the picture that does not belong with the others.
I used JRiver MC16 to convert the FLAC to WAV and MP3 (LAME 320Kbs). I also have the M4A iTunes version that I will compare.
The top waveform is FLAC, with the left channel on top and the right channel on bottom. And the bottom two waveforms are the left and right WAV channels. These are the original files unaltered.
Here I have zoomed way in on the waveforms to see the individual samples. Now I selected one of the waveforms and inverted the signal, and then applied the Mix and Render to produce this:
As you can see there is nothing there. It is totally nulled out. Let’s see if there is a frequency spectrum:
Ok, what about a frequency analysis:
Nothing. As expected.
Let's move to our comparison of FLAC versus MP3.
Here I have imported, amplified, zoomed, and lined the two waveforms up.
Ok let's, invert one of the waveforms and mix the 2 together:
Aha! Look, there is signal. As expected as MP3 is a lossy format.
Lets look at the spectrum:
And the frequency analysis:
As you can see, mostly high frequency content. I have attached a 30 second snippet of the file so that you can download and hear the difference with your own ears.
Well, so far the experiment is going as anticipated.
So what about M4A iTunes format? Well, lets look and see how far we get, given the the iTunes version is 2 seconds longer than the original.
Again, FLAC on top and M4A on the bottom. WOW! Look at the level of compression! And there is a time difference as well. The FLAC is 3:20 and the M4A is 3:22. So if the FLAC is from the original analog master, what is this M4A version? It looks like it is also on the CD as it says 3:22. But since I don't have a copy, I can't validate that claim. The other variable is that the M4A was downloaded from iTunes and I have no way of knowing if it has been processed in some way.
What could explain the 2 second difference? Total speculation, but I wonder if it was operator error or if the mastering lab had issues with their gear (out of calibration) or maybe the 2 different analog 2 track tape machines were not speed calibrated. Regardless, the iTunes version is horribly compressed. Time for an aside...
Why I want unaltered waveforms.
As someone who used to mix bands sound live and on tape, there is a mixing technique called, riding the fader:
Starts off with an up volume snare roll, nice intro guitar solo, then drops down in volume for the 1st verse, crank volume back up during the chorus, then back down for 2nd verse, then up for the 2nd chorus, then up for the bridge and then peak the Hammond B3 and guitar solo's a bit for a taste of what's to come, then back down for 3rd verse and then pull out all the stops for the final chorus and the rock out at the end, riding the faders to max with the peak on the piercing high note on the guitar solo and then a quick ramp down fade out.
But see, that's the sound mixer doing his/her job in getting every last bit of emotional content of the bands performance, whether live or recorded.
Have another look at the FLAC and M4A waveforms. It takes no special skill to see that the two waveforms do not match or even really resemble each other. But it is the same song! Somehow on the M4A, the low amplitude has been expanded, yet the peaks are lower... Welcome to the world of (badly done) compression.
I would speculate that when TP finished the recording and mixed to death the final two track analog master, over and over again, until it was the way they wanted it. I would think that they thought this is how it was going to get mastered on CD. I.e. an unaltered waveform copy. Not able yet to confirm, but if it is the same copy on the CD as the iTunes version, that's a world of hurt. I see on Wikipedia that there is at least 3 masters available, so who the heck knows what the iTunes one is other than it is atrocious.
I would further speculate that TP has had to live with a rubbish master job since 1979 or apparently it was remastered in the early 80's and again in 2001. Whatever, it was not till 2010 that TP finally got a copy of the original 2 track analog master tape at 24/96 and as it states the waveforms from the master and the 24/96 are virtually identical. I would even go out on a limb to say that the whole reason that Damn the Torpedoes was remastered was because of the incredibly compressed mastering(s). All speculation of course on my behalf.
The point is that the waveform on the iTunes version has been altered so much as to reduce the enjoyment factor to the point where I can't listen to it. Can you imagine what the band must have felt to have heard the iTunes version. Ruined. Again, total speculation.
You do not hear the fader riding in the iTunes compressed version. So there is no build. The verse and chorus sounds the same level, as does the bridge, solos, etc. Sounds flattened and knocked some good emotional performance out of the song. To add insult to injury, the waveform is so altered that it has also altered the soundstage. All of the psychoacoustic cues that the mixing engineer has put in place have been destroyed. In the iTunes version, TP's voice sounds flat and two dimensional. In the 24/96 unaltered copy, you can really hear his voice sitting back in the mix, very 3 dimensional sounding as are the rest of the depth cues in the remaster.
It is somewhat ironic that this is the reverse situation that plagues current remasters today. In other words, TP got burnt on his master, but his remaster is a virtually an identical copy of the original analog master. Whereas, we are now getting remasters that have their waveforms altered from the original. Can you imagine a fan playing the heck out of their favorite bands records, then getting the CD versions and now has an opportunity, 20 years later or whatever, to get a special, hi-res remaster only to find that it sounds nothing like what the fan has listened to for the last 20 years. Talk about disappointment and giving hi-res a bad name.
My example is the new CCR release. I am hesitant to buy it for exactly that reason because there is no information that I can see that states what master is being used for the remaster and what was the process used to remaster this version.
To me as an audiophile, I do not think that it is too much to ask for an unaltered waveform copy of the best master tape available for any given artist/song. Tom Petty did not think it was too much to ask. Just like the record companies did not get on-line music distribution, they are not getting that, with the advent of hi-res Blu ray video, people will pay a premium for hi resolution audio. Assuming it is the real thing and unaltered.
Was it the time or era that caused bad masters? I don't think so. Consider this, The Police's Synchronicity album was released in 1983. That's 28 years ago. The song, "Murder by Numbers" has a whopping DR of 18. Stewart Copland's drums sound incredible and in the last minute of the tune, Hugh Padgham, one of my favorite rock producer/engineers, pushes the faders up on Stewart's drum kit so high it's like he is playing right in your room.
At concert volume, the kick drum punches your stomach and the snare has such a crack, your eyes involuntary blink every time Stewart hits it - awesome! Even more so for a 16/44.1 CD mastered a long time ago. Man, I would pay a pretty penny to get a unaltered waveform copy of the original master from Les Studio.
Back to work. Let’s compare FLAC vs M4A. When I amplified the waveforms, FLAC takes 2.1db to give a peak of 0db, whereas the M4A versions only requires a .2db to reach peak. That's a factor of 10. Here it is zoomed in:
So when I inverted and mixed the 2 together, I got the following result:
Well, first of all, the red is showing clipping. Note in Audacity Beta 1.3.13, under the view menu is a menu item called Show Clipping. As a side note for the folks doing the music analysis thread http://www.computeraudiophile.com/Forums/Music/Music-Analysis-Objective-Subjective , always a good idea to have this Show Clipping checked on so that you can see if any hi-res download is clipped.
I might follow up later and see if I can pitch change the M4A to match the FLAC and re-run the test, but the bottom line is that the iTunes version is massively compressed and waveform altered. Attached is a 30 second snippet of this. Clearly you can hear the echo as one track is 1% out of sync with the other.
I also have the Blu ray version of Damn the Torpedoes. I was going to compare that too, but following Chris's Blu ray ripping video, http://www.computeraudiophile.com/content/Guide-Ripping-DVD-and-Blu-ray-Audio-Using-Dark-Side-Moon-Immersion-Box-Set I could not get the software to extract the streams.
The big standout to me is how much he waveform has been altered on the iTunes M4A version. And how bad it sounds. I would pay a premium price to get the best "historical" remaster of any music without altering the waveform, which means no compression or noise reduction or eq filtering of any kind. I want the virtually identical copy. I want to hear the music the way it was intended to be heard right from the mixing chair to my listening room.
For those that are hearing a difference between lossless file formats of any kind, it is not the file format. I humbly submit that the differences you are hearing are else where. Who knows where as it could be anywhere in or around the signal path. That's the issue. It is not that the files are different, it is how your system is interpreting those files that is different. The problem is diagnosing where the differences are. Or living with the difference and pick your format that sounds best to you and move on.
What is most important to me is the erosion of original performances where the original master is gone and we don't have an unaltered waveform of a "new" master. All those analog tapes are eventually going to fall apart. I hope we have unaltered waveform copies. I understand that famous paintings undergo restorations. But even then, the restorers are trying to restore the painting to its original condition.
That does not seem to be happening in the music remastering world. Those works of art are being altered, some so bad that it does not even resemble what the artist had in mind. The bands dynamics, nuances, soundstage, tone, have all been altered to the point of ruin.
Then again, if the restoration is an enhancement of the original because of modern technology (used professionally), whose to say... Little dip in eq there little bit noise reduction here, wee bit of expansion there - the ol nip and tuck. And if done really well, it should sound like an enhanced version of the original, but restoring more dynamic range, tape hiss reduction, etc.
The idea would be to restore the tape as best as the technology allows without massive compression. Instead, if massively compressed, try dialling in the reverse settings in an expander to try and restore the original waveform (a convolver can do it). It takes a trained ear in compression to be able to apply expansion. Maybe we can get some of the original performance back.
And unless you were in the mastering room and A/B ing the changes of the ol nip and tuck, you may never know what processing occurred. Unless there are other remasters, you could compare those. But If it sounds good, does it matter?
This was an experiment. Others have had similar results. Bruce Brown from Puget Sound Studios, "If I put a wav file on one track and a FLAC file on the other track in Pyramix, I can't tell them apart, sighted or blind". And the Well Tempered Computer.
If you are hearing a difference when you change X, then the waveform must be altered in order to produce that difference. Therefore it can be measured. Don't take my word for it, try it yourself.
Happy listening!<p><a href="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc0eb596_FLACvsMP3Nulltest_zip.156511900b5c8ce7287aa8b7170acb29" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28069" src="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc0eb596_FLACvsMP3Nulltest_zip.156511900b5c8ce7287aa8b7170acb29" class="ipsImage ipsImage_thumbnailed" alt=""></a></p><p><a href="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc0f0d02_FLACvsM4ANulltest_zip.70371baf6816ebe7701db11f0247da4d" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28070" src="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc0f0d02_FLACvsM4ANulltest_zip.70371baf6816ebe7701db11f0247da4d" class="ipsImage ipsImage_thumbnailed" alt=""></a></p><p><a href="<fileStore.core_Attachment>/monthly_2012_05/why_flac_dtt_pdf.ed02f7faa6b987f1406ac76064275a6e" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28071" src="<fileStore.core_Attachment>/monthly_2012_05/why_flac_dtt_pdf.ed02f7faa6b987f1406ac76064275a6e" class="ipsImage ipsImage_thumbnailed" alt=""></a></p><p><a href="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc92a436_FLACvsMP3Nulltest_zip.1a7eabcbefca90e926d5fc4ceca460d5" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28319" src="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc92a436_FLACvsMP3Nulltest_zip.1a7eabcbefca90e926d5fc4ceca460d5" class="ipsImage ipsImage_thumbnailed" alt=""></a></p><p><a href="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc92f099_FLACvsM4ANulltest_zip.918278b3c344219b5e0474833a016381" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28320" src="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc92f099_FLACvsM4ANulltest_zip.918278b3c344219b5e0474833a016381" class="ipsImage ipsImage_thumbnailed" alt=""></a></p><p><a href="<fileStore.core_Attachment>/monthly_2012_05/why_flac_dtt_pdf.f932bbb551d535b43e9a1cd8b7b8662c" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28321" src="<fileStore.core_Attachment>/monthly_2012_05/why_flac_dtt_pdf.f932bbb551d535b43e9a1cd8b7b8662c" class="ipsImage ipsImage_thumbnailed" alt=""></a></p>
Years ago, I was lucky enough to spend a decade as a live sound mixer and recording studio engineer in Western Canada. While most of my sound engineering experience was with rock bands, I had the opportunity to work with many talented musicians, recording/mixing many different types of music, from folk, country, jazz, choirs, and classical. I spent a quite a bit of time working with compressors and limiters, so I thought I would share some of my experiences with them, along with mixing and mastering. This summarized info may assist folks in their decision criteria for selecting high resolution downloads.
Here is a story to illustrate why compression and limiting is inevitable in rock (actually most multi-track recorded) music. I remember recording a bass player in the studio using a direct box. I wish it were this one: http://www.musicvalve.com/directbox.html This would be your audiophile direct box - a work of art and beautiful sound. I figure Computer Audiophile’s could appreciate that even in the pro sound category, some manufacturers are obsessed with sound quality. Back to the story, the point is when the bass player played his part and depending on how hard or soft he played on the strings, produced a wide dynamic range.
I was using a Sony 24 track analog tape recorder with 2" tape, Quantegy (Ampex) 456 brand: http://www.quantegy.com/specsheets/PDF/456.pdf . The general idea is to record as "hot" as possible on the tape so you get the best signal to noise ratio. The tape had quite a bit of headroom, (+10db) so it was popular to "saturate" the tape just a bit (ok sometimes slammed) that also gave the music a "hot" sound. A VU meter is used to measure the level of audio signal going to tape. Side note, that's the short version. The long version is that each strip in the mixing console had a VU meter and that was calibrated along with a corresponding VU meter on the tape recorder, x 24 tracks or more - the calibration took a long time). The idea is, on average, to record around 0db or +3db is hot, and the tape had over 10db of headroom.
The issue is that when the bass player played softly, the level on the VU meter was too low. So if I increased the gain on the input preamp on the console, I could not control the amount of headroom and sometimes when the bass player played harder on the strings, that would "pin" the meter and the result would be preamp and/or tape distortion. Further, given the wide dynamic range, the bass really did not "sit" well in the overall mix.
You may be surprised to learn how sensitive the VU meters are. It does not take a wide variation to swing from -20db to +3 on the meter. While I used the bass player to illustrate a point, recording music of any type, whether it be with a direct box or mics naturally have a wide dynamic range.
Even the most consistent musicians I have worked with, in the sense of even playing, still produced a wide dynamic range. So what choices are there to get the best signal to noise ratio, without distorting the tape (or direct box, input preamp, digital recorder, etc.)? I can't very tell the bass player to change his playing style or sound as you can imagine how nervous people get when they are being recorded in the first place. I have a lot of respect for muscican's to go on tape.
Enter the compressor. I think Wikipedia does an excellent job of describing how a compressor works: http://en.wikipedia.org/wiki/Dynamic_range_compression I am not going to rehash the operation or details here. From a critical listening perspective, depending on the compression threshold, ratio, attack, and release, the compressor will “shape” the waveform passing through it. Folks with an ear for the compressor can set it just enough to maintain good dynamic range, but not distort the tape or digital overload. Judiciously used, the idea is that the (critical) listener would not hear the "compression" working.
The point being, in order to get a good (and even) level on tape (analog or digital), with good signal to noise ratio, a compressor is more often times than not used on every single channel of a 24 track tape recording. Especially drums, voices, or instruments that have wide dynamic range and are closely mic’d typical of any studio/multi-track recording. High end mixing consoles like Neve and Solid State Logic (SSL) have compressor/limiter plugins for every channel "strip" on the console. There is usually a line/mic preamp on the strip, along with EQ, usually parametric with full range of adjustment over multiple bands, headphone mix level, effects level, and then the channel fader or volume control that then feeds the master buss or subgroup. A lot of electronics in the chain even before it hits the tape electronics and tape itself.
After spending many years recording and mixing sound, one of the side effects is that I can easily hear compression/limiting on virtually every single song I listen to. I am so intimate with it that I can probably estimate the compressor settings used and in some cases I can even tell the brand of compressor being used. Here is an example of someone that intimately knows the sound of UREI's line of compressor/limiters: http://www.gearslutz.com/board/568476-post2.html
How does this effect mastering?
So the compressor is not only used, most of the time, on a track by track basis for recording, but sometimes even on certain tracks during mix down. This happens when the individual track still has too wide a dynamic range to "sit" in the mix when it was put to tape. Additionally, there is usually a 2 channel compressor/limiter used on the mix down buss as well, that then is also going to a 2 track (i.e. stereo) recorder, whether analog or digital.
So 24 track mix down --> 2 track master
Even after the final mix has been created, sometime folks want to fiddle with the mix again and rather than mix down from the 24 track, they use the 2 track mix and feed it through the console again, apply whatever "processing" is required and generate another 2 track mix down (whether analog or digital). I am guilty of that myself.
2 track master --> processed 2 track master
That then goes to the mastering lab (or a copy for fear of the original being lost if the engineer did not mix down from 24 track to 2 masters) to be mastered onto whatever media for distribution. Even during the mastering process may repeat processing as described above. The Wikipedia article on mastering does a decent job on describing all of the steps: http://en.wikipedia.org/wiki/Audio_mastering As you can see by the time the music gets to you, the actual copy may be several generations away from the original master. This is especially true when discussing analog generations.
So what about remastering?
Again, I feel Wikipedia does a good job here: http://en.wikipedia.org/wiki/Remaster and I am not going to rehash. But this is the point of Mastering Inception, it is not clear as to the source of the material, where did it come from? How many generations is it? What was done to remaster it? Is it a different re-mix direct from the original 24 (48 track, whatever) track tape? Or is it a direct transfer from an analog 2 track master with no processing? Or maybe direct from 24 tracks to digital. Or is it all digital? Or did it get killed by the loudness war? These are excellent write ups on “The Death of Dynamic Range”: http://www.cdmasteringservices.com/dynamicdeath.htm and “What Happened to Dynamic Range”: http://www.cdmasteringservices.com/dynamicrange.htm
What is really disappointing to me is that even as far back as 1982, digital recording pioneers like Peter Gabriel and his all-digital album Security, that was recorded in his home http://en.wikipedia.org/wiki/Peter_Gabriel_(1982_album) has awesome dynamic range.
Mastering inception? Remaster? Master? Premaster? You may wake up one morning and come to Computer Audiophile to find the thread on HDTracks have come clean and they were are all CD copies that have been upsampled ;-) Just kidding of course. However, I searched all over HDTrack's site and if you read the wording in their Mission Statement: "It is our purpose to allow our customers access to the largest online library of DRM-free CD and DVD-Audio quality downloads complete with liner notes in a PDF format." And read from their About page, "Finally, audiophiles take note. HDtracks offers select titles in ultra-high resolution 96khz/24bit files. This is true DVD-audio sound quality for music lovers that demand the very best! " It almost sounds that their sources are from CD's and DVD-A and who knows, have they been upsampled?
As a side note, it would be nice to see Fleetwood Mac Rumours remixed from the 24 track master. http://www.soundonsound.com/sos/aug07/articles/classictracks_0807.htm But if you read closely, given what was going on, and the tricks used by assembling bits and pieces from multiple 24 tracks is likely never to be reproduced again. Also consider that produce/engineers use track sheets to keep track of all the eq, effects, compressors, etc., being used. SSL consoles and others used computers to automatically keep track of fader positions, mutes, and the like so that the mixing engineer had more hands. Finally, consider that most pre- 90’s recordings were recorded, mixed on gear that simply does not exist or exist in a working fashion. So most material from the past, is likely to have come from an analog 2 track source. But on the other hand, it is a snapshot of history, never to be repeated again, but can be played over and over. Pretty cool if you ask me, provided of course that I am getting the best possible transfer.
Recently, I downloaded Tom Petty and The Heartbreakers - Damn the Torpedoes. Along with the download was a note from the Engineer and Producer. I have attached the note to this post, but here are the magic words I like to hear:
“We’re committed to finding the highest quality way to get music from Tom Petty and the Heartbreakers to you. We want you to hear it at home the same way we hear it in the studio.
We think the best audio option for your computer or media server is FLAC. FLAC is a high quality file that is the best sounding format for downloads. Unlike mp3, which discards elements of the audio to make the file size smaller, FLAC is a “lossless” format, which sounds the same as the source files it was created from. We made the FLAC files from the same high-resolution uncompressed 24-bit 96K master stereo files we used for the vinyl and Blu-Ray versions of Damn The Torpedoes: Deluxe Edition. When we compared those files to the FLAC’s, the waveforms tested out to be virtually identical.
FLAC captures the full dynamic range of the music from the quietest to the loudest sounds. Because of this (and because we are adding no digital compression) it will not sound as “loud” as a standard CD or mp3. To compensate for this, turn up the volume!”
So not all is lost. If we the consumer keep putting pressure on hi-resolution publishers of music to include where the source of the material came from and what the remastering process was, the quality should improve. And that really is the trick to discerning the sound quality of a new hi-res download – where did the source come from for the hi-res version and how was it remastered. Sounds simple enough (no pun intended), but trying to get that info is like pulling teeth and entering Mastering Inception.
A couple of caveats. I glossed over a LOT here. Hopefully, it is just enough info to give a you a flavor of compressors, limiters (the latter I really did not touch on, but virtually every LP that was mastered always had a limiter in the chain to prevent over modulating the disc while cutting) and mastering, premastering, remastering, and in a lot of respects, all the same thing, but at different stages between the 24 track analog/digital tape and you.
Another caveat is that Classical or other 2 channel recordings are typically recorded/mastered without compressors, limiters, or as little as possible in the signal chain and using as high-end mics and components as possible. I used to own a Sony PCM F1 Digital 2 track recorder when it first came out along with a pair of Crown PZM microphones (plugged directly into the unit) that I got to use recoding a number of choirs, orchestras, chamber music, including a cappella. Lots of fun and most of the time was spent getting the mics positioned in a stereo array and watching nervously at the level meter in hopes of getting a good signal to noise ration without clipping or having to ask the artists to perform yet another take.
Finally, Digital Audio Workstations (DAW) has revolutionized the recording industry. http://en.wikipedia.org/wiki/Digital_audio_workstation It has dramatically reduced the cost of having a “recording studio”. I have even heard marvelous uncompressed sound out of Garage Band http://www.apple.com/ilife/garageband/ with some good DI boxes and one or two decent mics. To a large degree, DAW’s being digital, don’t suffer near as bad when it comes to copying as it is all in the digital domain. Theoretically the same. However, I have heard of mastering engineers that will put in an analog loop so they can use their favorite tube compressor to get “that sound”.
My point in all of this is that it really makes a difference to find out where the source of the hi-res copy came from and how it was remastered. Those two decision criteria should assist in your hi-res download choices. Easier said than done :-)
Mitch<p><a href="<fileStore.core_Attachment>/monthly_2012_05/why_flac_dtt_pdf.f13313415b8a33dff447f16dcf18d719" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28082" src="<fileStore.core_Attachment>/monthly_2012_05/why_flac_dtt_pdf.f13313415b8a33dff447f16dcf18d719" class="ipsImage ipsImage_thumbnailed" alt=""></a></p><p><a href="<fileStore.core_Attachment>/monthly_2012_05/why_flac_dtt_pdf.974a6576f853f21558b75261871fc6ca" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28332" src="<fileStore.core_Attachment>/monthly_2012_05/why_flac_dtt_pdf.974a6576f853f21558b75261871fc6ca" class="ipsImage ipsImage_thumbnailed" alt=""></a></p>
Remember that feeling when you were at a concert or club and saw a band perform an awesome live show? The lights, sound, music, and crowd all moving as one. You could feel the music as much as you could hear it. It was loud, but not too loud, sounded clean, with good dynamics and punch. Good times. I wanted to design a high resolution (i.e. audiophile) sound system that reproduced that live sound experience in my listening room.
I want club and concert sound in my listening room via streaming audio (e.g. MOG, Spotify, Zune, etc.), downloadable content (e.g. HDtracks), my LP, CD, and DVD collection, or any media type for that matter. I love live music. If I can't see the band live, maybe I can get a live recording and reproduce that experience in my listening room.
In order to reproduce this wide range of content, we need computer audio. Additionally, I want to maximize available software to perform sound recording and mixing capabilities. I am not biased towards a Mac or PC, I have heard great sound reproduction from both. In my case, I chose a PC.
I am not going to get into the design specifics of the PC build just yet. It's a mid to high end PC. You would do well with the CAPS V2. The purpose of this post is to introduce a few key concepts and criteria in designing a rock and roll sound system, with enough resolution to be called audiophile quality.
When designing the sound system, the sound quality is as important as several other design parameters. Perhaps in future posts I can pick apart each component's design and explain the decision criteria used for that design.
Media Players - there are several to choose from. In my case, I wanted one that could manage and play my digital content with some streaming capabilities. I chose JRiver Media Center mostly for its audiophile specifications. Again, it does not have to be JRiver specifically, but a media player with similar technical specifications and some level of peer approval in the community.
Use external DAC or internal sound card? In my case, I wanted to support sound recording and mixing capabilities. Having been in the recording industry, I am familiar with professional sound cards like Lynx and RME. Bottom line, don't skimp on your digital to analog conversion and the ever important analog line output stage. Every DAC will have pros and cons. Whether external DAC or one on an internal sound card, a good place to start would be the CASH list, plus member reviews to find one that suits your specific requirements.
I chose the Lynx L22 professional sound card . My requirements were to have a stereo audiophile sound card that had a 16 channel mixer, with loopback capabilities, rock solid performance, low noise, and pro level digital and analog technical specifications. Again, I am not saying just the L22, it is a good choice, but there are other brands and external DAC's that are equally good choices. As a side note, I prefer ASIO drivers in exclusive mode as the driver bypasses the normal audio path in the operating system (in my case Windows 7 Ultimate 64bit) and communicates directly with the sound card.
To preamp or not to preamp? Given that the Lynx L22 could plug directly into my power amplifiers, I could have gone that way. However, for safety purposes, and that Lynx does not recommend using digital volume controls, I chose to DIY a passive preamp by purchasing an ALPS RK27 potentiometer (volume control) and mounting it in a simple enclosure with interconnects. Cost under $80.
Power amps. There are two simple design considerations for our rock and roll scenario. One design consideration is the speakers sensitivity that you are intending to drive will determine how much output power you require. Typical audiophile speaker’s sensitivity is around 86db SPL measured at one watt at one meter distance. High efficiency speakers will start at 96db. That 10db difference to our ears = twice as loud and also = 10x as much power to drive them. This is significant, especially if you want to reach concert level peaks (maybe 110db SPL) cleanly so that the amp or speaker is not under duress and has headroom after that.
For example, in order for the 86db sensitivity rated speaker to play as loudly as the 96db speaker at 1 watt at one meter, you will need 10 watts of power. To get the 86db speaker to play at 106db, you will need 100 watts and at 116db, you will need 1000 watts of power. But by then the speakers will have likely blown up, along with our ears at that sound pressure level. Point is, depending on the speakers sensitivity you intend to drive, and its power rating, will ultimately determine how much power you will require, plus headroom.
The other design point to consider when deciding on a power amp is the class of design and operation.
Having heard many different "classes" of amplifiers, I prefer the sound of Class A. Class A amplifiers are very inefficient and dissipate a lot of heat and therefore require large heat sinks and still run hot. Class A amps take an hour to heat up and stabilize before they sound their best. Oh, but I love that type of sound.
This is a DIY Nelson Pass A40 Class A amplifier. Nelson's article on this design, parts list, construction, and measurements makes for an interesting read. The transient response and damping factor performance of this amplifier was a major decision point for me. Said another way, the amp is very fast, yet very tight sounding. Nelson designed for this as the measurements and listening prove out his design. http://www.passdiy.com/pdf/a40.pdf
I owned a pair of these Luxman MQ3600's and they sounded great too, but different than Nelson's solid state Class A design above. It would likely be my second choice or something similar in the tube category.
Aside from sound quality and class of operation, ensure you design for enough power to drive your speakers to concert level with some headroom left over. As a rule of thumb, try and find out the power rating of your speaker. For example, if it is rated for 100 watts, and you wanted 3db of headroom in your amp before clipping, then you would need a 200 watt amplifier. Then based on the sensitivity, you can figure out how loud it will go with 100 watts input, plus a 3db buffer (i.e. 200 watts) for headroom.
Speakers. My design goal is to achieve club/concert sound in my listening room effortlessly, with headroom, and with enough resolution to be considered audiophile. This means selecting speakers sensitive enough to be driven to concert levels with low distortion, high accuracy, plus headroom.
Before I get into speaker selection and design, a public safety message on sound system listening levels. I speak from experience. I spent over 10,000 hours recording and mixing records and CD's years ago. I also spent 1000's of hours on the road as a sound mixer for club and concert sound. Please, protect your ears. A good volume to listen to is 85dba SPL at the listening position. This is reasonably loud and happens to be the level that our ears hear the most balanced sound from a frequency response perspective: The SouthSIDE Of The Tracks - Glen Stephan - Independent Recording Network
Additionally, we can listen to 8 hours of music per day at 85dba SPL. The chart below details the length of time exposed at various sound pressure levels:
From: Noise - Occupational Exposure Limits in Canada : OSH Answers Note the charts are for continuous noise levels.
Anyone serious about maintaining their ear safety while listening to concert level sound, should purchase a sound pressure level meter. I would recommend a meter that is calibrated and meets industry standard specifications like this one: Amazon.com: Sound Pressure Level Meter: Electronics
When you turn up your sound system up to concert volume, turn on the SPL meter and correlate your measurements to the table above to understand your exposure level. This doubly applies to headphones as it is much easier to be too loud with your headphones because it sounds so clean. Play your headphones at your typical volume, take them off, turn on the SPL meter and place the measurement mic as close to the headphone/earbud as you think it is to your ear. This is only an approximation as the headphone or earbud is not sealed with your ear. Just like a serious photographer with a light meter, so should I with a sound level meter.
Back to speakers. I have owned all sorts over the years, Maggies, Thiels, Acoustats, Quad's, Kef's, Paradigm's Celestions, JBL's, Bose, custom builds, including several bookshelf with sub combo's. Almost all of these speakers do not have a high enough sensitivity and/or power rating and run out of gas as they approach concert levels. What to do?
Horn loudspeakers are very sensitive, typically 100db (or more) at one watt at one meter. Horns? Isn't horns and audiophiles an oxymoron? Well, years ago, horns received a deserved bad rep for poor designs and horns made out of materials that ring or color the sound. However, todays compression drivers and horns are computer designed and made out of inert materials that don't ring or color the sound. That old characteristic horn sound is gone. In my case, if you did not see the speakers with the grills removed, after hearing them, you would never believe they were horns. They sound smooth, never harsh or bright, regardless of what level I crank the amp up. In my case, I run out of watts before the speakers run out. But with the watts I do have, I can reach (theoretical) peak of 118db spl.
If you want club and concert rock and roll sound, the bigger the speakers, the better. An analogy to engines, there is no substitute for cubic inches applies here. You can't beat the laws of physics. I personally like large floor standing 3-way full range speakers with no subwoofers. Not that I am against subwoofers, but it takes a lot of effort to properly set up a sub and have it seamlessly integrate with the mains.
Just to comment a little more about bookshelf speakers and sub combos. This has been my experience. Here is a test you can perform on your own system. Put on your favorite rock and roll, blues, grunge, metal, rip, whatever - something that has dynamic and punchy drums and bass. Turn it up. Can you feel the punch of the snare drum in your chest? That snare drum thwack is centered between 100Hz to 300Hz depending on the type of snare drum and how it was tuned. http://home.earthlink.net/~prof.sound/id12.html
The problem is that a 6" or 8" mid/woofer in most 2 way bookshelf speakers don't do so well on snare drums or drums in particular. The physical speaker and cabinet sizes are too small to really come through with a proper timbre bottom end snare sound. The subwoofer has little impact in this frequency range. If you have played drums or stood in front of a drum kit as it is being played, you can "feel" the transient impact of the bottom end snare and tom tom sounds in your chest. That sensation is mostly missing in a bookshelf/sub combo set up, even though they are the most popular set ups around.
I prefer a big tight cone (accordion surround, not foam or butyl rubber surround) driver to cover the bass and lower midrange like a 15" driver from 500Hz on down. In a properly designed "big" cabinet, I can get low frequency rumble of the bass plus really good transient impact and punch from the drum kit. Here is an example of a large 3 way full range loudspeaker:
Cornwall III Floorstanding Speaker | Klipsch
If you read the specs you will see a full range frequency response with a very high sensitivity of 102db. Another audiophile has written an excellent test review of the Cornwall at: http://sites.google.com/site/mitjaborko/Klipsch_Cornwall_Test_Report.pdf?attredirects=0 Also attached.
Sure it is a big box, and people immediately think boomy. However, if you look technically at Paul Kilpsch's designed bass reflex box, he tuned it for maximum transient response as opposed to low frequency extension (i.e. Thiele/Small QB3 alignment). With this cabinet design, plus the woofer being 15" at 4 ohms (efficient enclosure design matched with efficient designed driver), all contribute to keeping up with the mid and high frequency horns in order to achieve that overall high efficiency and fast transient response. I have a great deal of respect for the "Dope from Hope". See attached PDF for some good reading from someone I feel was way ahead of his time. Note the dates of the newsletters.
A similar design, but with updated crossovers, compression drivers, and horns is Bob Crites custom designed "Cornscala". The Cornscala is taking the best parts of the Klipsch Cornwall and La Scala and putting it together. A good read about these are here: Cornscala? | Critesspeakers.com As my wonderful wife said, as we were heaving 125lbs of speaker into the listening room, "it's bigger than the dishwasher... it's bigger than the stove!"
I own these and could not be happier. Note the midrange horn is as big as the 15" woofer. The dynamic impact and solid punch of these speakers are nothing short of feeling like being at a live concert or club. They can effortlessly produce 110db peaks with no sign of strain or distortion. Using state of the art Digital Room Correction, they reproduce sound from 20Hz to beyond my hearing range +-3db at the listening position.
Big floor standing speakers used to be popular and then bookshelf’s and subwoofers more or less took over the market. That is too bad. As mentioned earlier, getting the range of 100Hz to 300Hz, where drums (and upper range of bass guitar) have their maximum punch is hard to get with this combo. With subs, I can feel the low end bass and bottom of the kick drum, but the chest impact of the drums is too high in frequency for a sub to participate in and (usually) too much for a 6" bookshelf speaker to handle a 110db peak. I am not saying it isn't possible, but problematic.
In my case, the design choice of a power amp that has fast transient response, yet total control over any ringing (i.e. damping factor), combined with really high sensitivity speakers, with low distortion, and fast transient response, makes for very dynamic sound output. Especially relevant when the majority of our sound sources are overly compressed or enhanced for iTunes. But when I get a good dynamic range recording, the reproduced sound impact and punch can be both felt and heard. Feels like being at the club or concert.
A couple of caveats to this article. This is high level design guidance and I have glossed over many details. The critical design factors to reproducing club or concert sound that you can hear and feel is to ensure you have big enough and sensitive enough full range speakers and enough amplifier power to drive them to concert levels, effortlessly, and have headroom.
Modern horns and compression drivers that have been computer designed and made out of modern materials bear little resemblance to the "horn" sound of the past. Today you can get high sensitivity, low distortion, fast transient response, and high resolution sound quality in these type of speakers.
Additionally, almost all modern cinema, live club, and concert sound is reproduced by arrays of horn speakers. In some cases, using the exact same pro sound components as used in my own speakers.
Remember, when you are enjoying concert sound pressure levels, keep your sound level meter on and check the chart for exposure times. Your ears are irreplaceable.
Enjoy your concert!<p><a href="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc13a435_DOPEfromHOPE_pdf.8bf3d2d062df444bb7ea1c3393b4532f" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28081" src="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc13a435_DOPEfromHOPE_pdf.8bf3d2d062df444bb7ea1c3393b4532f" class="ipsImage ipsImage_thumbnailed" alt=""></a></p><p><a href="<fileStore.core_Attachment>/monthly_2012_05/Klipsch_Cornwall_Test_Report_pdf.62198f27718b73dceca1893f50605402" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28086" src="<fileStore.core_Attachment>/monthly_2012_05/Klipsch_Cornwall_Test_Report_pdf.62198f27718b73dceca1893f50605402" class="ipsImage ipsImage_thumbnailed" alt=""></a></p><p><a href="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc965247_DOPEfromHOPE_pdf.ce1cf49231abc82b060bba8b071bb7df" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28331" src="<fileStore.core_Attachment>/monthly_2012_05/58cd9bc965247_DOPEfromHOPE_pdf.ce1cf49231abc82b060bba8b071bb7df" class="ipsImage ipsImage_thumbnailed" alt=""></a></p><p><a href="<fileStore.core_Attachment>/monthly_2012_05/Klipsch_Cornwall_Test_Report_pdf.deafd50cdabecd83f082b0e6aaa3812c" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28336" src="<fileStore.core_Attachment>/monthly_2012_05/Klipsch_Cornwall_Test_Report_pdf.deafd50cdabecd83f082b0e6aaa3812c" class="ipsImage ipsImage_thumbnailed" alt=""></a></p>
If you have followed this series on a quest for proper timbre, I have reached a conclusion. From wikipedia, in psychoacoustics, timbre is also called tone quality and tone color. No question, music source, electronics, interconnects, power, etc., all have impact on timbre. However, the biggest factor on reproducing proper timbre, by orders of magnitude, is the speaker to room interface which is limited by small room acoustics: http://www.gcmstudio.com/acoustics/acoustics.html
I was lucky to have worked in several recording studio control rooms. The LEDE control rooms are custom designed to produce the absolute best possible sound quality (i.e. timbre). Certified LEDE studio control rooms costs hundreds of thousands (and into the millions) to design, build, and certify. After logging over ten thousand hours of recording/mixing time in these rooms, every other rooms sounds (very) poor in comparison, including my current room.
As much as I would like to, building a properly designed LEDE room, outfitted with the state of the art diffusers, absorbers and bass traps, is not feasible. When I designed my computer audiophile system, I looked into Digital Room Correction (DRC) software. These series of posts have been about calibrating the speaker to room interface using well known industry standards and measures (i.e. B&K house curve, equilateral triangle, room mode calculator, RT60, etc.).
Before we look at some of the final measurements and analysis, a small aside on the measurement setup. Audiolense measures speakers and room. Based on those measurements, DRC filters are designed and hosted in an external to Audiolense program called a Convolver. I have a Convolver in JRiver as my default music player and a standalone Convolver when I am listening to MOG or Netflix.
Normally, I will perform the measurements using Audiolense on my audio computer with a Lynx L22 sound card. However, I wanted to measure sound in 3D (i.e. time, energy, and frequency like my old TEF computer), sometimes called waterfall plots. Ideally you want the calibrated frequency response (i.e. B&K house curve), so once the sound stops, it decays away at the same rate and ends at the same time across the frequency range. The easiest way to display this visually is using the waterfall plot.
Because Audiolense does not have a waterfall plot function, I used another fine piece of measurement software called REW: http://www.hometheatershack.com/roomeq/ I also used a different computer (laptop) and its internal sound module to perform the measurements. I was concerned that using different software and computer/sound module would introduce too much variability to the measurements.
I took a measurement using Audiolense and then the exact same measurement using REW. I cut n pasted the output of both and superimposed the 2 over each other as best as I could. While I was able to get the frequency scales to line up, my inability to line up the vertical scale shows a little variability within 2db. If I could match the vertical scales,the variability would be even less. The important point is that as is, our ears would hear the two as being identical.
The measurement is a high resolution picture of the frequency response at the listening position. I have zoomed the vertical scale and used 1/12th octave smoothing to show a detailed view. We can see a variation of 22db between the lowest dip and highest peak. This article on Small Room Acoustics explains the theory behind why that is: http://www.gcmstudio.com/acoustics/acoustics.html What I am impressed about is the tight tolerance between the two measurements but using different software and computer/sound card combinations. It is good to see this level of consistency.
That variance of 22db is typical in most small room acoustic listening rooms. Let's compare that variance to something in the electronic domain like my Lynx L22 sound card. It is ruler flat from 15Hz to 50Khz.
Here is the onboard sound module in my Dell Precision M6400 laptop. Its frequency response measures 20Hz to 20Khz +0.3 to -0.3db
This is exactly what I mean by the speaker to room interface having orders of magnitude more influence on timbre than anything else in the audio signal chain. Because of small room acoustics theory and measurement, most of our critical listening rooms have peaks, dips, honks, booms, and resonances. While generally speaking, electronics, cables, everything matters, have orders of magnitude less variance in frequency response, but also less distortion than speakers.
Here is a waterfall plot of my left speaker with the measurement mic in the listening position. Vertical scale is energy, measured in decibels. The horizontal scale is the frequency range from 20Hz to 20Khz. The Z scale is time in milliseconds. Starting at 0 time (i.e. measurement mic) and the decay to 300 milliseconds or a 1/3 of a second or given that sound travels a foot per millisecond, the sound has traveled roughly 300 feet in the room. That's a lot of reflections in my 30'L x 16'W x 8'H listening room.
As an aside, all measurements below were taken at the same sound pressure level within +- 0.5 db tolerance.
Classic dips and peaks of small room acoustics below 500Hz. I get 26db difference between the maximum dip at 45Hz and the honking spike at 200Hz. That is a large variance.
Here is the same measurement, but with Audiolense frequency DRC applied. Ignoring the B&K house curve slope, we are about +-3.5db across the listening range. That is a major improvement from +- 13db in the raw frequency response above. Also note the more even decay time across the frequency range.
Here is the waterfall plot with Audiolense True Time Domain (TTD) DRC applied.
Very interesting as the TTD DRC has really smoothed out the bottom end and extended it to 20Hz. Even more so than just with frequency DRC.
Let’s get serious and zoom on in below 500Hz as that's where the listening room has its maximum influence on the sound - i.e. room modes as per small room acoustics.
This room has a few problems for sure, even though it is typical of small room acoustics. Meaning if you take a frequency response measurement of your system at the listening position, you are likely to see something similar. Unless you have a nice large room (+56 feet length) or a golden room ratio or a LEDE room.
Here is the waterfall plot with frequency DRC applied. It is within +-3.5 db from 25Hz to 500Hz with a nice decay.
With TTD DRC applied:
We are seeing +-3db from 20Hz to 500Hz, with a real nice even decay. It can't get any better than that. If you look close, you will see it is labeled as TTD-11. I designed a dozen or so TTD filters, each time adjusting my filters in the Audiolense Correction Procedure Designer. The 11th version of tweaking the filters resulted in the best quality sound (i.e. timbre), both from a measurement and listening perspective.
The sound difference is as dramatic as the graphs depict. And sounds exactly as the graphs depict. With TTD applied, the sound is full, smooth, and crystal clear. A major improvement, bordering on as good as it gets.
I am impressed with Audiolense DRC. I can't listen to my system without it. With DRC in circuit, it sounds similar to the LEDE rooms I used to work in. It does not sound like my room anymore. The peaks, dips, honks, booms, and resonances are gone. The only acoustic treatment is a 10' x 7' throw carpet between the listening position and speakers.
If I looked at what it cost me for Audiolense, the measurement mic and preamp, it is almost criminal as to how good the results are for the investment. In my previous post, I have seen small fortunes spent on diffusers, absorbers, and bass traps. Look at the pictures.
In order to get the level of small room acoustics under control, comparable to Audiolense, I would have to purchase a lot of sound treatment. If I look at the value I am getting comparing the two, the cost of Audiolense and mic/preamp would only buy 2 RPG diffusers. It is likely I would require 6 to 10 or more diffusers for the back wall, depending on diffuser size. Something would need to go on the ceiling, even at a minimum, that would be 2 to 4 diffusers. Then likely 2 or 4 bass traps.
And that is why I have reached my conclusion. With DRC, the frequency response is +-3db throughout the listening range. The honks, booms, and resonances are gone in the time domain. The sound arrives at my ears at the same time, sounds full, super tight, and crystal clear - no more small room acoustics curse!
More acoustic treatment may help, but given the cost and how good the system sounds/measures with Audiolense DRC, I feel I am at the point of diminishing returns. I would need to spend several thousand dollars in acoustical room treatments to match what I can get with Audiolense DRC. Ideally, I would love to have both as then a full-on LEDE room may be possible. But for now, I am really enjoying my un-room. For once hearing proper timbre outside the studio control room.
Now that my system has been calibrated to reproduce the best possible timbre, I can really start hearing music the way it was intended to be reproduced.
I hope you found these series of posts useful in determining the best possible way to achieve proper timbre in the speaker to room interface.
Now that we have a calibrated frequency response at the listening position, let’s look at the other part of the timbre (i.e. tone quality) equation which is the time domain. Sound in your listening room has 3 measurable dimensions: time, energy, and frequency. We have looked at frequency response, targeting the B&K house curve, and how it affects timbre. So how does the time domain in your listening room affect timbre?
As mentioned earlier, I had the privilege to work in many recording studios, audio dealer listening rooms, and critical listening environments of many types over 30 years. Up until now, the best tone quality (i.e. timbre) I have heard was working in a certified “Live End Dead End” (LEDE) designed control room: http://www.acousticalsolutions.com/live-end-dead-end-control-rooms&usg=AFQjCNFQi2Lyk8WnfyTpRoT-x5voXUoYQQ with Urei 813 "time align" studio monitors: http://mixonline.com/TECnology-Hall-of-Fame/UREI-813-monitors-090106/ That's one expensive spec'd room (in the millions). There is an AES paper here ($20 or $5 for members): http://www.aes.org/e-lib/browse.cfm?elib=11805
It is the importance of Richard C. Heyser‘s invention of Time Delay Spectrometry (TDS), that allowed manufacturers to implement a computer that measures Time, Energy, and Frequency (Techron TEF). http://tecfoundation.com/hof/06techof.html#8 Using such a computer and software allows you to measure sound in 3 dimensions. This enabled Don and Chips Davis to invent the LEDE critical listening room. It enabled Dr. Peter D'Antonio to commercialize Diffuser panels. It enabled speaker designers to "time align" multiple speakers I feel this comment is insightful of the importance of these inventions in modern day acoustics, especially, "90% plus of serious professional control rooms in operation reflect this design type." http://www.hometheatershack.com/forums/home-audio-acoustics/16772-live-end-dead-end-yore.html#post164161
Let's look at a real application in using a TEF computer to analyze the room modes of a typical living room, like most of our critical listening environments. Similar to blowing air across the mouth of a Coke bottle to cause a resonance, the same thing occurs in small room acoustics. i.e. our critical listening rooms. Notice the resonating room mode at 125Hz. Mostly a function of room dimensions. A Helmholtz resonator is constructed at the target frequency of 125Hz and when the right amount is placed in the right spot in the room... voila, room mode tamed:
Room modes, unless you are lucky and get a golden ratio room, or have purposely built such a room, or even a LEDE type room, we are all stuck with room modes simply based on the physical dimensions of our listening rooms. Control rooms, on the other hand, are specifically designed and constructed to precise specifications, including the ability to reproduce a 20Hz sound which has a wavelength of 56.5 feet. These specifications, like LEDE, is designed to produce a psychoacoustic environment that is neutral in tone and sounds larger than its physical dimensions.
This is the Chips Davis designed and built LEDE control room I worked in. I was very fortunate to have been one of the house engineers and got to observe Chips designing and building this facility from scratch to its finished state.
Note the Urei 813C time align monitors. Nowadays the “dead end” is referred to as Reflection Free Zone (RFZ) and some like it more live (like me) than dead. I.e. more diffuser panels are used versus absorbent panels in the front of the control room.
Another LEDE type control room. We are just lining up another set of Urei time aligns to ensure we have an equilateral triangle and perform some room measurements.
This is the rear of the studio from the picture above. We were lucky to have 16 foot ceilings and got a nice diffuse sound field, but required extra tube traps to soak up a long RT60 in the bottom.
This was another LEDE type studio that I worked in.
Here is the rear of the control room from above. Normally there would not be any tube traps and a diffuser panel would fill the gap between those three tube traps on the room centerline. A local company had developed a bass tube trap and we were jamming them into the control room to get some absorption measurements using my TEF computer and listening on the monitors.
Here is an audio dealer critical listening room modeled after the LEDE design.
The psychoacoustic effect of the LEDE design technique is to give the mixers ears the acoustic cues of the larger space (i.e. recording studio) thus allowing the perception of hearing the studio rather than the control room. In our case, we want the perception of hearing (i.e. reproducing) the music rather than our listening room.
LEDE control rooms are specially designed enclosures using computer modeling to achieve the best distribution of room nodes, reflection free zone, and diffuse live end. A LEDE room has seven criteria (i.e. a specification) that must be adhered to and measured in order for the control room to be issued a certificate of compliance. In a nutshell, the idea is to suppress early reflections and diffuse the longer reflections to minimize the impact of the listening room on the reproduced music. You try for a ~ 20 millisecond inital time delay gap betweeen hearing the direct sound at the listening position and the diffuse sound from the rear of the room. Improved clarity and ambience were the result. The effect on listeners of giving the critical listening room a precise time delay gap is that of a much bigger room.
Alton F Everest's, " Master Handbook of Acoustics" is probably the best single book (for under $20) on acoustics that covers all of these concepts in more detail. Alton's book covers everything about acoustics in a readable format that does not require a PHD to comprehend: http://www.amazon.com/Master-Handbook-Acoustics-Alton-Everest/dp/0071360972
The other important factor is “time aligned” speakers. As the Mixonline link above indicates, the 813's were the most successful studio monitors for some time and very likely some of the music you listen to was mixed on these type of monitors. The idea behind the time alignment was to ensure that the direct sound from the monitor is arriving at your ears at the same time. Having the audible frequency range arrive at your ears at the same time is another key to hearing music the way it was intended to be reproduced. The imaging is precise, even across the frequency range and the sound is point source.
The Urei time aligns in the LEDE control room I worked in were the 813C versions. They were Monster cabled inside and out and driven with a couple Crown 1 kilowatt amps. I remember listening to http://www.sa-cd.net/showtitle/856, and the drums sounding incredibly tight and concussive. You could hear and feel the drums like you were standing in front of a real drum kit.
This LEDE room with the time align speakers was the best critical listening room quality I have heard, up until now. With the advent of sophisticated Digital Room Correction (DRC) software and powerful computer hardware, compared to 25 years ago, we can easily take frequency and time response measurements. Further, we can design and build filters (i.e. 65,536 filter taps) that can correct both the frequency and time domain of your system in your specific listening environment.
Using Audiolense, I can almost get the same sense of a physical LEDE designed room without breaking the bank by having to really build one. I am impressed with the sense of space where my room sounds bigger than its physical size (30' x 16' x 8'). At the same time, the sound is incredibly clear and tight, the tone quality is right on, and the stereo imaging is like wearing headphones. I remember feeling the same way in the LEDE control room with the Urei time aligns.
Over at the Audiolense User Forum (http://groups.google.com/group/audiolense), I asked Bernt Ronningsbakk, creator of Audiolense (http://www.juicehifi.com/index.html), if his True Time Domain (TTD) correction ensures that the initial sound arrives at the same time in the listening position, (like the Urei time aligns). I also asked him about early reflections. Here is his response:
“Yes, TTD ensures that initial sound arrives at the same time. And it also syncs up with the reflections that are early enough to fall inside the time domain window. It tightens up the direct sound from the speakers, cleans up some of the early reflections and often cleans up somewhat on all the reflections towards the deepest bass. 5 cycles @ 20 Hz equals 172 meter of sound travel. That is usually plenty to cover a few generations of reflections”.
I can concur that is what I am hearing with my ears and what is being measured. Sitting in front of the speakers almost feels like I am wearing headphones, The sense of clarity I have never heard before with just frequency correction. For example, this tune http://www.amazon.com/Chaiyya/dp/B000QOOD8K has a vibrato synth sound that cycles from left to right speaker through most of the tune at a quieter level. I never heard that before without TTD. I can hear it on the headphones, but up until now, that sound was obscured by my listening room. The song also contains some very powerful low frequency pulses midway into the tune. This will really test the low frequency response speaker to room interface. What I really notice is how tight the sound is and devoid of long low frequency decays using TTD.
HDTracks sampler https://www.hdtracks.com/index.php?file=login&redirectto=samplealbumdownload&ialbum_id=6446 artist, "Dave's True Story" and song "Misery" is a favorite of mine because of the awesome hall it was recorded in. I can hear the dimensions of the room coming through the stereo, just as clear as if I was wearing headphones.
It is incredible to think what is going on here as prior to this, short of spending a small fortune on a properly sized, designed, and built critical listening environment that meets a specification like LEDE, it was simply not possible to achieve. The reality is that we are bound by the physical dimensions of our listening room, which means small room acoustics apply. Unfortunately, this means most of our listening environments have those "Coke bottle" resonances predetermined by our room dimensions. Adding absorbers, diffusers and bass traps will likely improve most situations, but will not give the perception of being in a larger space.
As an aside, one audio dealership went to the trouble to compute the room modes and used one of the golden ratios to build a critical listening room. Listening and measurements proved out that golden room ratios (at least that specific one) made a considerable difference to the tone quality of the room. It is all about room mode distribution and Alton's book goes into this in detail. If you ever see the spec for a LEDE control room, you will see there are no parallel surfaces, in addition to being built as a room suspended in another room, plus many more specs (see Alton's book).
I believe that any critical listening environment should have a best effort towards acoustical treatment. Absorb or diffuse early reflections, diffuse later reflections, and control room modes with tube traps or Helmholtz resonators. I would caution against too much absorption versus diffusion. Too much damping makes the sound lifeless.
In my case, my first acoustical treatment was to put down a thick 10' x 7' area rug between my listening position and speakers. It really ties the room together.
The purpose of this was twofold. One was to add some overall damping to my otherwise totally live listening room; the other was to suppress any early reflections off the floor to the listening position. Even though my speakers are on Vibrapods, they should be raised off the floor more. Also looking at Tube Traps or Helmholtz Resonators behind the speakers tuned to about 200 Hz according to my measurements. I still have the ceiling to look at. It is easy to use a mirror to determine where the reflections are on the ceiling. The back wall needs diffusers of some sort. Plenty of commercial and DIY's to choose from.
Here is what my back wall looks like. It's just waiting for diffusers.
The way I see it, Audiolense is the icing on the cake. It completes the speaker to room interface. I can perform a best effort in speaker to room set up and acoustical treatment. Use a tape measure or better yet this digital laser measure (http://www.computeraudiophile.com/content/Get-Better-Sound-Without-Spending-Fortune)and read back a few articles to get the proper configuration to be measured up.
In the end, Audiolense will dial in the correct timbre by ensuring that the frequency response at the listening position is calibrated to a target and each speakers output is identical across the target frequency range at the listening position. Additionally, all sound will arrive at listening position at the same time, plus cleaning up early reflections and tightening up the bottom end. The result is better clarity and the sense of spaciousness without sounding like small room acoustics. To me, it sounds almost as good as being in one of those million $ LEDE control rooms. It is amazing to me.
However, not all is roses. I was not able to use TTD until recently. Up until now, I was just using frequency correction. Btw, the frequency correction works perfect and sound great. I am completely satisfied. TTD became possible when I introduced the carpet a week ago. Then I was able to get TTD working. It seemed that my room was just too live (it really has no absorption whatsoever) and maybe confused the software as I could not get as good a tone quality as the frequency correction. Now with the carpet in the room to provide some overall damping, I have been able to get TTD to work very well. I am encouraged by how quickly I can get excellent tone quality that is better than what I got with frequency correction. Plus I get time alignment, clearer sound, and tighter bass.
Audiolense can produce filters that have a bit of pre-ringing to them. I asked Bernt about this and his response was, "Since the improvement from a TTD over a pure frequency correction sometimes is perceived to be so massive, those who can't get rid of the audible pre-ringing gets really frustrated. I can understand that. Going back to a pure frequency correction isn't the same after you've heard the potential in TTD."
"The easy solution for us would be to abandon the time domain correction, or just reduce the scope of it to a level where it does just a little right and nothing wrong - and simply lower the expectations. But the massive improvement that TTD sometimes brings is a convincing argument, so I've been researching the pre-ringing issue and developing a solution for more than 1.5 years or so. I didn't know what the root causes were when I started. I only had a hunch. And I didn't know whether it could be fixed. A lot of hard work and outside the box thinking has gone into this but it is starting to pay off. I now have simulations at my desktop where the pre-ringing is suppressed so far down that the noise floor in the measurement is becoming a disturbing factor. Only time & experience will show whether the problem is about to be solved once & for all, or the bottleneck will move somewhere else."
I appreciate Bernt's work. Like I say, I have never been able to escape small room acoustics, until now. It really hits home when you work 8 hours a day infront of the control room monitors and come home and your nice gear sounds no where as good as the LEDE room. I have been spoiled rotten working in top end control rooms specifically designed for sound quality.
It is (mostly) the curse of small room acoustics. Using Bernt's software is the first time I have heard my listening room sound not like my room, but like when I was working in LEDE rooms. Very impressive and still maintaining audiophile quality sound as I hear no side effects of the digital FIR filters in the signal chain. Audio software playback engines http://wiki.jriver.com/index.php/Audiophile_Info and DSP software http://convolver.sourceforge.net/ these days are incredibly advanced and running on the most powerful computing equipment we have ever owned. I do not expect to hear any distortion being introduced by its use.
I am really impressed with the sound. It appears I have an optimum configuration already. I am going to continue experimenting, but mostly listening and figuring out DIY acoustic treatment I may build next. The better the room acoustics I can make, the better filters I can design, the better the overall sound. I think I can meet the LEDE spec.
This is Part 4 of a series on a quest to hear music the way it was intended to be reproduced. In the last 3 posts, we have “voiced” and calibrated our speakers to an equilateral triangle, took some frequency response measurements, analyzed the results, and introduced digital room correction (DRC). Let’s look at the frequency response measurement results and DRC in more detail. Is DRC ready for prime time? I think this post will show conclusively, yes, DRC is ready for audiophiles to take full advantage of their sound system investments. Once you hear correct timbre, you won’t go back ;-)
When I say DRC, I mean the subject area: http://en.wikipedia.org/wiki/Digital_room_correction and not the software with the same name: http://drc-fir.sourceforge.net/ I have not used this DRC software and can only provide you with my experiences using Audiolense.
Here is the frequency response of my sound system measured at the listening position:
Here is the frequency response with DRC enabled, using the B&K house curve as the target, again at the listening position:
How was this DRC accomplished? Within Audiolense, you click on generate correction filters, which produces the inverse of the measured frequency response like this:
You save the filter to a file location and then load it in a Convolver like the one hosted in JRiver Media Center:
And within a few minutes, you are listening to music the way it was intended to be reproduced. More on this later.
Now let’s zoom in on these graphs so you can see more detail. I will use the same relative vertical and horizontal scales so that the two graphs can be compared.
Without DRC at the listening position:
With DRC (and the B&K house curve applied) at the listening position:
Let’s analyze this. Given the graphs presented, I would conclude that DRC works very well. It's quite the controlled difference between the before and after with DRC, especially the tight variation tolerance. After 200Hz, the variance is +- 2.5db. Tighter tolerance than the manufacturers anechoic chamber specs.
Here is what it means. Most speaker manufactures will produce a frequency response curve (plus speaker sensitivity) @ 1 watt @ 1 meter in an anechoic chamber. The tech spec usually includes a variance limit on the frequency response, like +- 3db across the measured frequency range. Note that the anechoic chamber is designed to eliminate the room effects on the measurement. Having been in one, it is an interesting experience, completely void of relected sound.
Let’s take the popular B&W CM8 and look at its specified frequency response. From their online manual, http://www.bowers-wilkins.com/Downloads/Product/InfoSheet/ENG_FP300741_CM8_info_sheet.pdf, the frequency response measures 69Hz to 22Khz +- 3db variance. My speakers, which are a tech modernization of the Klipsch Cornwall, but custom designed and built by Bob Crites, called Cornscala Type C http://www.critesspeakers.com/cornscala-style-c.html They will have a similar frequency response to the Cornwall III’s of 34hz to 20Khz +- 3db.
My point is that while my speakers measure, 34Hz to 20Khz with a +- 3db variation in an anechoic chamber, the moment that I put them in a real listening room, of any sort, all bets are off. Look at the frequency response variations of my listening room again. Note the maximum amplitude deviation.
I get +- 12 db from about 34Hz to almost 20Khz. Having measured several studio control rooms and critical listening environments, this looks typical. In fact, regardless of your audio electronics and speakers, you are likely to get similar measurements in your own listening room – it’s all a function of room modes and there is no escape. I could certainly improve mine by moving the speakers/listening position around a bit more and adding acoustical treatments to the room. Ultimately, I could build a seperate critical listening room with more favorable "golden room ratios", but that isn't in the cards for me at this time. Side note, to be sure, if you ever have the opportunity to build a room, get the golden rule ratios as it does make a fundamental difference in the low end.
If I look at the DRC “calibrated” frequency response above and not taking into account the slope of the B&K house curve, I get +-3 db from about 34Hz to almost 20Khz. Now that is about the same frequency response specification I get from the manufacturer, when they measure in an anechoic chamber.
Effectively, what this means is the DRC is not only doing its job by eliminating or minimizing the room acoustics, it is also applying (i.e. calibrated by) the B&K house curve which renders the right tonal quality (i.e. timbre) at the listening position. This is even without the most basic of acoustic room treatments. I could use a carpet on the hardwood floor in front of the speakers. Note how closely the left and right curves match each other. This means that we get a solid dead center phantom image produced by the speakers as any amplitude imbalance of the electronics/speakers/room interface has been calibrated by the DRC.
I have been using Audiolense (and the DRC filter it produces) in my audiophile system for about 6 months. I have not had any issues with the digital filters or any other critical listening artifacts arising from their use.
I look at DRC as a must have to fully realize your audio system investment. Every attention to detail and calibration throughout the audio chain will pay off in the end. However, given that modern day electronics can have frequency responses from 10Hz to 100Khz with +- .25db variations, the speaker to room interface is the biggest variation, by far, on the quality of sound (read: timbre) than any other component in the audio chain.
You are not locked into the listening position using frequency DRC. You can walk anywhere around your listening room and notice the major improvement in sound quality. As mentioned earlier in my first post, my wife commented on how the recorded piano I was listening to still sounded real (i.e. correct tone quality) when she was out in the attached garage. Correct timbre, once you hear it, you won’t go back.
Or at least that has been my case. I am still blown away listening to my studio mixes from many years ago that I produced in a multi-million dollar recording facility and hearing it sound “identical” in my home. I would suggest that with DRC, and the B&K house curve, you hear music reproduced as close as possible to exactly what the mastering/mixing engineer and producer wanted you to hear, with the proper timbre. With the advent of 96/24 resolution recordings, you are hearing as close to the master tape as possible. It is great to be a computer audiophile!
We can get even more sonic improvements. In the case of Audiolense, we can achieve time domain correction in addition to frequency correction. What does this mean? We will look at this in my next post.
Now that my speakers are set up in an equilateral triangle, let’s take some frequency response measurements. First we need a calibrated microphone, software to perform the measurements, and a sound card. While there are several choices of measurement mics, acoustic measurement software, and sound cards, I will be using a MP-1r-KIT Acoustical measurement kit:" http://www.content.ibf-acoustic.com/catalog/product_info.php?cPath=30&products_id=35, Audiolense software: http://www.juicehifi.com/index.html, and Lynx L22 sound card: http://www.lynxstudio.com/product_detail.asp?i=11 in my Windows 7 PC.
Here is a view of my sound system setup. My listening room is approximately 30’L x 16’W x 8’H with no room dampening, mostly hardwood floors and drywall. A very “live” room, with a fairly long RT60. I have the speakers set up on the long side of the room to minimize side wall reflections. I took this pic from the back wall overlooking the couch (i.e. listening position). The couch is about 4 feet from the back wall.
Here is a side view to show how far the speakers are away from the back wall and where the listening position is.
The measurement mic is placed on a mic stand behind the couch with the mic stand feet on foam so any transmission through the floor (the speakers are on Vibrapods – highly recommended) won’t be picked up by the mic. Use a tape measure so that the tip of the mic is at ear level and forms the equilateral triangle with the speakers.
One easy way to measure up is to cut 2 pieces of string that form the equilateral triangle and tape them in the exact same spot on the top of each speaker and hold them taught while moving the mic into position. Note that the mic should be exactly centered between the two speakers at the listening position.
A side note on measurement mics. Regardless of which measurement mic you opt for, ensure that a) it is calibrated and b) you have the calibration file. Have a look at my mic calibration file:
Side note. Notice the mic is calibrated for a “pass band” of 20Hz to 20kHz. Most acoustic measurements will be pass band limited to this range, not only by the microphone, but also by the swept sine wave the measurement software produces and subsequent results through the sound system back to the microphone. This is not an anechoic chamber measurement test to determine the frequency limits of the loudspeaker at 1 watt @ 1 meter. We are measuring the audio chain at the listening position in a real room in order to produce as natural timbre as possible in the pass band. That is our goal.
Side note 2. Timbre is (mostly) the tonal quality of the music reproduced. Ideally speaking, if I was to record the sound of an acoustic guitar in the room and then play it back over the sound system, I am expecting the tonal quality of the reproduced guitar to be as similar as possible to the live guitar. That's what I expect in the studio/control room, I also expect that in my home stereo as well.
Of course the source material and electronics chain plays a factor in timbre for sure. But it is the speaker to room interface that is the weakest link, by far. If you are using measurements as a way to gauge the difference, the speaker to room interface frequency response, at the listening position, has orders of magnitude deviation compared to an amplifier's frequency response for example.
Now that you have the mic setup, let’s turn to the software. Note that Audiolense also creates digital room correction files, but at this time, we are using the free portion of the software to take frequency response measurements at the listening position.
In this seetup screen for a stereo configuration, we are applying a swept sine wave from 20Hz to 24Khz over 10 seconds for each speaker. Be careful when doing this as you don’t want to blast the speakers into oblivion. You would like about a 90db or so measurement at the mic.
Now let’s run the frequency response measurement. Please refer to your sound card manual and Audiolense help file for the finer points of setup and operation. Here is the frequncy response measured at the listening position:
There are several things to infer here. One is the incredible detail of the frequency response measured. The first thing to do is look at a smoothed version (1/3 octave smoothing) that correlates more to what our ear can discriminate between frequencies:
Now how does this frequency response correlate to the B&K house curve mentioned in earlier posts? Side note, the purpose of this exercise is to implement the “target” B&K house curve as closely as possible as it is proven, from a speaker to room interface, to provide the most natural timbre. And as time will tell, it will also produce that elusive depth quality to the sound stage. Audiolense lets you “draw” your own frequency response targets, so I took the image from Figure 5 in http://www.bksv.com/doc/17-197.pdf and drew that exact target curve in Audiolense. Here is what it looks like relative to my measurement, it is the flat curve on top with the high frequency rolloff:
Listening to the sound system correlates to what is being measured on the screen. It is a bit bright sounding, which should come as no surprise as the room my speakers are in have little sound absorption. I seemed to have “voiced” (see Part 2 for voicing) the speakers reasonably well as there are no huge dips or peaks in the low end, save for that 200Hz peak. Also note the different amplitudes from each speaker at the listening position. What we really want is to have each speakers output identical over the frequency range. That will give us a dead center phantom image.
Note the overall shape of the filtered response to the B&K house curve target. Again a bit on the bright side and as a general rule, unless you have a really dampened room, most speakers, relative to the B&K house curve will be a bit bright. Unfortunately this messes with both the timbre and the perceived depth of the sound stage as we will see.
So now what? Ideally I would add dampening material to the room to bring down both the mid to high frequency response to be more in line with B&K house curve and to what my ears perceive. How much dampening? If you crack open any of the acoustic books from F. Alton Everst, it will tell you how to measure RT60 and calculate the number of sabines required to dampen the room at various frequencies. As someone that has been there and done that, it is a lot of work, but is quite rewarding once it is done and matches what you want.
However, with the advent of digital room correction software, in mere minutes, you can generate filters that are the inverse of the measured frequency response, apply it to the target, install the filters in a DSP like ConvolverVST, hosted in a media player like JRiver MC16 and voila:
And the listening result? Near perfect timbre and full depth sound stage.
We will look at this frequency correction result in detail in the next post, including a section on the next step, time domain correction.
Part 1 is here. Thanks for your comments. Before we can measure the frequency response of your sound system at the listening position, we need to configure the speakers to the listening room. These set up steps are required in the quest to hear music the way it was intended to be reproduced – i.e. best effort timbre. This is the first part of a three part process. The three parts are setup, measure, and adjust. Then we iterate, sometimes a few times, sometimes more. It will cost you nothing but a few hours or more of your time moving your speakers and perhaps listening position around your listening room. A tape measure is required.
I was going to dive into acoustics, like in this article: http://www.nonoise.org/quietnet/tcaa/smallrooms.pdf But I thought it would be better to explain a few quick wins that you can easily achieve in your own listening room using a bit of muscle and a tape measure. Of course, these are setup calibration steps in order to establish a baseline for the series of frequency response measurements we are going to perform.
Have a look at this listening room. Actually it is a control room in a recording studio. In fact, the vast majority of control rooms in the world will be a variation of this set up:
<p><a href="<fileStore.core_Attachment>/monthly_2012_05/ExampleCRroom.jpg.3d16ad60bdf83615fd56e6d83adf6a36.jpg" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28072" src="<fileStore.core_Attachment>/monthly_2012_05/ExampleCRroom.jpg.3d16ad60bdf83615fd56e6d83adf6a36.jpg" class="ipsImage ipsImage_thumbnailed" alt=""></a></p>
You can find many examples of these setups on the internet, including specifications, plans, golden room ratios, etc. We will come back to that when we look at acoustics. What’s important at this point is:
1. The speakers and the listening position form an equilateral triangle.
2. A best effort attempt at passive room treatments to calibrate the damping, reflections, and RT60 of the room. http://en.wikipedia.org/wiki/Reverberation
3. Best effort attempt at using the tape measure for all measurements, symmetrical or otherwise.
Let’s walk though through this in some detail, but we are going to leave step 2 out for a while.
But first a comment. I take no credit for any of this. These are all public specifications, tried and true, and are generally accepted as industry standard in the pro audio biz. If this is “old hat” to you, just think of it as a quick review of how important it is in the quest to hear music the way it was intended to be reproduced.
An equilateral triangle of the speakers to the listening position is required for proper decoding of the stereo mix. Look at the diagrams above for both the control room and critical listening room. Both are equilateral triangles. In my listening room, my speakers are 9ft apart (center to center) and each speaker is 9ft away from where my ears are located at the listening position. The speakers should be toed in so that they are on axis to your ears in the listening position.
Get out your tape measure. Make sure that whatever equilateral triangle that you end up with that the distance between the speakers and that each speaker to your ears are as exact as possible, down to a ¼ inch tolerance or less if you can do it. This is absolutely critical to ensure you are getting the exact sound stage that was mixed in the control room in the recording studio.
Here is an analogy with respect to sound waves. Ever throw a rock in water and watch the waves it produces? Now throw two rocks in the water, spaced apart (like 3 to 6ft for example) and try to do it so they land in the water at exactly the same time. Really hard to do, but look at the waves produced and when they meet – beautiful symmetry. If the rocks land at different times, then observe the waves produced. The one that landed first will produce a wave sooner than the rock that landed second and when the waveforms mix, it will look distorted, (i.e. loss of symmetry) that is because it is. Quick rule of thumb, sound waves travel 1 foot per millisecond.
Short story. When I was working in a about to be built LEDE studio on the West Coast, Chips Davis used a professional laser distance meter, levels, and transits to layout the design of the studio including measuring the equilateral triangle down to 1/16” tolerance. While I am old school with the tape measure, I see the prices of some of these laser distance meters are down in the $100 to $200 range. So if you spend time measuring, moving the speaker ever so slightly and re-measuring, over and over again – it’s perfectly normal to increment and iterate.
I can’t stress enough how it important that everything is measured and as symmetrical as possible in your listening room. That includes measuring the toe-in of the speakers from the back wall for example so that they are as near a perfect mirror of each other. This is critical to attaining proper timbre, especially related to the perceived depth of the sound stage.
The end result is that your speaker system is calibrated to properly reproduce (i.e. decode) stereo sound. From a listening perspective, each speaker’s sound will arrive at your ears at the same time. This will result in a perfect, none distorted (from a time perspective) representation of the stereo signal. You will hear pinpoint imaging, dead center phantom image, and now in a position to move to the next step of the calibration process.
Before we continue, I know some will ask, how far do I move the speakers into the room from the back wall? Great question and one we will measure, but for a starting point, and avoiding an acoustics conversation, hear is a quick way to “voice” the speaker position in your room and train your ears at the same time.
Play music that has good bass content. If you have sound level meter, like the infamous Radio Shack http://www.maxim-ic.com/images/appnotes/988/DI127Fig04.jpg meter, then select C weighting and slow response and crank up the music to average 85 to 90db (note we will come back to this sound level and why it is important in another post). If you don’t have a sound level meter, no worries, just crank up the sound a bit, but not really loud, we just want to load the room with sound.
Turn your balance control to either left our right so only one speaker is playing. Now go stand beside the speaker and listen to the bass sound. Listen to how even the bass sounds as the notes go from high to low and vice versa. Does the bass sound louder on some notes and less on others? If so, start moving the speaker slowly forward while listening. For really trained ears, this is like blowing air into a Coke bottle and hearing the resonance. That’s what we are doing. We are trying to find the sweet spot where all of the bass notes sound even up and down the scale.
If you can’t hear the difference, no worries. Try moving the speaker against or as close to the back wall as possible. It is likely to sound boxy, or too much bass. Now move the speaker several feet from the back wall and listen again – the bass response should be considerably different. It may be that most of the bass seems to have disappeared. Somewhere between the two positions is the best position for the speaker based on your specific room ratios. Patience and practice will assist in finding the sweet spot.
Now turn the balance control to the other speaker and move the speaker the same distance from the rear wall that you had moved the other speaker from. It should sound the same in the bass region. Use a tape measure to get it exact. Now turn the balance control to the center and listen again. Bass notes sound even through the scale? Does the balance of bass to mids, to highs sound ok? Use your ears, they are wonderful measuring devices. Congrats, you just voiced your speaker to room interface without having a PHD in acoustics or breaking out the measuring equipment. Remember this is a starting point or baseline in order to continue the calibration process.
Now that you have located the sweet spot, make the measurements exact using the tape measure to form that equalateral triangle. Take the time to get it within a ¼” tolerance.
I am a bit reluctant to get into room treatments and acoustics until we take some measurements. The reality is that you have the listening room you have. You could work out the room modes with a room mode calculator like http://www.mcsquared.com/metricmodes.htm and you would do well to read the reference links at the bottom of: http://en.wikipedia.org/wiki/Resonant_room_modes You could also check to see if your room falls into the gold room ratios that are in the slides I referenced earlier. You will notice in the article that there are different types of rooms from the LEDE to RFZ to ESS. My own room falls into the latter ESS category. You can also look at this speaker set up guide: http://www.cardas.com/pdf/roomsetup.pdf
We are at a point where we have a best effort speaker to room setup and calibrated to a well-known standard (i.e. equilateral triangle). Now we have a baseline in which we can start taking frequency response measurements. In my next post, I will start taking measurements of this setup and we will see how close I voiced my speaker setup in the bass frequencies – remember they should be as evenly distributed as possible. I will get into a bit of room acoustics and basic room treatments if the measurements warrant it.
Mitch<p><a href="<fileStore.core_Attachment>/monthly_2012_05/ExampleCRroom.jpg.bd9c28ebd591bfcc0ea77fb78a2d8146.jpg" class="ipsAttachLink ipsAttachLink_image"><img data-fileid="28322" src="<fileStore.core_Attachment>/monthly_2012_05/ExampleCRroom.jpg.bd9c28ebd591bfcc0ea77fb78a2d8146.jpg" class="ipsImage ipsImage_thumbnailed" alt=""></a></p>
I love music, any kind of music really. As a former recording/mixing engineer/producer for 8 years, and lifetime audio freak, I had the privilege to record, mix, and master a wide variety of music. In this introductory post, we will look at the most important quality of reproducing music called, "timbre". Over a series of posts, the goal is to calibrate your sound system to be the most accurate reproducer of music for your ultimate listening pleasure :-)
In Wikipedia’s definition of timbre, you will see, aside from the technical definition, “In psychoacoustics, timbre is also called tone quality and tone color.” Tone quality is critically important in the reproduction of recorded music.
If you have ever heard live music, (e.g. piano, acoustic guitar, horns, strings, drums, etc.) then you may remember how it sounded. You may also remember when you went home and listened to something similar on your stereo that it did not have the same “tone quality”. Why?
Well, it so happens that another group of folks were also wondering this and produced this outstanding short article on, “Relevant loudspeaker tests in studios in Hi-Fi dealers' demo rooms in the home etc.” Of very particular importance is the frequency response curve in Figure 5. We will come back to that a bit later.
From the article abstract, “The "sound" of a Hi-Fi set is to a great extent room dependent. Very often, the final result is determined by the room rather than by the actual equipment. Fortunately, these influences may readily be measured.”
What the article is describing is musical timbre or tone quality. Unfortunately, the reality is that the tone quality reproduced by your sound system is highly dependent on your listening room. Before becoming a recording engineer, I was in electronics engineering world and as a hobby, built a great deal of speakers, amplifiers and preamps (still do). I also got into room acoustics and managed to get my hands on this wonderful device that revolutionized audio measurement techniques.
The TEF stands for time, energy and frequency. Very quickly you could analyze a room in 3D and determine the rooms “tonal quality” for sound reproduction. Based on that, you could treat the room with “Tube Traps” for bass frequency tuning, absorption materials for dampening overly live rooms, and “diffuser panels” to prevent slap echoes, but not overly dampen the room. I bought every possible book on recording and control room design and room tuning. I will provide a resource list later for those interested.
I had the privilege to observe Chips Davis design and build two multi-million dollar recording studios and control rooms from scratch using his infamous Live End Dead End (LEDE) room design. I then went on to “treat” several recording studios, controls rooms, critical listening rooms at audio dealers, and several private critical listening rooms using the TEF computer and lessons I learned from Chips plus the reference books.
My point in saying all of this is to pass on to you my learning’s to benefit you in your quest for the most tonally accurate sound reproduction system you can achieve with your existing equipment. No, I am not going to suggest you rip up your room or spend thousands or tens of thousands of dollars on acoustical measurement equipment and room treatments. What I am suggesting is that with a few key considerations, and a few bucks, you can make dramatic improvements to the tonal quality of your existing sound system.
Let’s get back to timbre and that B&K article, specifically Figure 5, “Optimum curve for hi fi equipment measured in the actual listening room.” Figure 5 is the key to tonal quality. That curve is the frequency response measured at the listening position. If your sound system measures close to this curve, especially the roll-off, then congratulations, you have achieved tonal perfection! Once you have heard a sound system that is calibrated to this curve, then you will understand exactly what I mean. Everything sounds “right” and all of a sudden the depth soundstage magically appears.
There is good reason for this curve, affectionately called the B&K house curve. In the recording studio world, in the control room, there will most always be a set of speakers that are tuned or calibrated to the B&K house curve. Why? Because it most accurately reproduces instruments that sound tonally correct. I.e. it has the best timbre. Additionally, when mixing engineers move from one studio to the next and listen to their mix downs, with this curve, it will have the same tone quality it had in the previous studio. Consistency is the key.
My wife, who is not an audiophile and puts ups with my tape measures and swept sine waves once commented, “I was in the garage and even there it sounded like someone is playing the piano in our living room.” That is near perfect timbre.
So the first step in understanding whether your sound system is tonally correct or at least as best as it can be, is to measure the frequency response at the listening position in your listening room and compare it to the B&K house curve. In my next post, I will show you how to do that without breaking the bank.