Moses

another shot at double-blind testing

Rate this topic

22 posts in this topic

I’m a long-time reader of CA, and I’ve been thinking about double-blind testing since I first read some of the truly bewildering long threads here on the subject. As my first real post here I would like to have another crack at it; it keeps coming up. Most of what’s said in favor of DBT seems to me quite naïve, though I may be missing something. (I’m a philosophy professor with an interest in experimental methodology, especially in psychology and psychiatry; I say that not to try to establish credibility (it doesn’t) but just to indicate where I’m coming from.)

 

(1) DBT is often said to be the mark of a scientific approach. In fact, there are only very special cases in science where it’s applicable at all, even in medical science. Consider the question whether it’s better to treat a psychiatric illness in a hospital or at home. Is there any way you’re going to be able to design a DBT here? The drug case, where you can have a placebo that seems just like the test drug, is quite special. The very mechanics of setting up a DBT in audio create an environment that is quite different to the usual listening context. So there aren’t going to be any obvious implications of the DBT for what’s discernible in the usual listening context.

 

(2) Outcome measures. The usual outcome measure chosen for DBT in audio is verbal report: ‘Can you say which component you’re listening to? Can you reliably say which component sounds better?’. Someone said: “I had two amplifiers and I had my friend come round so I could compare them blind. After he switched them around for a while I said, I can’t tell the difference. He said: ‘Well, every time you were listening to amp A you listened to the song all the way through. Every time you listened to amp B you signaled to change amp after a minute or so.’” Audio people are usually interested in some outcome like ‘emotional engagement with the music’. It’s entirely possible that at the level of verbal report, there’s no immediate difference in your reaction to two components, even though at some non-verbal level, such as emotional engagement, there is a big difference in your perceptual reaction. Perceptual psychology in the last thirty years has been all about the big differences in perceptions that can be invisible at the level of verbal report (you can't say what's different about two scenes you see, you can't even say that there is a difference) unmistakeably show up on implicit measures (your emotional reactions to the two scenes are nonetheless quite different, even if you don't explicitly acknowledge it). The double-blind tests that are done on audio components usually treat verbal report as the outcome measure. To do double-blind testing that was relevant to the audiophile’s concerns you would have to look at outcomes that are much more difficult to measure: emotional engagement, ability to distinguish (non-verbally) different elements in a complex piece of music, and so on. It’s not impossible to measure those things, but it’s not easy either, and I’ve never seen it done at all.

 

(3) Perceptual learning. Suppose you’re getting interested in fine art, and you’re offered a choice between an original Rembrandt for your office wall, or an excellent copy. (You can’t sell either.) You can’t tell the difference between them. It’s still not irrational for you to prefer to have the original, because the differences you can’t see now may become available to you in time. That might be optimistic, but it’s not crazy. Similarly, you may in DBT not be able to tell the difference between a high-specification component and a low-specification component now. It’s still not irrational to prefer the high-specification component, because the differences may become perceptible by you in time. That may be optimistic, but it’s not crazy.

 

(4) Fear of placebo effects. The idea is that if you rely, in your purchasing, on the results of double-blind tests, that will provide you with some protection against placebo effects. But of course they provide you with no protection at all. Suppose that your hearing is duller than most people’s. But you read that most people, in DBT, prefer A to B. So you try out A and B, and lo, you prefer A to B. How do you know that isn’t a placebo effect? Or, going the other way, suppose your hearing is actually more acute than most people’s. And you read that in DBT, most people can’t tell the difference between A and B. So you try out A and B, and sure enough, you can’t tell any difference. DBT in this area isn’t a guarantee of freedom from placebo effects, it’s just a fresh source of placebo effects. Of course, in principle there are ways in which you could start to sort this out, but it would require extensive cross-testing that goes way beyond anything that anyone in their senses would bother with for an audio component. Of course, you could do the DBT with yourself as the only subject, but by the time you have done it often enough for the test to have statistical significance you are going to be so far away from the usual listening context that you are going to have a hard time figuring whether there are any implications from your trials for what goes on the usual context.

 

My own conclusion is that there’s a lot to be said for an approach that involves listening to a component in a stress-free environment for a week or so and seeing if you get on with it. Or, if that isn’t possible, reading the views of other people who have managed to do that, and who seem to be unusually skilled at articulating their responses. Of course, neither of these is foolproof either. YMMV.

0

Share this post


Link to post
Share on other sites

Thanks for the post Moses.

"As my first real post here I would like to have another crack at it; it keeps coming up ...."

It will be interesting to see if anybody follows it - and who.

 

Some points you make are worth extra emphasis...

"DBT is often said to be the mark of a scientific approach. In fact, there are only very special cases in science where it’s applicable at all ..."

Well said!

 

"... but it would require extensive cross-testing that goes way beyond anything that anyone in their senses would bother with for an audio component."

Which is why you hear a never-ending stream of calls for X,Y, and Z to be verified by DBT, yet almost never anybody who comes out and says they've actually done it!

 

"It’s still not irrational ... [it] may be optimistic, but it’s not crazy."

It is just too easy to pontificate with "show me the (scientific) proof or else I'm calling it crazy". I hope this thread develops into a resource to which such posters can be directed.

0

Share this post


Link to post
Share on other sites

Yo Moses.

 

Excellent, thoughtful post.

 

Implicitly, the difference you give in your examples (drug trial vs psychiatric care environment) could also be seen as one of complexity.

 

The drug trial is a simple yes/no binary decision, whereas the home treatment vs. institutional treatment has thousands of individual (not necessarily even binary) components, and you are looking at a composite.

 

The psychiatric analogy however is strangely appropriate for present purposes, for a variety of reasons.

 

An A/B/X double-blind test is best-suited for determining if there exist audible differences within one elementary component of an otherwise identical system. Let's take computer power supplies, for example. It is best applied in situations where the null hypothesis is that no difference exists, so the burden of proof is on the person asserting the difference.

 

Get two identical laptops. Plug both into your system. One is battery-powered and one is mains-powered. Toggle blindly (or double-blindly, i.e., having someone else setting up and doing it might help). Then pretty much all of your objections have been removed.

 

It may be that subtle differences emerge only after additional tweaking, but nothing prevents you from testing again once these differences appear to have manifested themselves.

 

Do you happen to know Steve Laurence?

Edited by wgscott
0

Share this post


Link to post
Share on other sites

 

(1) DBT is often said to be the mark of a scientific approach. In fact, there are only very special cases in science where it’s applicable at all, even in medical science. Consider the question whether it’s better to treat a psychiatric illness in a hospital or at home. Is there any way you’re going to be able to design a DBT here? The drug case, where you can have a placebo that seems just like the test drug, is quite special. The very mechanics of setting up a DBT in audio create an environment that is quite different to the usual listening context. So there aren’t going to be any obvious implications of the DBT for what’s discernible in the usual listening context.

 

While I don't know of it being done, one could always set up listening sessions that are totally conventional, and record the reactions of listens. One could do switching between components or source files remotely without telling the listener it is happening. Observe for differences. Not sure how one could detect those differences until you can remotely monitor emotions, maybe if monitor blood pressure, heart rate etc. etc. That would still require some different context in normal listening, but might be closer. Say switching between high and lo rez files monitoring listeners, and not telling them a switch has happened. Again, don't know this is done much, but in principle could be to counter your issues with how it is conventionally done. Then one could compare the discriminating ability of that versus the usual method.

 

(2) Outcome measures. The usual outcome measure chosen for DBT in audio is verbal report: ‘Can you say which component you’re listening to? Can you reliably say which component sounds better?’. Someone said: “I had two amplifiers and I had my friend come round so I could compare them blind. After he switched them around for a while I said, I can’t tell the difference. He said: ‘Well, every time you were listening to amp A you listened to the song all the way through. Every time you listened to amp B you signaled to change amp after a minute or so.’” Audio people are usually interested in some outcome like ‘emotional engagement with the music’. It’s entirely possible that at the level of verbal report, there’s no immediate difference in your reaction to two components, even though at some non-verbal level, such as emotional engagement, there is a big difference in your perceptual reaction. Perceptual psychology in the last thirty years has been all about the big differences in perceptions that can be invisible at the level of verbal report (you can't say what's different about two scenes you see, you can't even say that there is a difference) unmistakeably show up on implicit measures (your emotional reactions to the two scenes are nonetheless quite different, even if you don't explicitly acknowledge it). The double-blind tests that are done on audio components usually treat verbal report as the outcome measure. To do double-blind testing that was relevant to the audiophile’s concerns you would have to look at outcomes that are much more difficult to measure: emotional engagement, ability to distinguish (non-verbally) different elements in a complex piece of music, and so on. It’s not impossible to measure those things, but it’s not easy either, and I’ve never seen it done at all.

 

Same idea as above for handling these questions.

 

(3) Perceptual learning. Suppose you’re getting interested in fine art, and you’re offered a choice between an original Rembrandt for your office wall, or an excellent copy. (You can’t sell either.) You can’t tell the difference between them. It’s still not irrational for you to prefer to have the original, because the differences you can’t see now may become available to you in time. That might be optimistic, but it’s not crazy. Similarly, you may in DBT not be able to tell the difference between a high-specification component and a low-specification component now. It’s still not irrational to prefer the high-specification component, because the differences may become perceptible by you in time. That may be optimistic, but it’s not crazy.

 

Well no one is giving you a Rembrandt for free. Lets say you are offered a very good copy for $10,000 which you can afford. Or you are offered the real deal for $3,000,000 which you can just barely afford and must be pretty rich for that to be the case. Now does it make sense to spend the extra dough just in case, over time you become attune to the differences while currently you cannot tell any difference? And for all anyone knows the copy may have been good enough that no one, even Rembrandt himself could distinguish between them. That does start to sound a bit less than wise if not crazy. Though if you can afford it such is your choice at any time of course.

 

(4) Fear of placebo effects. The idea is that if you rely, in your purchasing, on the results of double-blind tests, that will provide you with some protection against placebo effects. But of course they provide you with no protection at all. Suppose that your hearing is duller than most people’s. But you read that most people, in DBT, prefer A to B. So you try out A and B, and lo, you prefer A to B. How do you know that isn’t a placebo effect? Or, going the other way, suppose your hearing is actually more acute than most people’s. And you read that in DBT, most people can’t tell the difference between A and B. So you try out A and B, and sure enough, you can’t tell any difference. DBT in this area isn’t a guarantee of freedom from placebo effects, it’s just a fresh source of placebo effects. Of course, in principle there are ways in which you could start to sort this out, but it would require extensive cross-testing that goes way beyond anything that anyone in their senses would bother with for an audio component. Of course, you could do the DBT with yourself as the only subject, but by the time you have done it often enough for the test to have statistical significance you are going to be so far away from the usual listening context that you are going to have a hard time figuring whether there are any implications from your trials for what goes on the usual context.

 

This seems to me to simply come down to the need to figure out the level of accuracy and usefulness of DBT's. Or alter the methodology of what blind testing is done. There are variations beyond ABX which I assume you are familiar with. Otherwise one can only say any testing or measuring is simply a new source of placebo. This is something I would have to reject in principle because you end saying you don't know anything about anything. Could seeing two amps have identical frequency responses etc. and see that they null out to infinity mean hearing them the same is placebo? I don't see that it does. I see one is saying your knowledge of that might affect your tendency to hear them the same and believe it, but since they are in fact the same that isn't placebo. Now the reverse seeing the measure differently and hearing them sound different might or might not be placebo. If the difference is below audibility yet you let that knowledge color your perception and hear them different it is placebo. If instead you perceive them the same, then it would be an accurate perception.

 

My own conclusion is that there’s a lot to be said for an approach that involves listening to a component in a stress-free environment for a week or so and seeing if you get on with it. Or, if that isn’t possible, reading the views of other people who have managed to do that, and who seem to be unusually skilled at articulating their responses. Of course, neither of these is foolproof either. YMMV.

 

Except that rapid switching between reference and signal under test seems to let one circumscribe the very limits of aural perception more finely than extended listening. Unless you can monitor emotional long term response and know it is from sound quality differences and not other extraneous factors I don't see how you can know long term is better.

0

Share this post


Link to post
Share on other sites

Thanks for those reassuring comments, much relieved!

 

@wgscott:

 

Get two identical laptops. Plug both into your system. One is battery-powered and one is mains-powered. Toggle blindly (or double-blindly, i.e., having someone else setting up and doing it might help). Then pretty much all of your objections have been removed.

 

This doesn't specify an outcome measure. (In the case of DBT for a drug, for example, what patients say isn't relevant. The outcome measure is, e.g., whether they recover from the illness or not. What's the right outcome measure in the case of an audio test?)

 

Again, thanks!

 

P.S. I know of Steve and have literally met him once, years ago, we are maybe one degree of separation apart.

0

Share this post


Link to post
Share on other sites

Except that rapid switching between reference and signal under test seems to let one circumscribe the very limits of aural perception more finely than extended listening. Unless you can monitor emotional long term response and know it is from sound quality differences and not other extraneous factors I don't see how you can know long term is better.

 

But aren't our brains trained to minimise the effects of changing environments when we are trying to concentrate on a specific aspect of what is going on? For instance, if I am in a cave chatting with a friend about the latest sabre tooth tiger attacks and we walk out of the cave still chatting.. And then when a sabre tooth tiger hurtles towards us at 40 kph, it is more important to not be distracted by the changes in acoustics and instead heed the warning from my friend about the tiger even though the acoustics of the sound of his speech have changed quite a lot. So I think our brains put a lot of effort into minimising perceptible changes when sounds, colours or levels of darkness change because that is a good idea from a survivability point of view. Double blind tests assume that we can detect short term changes really well, when I see no reason why we would have evolved to do that well. And that doesn't mean there is a contradiction in how we might have also evolved to be very good at learning how to make fine discriminations in the long term.

0

Share this post


Link to post
Share on other sites
For instance, if I am in a cave chatting with a friend about the latest sabre tooth tiger attacks and we walk out of the cave still chatting.. And then when a sabre tooth tiger hurtles towards us at 40 kph, it is more important to not be distracted by the changes in acoustics and instead heed the warning from my friend about the tiger even though the acoustics of the sound of his speech have changed quite a lot. So I think our brains put a lot of effort into minimising perceptible changes when sounds, colours or levels of darkness change because that is a good idea from a survivability point of view.

 

I think most research points to exactly the opposite. Our brains are very good at making the "steady state" fade away from our focus, and instead concentrate on short-term changes/deviations. There was a huge survival value in tuning out the "ordinary" savannah/jungle/forest sounds, and be very sensitive to changes / new sounds appearing suddenly.

0

Share this post


Link to post
Share on other sites

Well, look here : http://www.computeraudiophile.com/threads/11243-Can-you-participate-to-a-listening-test?p=152336&viewfull=1#post152336

 

And as I said just today in some other thread : listening should be without consciousness (in my view).

 

On a regular base I have people over to audition "A" and "B", and what I always do is

 

a. try to notice what is happening to myself;

b. see whether the other person exhibits similar.

 

Ad b.

Could be some frowning, looking around, a small foot-tap, itching, a smile, eyes closed, anything - and everything with the opposite of course.

 

Then later, I could tell the person "didn't you notice at track X that you got annoyed of something ?" or anything applicable. Next I will be able to tell about the correlation with similarities in other tracks, but more explicitly present there. I will explain what happened in track X.

 

This is not exactly DBT'ing, but in my view similar and better;

This does not compare equal songs/tracks, but it does compare the underlaying elements. The means to dig them up are not necessarily about those elements itself, but are more at the global level. Example :

 

When cymbals sound too "rubber" to your likings, it may be a subjective idea from yourself. But what if track X, Y, Z all sound too rubber ? then it must be the equipment (whatever what's under test). So, in my view all is about finding similarities in your equipment under test, and that doesn't work much when track X is compared to track X. It does work when long term listening is performed with many albums and the most various types of music. You said "a week"; I always take 5 days, and (very) occasionally that proves to be not enough.

 

Peter

 

PS: One of the best posts I read about the subjects. Thanks !

0

Share this post


Link to post
Share on other sites
So, in my view all is about finding similarities in your equipment under test, and that doesn't work much when track X is compared to track X. It does work when long term listening is performed with many albums and the most various types of music. You said "a week"; I always take 5 days, and (very) occasionally that proves to be not enough.

 

esldude provided a link to a report of DBT where some distortion was added to one of the signals. There, comparisons over long periods seemed less effective than rapid switching at detecting a difference.

 

However, I have just recently had the chance to listen to the Jordi Savall-conducted Brandenburg Concertos on DSD vs. Redbook - or rather: in the case of DSD, Audirvana+ did on-the-fly conversion of DSD to PCM at 24/88.2 (my DAC does not accept 176.4 USB input), while for Redbook it did on-the-fly upsampling to 24/192. There were, as you might imagine, multiple differences in sound between the two versions. One important difference was that loudness levels for the two formats were not equal, so there was some fiddling around to get roughly equal levels. It took a fair amount of listening to determine which one I felt was closer overall to "right," to the sound of the instruments and ensemble in a live performance. (Turned out to be the DSD.)

 

So where files are identical in sound except for one specific thing - a bit of added distortion, for example - I think the difference may be more amenable to identification by rapid switching than if there are many differences and the question is which one overall is giving a clearer, more natural window onto the performers.

0

Share this post


Link to post
Share on other sites

You are talking about the source material now. This looks like a quite different subject to me. But still ... If you'd take Hires vs Redbook as the (your) example, it still won't come out that Hires is generally flawed when only one album is compared.

Oh, Now I'm changing the subject *again*.

 

Anyway, of course. We could try to apply that distortion to 100s of tracks and see whether it's audible, but the reason is already beyond me. Unless "distortion" is meant to be a possible improvement of course. But now I'm fairly sure it needs those 100s of tracks again, because whatever it is may not work out the same for all music.

Right ?

 

Oh well.

:)

0

Share this post


Link to post
Share on other sites
Thanks for those reassuring comments, much relieved!

 

@wgscott:

 

 

 

This doesn't specify an outcome measure. (In the case of DBT for a drug, for example, what patients say isn't relevant. The outcome measure is, e.g., whether they recover from the illness or not. What's the right outcome measure in the case of an audio test?)

 

Sorry I dropped the ball. Kid's b-day party interfered.

 

In the audio test, I am assuming it is just audible vs. inaudible.

 

I grant you that is weaker than some objectively quantifiable phenomenon, like 5 year survival rate.

 

You raise the same point I tried to make once when I first posted at hydrogen audio, so I am sympathetic (they were not). As a scientist, I have almost never had occasion to make use of double-blind tests or anything like that.

 

In a way, it is a bandaid for situations were you cannot make a better measurement, or are simply trying to decide whether a phenomenon really exists or not, where there is no clear hypothesis to test (except for the null hypothesis). DBT is useful when it is the only way forward, but in my experience it almost never is the only way forward.

 

To put it into Popper-like terms (which used to irritate the shit out of Steve), not only is the burden of testing placed on someone asserting the existence of an a priori unlikely phenomenon, but there is also a burden to state explicitly under what conditions that person would be willing to accept that their hypothesis has been refuted.

 

I think the main problem is that there almost never is that required prior agreement on what constitutes experimental refutation. It's not like medicine, where you have the luxury of being able to conduct a body count in the county morgue.

0

Share this post


Link to post
Share on other sites

Nice post! I encourage you to read some of the research white papers co-written by Floyd Toole and Sean Olive on the subject. A couple of the key factors they've found to successful DBT of audio components is that you have a fool-proof system set up to quickly compare given components in a consistent environment and that you have trained listeners (with good hearing) that CAN discern the slight differences that most of us might not detect.

 

Their work on the topic while both were at Harman is among the best out their presently IMO. Floyd's moved on from Harman but Sean continues to carry the torch in his research.

 

Bill

0

Share this post


Link to post
Share on other sites

@wgscott

I don't think we disagree on much, in fact I think, though I couldn't swear to it, that the example I gave about listening to songs all the way through on amp A vs. only for a minute on amp B, may have come from someone in your Hydrogen Audio thread.

 

In the audio test, I am assuming it is just audible vs. inaudible.

 

How do you measure 'audible vs. inaudible'? Being able to say, 'Now it's there, now it isn't', is one way. Another would be to ask the subject to tap a finger, or blink an eye, if they hear it. Another way would be to look at galvanic skin responses, or other measures of physiological arousal, as esldude suggested above. The trouble is that it's turned out all these measures will give you different answers as to what's audible, there just isn't a single answer to the question 'what's audible?'. So which one matters for the music listener? Is it any of these, or something else?

 

When people talk about DBT, they talk as if they know perfectly well what 'the test' is, only they want to be sure it's blind. But what's the test?

 

I am finding this a very illuminating discussion, many thanks.

John

0

Share this post


Link to post
Share on other sites
@wgscott

 

How do you measure 'audible vs. inaudible'? Being able to say, 'Now it's there, now it isn't', is one way. Another would be to ask the subject to tap a finger, or blink an eye, if they hear it. Another way would be to look at galvanic skin responses, or other measures of physiological arousal, as esldude suggested above. The trouble is that it's turned out all these measures will give you different answers as to what's audible, there just isn't a single answer to the question 'what's audible?'. So which one matters for the music listener? Is it any of these, or something else?

 

When people talk about DBT, they talk as if they know perfectly well what 'the test' is, only they want to be sure it's blind. But what's the test?

 

I am finding this a very illuminating discussion, many thanks.

John

 

I don't think I disagree with you much either John. Blind testing is very much a case of not having better ways to move forward. It is useful and has provided plenty of good insight into what is heard and what isn't.

 

You may have some knowledge that gives you a good opinion. I have thought in some ways 2 alternative forced choice is a better method than typical ABX. You do need to be picking between something. Just which is better would work I suppose. My guess is picking which of two tracks has better bass or better space or some other factors might be better still. It would seem to be closer to how audiophiles listen when they are comparing equipment at home for instance. Would also be less stressful in unfamiliar way because you know for certain that two tracks differ in some way. Just trying to see if that difference is audible and which you prefer.

 

What bugs me is subjectively thinking you hear something or have people claim it (often with bizarre theories as to why, though sometimes not so far fetched) while measuring the signal electrically at pretty high precision and finding no difference, then maybe someone tests it blind and finds no difference. You cannot prove a negative, but neither does negative results prove the reverse. This also doesn't preclude something unknown going on, but seems unwise and inefficient to always look for unknowns when several methods can't come up with something. I would think in your field you know more about such than I do though.

0

Share this post


Link to post
Share on other sites
What bugs me is subjectively thinking you hear something or have people claim it...while measuring the signal electrically at pretty high precision and finding no difference, then maybe someone tests it blind and finds no difference.

 

I would conceive of bounds regarding valid DBT consisting of your example above on one end (where a negative response on DBT would reinforce other data, or a positive response on a DBT in the face of no measurables might send us on a quest for new measurables), and the other end being the limit of sensitivity when testing for audibility of something measurable, such as Moses' example of galvanic skin response being positive while conscious verbal response indicates nothing.

0

Share this post


Link to post
Share on other sites
What bugs me is subjectively thinking you hear something or have people claim it (often with bizarre theories as to why, though sometimes not so far fetched) while measuring the signal electrically at pretty high precision and finding no difference

 

This is because we don't measure "right".

 

Please remember, I can measure 100% everything which *is* different, and as far as I can tell I am the only one at this moment.

If that measurement doesn't show a thing, there just is nothing (different).

If a double knot in your interlinks make you pereceive an audible difference, my measurement *will* show that difference.

 

There is no, NO single way I can let show what I can see with my own measurements with any of the common means.

 

And no, my measurements do not show you an absolute quality figure. Too bad. ;)

0

Share this post


Link to post
Share on other sites
When people talk about DBT, they talk as if they know perfectly well what 'the test' is, only they want to be sure it's blind. But what's the test?

 

In the audio test, I am assuming it is just audible vs. inaudible.

 

Then better think about this one :

 

Nothing is more easy to "create" more detail. I'm talking about the detail that makes you perceive spitting in microphones, clicking tongues and that sort of thing. Never (never !) I found that this kind of detail worked out for the better, net in the end.

 

Now you guys go out on a DBT mission. What will you chose for being the better one ?

 

It doesn't work like this.

What does work is having the reference at hand. And next just listen to whatever is under test.

That you need this auditory (long term) memory is another thing.

 

I know, this isn't exactly a proposal for a better DBT setup; only an indication that results may be useless anyway, when indeed it is not known what to look for (for the better). So only 2c here.

0

Share this post


Link to post
Share on other sites
You are talking about the source material now. This looks like a quite different subject to me. But still ... If you'd take Hires vs Redbook as the (your) example, it still won't come out that Hires is generally flawed when only one album is compared.

Oh, Now I'm changing the subject *again*.

 

Anyway, of course. We could try to apply that distortion to 100s of tracks and see whether it's audible, but the reason is already beyond me. Unless "distortion" is meant to be a possible improvement of course. But now I'm fairly sure it needs those 100s of tracks again, because whatever it is may not work out the same for all music.

Right ?

 

Oh well.

:)

 

(Momentarily stopping back at esldude's example to clarify - the study added in a small amount of some sort of distortion that's commonly part of specifications for audio equipment, if I remember correctly, and asked people which of two tracks, identical except for the added distortion, sounded better to them. A significantly greater percentage of people who did rapid switching selected the non-distorted track as better vs. those who did long term listening. esldude, if I've screwed this up or left out something important, correct me.)

 

This is very interesting stuff - not just whether you can hear something, but what to listen for. I don't have the expertise to pick out certain things as indicative of particular kinds of defects, so I listen for what sounds more like live performance to me.

 

In the case of the Savall, I had a later, possibly better recorded set with Savall's orchestra performing music from the era of Louis XIII. That had a fair amount of brass and drums. From that, I gained the impression I already mentioned, that (at first subconsciously) I was reluctant to make the Redbook track loud, while I had no such problem with the converted DSD.

 

Then I listened to the Brandenburgs. Strangely, because I don't usually use them as a reference, the strings in the second (slow) movement of the sixth Concerto were what settled it for me. First, I noticed that I heard more "body" from the strings. With the smaller/higher strings especially, I heard not just the notes being played on the strings, but the resonance from the body of the instrument more as it sounds in live performance. I heard a less uniform sound, more of the intonation from the bowing, communicating more emotion. And what finally made up my mind was a tour-de-force cadenza by one of the violins in the middle of the movement that was a little bit lost amid the background with the Redbook, but seemed more highlighted (I felt appropriately) in the DSD conversion. (I listened many times, and with the DSD, despite the highlighting of the violin, the interplay of the background instruments was not lost; if anything, it was clearer than with the Redbook.)

 

Now it is quite possible I could be wrong, and in the performance that was recorded the violin was meant to be just part of the ensemble at that point; and perhaps the live sound had more "strings" (though with less audible information about the bowing?) and less "body" to all the stringed instruments. But this is not my memory of sound of the string sections of live orchestras, nor of the relative volume of a player in the midst of a scintillating cadenza vs. the rest of the ensemble. So that is what I base my thinking on for the moment. It is of course possible that if I listen to enough DSD and/or hi-res of orchestral or other music vs. Redbook, I will hear what you are describing, Peter, about how hi-res is flawed. So far, I doubt I have a large enough sample of hi-res to make that judgment. And no doubt the differences in player software and DAC have something to do with it as well.

0

Share this post


Link to post
Share on other sites

Thanks for the interesting post Moses.

 

I don’t believe I have ever written any comments or opinions on DBT, but it is a subject I have thought about on a number of occasions. So, here are some thoughts and information I have absorbed over the past 4 years.

 

Recent research in the field of neuroscience and how music effects the brain has inspired a few psychologists to challenge old theories and develop new theories on how humans listen to music. One would think that with our marvelously precise and complex abilities to hear, make, and enjoy music a DBT would be an easy task. However, in reality with this testing method there are just too many variables which affect the listener.

 

Back in the beginning, someone got curious and started to experiment with audible differences that listeners claimed to hear between audio sources, and it was proven that “sometimes” (not always) what was heard was the product of imagination. Additional research suggested the illusions of our imagination were strong, an experience shared by many listeners, and consistently associated with specific knowledge of the audio source. Therefore the solution was to adapt the old Double Blind ABX Test used in the medical field for listening. The purpose was to confirm that an audible difference is indeed caused by the audio sources, and not just by the listener's personal impressions. However, many times the double blind test failed due to one factor or another.

 

One major factor rarely considered, and one of the most difficult elements with DBT is selecting the music. For example: if one selects a composition for which there is a strong emotional connection, the details one is listening for can be easily missed because the brain is relating to the music in a different manner. The brain actually produces autobiographical images and emotions. Another interesting factor is upon listening to some recorded music the brain will produce false memories.

 

At times it is almost impossible to separate emotions from the music because composers study the psychology of music as part of musical theory so they can apply known scales and key changes to evoke an emotion within the brain of the listener. Then there is the recording factor. Has the musical event been recorded live, or has the event been uniquely created in the studio, because there are studies that show the difference can affect the listener.

 

Our memory is a mental system that receives, stores, organizes, alters and recovers information from sensory input. Sensory memory, short-term memory and long-term memory are the three basic types. The music we hear first enters sensory memory, which holds an exact copy of the data for a few seconds. Short-term memory is the next step, and it holds small quantities of information for a brief period longer than sensory memory. Selective attention is utilized at this time to regulate what information is transferred to short-term memory. Unimportant sound information is removed permanently. Listening to the same music time and again or a musician practicing a piece for days on end can lead to long-term memory. This mental process of how we listen is referred to as the psychology of music which is mainly cognitive psychology – the study of mental processes including how people think, perceive, remember and learn, and part of the larger field of cognitive science which is related to other disciplines including neuroscience, philosophy and linguistics. A general definition would be the study of all processes by which the sensory input is transformed, reduced, elaborated, stored, recovered, and used.

 

The ability of how to listen to music can also play as a major factor. If one does not know in advance exactly what to listen for, what good is any listening test? If one focuses on the sound quality of treble (using selective attention), one will not even notice anything occurring in the lower bass, because it will be quickly forgotten. If one is focused on the soundstage and placement of the instruments, the quality of sound will be ignored by our short term memory.

 

Time is also a factor. It takes time for sustained memory to develop. It is unknown how many musical details can be captured by memory in one brief listening session. Most of the research with listening has been focused on the voice and language. Research in the Psychology and Neuroscience of sound are just now beginning to explore how music affects the brain and new theories being developed unrelated to voice and language. There are so many factors involved within cognitive psychology and cognitive neuroscience like: acoustics, microsound, the ability to separate noise from music, and the list goes on. All influence how and why sound can affect the brain.

 

This brings me to the term “placebo effect” often used in any discussion of DBT. Please consider for one moment that a placebo is an inert dummy drug. The effect is a psychological state manifested over a period of time, like weeks or months into real physical and emotional symptoms and originates as a direct result from the subject’s beliefs and expectations. Placebo refers to a beneficial, pleasant, or desirable consequent of taking the drug without knowing it is nothing more than an inert chemical. The opposite is the Nocebo effect where the subject is expecting negative consequences. Both effects may be physiological, behavioral, and/or emotional, but nonetheless cause real symptoms. So, somewhere along the line, somebody decided to connect cognitive memory and the sensory input of sound to the placebo/nocebo effect. However, it is not through Sensory memory where the placebo effect originates, it occurs when the subject is “expecting” to hear a pleasant or unpleasant sound during the act of listening to music before the music is actually received by the sensory receptors.

 

I cannot speak for anyone else, but I personally have never looked at an electronic component or a piece of wire with expectations it will sonically perform in a pleasant or unpleasant manner before actually listening to it and/or performing the act of comparing it to another known source. This may be the result that beliefs and expectations take place in a different area of the brain than the sensory receptors for sound.

 

Nevertheless, many audiophiles claim the placebo effect occurs and can manifest in a matter of a few seconds. At times I would wonder how that is possible. When considering how our brain processes sound, I tend to believe it is simply a convenient excuse for a lack of knowledge, in that whatever one hears which cannot be immediately explained must naturally be a placebo effect. What is actually occurring is the listener expecting to hear certain sounds prior to the sound waves being generated, then receiving a euphoric feeling or self satisfaction when hearing the sound which confirmed their expectations or suspicions, even though the quality of expected sounds never existed.

 

Then I was thinking, perhaps this placebo effect is directly connected to expectations through ritual. Has anyone observed an audiophile in love with their vinyl record collection? There is a good amount of ritual involved. Vinyl is a delicate medium that requires care, the audiophile will arrange the albums in a personal manner, delicately remove the album from the slip case avoiding any fingerprints on the groves, examining the album in the light before placing it on the turntable platter, brushing out the grooves, and with ever so much care lowering the tone arm. Just setting up the perfect turntable has its own rituals, technology, tweaks, and associated techno-babble. Audiophiles see their turntables as a delicate precision instrument and are willing to invest outrageous amounts in the purchase of one. The whole ritual generates a strong expectation of wonderful sound beyond any digital source.

 

The nocebo effect can also occur with expectation through ritual. Vinyl aficionados go through the ritual of recording hours on the tone arm cartridge, lubrication schedules, tube life hours, etc., all with the expectation of sound quality deterioration. Then after maintenance the return of the placebo effect.

 

With a CD there is little ritual except for removing the damn cellophane from the jewel case and slipping the CD into the drive bay. MP3 players have even less ritual.

 

With computer audio there is a great deal of ritual that perhaps we all don’t take notice of. The ritual of converting one’s computer to a dedicated music player includes the selection and installation of software, deleting and disabling applications no longer needed, deciding on a sound card or using fire wire or USB, choosing the cable to interface with a DAC, tweaking everything for weeks, always downloading music to the computer, keeping a daily watch for new HR downloads available, backing up hard drives, then the desire to upgrade for better sound quality. Just a brief list of rituals which can generate great expectations from computer playback of recorded music.

 

Again, the nocebo effect can easily occur with computer audio. Mostly I have noticed the nocebo occur with expectations through fear and anxiety. I have witnessed the effect right on this forum. One member would post doubts of bit perfect playback, only to stir fear in others that their system could be suffering from non bit perfect syndrome or the anxiety over the wrong USB cable. The anxiety that something is taking place on a circuit board or power supply that the subject cannot adequately measure or even see.

 

To illustrate how powerful a placebo effect can become, it is known to psychologically induce a slight sense of euphoria. This is a mental and emotional condition in which a person experiences intense feelings of well-being, elation, happiness, ecstasy, excitement and joy. Who does not like a boost to the euphoric feeling of listening to music.

 

Perhaps the placebo effect is far more pronounced through the ritual the individual performs due to their choice of recorded media, as opposed to walking into a room for a Double Blind Listening Test where other factors affect the listener.

The next question that comes to mind is can we avoid or limit the placebo effect by improving our listening skills? Is it possible to selectively concentrate on the sound quality of an audio system without being distracted by the factors that influence how we listen to music?

 

I believe it is possible to selectively limit psychological factors that influence our listening and focus on separate individual qualities of recorded sound. For some people this capability comes naturally, for recording engineers it becomes a routine task, for me it required doing a little research and a degree of practice. I started by reading recent books on the psychology of music and reviewing literature I had on how to listen to music. Then working on selective attention listening and while doing so attempting to limit any emotional response (which is quite difficult at first attempts).

 

I discovered that the distinctive and unique sound produced by a specific audio system can be stored in our long-term memory. However, there is one problem, the term and the amount of detail stored is currently unknown. There is data that fuels the theory musicians can retain very exacting memory of several instruments, but there are no long-term studies which can confirm the term of memory as the musician changes instruments over time. Also, I have not come across any research dedicated to the memory of sound quality of reproduced music except for some vague opinions offered by a few audio and recording engineers.

 

For myself, while learning the process there were a few listening surprises. First was hiring an acoustical engineer to evaluate my listening room and help me work out solutions. I was not prepared for the improvements immediately noticeable. (I wonder, has anyone ever considered that comb filtering can be the root of all evil.) Supplying clean AC power and vibration isolation also produced improvements to the sound quality, but far, far less noticeable than the results of acoustic room correction.

Discerning similarities, differences, and preferences between speakers is easy, however, electronic components, especially DACs and software were far more difficult. Less significant tweaks like cables can be even more difficult, and on many occasions, just a waste of time and effort in my opinion.

 

I have developed my own listening system which may or may not be relevant because the brain does not cooperate like we imagine. Even when focusing on selective attention, and making notes, short term memory is limited. In addition, performing listening tests on your system is a tedious task making it a necessary evil by stealing time away from just enjoying the music.

When it comes to simple A/B listening tests I must agree with Peter (once again).

 

“It does work when long term listening is performed with many albums and the most various types of music. You said "a week"; I always take 5 days, and (very) occasionally that proves to be not enough.”

I also have found longer term A/B listening sessions produce adequate to good results. The solution is partly due to time and memory. One to two hour sessions with a series of short tracks conducted over several days, then repeating the process at least once. I use different types of music in different formats (Redbook downloads and high res downloads). I have also found some test CDs quite useful on occasion.

 

In conclusion I offer a few worthless opinions: Most of the audiophiles I have met are not the least bit interested in how music affects the brain and our lives. They enjoy debating DBT because it is basically seen as an end to a means. Lazy ass couch potatoes want a sure scientific test based on listening results by a qualified source to rate audio equipment. Why risk purchasing a model of amplifier or a cable when another model of the same brand is proven by DBT to sound better. It is all about saving time and money while expending as little effort as possible, so the debate of a valid listening test rages on.

 

In order to increase one’s knowledge on what is actually taking place with how and why we listen to music it would require a little research into the psychology of music along with music and neuroscience. Of course that would mean picking up a few books to read, and for some people I could have not made a more dreadful suggestion. It would mean time or even budget infringements; paying for knowledge just like one pays for music. (Please accept my apology for being a little cynical.)

0

Share this post


Link to post
Share on other sites

I guess the short answer is why would hearing differ from all other senses. Yes, there are particulars to each sense. Yet there is a commonality to them as well. The placebo and nocebo effect are clear in these others, yet we are to think hearing is somehow 'special'. Without a doubt our vision gets the most information, the most informational bandwidth, and the most processing in the brain. We have no trouble with compression algorithms in video and still photography, yet hearing is 'special'. (said with the sarcasm of the church lady on Saturday Night Live).

 

Even if some of this stuff is for real, we have plenty of big fish to fry with less controversial measures. Get those right and the rest will be just that much clearer as either something to contend with or something of no consequence. One being, humans don't hear over about 20 khz, get over it. A digital system done to modern standards at 60 khz sampling is enough there will be no real artifacts. So okay, due to other conventions we need either 88.2 or 96 khz. We don't need more and are wasting our time. Lets get on with it. Probably almost never need more than 48/24. Sorry, but can someone at least agree there is some point at which we get total transparency. I know it goes against the grain, and we can argue with where it happens, but can we at least agree that at some point we can get total transparency with what is possible to hear? Otherwise, it is ridiculous.

0

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now