Jump to content
Computer Audiophile
Bob Stern

A/B testing favors B over A

Rate this topic

Recommended Posts

jabbr   

This is an example of study bias. Very much like transistor bias in that a static effect pushed the voltage/impression in one direction.

 

The study designer needs to correct. One way is to randomize -- another is to do multiple tests of each amp combo with order mixed. etc etc 

 

Yep that's why it would be hard work to do a real study ;) and why results are questioned -- the real problem is when investigators with an agenda use bias to "prove" something 

 

The ultimate answer is that "repeatability" under different circumstances must not be ignored.

Share this post


Link to post
Share on other sites
Daudio   
8 hours ago, jabbr said:

This is an example of study bias. Very much like transistor bias

 

You seem like a real smart guy, but every now and then (from a limited sample) I wonder if your brain snapped over to an alternate dimension  :o

 

I find comparing human perception and reporting behavior, to a simple solid state device, not in the ballpark at all !  Perhaps you could find another example with a little more nuance ?

 

The rest of your post is fine, and please don't take offense :)

 

 

Share this post


Link to post
Share on other sites
37 minutes ago, Ralf11 said:

 

True, but it's not clear that they had an agenda.  They could just be clueless.

 

I think that's unnecessarily insulting to Jason and his local audio society.  The primary purpose of the gathering was to compare two amps, not to evaluate A/B testing strategies.  Normally in such situations the A/B order is randomized.  Jason decided to do an interesting experiment to see the extent to which a fixed (non-random) order would bias the results.  By reversing the A and B ordering between the two groups of listeners, he achieved an overall result that was not biased, yet still was able to demonstrate the favoring of B over A.

 

Also, please bear in mind that this is a group of audiophiles gathering to have fun, not to conduct scientific research, and that each session was limited to about 3 hours

Share this post


Link to post
Share on other sites

Don't forget that one pair of amps were not on the Grand Prix Monaco amp stands which according to Mr. Serinus limited their bass, clarity, transparency and sound staging. I'm surprised they did as well as they did with a handicap like that. 

Share this post


Link to post
Share on other sites
jabbr   
12 minutes ago, Daudio said:

I find comparing human perception and reporting behavior, to a simple solid state device, not in the ballpark at all !  Perhaps you could find another example with a little more nuance ?

 

"bias" in study bias and transistor bias are polysemes.

 

Study bias is a systematic weight placed toward one outcome of the study (an error)

Transistor bias is a constant voltage applied to the collector and added to the base such that the transistor has improved amplification.

 

In both polysemes, there is a "push" in one direction.

Share this post


Link to post
Share on other sites
Daudio   
6 minutes ago, jabbr said:

"bias" in study bias and transistor bias are polysemes

 

Ok, I see that, but it seems just just a word game to me, as opposed to the vast gulf in complexity between a human being and a transistor. That's where my reaction came from.

 

And I learned a new word today !

 

Share this post


Link to post
Share on other sites
mansr   
2 minutes ago, jabbr said:

"bias" in study bias and transistor bias are polysemes.

 

Study bias is a systematic weight placed toward one outcome of the study (an error)

Transistor bias is a constant voltage applied to the collector and added to the base such that the transistor has improved amplification.

 

In both polysemes, there is a "push" in one direction.

A bias is a constant offset applied to a variable signal. Doesn't matter if the signal is a voltage or a statistical preference.

 

Your description of bias in a BJT circuit isn't quite right, however. Here bias is a constant current added to the base/emitter junction to bring the transistor's gain into the linear region. That is, of course, beside the point, which I agree with.

Share this post


Link to post
Share on other sites
jabbr   
3 minutes ago, Daudio said:

 

Ok, I see that, but it seems just just a word game to me, as opposed to the vast gulf in complexity between a human being and a transistor. That's where my reaction came from.

 

And I learned a new word today !

 

 

Assume that I'm writing tongue-in-cheek, and lept on the opportunity to keep this on-topic -- we are talking about amplifiers right ;)

 

Share this post


Link to post
Share on other sites
Daudio   
Just now, jabbr said:

 

Assume that I'm writing tongue-in-cheek, and lept on the opportunity to keep this on-topic -- we are talking about amplifiers right ;)

 

 

Ok, I'll let you off... this time  :D

 

Share this post


Link to post
Share on other sites

That is the reason on should always start with the more expensive component and then the cheaper, thus one is convinced that spending more money is not necessary. Of course sale persons always try the opposite order.

Share this post


Link to post
Share on other sites

OK,  so let's say the order is randomized - so what - we still get the second listening being preferred & therefore subtle audible differences being masked (or more correctly, the one paid more attention to during listening). As attention is not a fixed element in listening, how are such listening tests going to deal with this variable?

Share this post


Link to post
Share on other sites
jabbr   
1 hour ago, mmerrill99 said:

OK,  so let's say the order is randomized - so what - we still get the second listening being preferred & therefore subtle audible differences being masked (or more correctly, the one paid more attention to during listening). As attention is not a fixed element in listening, how are such listening tests going to deal with this variable?

The purpose of randomization is to reduce/eliminate systemic errors assuming sufficient sample size. (The preference for first vs second would cancel as roughly equal numbers of Amp A and Amp B would be listened first vs second.) ... but you need to have enough different people listening 

Share this post


Link to post
Share on other sites
mansr   
1 minute ago, jabbr said:

The purpose of randomization is to reduce/eliminate systemic errors assuming sufficient sample size. (The preference for first vs second would cancel as roughly equal numbers of Amp A and Amp B would be listened first vs second.) ... but you need to have enough different people listening 

Or enough rounds of comparisons.

Share this post


Link to post
Share on other sites
fas42   

I have no interest in the madness of trying to decide whether A is better than B, or B is better than A. If at least one of A or B appears not to inject significant flaws in the sound then that unit will get the nod from me ...

Share this post


Link to post
Share on other sites
jabbr   
2 hours ago, mmerrill99 said:

I understand this but my point is that if there is always a bias towards preferring the sample which we pay more attention to, then this will make discriminating of subtle differences between A & B impossible

 

Let's explain a different way - if the second sample was always played at 1dB higher, it doesn't matter if we randomize the order of the samples - this very factor will probably mask real differences

and I'm not saying that randomizing order is the only way to remove measurement bias.. you are discussing something else: IIRC "internal validity" or the ability of a study to measure "signal" in the presence of confounding variables "error"

Share this post


Link to post
Share on other sites
esldude   
45 minutes ago, Lebouwsky said:

Short comparison has a pitfall in this hobby. And I did a lot, because there was a time when my system contained 19 tubes devided over 5 different types. I was a tuberoller, a very expansive hobby by the way.

 

Eventhough the tuberolling is more or less in the past (only 1 pair in de preamp), I learned a valuable lesson. The only way to judge a component is to settle down, listen to it for a couple of weeks and trust your feeling. If it feels right (which takes time) it is right. No 3 hour a/b session can do that.

I think the same effect is occurring with break in.  It takes time to trust your feeling.  For it to feel right.  However, in neither case do I think a change in sound is really behind reaching this point of feeling comfortable.

Share this post


Link to post
Share on other sites
semente   
12 hours ago, mmerrill99 said:

There are probably many factors at play but one significant one is attention - when we focus attention we hear details that can have escaped our notice - essentially we hear differently as hearing is not a passive experience that happens to us, we are actively engaged in it, creating what we hear.

 

This makes sense to me... I realised that I unconsciously tend not to focus my attention in the same aspects of sound/performance when I do A/B comparisons, which is why I don't find them very effective.

My attention doesn't wander as much when I am comparing measurements. B|

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×