A/B testing favors B over A

mmerrill99 · June 16, 2017

10 minutes ago, jabbr said:

See your thought experiment through and propose an actual study design.

Thought experiments don't need to be converted into actual experiments - they exist as pure logic.

If you can point out the flaw in logic of my thought experiment then that is a valid objection.

Stating that it needs to be an actual physical experiment misses the whole purpose & concept of thought experiments in the scientific discipline.

jabbr · June 16, 2017

27 minutes ago, mmerrill99 said:

The argument was & still is being made that listening order bias is eliminated by randomization & I'm using a thought experiment to show that it couldn't be. A perfectly valid & scientific approach even though some jump on a bandwagon claiming I don't know what I'm talking about - it would seem they belie their own lack of ability to think logically

You don't understand. I am not asking you for a thought experiment "proving" that listening order bias can't be eliminated by randomization.

I am asking you for a thought experiment which a) eliminates listening order bias and b) doesn't employ randomization?

Do you understand the difference?

A disproves B is not equivalent to

C proves D

mmerrill99 · June 16, 2017

3 minutes ago, jabbr said:

You don't understand. I am not asking you for a thought experiment "proving" that listening order bias can't be eliminated by randomization.

I am asking you for a thought experiment which a) eliminates listening order bias and b) doesn't employ randomization?

Do you understand the difference?

Well you can't have another thought experiment - I only do them on Thursdays but I'll be here every week

Instead of looking for another logic experiment, let me ask you about this thought experiment - do you agree - randomization does not eliminate listening order bias?

mmerrill99 · June 16, 2017

18 minutes ago, Ralf11 said:

Sad. There are science courses in every high school, much less college.

Pity you never have anything of worth to contribute - why do you participate here?

mmerrill99 · June 16, 2017

7 minutes ago, jabbr said:

You don't understand. I am not asking you for a thought experiment "proving" that listening order bias can't be eliminated by randomization.

Your posts are confused & confusing

What I believe you mean to say is "I am not asking you for a ACTUAL experiment " as I already gave you the thought experiment " "proving" that listening order bias can't be eliminated by randomization."

mmerrill99 · June 17, 2017

2 minutes ago, Ralf11 said:

it's like arguing with the brain damaged...

try reading this:

https://en.wikipedia.org/wiki/Randomized_controlled_trial

Try thinking, for a change - even a modicum of an attempt to address anything I post would demonstrate this ability but you regularly fail - preferring ad hom attacks at every post!

jabbr · June 17, 2017

11 minutes ago, mmerrill99 said:

Instead of looking for another logic experiment, let me ask you about this thought experiment - do you agree - randomization does not eliminate listening order bias?

a) that's not a thought experiment

b) "randomization does not eliminate listening order bias" is a non-sequitur. Randomization is a technique which is employed in studies in order to reduce bias.(its not the only thing that defines a study) A study designed to eliminate listening order bias would employ randomization and other techniques.

Ralf11 · June 17, 2017

I am trying to help you educate yourself. Read the article.

mmerrill99 · June 17, 2017

2 minutes ago, jabbr said:

a) that's not a thought experiment

b) "randomization does not eliminate listening order bias" is a non-sequitur. Randomization is a technique which is employed in studies in order to reduce bias.(its not the only thing that defines a study) A study designed to eliminate listening order bias would employ randomization and other techniques.

And another failure to address the logic I gave which showed your statement was false - that randomization eliminates listening order bias.

I'm done with these incessant OT counterpoints.

Daudio · June 17, 2017

17 minutes ago, mmerrill99 said:

35 minutes ago, Ralf11 said:

Sad. There are science courses in every high school, much less college.

Pity you never have anything of worth to contribute - why do you participate here?

I think the gadfly only has cut-and-paste skills to/from the first hit of a Google search, and a sad need for attention, or a love of littering. Lurking and learning is a much better strategy for him.

Daudio · June 17, 2017

8 minutes ago, jabbr said:

Randomization is a technique which is employed in studies in order to reduce bias.(its not the only thing that defines a study) A study designed to eliminate listening order bias would employ randomization and other techniques.

Doesn't it kind of depend on what you randomize ?

mmerrill99 · June 17, 2017

Much more interested in Jud's experiment & results along with the concept of pattern matching as an underlying mechanism of auditory processing

jabbr · June 17, 2017

Just now, Daudio said:

Doesn't it kind of depend on what you randomize ?

Absolutely. Randomization is a technique which entirely depends on what is being randomized.

Ralf11 · June 17, 2017

which is why introducing an unrelated non-random effect says nothing about removing the original effect which is easily and commonly nullified by randomization

Teresa · June 17, 2017

11 hours ago, Daudio said:

And if one isn't in this hobby to chase better sound, then what the hell are they doing it for ? Waste money, exercise their oscilloscopes, or Online Armored Combat ?

I agree, it's all about enjoying the music, test equipment doesn't do it for me!

10 hours ago, mansr said:

I'd rather spend $5k on an oscilloscope than a power cord. Others would, apparently, get the power cord. I have yet to meet anyone who'd get a power cord for the oscilloscope.

I vote neither!

$5k is twice the cost of my entire audio/video system including my computer, all my upgraded cabling and my HDTV. I prefer to spend my money on music.

Test equipment cannot play music, nor make my music sound better! Thus I see no need for expensive test equipment since I don't design my own equipment, I buy good equipment already assembled at the best price I can find, on sale, clearance, demo or used. And I usually wait for it to die or be too expensive to repair before I replace it.

If I were an audio designer I could justify the cost of expensive test equipment to verify my designs meet specifications, but I am not an audio designer.

Daudio · June 17, 2017

24 minutes ago, jabbr said:

Randomization is a technique which entirely depends on what is being randomized.

Ok, so if we randomize the aforementioned bias, so that sometimes the test is A->B, and other times B->A, we haven't eliminated the bias, it is still there, but it's effect is masked by the randomization, and the bias itself masks differences smaller then the bias effect.

Seems to me a comedy of errors, best left in the dust bin

jabbr · June 17, 2017

15 minutes ago, Daudio said:

Seems to me a comedy of errors, best left in the dust bin

No really not. Bias isn't just "error", it is a systemic error as if the baseline is adjusted +x. Randomization eliminates this by statistically averaging +x and -x such that it approaches zero. Not for each sample : systemically.

The Computer Audiophile · June 17, 2017

Hi Guys - Remember, you can always use the "ignore user" feature of the site if someone gets on your nerves or makes your time here unenjoyable.

Daudio · June 17, 2017

1 minute ago, jabbr said:

Bias isn't just "error", it is a systemic error as if the baseline is adjusted +x. Randomization eliminates this by statistically averaging +x and -x such that it approaches zero

Sorry, but I'm an empirisist, not a theorist, and the above just doesn't resonate with me, while I think what I described was pretty clear. Guess we are at an 'agree to disagree' point.

Have a fine weekend

jabbr · June 17, 2017

19 minutes ago, Daudio said:

we haven't eliminated the bias, it is still there, but it's effect is masked by the randomization, and the bias itself masks differences smaller then the bias effect.

Actually you've eliminated the bias. If and only if the individual sample effect of the bias masks other differences then the "internal validity" of the experiment is not sufficient to detect these small differences -- you then need to use other techniques to improve the resolution of your experimental design. But masking isn't something which is automatic. It means that the variables aren't independent. Or you might need to use more individual subjects.

Daudio · June 17, 2017

3 minutes ago, The Computer Audiophile said:

you can always use the "ignore user" feature of the site if someone gets on your nerves or makes your time here unenjoyable.

I make extensive use of it, to make the best of... this place.

But occasionaly one will see their words in quotes, then ??

Teresa · June 17, 2017

Even someone as brain challenged as me can understand mmerrill99's logic. Maybe the following quotes and my responses can help those who have a problem with these concepts.

On 6/15/2017 at 5:11 PM, jabbr said:

The purpose of randomization is to reduce/eliminate systemic errors assuming sufficient sample size. (The preference for first vs second would cancel as roughly equal numbers of Amp A and Amp B would be listened first vs second.) ... but you need to have enough different people listening

Let's assume that Amp A sounds better than Amp B when actually playing music in a relaxed manner. So, there is an equal number of A’s and B’s randomly selected and the last sample played is always selected as best despite them sounding different, then one would only get a null result no matter how large the sample size is. When Amp A is played last people chose it, when Amp B is played last people chose it. Thus it doesn't matter how random the switching occurs if there is an equal number of A's and B's being last, one would get a null (50%) result.

Now lets suppose that Amp B was presented randomly two times more than Amp A, say 101 versus 99 times. In that case the worst sounding Amp (B) would be chosen incorrectly as the best if all the test subjects selected the last played in the bias this thread is about "A/B testing favors B over A"

On 6/15/2017 at 5:14 PM, mmerrill99 said:

I understand this but my point is that if there is always a bias towards preferring the sample which we pay more attention to, then this will make discriminating of subtle differences between A & B impossible...

Correct.

14 hours ago, mmerrill99 said:

Why not face up to the flaws in blind testing rather than look for excuses to denigrate all other results - it's one of the reasons why you loose believability as it's seen that something other than the truth is being sought

Agreed! I've just put "A/B testing favors B over A" in my long list as to why audio AB testing usually produces null results with human subjects. OTH I find nothing wrong with blind long-term listening to one's favorite music.

13 hours ago, mmerrill99 said:

Yes & I was hopefully careful to just point out that randomization was not the answer to the listening order bias - it seemed to be suggested by you & others that this was the case until I pointed out the flaw in your logic...

Agreed.

2 hours ago, mmerrill99 said:

Thought experiments don't need to be converted into actual experiments - they exist as pure logic.

If you can point out the flaw in logic of my thought experiment then that is a valid objection.

Stating that it needs to be an actual physical experiment misses the whole purpose & concept of thought experiments in the scientific discipline.

1 hour ago, mmerrill99 said:

...I already gave you the thought experiment " "proving" that listening order bias can't be eliminated by randomization."

Indeed, your example is very easy to understand IMHO.

If the last played sample is always chosen as better, and there is close to the same numbers of both A and B being randomly selected as the last sample, then it hides any real audible differences.

1 hour ago, Daudio said:

Ok, so if we randomize the aforementioned bias, so that sometimes the test is A->B, and other times B->A, we haven't eliminated the bias, it is still there...

Correct.

mmerrill99 · June 17, 2017

Thank you, Teresa - it's good to see my thoughts/posts actually make sense & people 'get it' - I don't find it a difficult concept to understand & I'm glad to see others also can understand the point.

Sometimes there are so many red herrings thrown into a thread that the smell of fish dominates rather than the real, valid points being posted.

A lot of wasted time is spent trying to clear out these red herrings

jabbr · June 17, 2017

15 minutes ago, Teresa said:

Let's assume that Amp A sounds better than Amp B when actually playing music in a relaxed manner. So, there is an equal number of A’s and B’s randomly selected and the last sample played is always selected as best despite them sounding different, then one would only get a null result no matter how large the sample size is. When Amp A is played last people chose it, when Amp B is played last people chose it. Thus it doesn't matter how random the switching occurs if there is an equal number of A's and B's being last, one would get a null (50%) result.

Right Teresa. So this wouldn't be the best way to do the study. This is what we call a "Type II" error.

Suppose each track is played 3 times and each time randomized to A/B options would be:

A-A-A

A-A-B

A-B-A

A-B-B

...

B-B-B

and instead of picking "best" between A and B, perhaps just rank the listening experience from 1 (bad) - 10 (best possible)

and then average the scores for A vs B for each time and for all together -- perhaps just use the middle repeat.

If this isn't good enough you could do, say 6 repeats and multiple listening sessions and with enough samples the statistics would show the difference.

Now this isn't so simple and I hope my example isn't too hard to understand but that is an example of how to reduce listening order bias.

mmerrill99 · June 17, 2017

8 minutes ago, jabbr said:

Right Teresa. So this wouldn't be the best way to do the study. This is what we call a "Type II" error.

Suppose each track is played 3 times and each time randomized to A/B options would be:

A-A-A

A-A-B

A-B-A

A-B-B

...

B-B-B

and instead of picking "best" between A and B, perhaps just rank the listening experience from 1 (bad) - 10 (best possible)

and then average the scores for A vs B for each time and for all together -- perhaps just use the middle repeat.

If this isn't good enough you could do, say 6 repeats and multiple listening sessions and with enough samples the statistics would show the difference.

The thread is about A/B blind testing in which two tracks are played A & then B or B & the A - a preference is made between the two tracks heard - what is your list about?

A/B testing favors B over A

Recommended Posts

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Create an account or sign in to comment

Create an account

Sign in