View RSS Feed

Julf

Listening test results

Rate this Entry
Here are the results of the listening test. Unfortunately I only got 6 submissions before I decided to stop collecting entries, so it is not really a statistically relevant sample, and I would not draw any profound conclusions from the results.

As has been pointed out, anyone handy with a program such as Audacity could have cheated by looking at the spectrum plots, and just looking at file sizes would already have provided a lot of guidance. Fortunately the results seem to indicate that the submissions were based on honest listening.

What is very interesting is that the full set of files has been downloaded 103 times, and the first track 147 times, but only 6 people actually submitted results. What can we conclude from that? That people are shy and are afraid to make a fool of themselves? Or that they really didn't hear a difference? No way to tell.

I would also like to state that when I put together the test, I had no strong position with regards to hi-res. I originally found my way to CA because of my interest in hi-res, and having gotten burned by HDtracks, I found the "Audiophile Downloads" and "Music Analysis - Objective & Subjective" forums extremely useful. I have used a fair bit of money to support HDtracks, eclassical, B&W Society of Sound, 2L, iTrax, Linn and others.

I can definitely hear a difference between some of the hi-res tracks and red book. Or at least I think I can. But I can't tell if that is from different mastering. Or it could even be all in my head.

A number of people have stated that the a capella material doesn't really allow for the benefit of hi-res to be heard, as it doesn't have any significant high frequency content, and that is definitely a valid point. On the other hand, at least one person stated that unaccompanied, natural human voice is the best test material.

I also want to make clear that this is a test of listening material as it is mostly presented to the buying public - it starts out being recorded at 96/24, and it then gets downconverted to whatever distribution format needed. It is not a comparison of recording formats. As the physical listening format is 96/24 on all of the tracks, it also doesn't allow a DAC optimized for lower resolutions to shine - so it is also not a comparison of how a certain DAC handles hi-res vs lower resolution.

Anyway, let's move to an explanation of the differences between the tracks. One of the files was of course the original track in 96/24. In this case that was track C. As a control, I then took a copy of the original, converted it from FLAC to WAV and back 300 times, and then copied it over the network between my desktop computer and a server in a very electrically noisy machine room 100 times back and forth. This copy was track G.

My next step was converting the file to 16 bit and then upconverting it back to 24 bit (effectively leaving the bottom 8 bits as zeroes). This resulted in track E.

If I would do the test again, I would add in a small amount of noise at 23-24 bit level, as the FLAC encoding is very clever about not storing those redundant zero bits - thus the file size was much smaller than the original. This shows that a simple way to detect a very crude 16-to-24-bit upconversion is to compare the size of the FLAC file with the corresponding WAV file. If the WAV file is roughly 2 times as large as the FLAC, the material has real 24-bit content (but perhaps just noise), but if the WAV file is 4-6 times the size of the FLAC file, the material is clearly 16 bit.

Next I took the original file and downsampled it (using sox) to 48/24, and then added a small amount of filtered white noise (at the -120 dB level, so definitely inaudible) to make the spectrogram of the file show at least some high frequency content. The 48/24 file was track D.

Then I threw in a classic calibration test - I made a copy of the 48/24 track and amplified it by +1 dB. This was track H.

Next I did the same 24-to-16-to-24-bit downgrade as with the 96 kHz file, resulting in the 48/16 track B.

Then another test - just to verify that adding the tiny bit of noise to mask the lack of HF content didn't distort the test, I included a 48/16 version of the track *without* the noise. This was also a bit of a check against cheating - if the version without noise did significantly worse than the one with the fake HF content added, it would indicate use of analysis. So track I is 48/16 without any artificial noise.

Track A is a 44/16 file, produced using the same methods.

Track F is another control point - it is a mp3 version, produced using lame with the "insane" quality preset, decoded with mpg123, and converted into a FLAC. So basically a FLAC recording of a mp3 file.

Out of the 6 people responding, one provided no numerical assessments, and one did not give them for all tracks, but based on the numbers I got, here are the averages per track:




I have arranged the tracks in rough quality order, starting with the mp3 on the left and ending with the original file to the right. Ideally we should see the points line up as a rising line from the bottom left corner up to the upper right corner.

Two points stand out - track D (48/24), that for some reason got the lowest average total, and track H (the same 48/24 as track D, but 1 dB louder). This illustrates how important it is to adjust loudness/volume to *exactly* the same when comparing two components or recordings - 1 dB of difference in loudness was the only difference between the one that got the best overall rating, and the one that got the worst.

We also see the somewhat surprising fact that the mp3 version was the one that got the second highest score.

Here are all the numerical assessments in a scatter plot:





Again, we would expect to see the points line up as a rising line from the bottom left corner up to the upper right corner.


Now we get to the comments and assessments for each track. I have replaced the names of the submitters with the names, in the ICAO/ITU phonetic alphabet, of the last letters of the alphabet. Only "Uniform", "Victor" and "Whiskey" provided verbal comments. "Whiskey" provided two separate responses, I have included both responses separated by a slash ("/"). "Uniform" also provided two separate sets of comments, one based on listening through the MacBook Pro headphone jack in 24/96, the other through a 16/48 DAC.

Track A - 44.1 kHz / 16 bit, average: 4.9

"Uniform": 4
"Like the piece, not sure about the recording" (through 16/48 DAC)
"Hear some collisions here – sound in general very gritty and glassy (both)" (through MacBook Pro 24/96 headphone jack)

"Victor": 6
There’s decent harmonic richness and clarity. Voices are distinct but slightly flat and cold.
Not as good as track H. Seems like red book CD. The deeper I listen into this track to better it sounds.
"Whiskey":
Emphasized small S-es. Notice that this could happen in anything, but I noticed it as unnatural / Strange

"X-ray": 3
"Yankee": 7.5
"Zulu": 4



Track B - 48 kHz / 16 bit, average: 5.3

"Uniform": 8
"Somehow sounds softer than the first" (through 16/48 DAC)
"Rounds off some of the grit – more palatable (not sure what that means)" (through MacBook Pro 24/96 headphone jack)

"Victor": 1
Similar to track A but seems a bit worse; kind of mp3 like. Differences from A are not significant
and may be more about which sins of omission are less offensive than which is better. There is a
light metallic edge to the harmonics. Not as engaging as track H.

"Whiskey":
Seems to sound more comfortable (relative to A) / Normal

"X-ray": 6
"Yankee": 6.5
"Zulu": 5


Track C - 96 kHz / 24 bit, average: 6.5

"Uniform": 4
"Good bit of high “glassiness” - not sure what that means" (through 16/48 DAC)
"Not really drawn in by this one – want to stop listening" (through MacBook Pro 24/96 headphone jack)

"Victor":
Sounds rather generic. Nothing is grabbing my attention. Voices difficult to distinguish.

"Whiskey":
More spatious. More natural (after the fact ... this could well be "the one") / Flanging

"X-ray": 9
"Yankee": 7
"Zulu": 6


Track D - 48 kHz / 24 bit, average: 3.75

"Uniform": 5
"Sounds very similar to A" (through 16/48 DAC)
"Both gritty and glassy – similar to A" (through MacBook Pro 24/96 headphone jack)

"Victor":
Reminiscent of track A. Voices are more distinct. Harmonics are richer.
Voices have a more instrumental tone; more engaging than track A.

"Whiskey":
Sounds strange / Flanging

"X-ray": 2
"Yankee": 6
"Zulu": 2


Track E - 96 kHz / 16 bit

"Uniform": 4
"Didn't like it very much – not sure why" (through 16/48 DAC)
"Didn't like it very much – not sure why" (through MacBook Pro 24/96 headphone jack)

"Victor":
Seems somewhat limited and flat. Not bad sounding but not quite free and engaging.

"Whiskey":
No S-es ? I listened to this one after the happening once again.
Then I noted : OK, more normal S-es here / Too metallish

"X-ray": 5
"Yankee": 8
"Zulu": 7
Average: 6


Track F - mp3 (VBR, lame --preset insane)

"Uniform": 7
"Pretty good" (through 16/48 DAC)
"Like it quite a bit, even if the treble is a bit “glassy” sounding – actually
sounds “good” on this one?" (through MacBook Pro 24/96 headphone jack)

"Victor":
Harmonics seem a bit muddy. Some of the voices seem a bit artificial.
There’s a metallic edge to upper register voices. Voices seem one-dimensional.

"Whiskey":
S-es / Strange

"X-ray": 7
"Yankee": 4
"Zulu": 9
Average: 6.75


Track G - 96 kHz / 24 bit, converted flac-wav-flac 300 times, copied between computers 100 times

"Uniform": 5
"Very much like C, I hear “glassiness”" (through 16/48 DAC)
"Both chunky and glassy :/" (through MacBook Pro 24/96 headphone jack)

"Victor":
After listening to track H it’s hard for anything else to compare more favorably. In comparison to H
this sounds a bit more dynamically restricted. Less sense of pace and presence. Sounds a bit compressed and forced.

"Whiskey":
Wrong S-es / Bad

"X-ray": 8
"Yankee": 5
"Zulu": 8
Average: 6.5


Track H - 48 kHz / 24 bit + 1 db extra gain

"Uniform": 6
"Kind of sounds “gritty” - again, not sure I like it" (through 16/48 DAC)
"Hearing a bit of dissonance (collisions, as I said for A)" (through MacBook Pro 24/96 headphone jack)

"Victor": 10
Captivating. Voices have more ease and together the voice sound more instrumental. There is better presence, pace and clarity.
Hall dynamics come across better. Much more natural decay. This track is clearly better.

"Whiskey":
Sounds strange. S-es buzz. I noticed the "buzz" earlier on, but didn't write that down
so I don't know anymore where it was) / Wrong

"X-ray": 10
"Yankee": 6.5
"Zulu": 10
Average: 8.5


Track I - 48 kHz / 16 bit, "raw" resample, no masking noise added

"Uniform": 7
"Treble a bit rolled-off (e.g., no glassiness), but I like it in this recording" (through 16/48 DAC)
"Now hearing a bit of glassiness in this one" (through MacBook Pro 24/96 headphone jack)

"Victor":
Flat in comparison to H. Harshness around S sounds. Overall slight nasal quality. The music seems a bit forced. Lacking ease.

"Whiskey":
Too high pitched S-es; furthermore quite normal / Rather normal, but edgy
"X-ray": 4
"Yankee": 7
"Zulu": 3
Average: 5.25
Categories
Personal Blogs

Comments

  1. PeterSt's Avatar
    First off, thanks for all the effort;

    I hope to make it somewhat more interesting by means of a next post. But I can't finish it, because something is not clear ...



    Next I did the same 24-to-16-to-24-bit downgrade as with the 96 kHz file, resulting in the 48/16 track B.



    You may have made a few mistakes here, or otherwise pasted this sentence in the middle of some context which doesn't allow me to understand this.

    Can you briefly explain -now in absolute sense- what happened to track B ?



    Thanks,

    Peter
  2. Julf's Avatar
    "[i]I hope to make it somewhat more interesting by means of a next post.[/i]"



    Looking forward to that :)



    "[i]Can you briefly explain -now in absolute sense- what happened to track B ?[/i]"



    Sure. What I wrote was "Next I did the same 24-to-16-to-24-bit downgrade as with the 96 kHz file, resulting in the 48/16 track B."



    So what I did was take the 48/24 downsampled copy (with the masking noise added), track D, and then did a similar "convert to 16 bits and convert back to 24 again" operation (same as what I did with the 96 kHz one) resulting in track B.
  3. PeterSt's Avatar
    (same as what I did with the 96 kHz one)



    Call me thick this morning, but ... in which track did *that* end up (knowing that will make it clear to me for sure).



    Thanks.
  4. Julf's Avatar
    "[i]in which track did *that* end up[/i]"



    "My next step was converting the [96/24] file to 16 bit and then upconverting it back to 24 bit (effectively leaving the bottom 8 bits as zeroes). This resulted in track E."



    So E.



    Just to be sure, here's the whole list in condensed form:



    A 44.1 kHz / 16 bit

    B 48 kHz / 16 bit

    C 96 kHz / 24 bit

    D 48 kHz / 24 bit

    E 96 kHz / 16 bit

    F mp3 (VBR, lame --preset insane)

    G 96 kHz / 24 bit, converted and copied back and forth

    H 48 kHz / 24 bit + 1 db extra gain

    I 48 kHz / 16 bit, "raw" resample, no masking noise added
  5. manisandher's Avatar
    Hey Julf, thanks for taking the effort to set this up. And the results are interesting, no?



    For my part, I found it incredibly difficult to hear differences between all 8 tracks (playing one straight after another over a period of an hour or so). But it seems that "Whiskey" really did identify 'C' as being the original track, no?



    Also, just for my understanding, track B was derived from track D with an extra 24-16-24 conversion, right?



    Yes, we all know the test was ultimately 'flawed'. But I'm glad I took part anyway and look forward to another more rigorous test. Next time I think I'll allocate one day for each track - I suspect I'd get 'better' results this way.



    Cheers, Mani.
  6. Julf's Avatar
    "[i]it seems that "Whiskey" really did identify 'C' as being the original track, no?[/i]"



    Yes, kind of. It was a "could well be" on one of his two attempts, but I definitely give "Whiskey" credit for that one.



    The important thing is that with so few responses, I don't think we can draw any conclusions this way or that - maybe apart from the fact that if you want to improve your sound quality, just turn up the volume :)



    "[i]Also, just for my understanding, track B was derived from track D with an extra 24-16-24 conversion, right?[/i]"



    That is correct.
  7. PeterSt's Avatar
    No, not the Uriah Heep album, but just me.



    All right. Before typing this post, it looked interesting to me to see what I have done, and mainly : why. Not sure what I can make of it, but let's see.

    Btw, this can be approached from more angles, like comparing with the others (and why the results may differ). Maybe that can be done in a later stage.



    The following is to be noticed from my listening to this :



    a. I can't A-B. Not only because I can't but because I don't think it can work (hear something here, and you can't avoid it there).



    b. I listened to everything one time only. Just let it all play (from A to H).



    c. Being bored, somewhere into track D or something I started cooking. In this case this implied the first stage of beef needing a couple of hours to boil, that first stage being on a high fire with a lot of noise. Not that I wasn't serious, but I really can't spend the time for so long otherwise.



    d. My second remarks behind the "/" are from another run, which was 2 weeks or so after the fact. However, I listened to the first 10 seconds of each track only, implying a hopefully better concentration.



    e. It is to be noted that generally I like 16/44.1 better than any Hires. Reasons are numerous, but with my general idea that there's also a real technical merit in it, when the Hires was done right.



    Track A - 44.1 kHz / 16 bit, average: 4.9



    Emphasized small S-es. Notice that this could happen in anything, but I noticed it as unnatural / Strange

    ---

    This is relative to nothing of course, because it was the first track I listened to, and never listened again. So, no reference at all.

    Referring to my remark above that I tend to like 16/44.1 better, it is to be noted that this will be (only !) about my own upsampling/filtering mechanisms, and through the Phasure NOS1 DAC which does nothing to the sound (were it about filtering - which is not in there). Thus, *this* 16/44.1 is not about this "better liking" because it is played natively as 24/96 because it was (made) just that.



    Track B - 48 kHz / 16 bit, average: 5.3



    Seems to sound more comfortable (relative to A) / Normal

    ---

    Can be because 96 to 48 was "better" (hence more easy to convert to) than 96 to 44.1 ?



    Track C - 96 kHz / 24 bit, average: 6.5



    More spatious. More natural (after the fact ... this could well be "the one") / Flanging

    ---

    Mind you, this "after the fact" is after listening to all of the 9 tracks, with some "asbolute memory" that this one possibly sounded better. So, this round (before the "/") I went for E, and while denoting this in the remark for C (which *is* the right one) I should have compared E to at least this one. But I really felt no need to spend more time on it.



    Track D - 48 kHz / 24 bit (white noise at -120dB), average: 3.75



    Sounds strange / Flanging

    ---

    The importance here is my "Flanging" remark. It will not be a coincidence that both the most normal 24 bit tracks (C being the real one, this one being 48KHz), the flanging occurs to me. It should indicate that the bit depth is doing a few things to a real existing flanging, which btw is a slower variation in level.

    What is important to others is that this flanging can easily be incurred for by BETTER playback means. So, or it gets lost in the further anomalies (like noise), or it is just there.

    I did NOT at all merit the "flanging" I noticed here in both 24 bit occasions as a good thing. I just noticed it.



    Track E - 96 kHz / 16 bit



    No S-es ? I listened to this one after the happening once again.

    Then I noted : OK, more normal S-es here / Too metallish

    ---

    Apparently the 24 bit emphasizes something I don't like. This can be the recording. Obviously it looks like 24 bits should emphasize the betterness of the S-ses, but instead it emphasized the wrongness ?

    The "Too metallish" from that second 10 seconds round will not immediately say that 16 bit is wrong, but merely that the decimation wasn't right.

    In addition I must say that I found the sound on all tracks "strange" in the voices, and like one of the others said "too much instrumentationed" or something like that.

    I think in the main thread I may have talked about microphones or something.





    Track F - mp3 (VBR, lame --preset insane) Average: 6.75



    S-es / Strange

    ---

    No further comments.



    Track G - 96 kHz / 24 bit, converted flac-wav-flac 300 times, copied between computers 100 times



    Wrong S-es / Bad

    ---

    Dangerous; I am NOT in that leage of "being able" to say that anything like copying a 100 times etc. can ever change the sound. I am accused though, to perceive differences from something like this.

    It is remarkable that I denote this one as "bad" for the 10 seconds round, where I denoted nothing "as bad" as this.

    How ?





    Track H - 48 kHz / 24 bit + 1 db extra gain



    Sounds strange. S-es buzz. I noticed the "buzz" earlier on, but didn't write that down

    so I don't know anymore where it was) / Wrong

    ---

    One of the others noticed the same, but described the "buzz" differently. So, a poor means of digital gain must be in order here.

    Also notice the explicit "Wrong" I judged for the 10 seconds round. I didn't do that anywhere either.

    And let's keep in mind : any digital attenuation is still underestimated; attenuation or gain doesn't differ much, but it has to be good. This one clearly is not.

    And no, I did not notice the extra gain. The fan above the stove was louder anyway ...



    Track I - 48 kHz / 16 bit, "raw" resample, no masking noise added Average: 5.25



    Too high pitched S-es; furthermore quite normal / Rather normal, but edgy



    This is remarkable just the same. How in the world can I judge with two weeks in between - never looking back at my earlier comments - the second round listening for 10 seconds only ... judge exactly the same ??

    To keep in mind : No S-es are there in those first 10 seconds, but the "edgy" of it seem to resemble the "high pitched".





    Okay, I merely started writing down the above to see if I could see patterns related to my general thinking / observations from music formats. I feel it is in there, but I think my means of writing it down doesn't allow for conclusions easy. So I'll leave it to that for now.



    But now let's see whether we can make some more out of my own judgements ...



    So, in the first round I chose for track E, so that was my "submission". Now, the only thing wrong with that, were some chopped off least significant bits. So, they are not the most important ? (no judgement, just posing something).



    Along with this "E" submission, after the fact, and thinking back how things sounded (so, over 50 minutes later that was), I actually said that track C should be the one.

    Well, it was the one. Still failed, because I officially said "E". But ...



    Julf forgot something to tell (he *really* forgot, I'm sure) because it was at the end of my list of judgements, and this was that although I had chosen track E, it could not be the one because upsampling it to 768 made it sound worse. And, this should not happen (as I told in that submission).

    Thus, I submitted E as the winner, but only because I was fed up with it.



    From the 10 seconds round, I didn't really pick one, but said that I played B for a next time (sort of having chosen that one), and that "I had no problems with it". I played it again because I had denoted "normal" to it, as the only one. Btw, here too I said that I played it upsampled, and that it still didn't workout, implying that this one could not be the one either.

    As you can see in the questions from me and the answers from Julf by now, this is the actual way Track B emerged :



    "My next step was converting the [96/24] file to 16 bit and then upconverting it back to 24 bit (effectively leaving the bottom 8 bits as zeroes). This resulted in track E."



    ... with the difference that track B is 48 KHz.



    Amazing ...

    So, in two subsequent sessions I chose the two tracks of the same kind.



    For myself it is as amazing that unconsciously I seem to be able to detect 24 bits over 16 bits (the flanging thing).



    [yes, I am getting crazy myself by now, about these comparisons / seeking for the common denominators]







    Trying to summarize :



    In two sessions -and listening to very different things obviously- I "formally" preferred the 16 bits verions; the both I chose were the best genuine ones of it (E better than B though).



    In both of these cases I could prove that neither could be the right one, because upsampling them to 768 would not workout for the better, which it otherwise would (as far as my experience goes).



    After the session I "formally" preferred E, I said I better had chosen C. But I did not (formally).



    I do not like Hires, generally. So, I shouldn't have chosen C.

    And I didn't.

    Instead I chose a 16 bit version, although still 96KHz for the one occasion (listening throughout) and another 16 bit version of 48KHz in the other - that being the 10 seconds session.

    I regard the 10 seconds session to have been more serious, already because I wasn't cooking and making a lot of noise at the same time. The fact that a. the a cappella seems the worse to me to do a proper job ever and b. the first 10 seconds of it should be totally undoable (for myself, in my view), didn't prevent me from choosing exactly what I always say I like best : "low-res".



    In the mean time I seem to have been able to pick the proper Hires ... (but this wasn't my formal choice).



    More low-res would have been 16/44.1 of which I said this :



    Emphasized small S-es. Notice that this could happen in anything, but I noticed it as unnatural. / Strange



    and which I dedicate to the downconversion not working as decent as it does with going from 96 to half of that.

    (btw, I added a period after unnatural, which is not there in Julf's quotes. So, the Strange is from the second session.







    I think I did well. Fairly well.







    Now I may wonder, why didn't I like the both 24 bit versions ?



    The second, 10 seconds session, told me "flanging". Of course I didn't merit this as good. Maybe not as bad either, but I noticed it, and it can't be a coincidence that I noticed this with the two 24 bits versions only. Yes, the 24 bits will have dug that up. But what if it wasn't normal, and it was created in the process. I mean, that too would require 24 bits to get it in, right ?

    The flanging in this case is NOT normal at all. It is no Lesly you know. It is a slow frequency flanger "over" the voices, and it can't be, unless from wrong processing. This, while the whole lot clearly showed as processed to me.



    The first "throughout" session, made me make of C say it sounds spatious and "more natural" (not "natural !!") and from D (the other 24 bit) I said "sounds strange".

    Of course I don't know anymore, but the "strange" from the first session can just as well have come from the "flanging" I heard in the second, and where this just will be about which part your attention goes.

    But, this should proove (to me) that



    1. The flanger is just in there (because I heard it in C);

    2. It is wrong (because I told so from D).

    3. And of course that only 24 bits can unveil it.



    Nice.

    Add to this that the only two other contenders had the same exact remark *AND* it is my own remark ever about Hires :



    It doesn't grab your attention. No way to get into the music.



    HOW ?



    My idea about it : the recording is not good enough.

    Or :

    Our systems are not on par.

    Or :

    Something else is wrong we don't quite know about yet.



    Ok, I hope you all can see that I didn't just say this out of the blind - at least not for this post (and listening efforts).

    But I say it all the time ...



    Peter
  8. Jud's Avatar
    Don't know if I'm included in your "official" results, but you did allow me to take the test twice, once with the AQ Carbon cable in my system, then, after scrambling the choices, again with the AQ Coffee. I spent longer on the test the first time through; due to pressing errands, I wasn't able to take as much time to compare the second time through. However, as I told you when I sent in my results, the second time I thought I heard a clearer difference between my preferred selection and all the others.



    For both sets of results I assigned scores of 10 through 2, from most preferred to least.



    If I am reading correctly your post here and your e-mail to me about how the order was switched between tests, it turns out that in both instances I did what it appears many people did, preferred the +1db track. The first time through I selected the original as my next favorite track, and the 96/24 converted/copied version as the next after that. The second time through I gave the original a 7 (that is, ranked it fourth), and again ranked the 96/24 converted/recopied version next.



    So what can be said about these results? Well, no surprise that we humans are apparently quite sensitive to level changes. Since I thought I could hear a difference more clearly in the second test, perhaps the Coffee actually was better in bringing out the level difference. And I also found it interesting that I ranked the original and its full resolution converted/recopied version together in both tests.



    By the way, speaking of converting/recopying - I converted all files to AIFF with XLD before listening to them, because that is how I listen to nearly all my music, and I wanted listening conditions to be as close to normal as possible. I also used some software upsampling to 24/192 in the following way: In the first test, I went through the selections first without using any upsampling, then again with it. My order of preference did not change. Since it did not change my preferences the first time around, and I had less time, in the second test I listened using upsampling exclusively, because that is the way I normally listen to most of my music these days.



    So in the second test, given that I selected the +1db version as the best, and the original and its converted/recopied version fourth and fifth best, respectively, which tracks "sneaked in" for second and third? If I'm reading your post and e-mail correctly, they would be E (96/16) as second preference, and A (good old Redbook, 44.1/16) as third.
  9. PeterSt's Avatar
    Track G - 96 kHz / 24 bit, converted flac-wav-flac 300 times, copied between computers 100 times



    "Uniform": 5

    Very much like C



    There is much much more in this all.



    Maybe *because* so few attended, it is more easy to make a few important conclusions later.
  10. PeterSt's Avatar
    I think I can look at this all forever and have new "amazing" conclusions ...



    So, despite of me thinking that C was the original, but not submitting that one as the best sounding (and of course I was seeking for the best sounding - what to do else) ... WHY was it so necessary that I said this :



    Track G - 96 kHz / 24 bit, converted flac-wav-flac 300 times, copied between computers 100 times



    Wrong S-es / Bad

    ---

    Dangerous; I am NOT in that leage of "being able" to say that anything like copying a 100 times etc. can ever change the sound. I am accused though, to perceive differences from something like this.

    It is remarkable that I denote this one as "bad" for the 10 seconds round, where I denoted nothing "as bad" as this.

    How ?



    So, why should I not see this in the realm of my general thinking ? eh ... Hires sounds bad ?



    And thus, I highly prefer to state this after all, over suggesting that 100 (okay, 300) copies deteriorate the sound, right ?
  11. Jud's Avatar
    A number of people have stated that the a capella material doesn't really allow for the benefit of hi-res to be heard, as it doesn't have any significant high frequency content, and that is definitely a valid point. On the other hand, at least one person stated that unaccompanied, natural human voice is the best test material.



    I do think that unaccompanied voice, or voice with instrumentation that doesn't overwhelm it, is the material with which I can best discern differences. However, the choral style in this case intentionally de-emphasizes much of what I personally listen for as natural in a singing voice: small volume changes, phrasing, breathing, all helping to create drama and project emotion. So I'm fine with vocals, but believe I personally at least would be better off with more modern vocal styles. (Some examples of artists I often listen to when comparing gear, to give a feeling for what I'm talking about: Gillian Welch, Rosanne Cash, Alison Krauss, Jakob Dylan, Ryan Bingham (his ballads).)
  12. PeterSt's Avatar
    Track I - 48 kHz / 16 bit, "raw" resample, no masking noise added



    My comment :



    Too high pitched S-es; furthermore quite normal / Rather normal, but edgy



    I only now realize that some things are not fair, maybe ...



    If this track is to be seen as 16/48 indeed, my NOS1 won't do a thing with it by guarantee. This means I would be listening to material which still needs filtering. No wonder that I "am able to" perceive this as distortion (see judgement).



    On the other hand, someone with a normal(ly) filtering DAC - what would happen there ?

    What officially comes in there, is 24/96. So, it shouldn't do a thing either.

    Shouldn't.

    But this is no guarantee at all. Thus :



    When such a DAC smears a nice filter over it again, it will have solved the Nyquist "problem" and judgement like I could do it, would not be possible anymore. Other things may happen, but not this.





    I came to this because I saw Jud writing about his "standard upsampling". Well, hey, that would be illegal also of course. Or at least not convenient for yourself, because it could 100% equalize two different formats (depending on the format).



    But what I merely start to see is that the test itself is not much valid, when the *objective* is taken in mind : can you perceive Hires or whatever it was. So :



    No no, this now suddenly is (also) about whether you can perceive Nyquist violating tracks, which would be all of them under 96KHz and the 16 bits ones, just because your DAC can't see that; only with luck it overrules a few things.



    Now what ?



    PS: Of course now I should wonder why a 24 bit file but violating Nyquist for the sample rate, didn't come across to me as distortion (as bad as the 16/48 from track I).

    Oh boy, now I must look again at the results ? maybe I better stop ...
  13. manisandher's Avatar
    Well, I've looked long and hard at my results (with a short break in between to hear our Queen's speech today - when I was young, I was totally against our Monarchy, but nowadays I can see the immense value in having had a non-elected head of state for the last sixty years!)... and I can't really see a pattern.



    But for what it's worth, I thought A, C, E, G and I all sounded similar - and B, D, F and H all sounded similar. The first group seemed a little more forward sounding than the latter group.



    I really can't see any pattern here. But just in case it's important, I was upsampling all tracks to 768KHz in XXHighEnd.



    Mani.
  14. Jud's Avatar
    I came to this because I saw Jud writing about his "standard upsampling". Well, hey, that would be illegal also of course. Or at least not convenient for yourself, because it could 100% equalize two different formats (depending on the format).



    I was concerned about upsampling masking differences myself, so I initially listened without any software upsampling. When I listened again with upsampling, it did not alter my preferences, so I felt OK to listen exclusively with software upsampling in the second trial.



    It is at least conceivable that some software upsampling might mask differences in the original material. (For those who want measurements, I hasten to add that different sample rate converters have differing measured performance.)
  15. Julf's Avatar
    Thanks for taking the time to write up your observations - some of them definitely make sense.



    "[i]I submitted E as the winner, but only because I was fed up with it.[/i]"



    That's as good a reason as any :)
  16. Julf's Avatar
    "[i]Don't know if I'm included in your "official" results, but you did allow me to take the test twice[/i]"



    Indeed - unfortunately I didn't have time to redo the plots etc. to include your second run, so only the first round is included.



    "[i]no surprise that we humans are apparently quite sensitive to level changes.[/i]"



    I guess Spinal Tap was onto something with the "they go to 11" amps...



    "[i]I also found it interesting that I ranked the original and its full resolution converted/recopied version together in both tests.[/i]"



    I agree - but I still maintain that we don't have enough data to draw any real conclusions.
  17. Julf's Avatar
    "[i]I personally at least would be better off with more modern vocal styles. (Some examples of artists I often listen to when comparing gear, to give a feeling for what I'm talking about: Gillian Welch, Rosanne Cash, Alison Krauss, Jakob Dylan, Ryan Bingham (his ballads).)[/i]"



    So would I - but it can be a bit tricky getting permission to use (preferably unpublished) material from any of those... :-/
  18. Julf's Avatar
    "[i]If this track is to be seen as 16/48 indeed, my NOS1 won't do a thing with it by guarantee. [/i]"



    So are you saying that the NOS1 ignores the physical format of the track (96/24) and somehow "guesses" that it really is 48/16?



    "[i]No no, this now suddenly is (also) about whether you can perceive Nyquist violating tracks, which would be all of them under 96KHz and the 16 bits ones, just because your DAC can't see that; only with luck it overrules a few things.[/i]"



    How would they violate nyqvist? Upsampling is nyqvist-safe.
  19. Julf's Avatar
    "[i]when I was young, I was totally against our Monarchy, but nowadays I can see the immense value in having had a non-elected head of state for the last sixty years!>/i>"



    Then there is the Belgian experiment - doing without government for a year :)
  20. Jud's Avatar
    I still maintain that we don't have enough data to draw any real conclusions.



    Yes, with you completely there.



    So would I - but it can be a bit tricky getting permission to use (preferably unpublished) material from any of those... :-/



    Right, well aware of the problem. I was mentioning these folks not as a request to have them be the ones to provide music samples (though wouldn't it be nice?), but simply to give a general idea of the type of vocal music I often find useful for testing.
  21. PeterSt's Avatar
    So are you saying that the NOS1 ignores the physical format of the track (96/24) and somehow "guesses" that it really is 48/16?



    No. But I think :-) I made a thinking mistake ...



    Earlier I found myself in some sort of spiral where I couldn't make up my mind much about upsampling from 16 bits, that resulting in 16 bits ...

    This *will* be violating Nyquist. Eh, formally.



    And next I got into some wrong thinking, because you could just as well have presented a 48KHz file as an 96KHz one, without real upsampling (which XXHighEnd really can do, so it's easy for me to think about this by accident).

    But you didn't of course, and I just thought the wrong way.

    Apologies.



    But now we're at it anyway ... think about the difference in decimating to 16 bits (after normal upsampling if you want) and upsampling from 16/48 to 24/96. There really will be a difference ...



    But it is not important !
  22. Julf's Avatar
    "[i]I just thought the wrong way[/i]"



    I have days like that. Just ask my wife! :)
  23. Julf's Avatar
    "[i]I was mentioning these folks not as a request to have them be the ones to provide music samples (though wouldn't it be nice?), but simply to give a general idea of the type of vocal music I often find useful for testing.[/i]"



    Yes - and I agree with you. So now we just need to find the next Gillian Welch, Rosanne Cash, Alison Krauss, Jakob Dylan or Ryan Bingham (before they have become famous) and ask for a demo tape (but in hi-res!) :)
  24. Jud's Avatar
    So now we just need to find the next Gillian Welch, Rosanne Cash, Alison Krauss, Jakob Dylan or Ryan Bingham (before they have become famous) and ask for a demo tape (but in hi-res!) :)



    I am guessing it would actually not be insuperably difficult to find an aspiring singer-songwriter of acceptable (or perhaps better) quality willing to provide sample material in exchange for a good recording studio providing him or her with 24/96 or 24/192 demos. Or (this is the shoot-for-the-stars alternative) ask someone like Neil Young or T-Bone Burnett (Burnett is a performer as well as producer), who have come out in favor of high-res and against the "loudness wars," if they might be willing to provide test/demo material.
  25. Julf's Avatar
    "[i]I am guessing it would actually not be insuperably difficult to find an aspiring singer-songwriter of acceptable (or perhaps better) quality willing to provide sample material in exchange for a good recording studio providing him or her with 24/96 or 24/192 demos.[/i]"



    It would certainly be worth a try - but let's first wait for everybody to tell what was wrong with my test and why it was flawed :), so that the next round will be a better one.
  26. Brian A's Avatar
    First off, thanks for putting up with us all. You've shown a lot of patience through this process.



    Second, the results are quite interesting. I rated H best, but also wondered in my email to you if the volume had been boosted. Ha: called it! All the rest I am surprised at and I am shocked (horrified!) at how highly I rated the mp3. Of course, I don't blame myself but rather my equipment which is pretty decent on the analog side but only entry-level on the digital side. Nothing to do with my ears, of course.



    Thanks again.
  27. Julf's Avatar
    "[i]I rated H best, but also wondered in my email to you if the volume had been boosted. Ha: called it![/i]"



    Indeed!



    "[i]I am shocked (horrified!) at how highly I rated the mp3[/i]"



    I wouldn't be so horrified - the "insane" lame preset produces surprisingly good results, and MP3 was designed to fool your ears after all...
  28. Dirgen's Avatar
    I didn't submit my scoredue to the earlier than expected cut off date but I was also really surprised by the results. My top 3 choices were H, D and F! My bottom 3 were B, E, I.



    It seems that I am more sensitive to bit than sampling rate but really surprising that I rated the mp3 version so highly.
  29. esldude's Avatar
    Often blind testing gets a rap among audiophiles of not being sensitive enough. Yet things like loudness differences in tracks get clearly picked out at very small difference levels. Other testing with larger groups of people show even .2 dB will get singled out. Many people will not consciously notice that level difference unless cued into it and asked to compare and make sure. Some don't even then.



    Some friends think I am overly picky about level matching etc. if you are attempting to compare equipment or cables. But turn it down or turn it off, make the swap run volume back up somewhere close to where it was, then adjust to suit a little bit will totally invalidate any such comparisons. Differences would have to be large indeed not to be swamped by a minor volume difference.



    Another oft heard complaint is you need to listen over longer periods to know what you think of the sound. But external noise levels can change enough from day to day, and almost certainly will change enough from day to night it will invalidate comparisons like that. The classic example being TV turned down low for late night watching though plenty intelligible. Turn it on the next afternoon and the sound is hardly heard because the ambient noise levels has changed in the daytime by ten or more decibels.



    Anyway, good attempt at this test Julf. I am a little surprised you didn't get more than 6 responses though I always expected you would get no more than 1 of 4 from those downloaded.
  30. Jud's Avatar
    "I am shocked (horrified!) at how highly I rated the mp3"



    I wouldn't be so horrified - the "insane" lame preset produces surprisingly good results, and MP3 was designed to fool your ears after all...



    I gave the mp3 a "7" (4th highest preference) in the original run with the Carbon USB cable, then ranked it worst of all in the second run with the Coffee USB. Obviously the change in USB cables made a huge difference in my system! ;-)
  31. Julf's Avatar
    I agree with you about blind testing - it is not a perfect tool, but it is the best we have, and it is actually surprisingly good.



    "[i]I am a little surprised you didn't get more than 6 responses though I always expected you would get no more than 1 of 4 from those downloaded.[/i]"



    Yes, something around 25 submissions was what I expected too - but then cutting it short left out people who thought they had ample time to do the test.
  32. Jud's Avatar
    Another oft heard complaint is you need to listen over longer periods to know what you think of the sound. But external noise levels can change enough from day to day, and almost certainly will change enough from day to night it will invalidate comparisons like that.



    May depend on what one is listening for. For example, if you are trying to determine if a new component has a sound of its own, so by definition it is not accurate, you may want to listen to a variety of music over a period of time to see whether the tell-tale signs arise: boredom, "listening fatigue," etc. Comparing two musical selections or components at various times may be helpful precisely because if you only compare once it may be at a time when the environment is not the best.



    Agreed that listening to a single component/selection at various times over a long period is much more susceptible to extraneous variations.



    I have read that "aural memory" may not be much longer than 30 seconds, so I try to include at least some comparisons where components/selections are switched quickly. For what may be similar reasons, the manufacturer of my favorite cables recommends listening to no more than 1 minute of a musical selection before switching to the component/selection to be compared. (Nevertheless, I think there is also something to be said for listening to a selection all the way through, to try to get an idea of the emotions the artist is trying to convey at various points, and how well the equipment/recording renders these.)



    Also, with respect to PeterSt's very worthwhile caution that once you hear something notable in selection A, you will surely hear it in selection B, whether or not you would have noticed it in selection B if you'd played it first: I always go back and forth between components or musical selections to compare them, if I have the time, for precisely the reason that Peter mentions.
  33. Edgaronline's Avatar
    Being the one that rated the MP3 as the lowest ...



    Okay its me..



    I have listened for like 1.5 hrs and would not be surprised if I would make totally unexpected observations, but my list resembles what you may expect from the specs, with F being the lowest and E A and C the best ratings. I did hear differences, but I would also completely agree that there is a matter of taste in it. And since you can't shut down your brain (at least I can't but many on this forum can ;-), different people may have different experiences while listening to these tracks.



    -Edgar
  34. bdiament's Avatar
    Hi Jud,



    **[i]"...I have read that "aural memory" may not be much longer than 30 seconds...**



    I've read things along these lines too (some worse) and they always leave me puzzled.



    I wonder how it is that I know it is my mother at the other end of a phone call, without having to ask who it is. And I (as I'm sure a great many of us can) hear subtle changes in the voice, as compared with memory, to the point where I can identify those occasions when she might have a cold. If aural memory was so short as some would claim, I'd think I wouldn't be able to tell who it was, much less recognize little changes in the voice compared with my memory of it.



    That example, for me, cuts a Grand Canyon sized hole in the argument against aural memory. But there are other, smaller examples.



    Most experienced guitarists I know can listen to a few seconds of a guitar and tell you right off whether it is a Les Paul or a Stratocaster.



    And some audio gear has such a distinct signature, an experience listener (who has heard a lot of gear) can name the brand, if not the model, by listening alone.



    At a meeting of the local audio society not too long ago, I arrived a few minutes late and was immediately put in the "hot seat" and asked to describe what I heard. I had no knowledge of what we being demonstrated and in fact, it turned out to be something other than what I commented on. After listening for about 30 seconds, I said I didn't know what the "test" was about but felt I was listening to a CD-R recorded at a high speed. (I find there is a distinct, or rather indistinct quality to the sound, a peculiar out-of-focus sense to the ambience and any traces of reverberation in a recording.) It turned out to be a CD-R burned, if I recall correctly, at 30x.

    Perhaps a coincidence or "lucky guess" but that's what I heard.



    Different microphones too, have a specific "sound". As, do my ears, certain highly touted DACs. And cables, and everything else I've spent any time listening to.



    I firmly believe aural memory, like any other sort of memory, can, with "exercise", last a long, long time.



    Sorry to all for the slightly off-topic post but I believe there is some relevance to the discussion at hand, i.e. to comparing formats.



    Best regards,

    Barry

    www.soundkeeperrecordings.com

    www.barrydiamentaudio.com
  35. Jud's Avatar
    And I (as I'm sure a great many of us can) hear subtle changes in the voice, as compared with memory, to the point where I can identify those occasions when she might have a cold. If aural memory was so short as some would claim, I'd think I wouldn't be able to tell who it was, much less recognize little changes in the voice compared with my memory of it.



    That example, for me, cuts a Grand Canyon sized hole in the argument against aural memory. But there are other, smaller examples.



    And some audio gear has such a distinct signature, an experience listener (who has heard a lot of gear) can name the brand, if not the model, by listening alone.



    Or the player. If it's a Hendrix song I've never heard before, I can tell you within a few notes (though it's possible Stevie Ray Vaughn or, less likely, Robin Trower, might fool me for a little while).



    So what you've said is certainly something to consider.



    But...



    Don't know whether you saw the 60 Minutes segment on "face blindness" this past Sunday. There are people who absolutely cannot recognize the faces of friends and family without cues like hairstyle (or voice - and if the voice comes from a friend or family member who's changed hairstyles, there's cognitive dissonance). To demonstrate how this feels to the reporter, a researcher showed her pictures of faces turned upside down, with hair covered. She could not recognize any of them, including her own daughter(!), simply because the orientation was changed. Eye color, the shape and size of the face and all the constituent parts, was still there, it just didn't add up to Famous Actor, or Daughter.



    So the case may well be similar with sounds - though there are sounds we'd recognize anywhere, this does not necessarily mean that we have the sort of instant, effortless recall and assessment of changes for all sounds that we do for sounds of family (Mom's voice) or familiar "friends" (Jimi's guitar).
  36. bdiament's Avatar
    Hi Jud,



    I suppose if the sounds were played backwards, it would be somewhat more difficult. But I'd bet you would recognize "Axis: Bold As Love" within seconds, even backwards.



    Didn't see the TV show. I would question a researcher who used upside-down faces to "test" facial recognition on anyone other than a professional hand-stander (or head-stander). ;-} That said, I understand that in this case, they wanted to demonstrate a sense of un-familiarity.



    The idea of "face blindness" reminds me of all those TV shows where someone wears sunglasses and their family members don't recognize them.



    Best regards,

    Barry

    www.soundkeeperrecordings.com

    www.barrydiamentaudio.com
  37. Jud's Avatar
    That said, I understand that in this case, they wanted to demonstrate a sense of un-familiarity.



    There is in fact a specific part of the brain devoted to facial recognition. Of course that part of the brain is accustomed to seeing faces in their familiar right-side-up orientation. So what was demonstrated, quite specifically, is what the brain *other than* the facial recognition area (the area which "face-blind" people lack or which has some deficit) does with the features of faces. The answer is, very surprisingly to people who don't understand how this specific brain function works, not much.



    That's why I'd be cautious (not necessarily doubtful, but cautious) about applying our experience with memory of familiar sounds and music to memory of possibly unfamiliar musical sounds, just as we've now learned one cannot apply the experience of identifying familiar faces to the identification of those faces' constituent parts in a slightly unfamiliar orientation.
  38. bdiament's Avatar
    Hi Jud,



    With unfamiliar sounds, I believe the experience of the listener comes into play as a significant factor.



    Best regards,

    Barry

    www.soundkeeperrecordings.com

    www.barrydiamentaudio.com
  39. Jud's Avatar
    Ah, Grasshopper: For the truly experienced listener, *all* sounds are familiar.



    OK, OK - I know, I've gone Too Far again. ;-)
  40. esldude's Avatar
    http://readinginthebrain.pagesperso-orange.fr/intro.htm



    http://www.amazon.com/Reading-Brain-Science-Read-ebook/dp/B002SR2Q2I



    "Reading in the Brain" by Stanislas Dehaene



    Describes current knowledge about how reading takes place in the human brain. My not be everyone's cup of tea, but I find it well done and fascinating. Explains the processing done by the brain to read and parts of the brain's normal functions that are recruited for something synthetic like reading as communication.



    Now the relevance here is the difference in testing perception thresholds and perception (Hendrix's guitar style, or Les Paul vs Strat or recognizing your daughter's voice on the phone). The aural memory said to be very short in audio testing is a different animal than the memory that lets one recognize types of sound. I don't know how to easily summarize it here. But there really is no conflict in those ideas.



    Recognition of sound also has to be going through a similar multi-layered processing like reading. On the other hand you cannot recognize something below the threshold of perceptibility. It never makes it further along the brain's processing path to be processed in a useful manner. Even the generic brain processing that goes along with the physical filtering of the ear/ear drum etc. almost surely precedes processing further along a chain of neurons that will be involved in recognition in other more complex ways.
  41. Brian A's Avatar
    Now that I'm back home and have taken a few minutes to analyze my ratings, I see that, other than that $#%&@ louder track and the mp3 track, I selected all the higher frequency versions as the best sounding regardless of bit depth. Weird. Popular wisdom says that the bit depth is more important. Hummm. Double hummm.