View RSS Feed

Miska

Some analysis and comparison of MQA encoded FLAC vs normal optimized hires FLAC

Rating: 1 votes, 1.00 average.
Now since there are MQA encoded FLAC test files available, I wanted to see how the file works out without MQA decoder - as a normal FLAC.

If we first look at the original 352.8/24 (DXD) file that has FLAC size of 69 MB:
blogs/miska/attachments/23258-some-analysis-and-comparison-mqa-encoded-flac-vs-normal-optimized-hires-flac-mqa-352_8-orig.jpg

We can see that there is content up to around 57 kHz.


If we then convert it to 44.1/24 we get a FLAC with size of 14 MB:
blogs/miska/attachments/23259-some-analysis-and-comparison-mqa-encoded-flac-vs-normal-optimized-hires-flac-mqa-441_24-conv.jpg

From the filter roll-off we can see that there's quite a bit less than 24-bit worth of information, amounting slightly less than 18-bit. Meaning about 108 dB SNR. Quite a lot of low level information visible.

OK, so how would this look like in normal 44.1/16 RedBook FLAC with size of 6.2 MB?
blogs/miska/attachments/23260-some-analysis-and-comparison-mqa-encoded-flac-vs-normal-optimized-hires-flac-mqa-441_16-conv.jpg

Now it has lost some low level information.

But how does 44.1/24 MQA encoded FLAC with size of 16 MB, decoded as normal file without MQA capability look like?
blogs/miska/attachments/23261-some-analysis-and-comparison-mqa-encoded-flac-vs-normal-optimized-hires-flac-mqa-dec.jpg

Whoops, there's significantly less than 16-bit worth of resolution left, with MQA data appearing as high frequency noise. Worse than RedBook format FLAC.


Next, I'm curious how big the file would be, if we optimally encode this content as FLAC? So we need 60 kHz of bandwidth and 18-bit worth of resolution. So let's go and encode it as 120/18 FLAC with size of 13 MB:
blogs/miska/attachments/23262-some-analysis-and-comparison-mqa-encoded-flac-vs-normal-optimized-hires-flac-mqa-120_18-conv.jpg

Perfect! Everything preserved, and the file is 3 MB smaller than the MQA encoded FLAC! Down size is now that you would need a player capable of upsampling this to a rate suitable for your DAC, for example HQPlayer.

Next let's try to encode it to a format that can be played on ordinary equipment/software, that is 176.4/18 FLAC, which ends up as 17 MB file:
blogs/miska/attachments/23263-some-analysis-and-comparison-mqa-encoded-flac-vs-normal-optimized-hires-flac-mqa-176_18-conv.jpg

This has about 30 kHz of unused bandwidth, but the file is only 6% larger than the MQA encoded and can be played on pretty much anything that can play hires FLACs and doesn't lose anything compared to the original while doing so.

We can also check how the result looks like at 96/18 FLAC with size of 12 MB:
blogs/miska/attachments/23264-some-analysis-and-comparison-mqa-encoded-flac-vs-normal-optimized-hires-flac-mqa-96_18-conv.jpg

Most of the content is preserved with highest harmonics cut out only of couple of strongest tones.

I did some listening comparisons between the MQA encoded file and the 120/18 FLAC, both played out at 24.576 MHz DSD rate to iFi iDSD Micro DAC, Fostex HP-A8C headphone amp and Sennheiser HD-800 headphones. And I preferred the 120/18 FLAC...
Categories
Uncategorized

Comments

  1. The Computer Audiophile's Avatar
    Maybe I'm a little slow, but I don't really understand what you're trying to do here.
  2. palbratelund's Avatar
    +1
  3. Miska's Avatar
    I'm trying to see two things:
    1) Does MQA degrade quality vs RedBook FLAC when there is no MQA decoder - answer seems to be yes
    2) Is it possible to deliver full source quality in same size or smaller FLAC as MQA encoded one, without MQA - answer seems to be yes

    So:
    - If Tidal moves to MQA FLAC, quality degrades at least for those who don't have MQA capable hardware
    - They could stream standard hires FLAC at same bandwidth as MQA takes, and it would be fully decodable with the original open FLAC implementation
  4. esldude's Avatar
    I don't get how the information you are showing in the spectrogram view is allowing you to determine bit levels of information.

    What settings for the spectrogram are you using?

    When you go from the 352 DXD, did you adjust the spectrogram FFT bin size when you drop to 44? Otherwise the bin size is wider in the 352 than the 44 view which would corrupt the comparison, and I still don't see how are able to call a bit level with any precision given the information you show here. I do see the high frequency noise you are referring to in the MQA version of this file.
    Updated 01-12-2016 at 05:00 AM by esldude
  5. Miska's Avatar
    The background noise level comparison is done between two files of same format, 44.1/24 in this case. The MQA one vs one converted from the DXD to 44.1/24 and 44.1/16 using TPDF dither. First compare these two, they are completely comparable, sampling rate source word length and FFT settings are all the same. If the MQA encoded 44.1/24 FLAC has significantly visibly more noise than 44.1/16 FLAC, it has more noise than what 16-bits would allow.

    Then I used of of my own tooling to determine that 18-bit with TPDF dither is enough to preserve all the information of the DXD source, IOW the word length reduction doesn't affect the dynamic content at all. So only the complete noise bits are removed, the ones that carry no useful information. (IOW, as long as diff is white noise, it is not very useful (you can also normalize it and run it through long FFT to make sure))

    Looking at the spectrograms is enough for determining highest harmonic that is popping up from the noise floor. Making FFT longer could pop up couple of more though, so I left some margin when choosing sampling rate that can accommodate the spectral content and certainly at least the 176.4k with 30 kHz of visibly empty space is enough. Highest visible harmonic was at 57 kHz so I chose 60 kHz bandwidth and thus ended up with 120 kHz sampling rate.

    FLAC is clever enough, so even if you feed it always with "24-bit" samples that are zero-padded and actually only contain 16- or 18-bit of information it knows how to shave off those padding bits. But you can also feed it with truly 18-bit information without zero-padding. But then the decoding application needs to understand this (HQPlayer does, but many others don't).

    I also double-checked the results with xivero MusicScope.

    These are the settings I generally use on Audacity:

    The GUI has some scaling issues with 4k display, so the maximum frequency value doesn't fit in completely, but it is 100 kHz.
  6. Miska's Avatar
    I already wrote an MQA-stripper that strips out the noise LSBs. The content doesn't get changed, but the extra noise bits get cut out.
  7. esldude's Avatar
    I think all your seeing here is noise shaped dither. I took some of the 2L files that are non-MQA, and changed them to 16 bit from the original 24 bit. You get nearly the same band of low level noise in the upper frequencies, and that is due to the noise shaped dithering.

    I seem to recall one of the patents on MQA talking about hiding those lower bits of HF info in pseudo-random shaped noise. Apparently it functions as dither yet with MQA decoding they can recover the low level HF bits too. You can see in the block diagram they are subtracting the lower bits of the HF signal from the LF signal using bits 14,15, and 16. That would serve the function of dither. 3 bits is more than often used, but it being only at the high frequencies wouldn't really mean the files have been reduced to 12 or 13 bit resolution.
  8. Miska's Avatar
    Yes, but the only advantage of noise shaping is to move noise to less audible frequencies from the critical frequencies. In noise shaping, the amount of noise doesn't increase or decrease, it is just distributed differently.

    But if you check the MQA files, there's no drop in noise levels at mid range which stays at 16-bit equivalent level. So the total amount of noise is increased compared to properly dithered 16-bit conversion of the same content.

    IOW, if you would do normal SNR or THD+N calculation, summing all the noise bins together, the MQA version would give significantly worse number than a normal TPDF dithered 16-bit file.

    So at least I would rather choose a normal properly made RedBook file rather than the MQA one, which is 2.7 times larger but technically worse.
  9. esldude's Avatar
    I am still not getting it how you are determining there is no drop in mid range noise levels. It isn't apparent in the spectrograms. So how are you determining this?
  10. Miska's Avatar
    Yes it is apparent, you need to open the pictures at full 1:1 pixel-to-pixel scale and inspect. You can also run it through various other tools to compare such as MusicScope (note with MusicScope you need to use zero-padded 24-bit words for 16-bit samples to get same dynamic range scaling on the plots).

    I get less noise by using my tools to produce 44.1/16 file and the size is smaller 6.2 MB vs 16.7 MB. At tiny size increase to 17.2 MB I can have full normal 176.4/18 FLAC! The optimized 120/18 normal FLAC was 13 MB, 3 MB less than the MQA FLAC. And undoubtedly much better quality and no fancy playback hardware needed!

    So what the heck is point of MQA!? It doesn't save any bandwidth, it just adds proprietary per-unit licensing royalty cost to the picture.

    In addition, for content providers:
    Every MQA encoder will need access to an HSM (Hyper-Security Module) that issues the encrypted signatures contained within each file. Costs of owning and implementing HSM within your environment will generally range between £5,000 - £20,000 but it’s important you discuss this with your technical team and MQA.
    Or alternatively you can pay to 7digital to encode your files.

    Producers cannot encode to MQA files on their own, only "The actual MQA encoding takes place at the encoding house". The files they produce with plugins just add metadata to the file for encoding process.
  11. Miska's Avatar
    Argh, crap, the CA site has converted the PNG files I uploaded to JPG screwing up all the detail in the pictures. :(

    Of course the pictures were big (13 MB per screenshot), but still...
  12. mansr's Avatar
    Now someone just needs to make a fake-MQA filter that adds properly shaped noise so it looks "right" in the a spectrum analysis. I'm sure someone would "hear" the improvement.
  13. esldude's Avatar
    Yes, I have had the same problem with CA picture uploads. I looked at the largest version of the pictures and couldn't see evidence of what you are talking about. Not trying to be argumentative or hardheaded, just haven't seen it. The conversion to jpg does wipe out fine detail so maybe that is the reason.
  14. Miska's Avatar
    Anyway, I'm pretty sure one can reproduce my results close enough by using SoX and Audacity...
  15. mansr's Avatar
    My quick analysis of a couple of files: http://www.computeraudiophile.com/f8...tml#post501224
  16. Miska's Avatar
    @mansr also check out the same 2L-111 track I was using, it is clean from modulator noise (probably the decimation filter is optimized to match the modulator slope -> flat noise floor). And contains some ultrasonic content too (not the most ultrasonic active of the test tracks, but has some good dynamic swings and clear harmonics along with nice discrete low level detail tones at places).
  17. mansr's Avatar
    Removing modulator noise is probably one of the things MQA does.
  18. Miska's Avatar
    There is no modulator noise visible in the bandwidth they use (44.1 / 48 kHz) with any of the DXD recordings. Even in on of the oldest, the 2L-038, the modulator noise begins to rise from about 60 kHz onwards. That's also about the place where harmonics disappear in the background noise. One of the reasons I chose to use 120 kHz sampling rate for the "optimized" version.
  19. mansr's Avatar
    Some of the DXD recordings have quite a bit of modulator noise following the typical profile starting around 60kHz. The outlier here is of course the Nielsen DAT recording that has what looks like modulator noise starting at 16kHz. I conjecture that one part of MQA is a low-pass filter that is more or less an inverse of the noise-shaping filter used in specific DACs.
  20. Miska's Avatar
    But MQA doesn't even reach 60 kHz, it reaches only 2x the FLAC rate which is 44.1 kHz here... So there's nothing in terms of modulator noise to remove in the band they use.

    But their decimation filter rolls off throughout the top octave, IOW the band 22.05 - 44.1 kHz reaching maximum attenuation at Nyquist of the 2x.

    Even if they'd encode something above 2x rate, they would need to get rid of the rising modulator noise there because it would have too much information for the bitrate they have available to encode HF.
  21. mansr's Avatar
    With the Nielsen sample it seems like they removed the modulator noise and replaced it with their own.
  22. mmerrill99's Avatar
    Is the modulator noise you are talking about the rise in noise floor outside the audio band due to noise shaping or something else in the audio band?
    I've heard that noise modulation was one of the areas that MQA was targeting along with time smear.