Jump to content
IGNORED

Foobar Sox guidance please


Recommended Posts

Am trying to arrive at what is considered "the best" settings when upsampling to 96 in Foobar using the Sox plugin.

 

I've read the Sox FAQ section

I've read the forum post on Hydrogen audio

I've read Ayre's whitepaper

I've looked at Infinitive SRC graphs

 

Some of the terminology used in describing Sox's settings are not consistent with the UI of the plugin (I don't use the command line).

 

Can anyone here guide me how to best set the Sox plugin in Foobar? I am trying to achieve, I think, what Ayre concluded was the best compromise in their white paper:

"“Slow Roll-Off ” Digital Filter, -6 dB at 22,050 Hz, No Pre-Ringing, ~2 Cycle of Post-Ringing"

 

Here is the UI I have to work with

05.03.2015-22.54.png

 

Thanks for any guidance

Link to comment

Here can't be unambiguous answer. You may experiment with settings. If you may use measurement tools it can give faster result.

 

Features of resampling filers is contradictory. For better result need found point of balance.

AuI ConverteR 48x44 - HD audio converter/optimizer for DAC of high resolution files

ISO, DSF, DFF (1-bit/D64/128/256/512/1024), wav, flac, aiff, alac,  safe CD ripper to PCM/DSF,

Seamless Album Conversion, AIFF, WAV, FLAC, DSF metadata editor, Mac & Windows
Offline conversion save energy and nature

Link to comment

Interpreting this GUI involves figuring out which of the options on the SoX rate effect the various controls can correspond with.

 

To get something close to "“Slow Roll-Off ” Digital Filter, -6 dB at 22,050 Hz, No Pre-Ringing, ~2 Cycle of Post-Ringing", you would check "allow aliasing", reduce Passband to maybe 90-92%, and move the Phase response slider to 25% (corresponding to SoX Intermediate phase setting). This is not, however, what I recommend. I would leave the Phase at 50% and uncheck "allow aliasing."

Link to comment
Interpreting this GUI involves figuring out which of the options on the SoX rate effect the various controls can correspond with.

 

To get something close to "“Slow Roll-Off ” Digital Filter, -6 dB at 22,050 Hz, No Pre-Ringing, ~2 Cycle of Post-Ringing", you would check "allow aliasing", reduce Passband to maybe 90-92%, and move the Phase response slider to 25% (corresponding to SoX Intermediate phase setting). This is not, however, what I recommend. I would leave the Phase at 50% and uncheck "allow aliasing."

 

Jay-dub, can you expand/explain your recommendation here? I understand this is a very subjective topic. Very curious to understand your POV.

Link to comment
  • 3 weeks later...

Since this is going to be long, here's the meaty part first. I use fairly generic SoX settings.

- Best quality (also called VHQ in SoX documentation)

- Passband 95.0%

- Phase response: minimum (0%) when downsampling; linear (50%) when upsampling

- Aliasing/imaging: only allowed when upsampling

 

Note that these slightly differ from the default recommended settings for downsampling only.

 

Now here's the why and how in my personal case.

 

Objective: everything to 24/96, except redbook (16/44 -> 24/44)

My DAC only handles 44, 48 and 96 natively (and for some reason its 48 conversion seems subpar).

Whereas in iZotope I only used integer resampling (2x 4x 8x), in SoX I can't hear the difference between integer or not, so I resample everything to 96 (I figure, why lose information when it's just as easy not to). Thus 48, 88, 176, 192, 352, 384 is resampled to 96. The bulk of my files, 96 & 44, is fed bit-perfect to the DAC.

 

Foobar's DSP Manager

So I put two SoX mod2 components (let's call them A and B) in Foobar's DSP manager. The order doesn't matter. Resampler A handles 88 & 48 files (upsampling to 96), Resampler B handles 176 & 192 files (downsampling to 96). Like so:

 

(A) Resampler (SoX) mod2

Target samplerate: 96000 Hz

Resample ONLY frequencies: 88200;48000 (these days I'm experimenting on upsampling 44100 as well)

Quality: Best

Passband 95%

Aliasing/imaging: YES (checked)

Phase 50% (linear)

 

(B) Resampler (SoX) mod2

Target samplerate: 96000 Hz

Resample ONLY frequencies: 384000;192000;352800;176400

Quality: Best

Passband 95%

Aliasing/imaging: NO

Phase 0% (minimum)

 

about other DSP order: I use "Skip Silence", first one in the list (because, why process in SoX parts of a file that won't be played? so let's exclude silences first of all). The fourth and last one, after the two SoX mod2, is "Convert mono to stereo" (I suppose it's simply doubling the existing track over two channels instead of one, so why process twice the same track in SoX, if that can be avoided?)

 

Remarks about these settings

I honestly suspect that my B-downsampling is not "transparent" or "the best", from a lab/mathematical perspective, but I'll tell you what: besides the fact that downsampling like so usually sounds better to my ears, I seriously think that, combined with the WASAPI passthrough, minimum phase is "how my DAC likes its food". I just don't know how to say that. Maybe it yields less jitter, maybe it's just my tastes, maybe it suits my speakers better (old JBL full-range column-shaped louspeakers, the aging twitters have a tendency to hiss unforgivingly with many masters)... I don't know. Comparatively, downsampling in linear phase seems a bit muffled, a bit too pretty yet dusty, somewhat unrealistically perfect and dull. It's like these mid-range CD players from the early 90's. Or a photograph from that time. I'm pretty convinced it's actually more transparent mathematically, though, but probably outside the human range. I don't know. All I know is, the lower the phase I can get away with, the better my particular audio chain seems to perform. Regardless of the OS/player/resampler.

 

Aliasing is a different beast.

 

When upsampling, I generally find that aliasing makes for a much better "presentation", clearer soundstage. Aliasing shines when it's all about creating stuff out of thin air (literally).

 

However, generally when downsampling on my system (remember, with minimum phase), I often see aliasing as a necessary evil. Basically, in my subjective experience, and with many, many exceptions to these general impressions:

- with the greatest masters, aliasing + minimum phase = loss of detail, like a picture losing sharpness (not blurry, simply less striking)

- with a bad source, aliasing is often necessary because otherwise the sound breaks your ears with artefacts and sometimes an obviously flawed signal, and generally it just sounds better with aliasing on those flawed tracks (think: stupid DR5 so-called "remaster" of Hendrix, broken mp3 because it's been compressed several times, a file dithered more than once, a bad vinyl rip, a hissing master, etc)

- with most lossless files I find aliasing to be somewhat of an unnecessary layer between the listener and the source (so I leave it off by default and only take the time to enable it for downsampling when I'm listening to one of these albums that sound bad without, but fortunately these are rare at 176 or 192...)

 

Playing DSD to PCM: SoX may help

I also use the SACD plugin, in relation to SoX, setup like so:

Output Mode: PCM (wish I could do native DSD)

PCM Volume: +0dB, NO DeClicker

PCM Samplerate: 352800 => then it's up to SoX (second mod2) from there to 96000

DSD2PCM Mode: Multistage (Double Precision) => basically, 64-bit processing.

 

PCM Samplerate should be set as high as the DAC can natively process. In my case the only multiple of 44100 is 44100 itself (and it performs very well, I honestly have no complain about setting PCM Samplerate to 44100 and be done with it), so if I want to output DSD to 96, I have to do two stages of resampling. It's an experiment, one that sounds better than I thought. Which is why it's ongoing. I think SoX sounds better than SACD plugin's internal filters, so I minimize that plugin's job so that SoX kicks in as early as possible in the chain (hence, 352800, and I would even use twice that if it were available).

 

Note: +0dB instead of the default +6dB because, contrary to popular belief, whereas DSD has a 6dB overhead that usually isn't used, it can be! I've seen some DSD tracks (latest japanese SHM remasters of Queen and Stevie Wonder's discography, notably) use as much as 4dB out of 6, so to be on the safe side, it's better to keep DSD2PCM processing at +0dB and let ReplayGain do its job if volume is a concern (provided these tracks are .dsf files in order to be tagged with RG values).

 

Ok, on to advanced settings.

As most of us I suppose, most of my library is redbook material (16/44).

So in Foobar's preferences, "Advanced", I set decoding Tone/sweep sample rate to whichever frequency I'm going to be processing redbook at. That's 44100 by default, and I set it up at 96000 if I'm upsampling 44100 to 96000 in the first SoX mod2 (which is my default case these days, but it's an ongoing experiment, one that goes well I might add).

Note: I don't have a clue what the prior setting does, I'm just guessing. Any input is welcome.

 

Playback > Full file buffering up to 6291456 kB (6 GB, probably enough for a full ISO).

Playback > WASAPI at default values, High worker process priority checked.

The most important setting is probably the following.

Thread priority 7 (max), Use MMCSS YES (checked), MMCSS mode: Pro Audio

This guarantees that Foobar's WASAPI sound processing is of the utmost importance for your system.

 

Finally, I don't Prevent hard disk sleep while playing because a sleepy disk means less energy footprint, less EM in the computer case, and whatever else we don't need. That's considering my 16GB of RAM and the fact that my audio files are on a network share, so there's just no need for HDD access if Foobar does its job correctly. Windows might do stuff because that's what it does, but most of the time it doesn't (also a good reason to choose a Server version imho: less work on the user's part to achieve an optimally low OS footprint from a clean install).

 

Speaking of which, in Windows, it never hurts to make sure everything's fine in your Playback Device config (right-click the sound icon in the taskbar > Playback Devices, select your DAC, hit "properties"). Sometimes a driver update or other application may change settings, though it shouldn't happen (blame these guys, not Microsoft).

- Levels > Speakers: 100

- Enhancements > Disable all enhancements YES (checked)

- Advanced > Default Format Set it to your DAC's maximum capability (last item in the list)

> Exclusive Mode: check both options

Click "OK", then "Configure"

Audio channels: select your DAC's maximum*

Full-range speakers: check those which apply, at least Front left and right. Test each speaker, then hit "Finish".

 

*Some people suggest to use <max bit depth available> / 44100 Hz because the default format shouldn't matter when using Foobar/SoX with WASAPI (that's the whole point of WASAPI, bypassing Windows internal sound processing and especially forced resampling of everything), whereas other sources (YouTube, Spotify, most videos, most games, etc.) are more likely to use 44.1 KHz so you may want to avoid crappy directX resampling if you intend to use the DAC output with other sources than Foobar.

 

As for Windows versions, in terms of SQ end result, I think the NT6.3 core (Windows 8.1 / Server 2012R2) is a significant improvement over NT6.1 (7/2008R2). I hope Windows 10 maintains that SQ, that will be one reason less not to update. Ultimately I think that 2012 sounds even better than 8.1, but that's probably due to optimization which can theoretically be achieved on both systems, though somewhat in a more convoluted way in 8.1. (if you go server, make sure you install the necessary media/QoS features in Server Manager, otherwise sound processing, and video for that matter, will simply be atrociouly bad).

 

The conclusion of this thread is that I absolutely need to try a micro iDSD and only care about upsampling. And that Apple need to get a grip on their desktop OS.

 

 

 

 

Link to comment

The conclusion of this thread is that I absolutely need to try a micro iDSD and only care about upsampling. And that Apple need to get a grip on their desktop OS.

 

You lost me there. What did anything in your post have to do with shortcomings in Apple's desktop OS?

Link to comment
You lost me there. What did anything in your post have to do with shortcomings in Apple's desktop OS?

The "conclusion" wasn't really a conclusion, it was just me saying something stupid at the end of a long-serious post.

;-)

 

It would have made more sense though, if I hadn't forgotten to say something about that (again, long post, some ideas get lost in the process). So here goes (a bit off topic, though): on both my Mini and iMac, no matter the player (among the best contenders these days, A2+, HQ, Amarra, etc.), I consistently prefer the sound on Windows. In comparing SRC on OSX (in A2+) and SoX (in Foobar) on Windows for instance, hell even bit-perfect non-resampled playback (so, just straight sending the bits from the file to the DAC, no nothing done whatsoever) everything just sounds better on the Windows side. Since I know that neither A2+ nor SRC are worse than Foobar/SoX (certainly not by such a magnitude), I tend to blame the OS. I think WASAPI is great, whatever you throw at it (HD video audio resampling, games, this), it just does the job awesomely well, no matter the physical output (S/PDIF, WASAPI, you name it). Also I have this tendency to blame OS X because it wasn't like that with previous iterations of OS X, and certainly not with my old Windows 7x64. But now, 8.1 sounds insanely much better to me than Yosemite (again, it's just my personal setup, YMMV).

 

 

 

 

Link to comment
  • 2 months later...

Is it possible while upsampling with sox to equalize the file a little bit? As an example i have some old sounds (e.g. Deep Purple, Yes, Sanatana) wich all miss some punch at the low end. My idea is to go in with say -6 db und push up the low frequencies < 50Hz with + 12db (similar as used to be the bass control on the amplifier).

Link to comment
  • 4 weeks later...
You can use another DSP plugin for equalization, for example foobar2000: Components Repository - Graphic Equalizer

 

I have to be more precise: as i want to access my music with DNLA i have to transcode the files on the DNLA-Server.

The best solution would be a tag in the music-files (e.g. flac, mp3 or wma) which controls what kind of transcode will take place. i do not want to upsample or downsample nor change any other except equalizing to some individual taste (more or less bass / treble and most a little bit expanding dynamic (sox's compand command)).

 

The tags in the files may be like a group of profile e.g. 'sox01' would boost the bass below 50Hz to + 6 dB or 'sox02' should expand the dynamic for 20%. So all music-files with a tag 'sox01' would get the desired treatment...

Link to comment
  • 2 years later...
On 5/5/2015 at 10:09 PM, Jay-dub said:

Interpreting this GUI involves figuring out which of the options on the SoX rate effect the various controls can correspond with.

 

To get something close to "“Slow Roll-Off ” Digital Filter, -6 dB at 22,050 Hz, No Pre-Ringing, ~2 Cycle of Post-Ringing", you would check "allow aliasing", reduce Passband to maybe 90-92%, and move the Phase response slider to 25% (corresponding to SoX Intermediate phase setting). This is not, however, what I recommend. I would leave the Phase at 50% and uncheck "allow aliasing."

 

On 5/25/2015 at 10:40 PM, ikkei said:

Since this is going to be long, here's the meaty part first. I use fairly generic SoX settings.

- Best quality (also called VHQ in SoX documentation)

- Passband 95.0%

- Phase response: minimum (0%) when downsampling; linear (50%) when upsampling

- Aliasing/imaging: only allowed when upsampling

 

Note that these slightly differ from the default recommended settings for downsampling only.

 

Now here's the why and how in my personal case.

 

Objective: everything to 24/96, except redbook (16/44 -> 24/44)

My DAC only handles 44, 48 and 96 natively (and for some reason its 48 conversion seems subpar).

Whereas in iZotope I only used integer resampling (2x 4x 8x), in SoX I can't hear the difference between integer or not, so I resample everything to 96 (I figure, why lose information when it's just as easy not to). Thus 48, 88, 176, 192, 352, 384 is resampled to 96. The bulk of my files, 96 & 44, is fed bit-perfect to the DAC.

 

Foobar's DSP Manager

So I put two SoX mod2 components (let's call them A and B) in Foobar's DSP manager. The order doesn't matter. Resampler A handles 88 & 48 files (upsampling to 96), Resampler B handles 176 & 192 files (downsampling to 96). Like so:

 

(A) Resampler (SoX) mod2

Target samplerate: 96000 Hz

Resample ONLY frequencies: 88200;48000 (these days I'm experimenting on upsampling 44100 as well)

Quality: Best

Passband 95%

Aliasing/imaging: YES (checked)

Phase 50% (linear)

 

Aliasing is a different beast.

 

When upsampling, I generally find that aliasing makes for a much better "presentation", clearer soundstage. Aliasing shines when it's all about creating stuff out of thin air (literally).

 

However, generally when downsampling on my system (remember, with minimum phase), I often see aliasing as a necessary evil. Basically, in my subjective experience, and with many, many exceptions to these general impressions:

- with the greatest masters, aliasing + minimum phase = loss of detail, like a picture losing sharpness (not blurry, simply less striking)

- with a bad source, aliasing is often necessary because otherwise the sound breaks your ears with artefacts and sometimes an obviously flawed signal, and generally it just sounds better with aliasing on those flawed tracks (think: stupid DR5 so-called "remaster" of Hendrix, broken mp3 because it's been compressed several times, a file dithered more than once, a bad vinyl rip, a hissing master, etc)

- with most lossless files I find aliasing to be somewhat of an unnecessary layer between the listener and the source (so I leave it off by default and only take the time to enable it for downsampling when I'm listening to one of these albums that sound bad without, but fortunately these are rare at 176 or 192...)

 

Playing DSD to PCM: SoX may help

I also use the SACD plugin, in relation to SoX, setup like so:

Output Mode: PCM (wish I could do native DSD)

PCM Volume: +0dB, NO DeClicker

PCM Samplerate: 352800 => then it's up to SoX (second mod2) from there to 96000

DSD2PCM Mode: Multistage (Double Precision) => basically, 64-bit processing.

 

PCM Samplerate should be set as high as the DAC can natively process. In my case the only multiple of 44100 is 44100 itself (and it performs very well, I honestly have no complain about setting PCM Samplerate to 44100 and be done with it), so if I want to output DSD to 96, I have to do two stages of resampling. It's an experiment, one that sounds better than I thought. Which is why it's ongoing. I think SoX sounds better than SACD plugin's internal filters, so I minimize that plugin's job so that SoX kicks in as early as possible in the chain (hence, 352800, and I would even use twice that if it were available).

 

Note: +0dB instead of the default +6dB because, contrary to popular belief, whereas DSD has a 6dB overhead that usually isn't used, it can be! I've seen some DSD tracks (latest japanese SHM remasters of Queen and Stevie Wonder's discography, notably) use as much as 4dB out of 6, so to be on the safe side, it's better to keep DSD2PCM processing at +0dB and let ReplayGain do its job if volume is a concern (provided these tracks are .dsf files in order to be tagged with RG values).

 

Ok, on to advanced settings.

As most of us I suppose, most of my library is redbook material (16/44).

So in Foobar's preferences, "Advanced", I set decoding Tone/sweep sample rate to whichever frequency I'm going to be processing redbook at. That's 44100 by default, and I set it up at 96000 if I'm upsampling 44100 to 96000 in the first SoX mod2 (which is my default case these days, but it's an ongoing experiment, one that goes well I might add).

Note: I don't have a clue what the prior setting does, I'm just guessing. Any input is welcome.

 

Playback > Full file buffering up to 6291456 kB (6 GB, probably enough for a full ISO).

Playback > WASAPI at default values, High worker process priority checked.

The most important setting is probably the following.

Thread priority 7 (max), Use MMCSS YES (checked), MMCSS mode: Pro Audio

This guarantees that Foobar's WASAPI sound processing is of the utmost importance for your system.

 

Finally, I don't Prevent hard disk sleep while playing because a sleepy disk means less energy footprint, less EM in the computer case, and whatever else we don't need. That's considering my 16GB of RAM and the fact that my audio files are on a network share, so there's just no need for HDD access if Foobar does its job correctly. Windows might do stuff because that's what it does, but most of the time it doesn't (also a good reason to choose a Server version imho: less work on the user's part to achieve an optimally low OS footprint from a clean install).

 

Speaking of which, in Windows, it never hurts to make sure everything's fine in your Playback Device config (right-click the sound icon in the taskbar > Playback Devices, select your DAC, hit "properties"). Sometimes a driver update or other application may change settings, though it shouldn't happen (blame these guys, not Microsoft).

- Levels > Speakers: 100

- Enhancements > Disable all enhancements YES (checked)

- Advanced > Default Format Set it to your DAC's maximum capability (last item in the list)

> Exclusive Mode: check both options

Click "OK", then "Configure"

Audio channels: select your DAC's maximum*

Full-range speakers: check those which apply, at least Front left and right. Test each speaker, then hit "Finish".

 

*Some people suggest to use <max bit depth available> / 44100 Hz because the default format shouldn't matter when using Foobar/SoX with WASAPI (that's the whole point of WASAPI, bypassing Windows internal sound processing and especially forced resampling of everything), whereas other sources (YouTube, Spotify, most videos, most games, etc.) are more likely to use 44.1 KHz so you may want to avoid crappy directX resampling if you intend to use the DAC output with other sources than Foobar.

 

As for Windows versions, in terms of SQ end result, I think the NT6.3 core (Windows 8.1 / Server 2012R2) is a significant improvement over NT6.1 (7/2008R2). I hope Windows 10 maintains that SQ, that will be one reason less not to update. Ultimately I think that 2012 sounds even better than 8.1, but that's probably due to optimization which can theoretically be achieved on both systems, though somewhat in a more convoluted way in 8.1. (if you go server, make sure you install the necessary media/QoS features in Server Manager, otherwise sound processing, and video for that matter, will simply be atrociouly bad).

 

The conclusion of this thread is that I absolutely need to try a micro iDSD and only care about upsampling. And that Apple need to get a grip on their desktop OS.

I know this is an old post, but after several years, I have a question regarding whether you guys still use aliasing checked on, I'm more interested about the subjective argument perhaps sounding "better" but still appearing quite natural. Would you say this would create a heightened soundstage that is worth it for a permanent listening experience or would you say it's a cool affect that wears off in a while resorting to unchecking it and resorting back. Thanks.

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now



×
×
  • Create New...