Results 1 to 18 of 18
  1. #1
    Just a poor ole country boy sjoc2000's Avatar
    Join Date
    Mar 2012
    Location
    N. of Sumner, Iowa
    Posts
    483
    Blog Entries
    1

    Trying to figure out audio signal path from JRiver via USB to Dac?

    I have been trying to figure out precisely what the audio signal path is from JRiver via USB to a Dac.

    First of all what would be the form of the data from the media player (mp); serial bits, bytes, or packets? How does the mp embed the driver (wasapi, asio?) information, or how is the driver implemented?

    Then with this example, the transfer medium being USB, where does the data stream go after the mp; I/O Hub Controller, CPU...? If someone can explain the places stages and processes that the signal goes through in the computer it would be helpful. Where does framing, buffering, and reclocking occur?

    Finally, is the data output per USB to the Dac converted to PCM, or are raw packets output in serial form to the Dac?

    I have been trying to find explanations on the net for these questions, with little success.

    Any responses from members here, that have an understanding of these processes would be very much appreciated.

    Sincerely,

    Jim
    PC (J River-Jplay) > USB > Mytek 192 - DSD > XLR > Adcom GFP-750 Pre > XLR > Emotiva XPA-5 > Snell C/V's (bi-amped) / Klipsch Sub <100 Hz

  2. #2
    Banned
    Join Date
    Oct 2011
    Location
    Amsterdam
    Posts
    3,515
    Blog Entries
    4
    (I don't normally read (or post in) the non-music parts of CA, but in this case the OP specifically asked me to comment)

    Quote Originally Posted by sjoc2000 View Post
    I have been trying to figure out precisely what the audio signal path is from JRiver via USB to a Dac.
    I am not really familiar with Windows (I'm more familiar with UNIX and Linux kernels), but let me take a first cut, and others can add more information.

    To start with, the player program (JRiver, in this case) somehow loads the FLAC, WAV, AIFF, MP3 or whatever data from the hard disk or network using the normal operating system services (typically the data ends up in the memory space of the program using DMA). The CPU, under control of the player program, reformats (and possibly recalculates/resamples/processes) the data into raw PCM bytes, and writes them into a memory buffer. From there, the USB controller pick up the data using DMA (Direct Memory Access - basically the USB controller takes over the memory bus briefly, and transfers the data (using a DMA controller) into it's own buffer space. From there, the bytes get clocked out on the USB bus as high-speed serial data.

    Thus there are multiple layers of buffering and reclocking (but the clock only matters once the data hits the USB bus in serial form) - from hard disk to hard disk controller, from there to main RAM, then to USB controller...

    That's a quick first approximation of the process...

  3. #3
    Ph.D. Level Member Paul R's Avatar
    Join Date
    Mar 2010
    Location
    Sgr A*
    Posts
    11,968
    Blog Entries
    6
    Winders is quite complex, but in many ways, it is an elegant design. Basically, the software part of the system looks like this:

    ic45726.jpg

    I am using WASPI as an example here, by the way. There are other methods, but I prefer this one.

    In your case, the USB audio device is an Audio Endpoint Device, and windows writes data to it through the endpoint buffer. All that means is that Windows pretty much transparently manages the transfer of data to the device for the Windows programmer. He pretty much just loads a buffer, and off it goes. The sound stream will get mixed if necessary and all that stuff.

    Now, the next key part of that puzzle is that there are two ways to open the stream that writes to the buffer - shared mode and exclusive mode. This is important to the format of the data you can send - in shared mode, you pretty much can only send PCM data. (Not strictly true, but true enough for this discussion.)

    In exclusive mode, you can send data in any format that the endpoint device accepts. That is how you can, for example, send DSD data to a DAC that can accept DSD data over a USB connection. (Whew!)

    Now it gets a bit trickier to understand, since most DACs only accept PCM data. That means the application - say JRMC, has to take whatever format the music exists in on the disk, and basically convert it to PCM. That can happen "automagically" by using Windows core audio services and libraries, or it can happen in the application. All the same, doesn't care.

    Macs, by the way, work essentially the same way, as indeed, do Linux applications. The final details are all different of course, but the logic and end result is always the same.

    Hope that helps a bit. It is definitely at the 10,000 foot level. If you need more details just post away. I am sure the dozens of great Windows programmers we have here chip in to answer a lot of the specifics - probably much better than I can.

    Yours,
    -Paul



    Quote Originally Posted by sjoc2000 View Post
    I have been trying to figure out precisely what the audio signal path is from JRiver via USB to a Dac.

    First of all what would be the form of the data from the media player (mp); serial bits, bytes, or packets? How does the mp embed the driver (wasapi, asio?) information, or how is the driver implemented?

    Then with this example, the transfer medium being USB, where does the data stream go after the mp; I/O Hub Controller, CPU...? If someone can explain the places stages and processes that the signal goes through in the computer it would be helpful. Where does framing, buffering, and reclocking occur?

    Finally, is the data output per USB to the Dac converted to PCM, or are raw packets output in serial form to the Dac?

    I have been trying to find explanations on the net for these questions, with little success.

    Any responses from members here, that have an understanding of these processes would be very much appreciated.

    Sincerely,

    Jim
    Anyone who considers protocol unimportant has never dealt with a cat DAC.
    Robert A. Heinlein

  4. #4
    Just a poor ole country boy sjoc2000's Avatar
    Join Date
    Mar 2012
    Location
    N. of Sumner, Iowa
    Posts
    483
    Blog Entries
    1
    Julf and Paul;

    Thanks big time for the explanations. This subject and explanations are helpful to me, and I'm sure for some others that want to begin or further their understanding of this subject. I'm going to try to ask some questions, but in parts, so that there is not too much to deal with all at once.

    So JRiver, after I select a track, sends that track in it's compressed form by DMA to Ram, where JRiver instructs wasapi through the CPU to decompress the file and place it in a buffer? Upon being instructed wasapi will then move the data after changing it to PCM to a memory position in the USB section of the I/O controlling hub (south bridge)?

    Jim
    PC (J River-Jplay) > USB > Mytek 192 - DSD > XLR > Adcom GFP-750 Pre > XLR > Emotiva XPA-5 > Snell C/V's (bi-amped) / Klipsch Sub <100 Hz

  5. #5
    Ph.D. Level Member Paul R's Avatar
    Join Date
    Mar 2010
    Location
    Sgr A*
    Posts
    11,968
    Blog Entries
    6
    Quote Originally Posted by sjoc2000 View Post
    Julf and Paul;

    Thanks big time for the explanations. This subject and explanations are helpful to me, and I'm sure for some others that want to begin or further their understanding of this subject. I'm going to try to ask some questions, but in parts, so that there is not too much to deal with all at once.

    So JRiver, after I select a track, sends that track in it's compressed form by DMA to Ram, where JRiver instructs wasapi through the CPU to decompress the file and place it in a buffer? Upon being instructed wasapi will then move the data after changing it to PCM to a memory position in the USB section of the I/O controlling hub (south bridge)?

    Jim
    Pretty close. I think it is a little more like:

    JRiver reads tracks from disk into working memory (RAM)
    JRiver converts tracks to LPCM
    JRiver calls appropriate DirectMusic/Audio APIs to send samples to the endpoint buffer

    That's simplified, and there is more than one way to do it of course, but that's pretty close. JRiver does the conversions from the disk format to LPCM, which is one reason it sounds better than most Windows Apps.

    -Paul
    Anyone who considers protocol unimportant has never dealt with a cat DAC.
    Robert A. Heinlein

  6. #6
    Just a poor ole country boy sjoc2000's Avatar
    Join Date
    Mar 2012
    Location
    N. of Sumner, Iowa
    Posts
    483
    Blog Entries
    1
    Cool stuff. :0)

    Does JRiver decompress the track as it sends to ram, or is the track decompressed after it gets to Ram?

    JRiver calling Audio API's, as in wasapi?

    What is the function of the sampling? I thought the end product of previous functions was just placed in a buffer in Ram to await instructions for transfer to USB section of I/O hub controller memory?

    Jim
    PC (J River-Jplay) > USB > Mytek 192 - DSD > XLR > Adcom GFP-750 Pre > XLR > Emotiva XPA-5 > Snell C/V's (bi-amped) / Klipsch Sub <100 Hz

  7. #7
    Ph.D. Level Member Paul R's Avatar
    Join Date
    Mar 2010
    Location
    Sgr A*
    Posts
    11,968
    Blog Entries
    6
    Quote Originally Posted by sjoc2000 View Post
    Cool stuff. :0)

    Does JRiver decompress the track as it sends to ram, or is the track decompressed after it gets to Ram?
    (grin) The information has to be in RAM before it can be worked upon. The disk "read" puts information from the disk into a RAM buffer area, and from there the program manipulates it as desired.

    JRiver calling Audio API's, as in wasapi?
    Yep.

    What is the function of the sampling? I thought the end product of previous functions was just placed in a buffer in Ram to await instructions for transfer to USB section of I/O hub controller memory?

    Jim
    I was unclear - the LPCM data consists of audio samples. What JRiver transfers (via a buffer of course) is those samples. Not sure how much detail you want on that. But you could think of it as the data that represents the 16 bit, 24 bit, or 32 bit data.

    -Paul
    Anyone who considers protocol unimportant has never dealt with a cat DAC.
    Robert A. Heinlein

  8. #8
    Just a poor ole country boy sjoc2000's Avatar
    Join Date
    Mar 2012
    Location
    N. of Sumner, Iowa
    Posts
    483
    Blog Entries
    1
    That's enough questions for one day, don't want to wear you out. :0)

    Thanks for all the help. I'll be thinking about remaining questions, and if you're in the mood to answer them we can continue tomorrow.

    Sincerely,

    Jim
    PC (J River-Jplay) > USB > Mytek 192 - DSD > XLR > Adcom GFP-750 Pre > XLR > Emotiva XPA-5 > Snell C/V's (bi-amped) / Klipsch Sub <100 Hz

  9. #9
    Just a poor ole country boy sjoc2000's Avatar
    Join Date
    Mar 2012
    Location
    N. of Sumner, Iowa
    Posts
    483
    Blog Entries
    1
    Ok Paul, I'm back at it. :0)

    We have what was a compressed file from disk, residing in a Ram buffer, having been converted back to PCM (LPCM or a .wav file). This function has been accomplished by way of wasapi, which is instructed by JRiver. I am a little unclear about when wasapi is operating autonomously, and when it is being instructed by JRiver? Samples of this file are also being sent the endpoint buffer to establish how it is to be dealt with per it's file characteristics.

    Now, wasapi (or JRiver instructs wasapi?) instructs the endpoint buffer, located in the south bridge (I/O Controlling Hub) to take control of the Ram buffer and move it's data to that endpoint buffer?

    Is the PCM data from Ram being transferred to the endpoint buffer as a serial stream, or in what form?

    Do I have that right so far?

    Jim
    PC (J River-Jplay) > USB > Mytek 192 - DSD > XLR > Adcom GFP-750 Pre > XLR > Emotiva XPA-5 > Snell C/V's (bi-amped) / Klipsch Sub <100 Hz

  10. #10
    I think it would be beneficial to read a bit about DMA Direct memory access - Wikipedia, the free encyclopedia

    This is how it works in linux (technically it will be the same in windows, it is the same HW). The audio player will ask the audio API layer (wasapi in windows, alsa in linux) to allocate some memory space in RAM for the PCM samples. The USB audio driver takes the data, splits them into chunks suitable for USB transfer (URBs) and submits these data structures to the core USB driver. This driver "mixes" the URBs with URBs from other USB drivers for devices hooked to the same USB controller, prepares final USB frames and stores them to another part of memory for DMA transfer to the USB controller (which is told this memory address in advance).

    The controller reads the USB frames via DMA. If it hits a frame with interrupt bit set, it throws an IRQ to inform the USB stack about its current reading position. This call further propagates to the usb audio driver and through the audio layer all the way up to the playback application - time to fill the already processed part of the memory buffer with new data.

    The controller itself has only very short buffer for serialization of the data to the USB line.

  11. #11
    Just a note - PCI cards do not need the mixing of data from various devices. Therefore the application can be given the actual buffer used for DMA transfer to the soundcard (ASIO (anything newer?) in windows, hw:x in linux/alsa) and only one buffer can be used. USB soundcards require two buffers - the other one for construction of the final USB frames (so called double buffering in linux usb-audio driver source code)

  12. #12
    Just a poor ole country boy sjoc2000's Avatar
    Join Date
    Mar 2012
    Location
    N. of Sumner, Iowa
    Posts
    483
    Blog Entries
    1
    Quote Originally Posted by phofman View Post
    The controller itself has only very short buffer for serialization of the data to the USB line.
    Good read at Wikopedia. At least I'm beginning to get a handle on the audio signal/data dynamics involved.

    At what point in this process is the data "clocked", by the pc. I was thinking it occurred in the endpoint buffer, but you are saying this is a small buffer, it is serialized and sent to the endpoint device.

    Could you please elaborate a bit on where and how this clocking occurs?

    Jim
    PC (J River-Jplay) > USB > Mytek 192 - DSD > XLR > Adcom GFP-750 Pre > XLR > Emotiva XPA-5 > Snell C/V's (bi-amped) / Klipsch Sub <100 Hz

  13. #13
    Ph.D. Level Member Paul R's Avatar
    Join Date
    Mar 2010
    Location
    Sgr A*
    Posts
    11,968
    Blog Entries
    6
    That was a pretty good reference.

    The "clocking" actually takes place way down in the bottom layers - where there is software that is specifically driving the hardware. In Windows, there are layers and layers of abstraction between the application (like JRiver) and the low level hardware drivers. Which is a good thing in almost every way.

    Those same kinds of layers of abstraction exist in Linux and MacOS as well, to a greater or lesser degree. Specifically in Windows however, the Application pretty much just talks to the top layers (DirectMusic, WAV, etc.) and not to the very low layers of the hardware. That's one of the reasons it is difficult to get bit perfect sound under Windows, and one of the better accomplishments of products like JRiver.

    -Paul


    Quote Originally Posted by sjoc2000 View Post
    Good read at Wikopedia. At least I'm beginning to get a handle on the audio signal/data dynamics involved.

    At what point in this process is the data "clocked", by the pc. I was thinking it occurred in the endpoint buffer, but you are saying this is a small buffer, it is serialized and sent to the endpoint device.

    Could you please elaborate a bit on where and how this clocking occurs?

    Jim
    Anyone who considers protocol unimportant has never dealt with a cat DAC.
    Robert A. Heinlein

  14. #14
    Quote Originally Posted by sjoc2000 View Post
    At what point in this process is the data "clocked", by the pc.
    The USB controller has its own clock (generated by PLL in the southbridge, but can be a precise crystal-based clock as is the case e.g. of an external usb card (e.g. the sotm-x usb controller card). It reads usb frames from RAM via DMA at its own pace.

    The notion of clocking to the application comes through the interrupts - at interrupt the application is asked to fill the "dirty" buffer parts. Of course it provides the data at full speed (burst). From the outside it looks as if the timing occurs somewhere in software but the requests for filling come from the actual hardware (be that a PCI soundcard or USB controller). The whole process is timed by the hardware.

    USB adaptive audio devices just recover the USB clock from the incoming USB stream

    USB asynchronous devices deploy their own stable precise clock and send back to the USB host feedback control information to keep its short internal buffer optimally filled. The feedback message says sort of "continue sending X more/less data in each frame". This information is propagated to the usb audio/core usb driver which acts upon the request and next times copies larger/smaller number of samples from the first buffer to each USB frame (in effect fetching more/less samples to the USB audio device, just like requested).

  15. #15
    Just a poor ole country boy sjoc2000's Avatar
    Join Date
    Mar 2012
    Location
    N. of Sumner, Iowa
    Posts
    483
    Blog Entries
    1
    I had assumed that clocking in the pc relative to this application was related to the sampling rate of the particular audio PCM being worked with. So clocking with a 16/44 audio file is establishing an interval of 1/44,100 sec., that interval being measured from the down slope of the last bit voltage pulse to the up slope of the next (to keep it simple).

    But you are speaking of USB burst and other functions of the southbridge that seem to have no relationship to sampling.

    Am I way off on what clocking means in this sense?

    Jim
    PC (J River-Jplay) > USB > Mytek 192 - DSD > XLR > Adcom GFP-750 Pre > XLR > Emotiva XPA-5 > Snell C/V's (bi-amped) / Klipsch Sub <100 Hz

  16. #16
    Quote Originally Posted by sjoc2000 View Post
    I had assumed that clocking in the pc relative to this application was related to the sampling rate of the particular audio PCM being worked with.
    Sure it is

    Quote Originally Posted by sjoc2000 View Post
    So clocking with a 16/44 audio file is establishing an interval of 1/44,100 sec.
    Not really. The application tells the HW (via the driver) that it should switch to 44100Hz samplerate. For PCI, it only means the PCI audio controller selects appropriate clock signal. At this pace (actually a fraction of, usually several samples at a time) it reads data via DMA. The faster the samplerate, the more samples get consumed, the more often the HW throws IRQ, the more samples get prepared by the playback chain.

    In USB, the USB device is notified about the new samplerate too. In addition (actually the most importantly), the usb audio driver must pick appropriate number of samples for each usb frame which occurs every 1ms (or 125us for USB2). For 48kHz, it always selects 48 samples when preparing its part of the USB frame (for adaptive mode). For 44.1kHz, it takes 9 rounds 44 samples and the 10th round 45 samples. This regulates the rate at which the data is consumed. The consumption at the driver level happens at bursts - the application/driver chain does not prepare new frames every 1ms, but every 10ms (10 frames at once), or even less often. In fact a linux usb driver can be configured to prepare half a second of USB audio frames at one "burst".

    Quote Originally Posted by sjoc2000 View Post
    But you are speaking of USB burst
    No, I am talking about processing bursts. The audio samples flow into the USB device steadily, every single USB frame appropriate amount for the next period. That is why this USB transfer mode is called isochronous. In fact the steadiness is crucial for least-jitter clock recovery in the USB device (we are talking adaptive mode). The continuous flow is provided by the USB controller which continuously reads data from the RAM via DMA. Unlike writing fresh samples, which occurs in processing bursts, by the playback SW chain (application - sound layer - driver). The bursts are trigered by the IRQ thrown by the device (a bit simplified model).

  17. #17
    Just a poor ole country boy sjoc2000's Avatar
    Join Date
    Mar 2012
    Location
    N. of Sumner, Iowa
    Posts
    483
    Blog Entries
    1
    Tremendous information.

    Lots of answers here. We all like answers, but I get real concerned about asking the appropriate question. At this point I need to work on understanding all these great answers so I can figure out what the next question ought to be. :0)

    Allowing for the fact that the world doesn't end by tomorrow, I'll see you then...

    Thanks Paul and phofman for all the info. of today.

    Jim
    PC (J River-Jplay) > USB > Mytek 192 - DSD > XLR > Adcom GFP-750 Pre > XLR > Emotiva XPA-5 > Snell C/V's (bi-amped) / Klipsch Sub <100 Hz

  18. #18
    Just a poor ole country boy sjoc2000's Avatar
    Join Date
    Mar 2012
    Location
    N. of Sumner, Iowa
    Posts
    483
    Blog Entries
    1
    After going through all that has been provided here, I don't really have any further questions. The information is extensive and requires some study on my part. With these explanations I have a grounding with which to continue.

    This thread is a good example of the kind of resources that are available at CA, in it's members.

    Others who have followed this thread may have questions, and if those giving explanations continue to be generous, there could be further answers following.

    Thanks again,

    Jim
    PC (J River-Jplay) > USB > Mytek 192 - DSD > XLR > Adcom GFP-750 Pre > XLR > Emotiva XPA-5 > Snell C/V's (bi-amped) / Klipsch Sub <100 Hz