audiventory Posted March 23, 2017 Share Posted March 23, 2017 15 hours ago, Jud said: I suppose cross-platform stuff can't be the most efficient because it requires a compatibility layer. So if you need speed, you use something else. Absolutelly, Jud. However, exists way for C language. Qt library, as example. There even compatibility with QNX (RTOS, that Ralf (@Ralf11) mentioned above). When I began in 1990s C/C++ programming, we many efforts spent to display-plot graphics. Now it is not matter for DSP applications. 13 hours ago, Ralf11 said: back in ye olden days, people would test the code to see where the bottlenecks were, then redo those modules in assembler... Yes. I don't written Assembler modules, but I inserted Assembler fragnmets code to C code. In my youth I tried write compiler of Assembler for one-chip computer without elementary knowledges about compilers. But I don't finished this work. AuI ConverteR 48x44 - HD audio converter/optimizer for DAC of high resolution files ISO, DSF, DFF (1-bit/D64/128/256/512/1024), wav, flac, aiff, alac, safe CD ripper to PCM/DSF, Seamless Album Conversion, AIFF, WAV, FLAC, DSF metadata editor, Mac & WindowsOffline conversion save energy and nature Link to comment
audiventory Posted March 23, 2017 Share Posted March 23, 2017 13 minutes ago, mansr said: For tight processing loops, yes. What we're talking about here is essentially a file copy, and there the disk I/O dominates unless it's a RAM filesystem. For arrays copying used "for ..." construction with conditions. How to implemented memcpy (standard C-library function), I don't learned. May be it work in assembler more optimal way than universal "for ....". Disk input/output resource consuming depend on how work with it: size transferred data, buffering, reading in parallel thread, etc. I don't know that points of time measurement @yamamoto2002 done: with accounting disk operations or without. Anyway for each OS and library need check the time performance. AuI ConverteR 48x44 - HD audio converter/optimizer for DAC of high resolution files ISO, DSF, DFF (1-bit/D64/128/256/512/1024), wav, flac, aiff, alac, safe CD ripper to PCM/DSF, Seamless Album Conversion, AIFF, WAV, FLAC, DSF metadata editor, Mac & WindowsOffline conversion save energy and nature Link to comment
Jud Posted March 23, 2017 Share Posted March 23, 2017 14 minutes ago, mansr said: For tight processing loops, yes. What we're talking about here is essentially a file copy, and there the disk I/O dominates unless it's a RAM filesystem. Can U.2 or M.2 approach RAM filesystem speeds? One never knows, do one? - Fats Waller The fairest thing we can experience is the mysterious. It is the fundamental emotion which stands at the cradle of true art and true science. - Einstein Computer, Audirvana -> optical Ethernet to Fitlet3 -> Fibbr Alpha Optical USB -> iFi NEO iDSD DAC -> Apollon Audio 1ET400A Mini (Purifi based) -> Vandersteen 3A Signature. Link to comment
mansr Posted March 23, 2017 Share Posted March 23, 2017 10 minutes ago, Jud said: Can U.2 or M.2 approach RAM filesystem speeds? Even if the flash itself is infinitely fast, those still entail an additional DMA operation at PCIe rate. In reality, the fastest SSDs achieve transfer rates up to a few GB/s. Link to comment
yamamoto2002 Posted March 23, 2017 Share Posted March 23, 2017 41 minutes ago, audiventory said: For arrays copying used "for ..." construction with conditions. How to implemented memcpy, I don't learned. Disk input/output resource consuming depend on how work with it: size transferred data, buffering, reading in parallel thread, etc. I don't know that points of time measurement @yamamoto2002 done: with accounting disk operations or without. Anyway for each OS and library need check the time performance. My implementation of the music player 6 years ago was slightly different than current implementation , when play button is pressed, program reads entire files of the playlist onto main memory then convert endianness of PCM data (if necessary) then playback starts, so the time measurement is without file I/O time, without playback. When this endianness conversion program was first implemented, the program was not optimized at all and was much slower. Windows caches previously read file and it is really fast to read the same music file second time (it is done very often when developing music player program), and this conversion time was noticeable so I benchmarked the code and improved performance a bit, and from this effort, the "not optimized at all" program is somewhat improved and becomes "not-so-optimized" program. BTW this is my memcpy implementation written in x64 assembler, this is fully optimized code I don't remember but there may be a limitation of copy data size https://sourceforge.net/p/playpcmwin/code/HEAD/tree/PlayPcmWin/00experiments/SseCopyTest64/MyMemcpy64a.asm audiventory 1 Sunday programmer since 1985 Developer of PlayPcmWin Link to comment
audiventory Posted March 23, 2017 Share Posted March 23, 2017 4 hours ago, mansr said: Even if the flash itself is infinitely fast, those still entail an additional DMA operation at PCIe rate. In reality, the fastest SSDs achieve transfer rates up to a few GB/s. When you made music projects in DAW and old computer with old "mechanical" HDD, there disk load about zero even for high resolution 96/192 kHz in float point and more 10 audio tracks + midi tracks. Each track is separate audio file, midi tracks placed in single file. But CPU load may be about 100% for DSP. After DSP off, CPU load decreased to 10%. It is significantly harder mode, than read 1 file from disk. It is approximate values. Many depend on certain disk, system settings, reading buffer length. Theoretically it need check for every computer. I suppose, consuming resources for input/output disk is not significant in total time of calculations. AuI ConverteR 48x44 - HD audio converter/optimizer for DAC of high resolution files ISO, DSF, DFF (1-bit/D64/128/256/512/1024), wav, flac, aiff, alac, safe CD ripper to PCM/DSF, Seamless Album Conversion, AIFF, WAV, FLAC, DSF metadata editor, Mac & WindowsOffline conversion save energy and nature Link to comment
mansr Posted March 23, 2017 Share Posted March 23, 2017 50 minutes ago, audiventory said: When you made music projects in DAW and old computer with old "mechanical" HDD, there disk load about zero even for high resolution 96/192 kHz in float point and more 10 audio tracks + midi tracks. Each track is separate audio file, midi tracks placed in single file. But CPU load may be about 100% for DSP. After DSP off, CPU load decreased to 10%. It is significantly harder mode, than read 1 file from disk. It is approximate values. Many depend on certain disk, system settings, reading buffer length. Theoretically it need check for every computer. I suppose, consuming resources for input/output disk is not significant in total time of calculations. For heavy DSP work, sure. For byte-swapping uncompressed PCM, I/O is the bottleneck. A 3 GHz CPU with SSE2 instructions can byte-swap 48 GB/s assuming one 128-bit operation per cycle. That's much more than any storage device can handle. Link to comment
yamamoto2002 Posted March 23, 2017 Share Posted March 23, 2017 5 hours ago, mansr said: For heavy DSP work, sure. For byte-swapping uncompressed PCM, I/O is the bottleneck. A 3 GHz CPU with SSE2 instructions can byte-swap 48 GB/s assuming one 128-bit operation per cycle. That's much more than any storage device can handle. I think so too Estimation of "it will be 3x or 4x faster when optimized" is based on Xeon W3680's memory bandwidth, it is 32GB / second https://ark.intel.com/products/47917/Intel-Xeon-Processor-W3680-12M-Cache-3_33-GHz-6_40-GTs-Intel-QPI Sunday programmer since 1985 Developer of PlayPcmWin Link to comment
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now