I hope to ASIO 2048 samples memory copy process in Bug head.
My sound player is made so that the longer the ASIO latency, the better the sound quality. It is currently being used in $200,000 high-end audio, and professional studios that are strict about sound accuracy, and has a solid track record.
ASIO transfer codes
Code: Select all
...
!WasapiPorc_Process_SnowFall59:
!MOVNTQ [R8], mm5 ; Set ; 5
!MOVNTQ [R8], mm2 ; 0
!MOVNTQ [R8], mm1 ; 1
!MOVNTQ [R8], mm3 ; 1
!MOVNTQ [R8], mm5 ; Set
!MOVNTQ [R8], mm3 ; 1
!MOVNTQ [R8], mm5 ; Set
!MOVNTQ [R8], mm2 ; 0
!MOVNTQ [R8], mm1 ; 1
!MOVNTQ [R8], mm3 ; 1
!MOVNTQ [R8], mm5 ; Set
!MOVNTQ [R8], mm3 ; 1
!WasapiPorc_Process_SnowFall68:
!MOVNTQ [R8], mm5 ; Set ; 6
!NOP [Rip]
!NOP [R8]
!MOVQ mm5, [R8] ; [12.12 - 2.99]-Start
;dump ; !MOVQ mm1, [R8] ; [12.27 - 3.14]
;dump ; !MOVQ mm3, [R8] ; [12.27 - 3.14]
!MOVQ mm5, mm5 ; [12.34 - 3.21]
!MOVQ mm5, mm5 ; [12.34 - 3.21]
!MOVQ mm5, mm5 ; [12.34 - 3.21]
!MOVNTQ [R8], mm2 ; 0
!MOVNTQ [R8], mm1 ; 1
!MOVNTQ [R8], mm3 ; 1
!MOVNTQ [R8], mm5 ; Set
!MOVNTQ [R8], mm3 ; 1
!MOVNTQ [R8], mm5 ; Set
!MOVNTQ [R8], mm2 ; 0
!MOVNTQ [R8], mm1 ; 1
!MOVNTQ [R8], mm3 ; 1
!MOVNTQ [R8], mm5 ; Set
!MOVNTQ [R8], mm3 ; 1
!MOVNTQ [R8], mm5 ; Set ; 6 [12.12 - 2.99]-End
!NOP [Rip]
!NOP [R8]
!AddFNOP_S_WasapiProc_2:
!INC Rdx
!INC Rdx
!INC Rdx
!INC Rdx ;4
!INC Rdx
!INC Rdx
!INC Rdx
!INC Rdx ;8
!INC R8
!INC R8
!INC R8
!INC R8 ;4
!INC R8
!INC R8
!INC R8
!INC R8 ;8
!DEC Rcx
!DEC Rcx
!DEC Rcx
!DEC Rcx ;4
!DEC Rcx
!DEC Rcx
!DEC Rcx
!DEC Rcx ;8
!FNOP ; for wide
!FNOP
!FNOP
!FNOP ;4
!NOP [Rip] ; [12.10 - 2.97]
!NOP [Rip] ; [12.10 - 2.97]
!JNZ WASAPI_Proc_LOOP_222
; CopyMemory(*bufferDecode+WasapiPos, *buffer, length) ; or very light memory copy process
; result = length : WasapiPos + result
!WAIT ;1 [11.24 - 144]
!WAIT ; no fwait
!WAIT ; wait with FNOP
!WAIT ;4
!XCHG ch, cl
!XCHG cl, ch
The JNZ instruction is the one that causes the worst sound quality in this process. I was thinking of the AVX-512F instruction to avoid the loop process. But AVX-512 support can easily become a big neurological burden for compiler developers.
I am using memory access with MMX instructions. The problem with the Rax R8 registers instructions is that the left/right volume balance collapses with a full digital amplifier at the lowest 8 bits; the memory access for the SSE XMM registers and AVX YMM registers instructions seems to change the CPU clock during the transfer and the sound quality gets worse; the AVX-512F instruction might improve the sound quality. Only there, only about the transfer process, it would be enough to write it as a DLL in FASM.
How write FASM for AVX-512 x64 DLL? Do you know any about it?