It is currently Fri Feb 26, 2021 8:31 pm

All times are UTC + 1 hour




Post new topic Reply to topic  [ 4 posts ] 
Author Message
 Post subject: AVX-512 Instruction
PostPosted: Sat Feb 13, 2021 10:47 am 
Offline
Enthusiast
Enthusiast
User avatar

Joined: Mon Jan 12, 2004 11:40 pm
Posts: 779
Location: Okazaki, JAPAN
Hello

PureBasic 5.73 LTS
flat assembler version 1.71.39

Is it on the planned roadmap?
flat assembler version 1.71.40
Added support for Intel AVX-512

512bits register is 32 pcs.
I hope to ASIO 2048 samples memory copy process in Bug head.
Code:
VMOVDQA64 [R8], zmm0
Add R8, 64
VMOVDQA64 [R8], zmm1
Add R8, 64
VMOVDQA64 [R8], zmm2
Add R8, 64
...


Thanks.


Top
 Profile  
Reply with quote  
 Post subject: Re: AVX-512 Instruction
PostPosted: Sat Feb 13, 2021 12:18 pm 
Offline
User
User

Joined: Wed Feb 26, 2014 3:16 pm
Posts: 95
Quote:
I hope to ASIO 2048 samples memory copy process in Bug head.

ASIO is all about latency reduction. The smaller the buffers, the better.

But if you talk about the render/mixing stage with anticipative pre-rendered buffering, then YES, AVX-512 is the way to go nowadays.


Top
 Profile  
Reply with quote  
 Post subject: Re: AVX-512 Instruction
PostPosted: Sat Feb 13, 2021 6:15 pm 
Offline
Enthusiast
Enthusiast

Joined: Wed May 27, 2020 12:26 pm
Posts: 286
I do not understand how do you do in direct Assembly write :
Code:
add rax,1

I was ever angry to have confusion between native basic instructions and Assembly instructions. Plus, I did not understand why any Assembly instructions were unabled directly, but executed even by prefixing the '!' character...
Code:
! add rax, 1
Anyway, if it misses any Assembly statements, it is (weightly) possible to write directly their bytecode. (excepted if there is a security, that I do not absolutely know, on the recent OSs).

_________________
Thank you Google Search


Top
 Profile  
Reply with quote  
 Post subject: Re: AVX-512 Instruction
PostPosted: Mon Feb 15, 2021 3:23 am 
Offline
Enthusiast
Enthusiast
User avatar

Joined: Mon Jan 12, 2004 11:40 pm
Posts: 779
Location: Okazaki, JAPAN
Quote:
I hope to ASIO 2048 samples memory copy process in Bug head.

My sound player is made so that the longer the ASIO latency, the better the sound quality. It is currently being used in $200,000 high-end audio, and professional studios that are strict about sound accuracy, and has a solid track record.

ASIO transfer codes
Code:
...
!WasapiPorc_Process_SnowFall59:
!MOVNTQ [R8], mm5 ; Set ; 5
!MOVNTQ [R8], mm2 ; 0
!MOVNTQ [R8], mm1 ; 1
!MOVNTQ [R8], mm3 ; 1
!MOVNTQ [R8], mm5 ; Set
!MOVNTQ [R8], mm3 ; 1
!MOVNTQ [R8], mm5 ; Set
!MOVNTQ [R8], mm2 ; 0
!MOVNTQ [R8], mm1 ; 1
!MOVNTQ [R8], mm3 ; 1
!MOVNTQ [R8], mm5 ; Set
!MOVNTQ [R8], mm3 ; 1
!WasapiPorc_Process_SnowFall68:
!MOVNTQ [R8], mm5 ; Set ; 6
!NOP [Rip]
!NOP [R8]
!MOVQ mm5, [R8] ; [12.12 - 2.99]-Start
;dump ;   !MOVQ mm1, [R8] ; [12.27 - 3.14]
;dump ;   !MOVQ mm3, [R8] ; [12.27 - 3.14]
!MOVQ mm5, mm5 ; [12.34 - 3.21]
!MOVQ mm5, mm5 ; [12.34 - 3.21]
!MOVQ mm5, mm5 ; [12.34 - 3.21]
!MOVNTQ [R8], mm2 ; 0
!MOVNTQ [R8], mm1 ; 1
!MOVNTQ [R8], mm3 ; 1
!MOVNTQ [R8], mm5 ; Set
!MOVNTQ [R8], mm3 ; 1
!MOVNTQ [R8], mm5 ; Set
!MOVNTQ [R8], mm2 ; 0
!MOVNTQ [R8], mm1 ; 1
!MOVNTQ [R8], mm3 ; 1
!MOVNTQ [R8], mm5 ; Set
!MOVNTQ [R8], mm3 ; 1
!MOVNTQ [R8], mm5 ; Set ; 6 [12.12 - 2.99]-End
!NOP [Rip]
!NOP [R8]
!AddFNOP_S_WasapiProc_2:
!INC Rdx
!INC Rdx
!INC Rdx
!INC Rdx ;4
!INC Rdx
!INC Rdx
!INC Rdx
!INC Rdx ;8
!INC R8
!INC R8
!INC R8
!INC R8 ;4
!INC R8
!INC R8
!INC R8
!INC R8 ;8
!DEC Rcx
!DEC Rcx
!DEC Rcx
!DEC Rcx ;4
!DEC Rcx
!DEC Rcx
!DEC Rcx
!DEC Rcx ;8
!FNOP ; for wide
!FNOP
!FNOP
!FNOP ;4
!NOP [Rip] ; [12.10 - 2.97]
!NOP [Rip] ; [12.10 - 2.97]
!JNZ WASAPI_Proc_LOOP_222
;     CopyMemory(*bufferDecode+WasapiPos, *buffer, length) ; or very light memory copy process
;     result = length : WasapiPos + result   
!WAIT ;1 [11.24 - 144]
!WAIT ; no fwait
!WAIT ; wait with FNOP
!WAIT ;4
!XCHG ch, cl
!XCHG cl, ch


The JNZ instruction is the one that causes the worst sound quality in this process. I was thinking of the AVX-512F instruction to avoid the loop process. But AVX-512 support can easily become a big neurological burden for compiler developers.

I am using memory access with MMX instructions. The problem with the Rax R8 registers instructions is that the left/right volume balance collapses with a full digital amplifier at the lowest 8 bits; the memory access for the SSE XMM registers and AVX YMM registers instructions seems to change the CPU clock during the transfer and the sound quality gets worse; the AVX-512F instruction might improve the sound quality. Only there, only about the transfer process, it would be enough to write it as a DLL in FASM.

How write FASM for AVX-512 x64 DLL? Do you know any about it?


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 4 posts ] 

All times are UTC + 1 hour


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  

 


Powered by phpBB © 2008 phpBB Group
subSilver+ theme by Canver Software, sponsor Sanal Modifiye