CompareMemory() optimizations

Just starting out? Need help? Post your questions and find answers here.
technopol
User
User
Posts: 27
Joined: Thu Mar 24, 2011 11:00 pm

CompareMemory() optimizations

Post by technopol »

I imagine CompareMemory() stops at the first data difference but does it?
Also, is it quadword optimized (using padding for bytelenght capability)?
What is faster?

Code: Select all

success = CompareMemory(*passBuf,*netBuf,passLen)
OR

Code: Select all

;( passLen already a multiple of 8 )
success = #True
For i = 0 To passLen - 8 Step 8
  If PeekQ(*passBuf+i) <> PeekQ(*netBuf+i)
    success = #False : Break
  EndIf
Next i
// Code Tags added (Kiffi)
User avatar
jacdelad
Addict
Addict
Posts: 2003
Joined: Wed Feb 03, 2021 12:46 pm
Location: Riesa

Re: CompareMemory() optimizations

Post by jacdelad »

Please use code tags and provide runable code!

CompareMemory stops as soon as further comparison isn't necessary (=the first difference).

Code: Select all

#Cycles=1000
*Mem1=AllocateMemory(1048576)
*Mem2=AllocateMemory(1048576)
For i=0 To 1048575
  RandomByte=Random(255,0)
  PokeA(*Mem1+i,RandomByte)
  PokeA(*Mem2+i,RandomByte)
Next

Timer=ElapsedMilliseconds()
For counter=1 To #Cycles
  result=CompareMemory(*Mem1,*Mem2,1048576)
Next
Timer1=ElapsedMilliseconds()-Timer

Timer=ElapsedMilliseconds()
For counter=1 To #Cycles
  success=#True
  For i=0 To 1048575 Step 8
    If PeekQ(*Mem1+i)<>PeekQ(*Mem2+i)
      success=#False
      Break
    EndIf
  Next
Next
Timer2=ElapsedMilliseconds()-Timer

MessageRequester("Result","Result:"+#CRLF$+"CompareMemory: "+Str(Timer1)+"ms"+#CRLF$+"Own function: "+Str(Timer2)+"ms",#PB_MessageRequester_Info)
The results speak for themselves.
Good morning, that's a nice tnetennba!

PureBasic 6.21/Windows 11 x64/Ryzen 7900X/32GB RAM/3TB SSD
Synology DS1821+/DX517, 130.9TB+50.8TB+2TB SSD
User avatar
idle
Always Here
Always Here
Posts: 5886
Joined: Fri Sep 21, 2007 5:52 am
Location: New Zealand

Re: CompareMemory() optimizations

Post by idle »

jacdelad wrote: Thu Apr 27, 2023 1:32 am Please use code tags and provide runable code!

CompareMemory stops as soon as further comparison isn't necessary (=the first difference).

Code: Select all

#Cycles=1000
*Mem1=AllocateMemory(1048576)
*Mem2=AllocateMemory(1048576)
For i=0 To 1048575
  RandomByte=Random(255,0)
  PokeA(*Mem1+i,RandomByte)
  PokeA(*Mem2+i,RandomByte)
Next

Timer=ElapsedMilliseconds()
For counter=1 To #Cycles
  result=CompareMemory(*Mem1,*Mem2,1048576)
Next
Timer1=ElapsedMilliseconds()-Timer

Timer=ElapsedMilliseconds()
For counter=1 To #Cycles
  success=#True
  For i=0 To 1048575 Step 8
    If PeekQ(*Mem1+i)<>PeekQ(*Mem2+i)
      success=#False
      Break
    EndIf
  Next
Next
Timer2=ElapsedMilliseconds()-Timer

MessageRequester("Result","Result:"+#CRLF$+"CompareMemory: "+Str(Timer1)+"ms"+#CRLF$+"Own function: "+Str(Timer2)+"ms",#PB_MessageRequester_Info)
The results speak for themselves.
I didn't know it did that! :shock:
technopol
User
User
Posts: 27
Joined: Thu Mar 24, 2011 11:00 pm

Re: CompareMemory() optimizations

Post by technopol »

Thanks a million times for the quick and very elaborate reply!

I'm sorry for the unrunable code, it was just part of the question, not meant to be run. But even if I'm an avid PureBasic user since 1998, it's the first time I read about code tags (not a big user of the forum; I know I should be). Where can I find docs about the forum's tags? phpBB?
jacdelad wrote: Thu Apr 27, 2023 1:32 am Please use code tags and provide runable code!

CompareMemory stops as soon as further comparison isn't necessary (=the first difference).

Code: Select all

#Cycles=1000
*Mem1=AllocateMemory(1048576)
*Mem2=AllocateMemory(1048576)
For i=0 To 1048575
  RandomByte=Random(255,0)
  PokeA(*Mem1+i,RandomByte)
  PokeA(*Mem2+i,RandomByte)
Next

Timer=ElapsedMilliseconds()
For counter=1 To #Cycles
  result=CompareMemory(*Mem1,*Mem2,1048576)
Next
Timer1=ElapsedMilliseconds()-Timer

Timer=ElapsedMilliseconds()
For counter=1 To #Cycles
  success=#True
  For i=0 To 1048575 Step 8
    If PeekQ(*Mem1+i)<>PeekQ(*Mem2+i)
      success=#False
      Break
    EndIf
  Next
Next
Timer2=ElapsedMilliseconds()-Timer

MessageRequester("Result","Result:"+#CRLF$+"CompareMemory: "+Str(Timer1)+"ms"+#CRLF$+"Own function: "+Str(Timer2)+"ms",#PB_MessageRequester_Info)
The results speak for themselves.
User avatar
jacdelad
Addict
Addict
Posts: 2003
Joined: Wed Feb 03, 2021 12:46 pm
Location: Riesa

Re: CompareMemory() optimizations

Post by jacdelad »

The editor itself offers them, right next to Bold/Italic/Underline etc.

Avid user since 1998 and only 15 posts? What have you done all the time in your cave?
Good morning, that's a nice tnetennba!

PureBasic 6.21/Windows 11 x64/Ryzen 7900X/32GB RAM/3TB SSD
Synology DS1821+/DX517, 130.9TB+50.8TB+2TB SSD
technopol
User
User
Posts: 27
Joined: Thu Mar 24, 2011 11:00 pm

Re: CompareMemory() optimizations

Post by technopol »

Because of the industrial secrets and patent applications not yet done, I just couldn't share anything from my work as I always work on my inventions. Like trunked code I just shared I had to extract, simplify and even change all variable names. And since I started programming some 53 years ago on this monster https://www.hpmuseum.org/hp9100.htm, then Basic and Assembler 10 years after on an army of ZX80-81, TS1000 which I mostly destroyed with my own expansion boards; I very rarely (or never) need advice; only more infos. And because of my Asperger syndrome, I'm not much into giving advices as I always work (and live) alone.

But I can name a few stuff I have specifically programmed with PureBasic (if Fred knew all I've done with his software, he would ask my for a few millions):
On Amiga (most of them with added routines I wrote with HiSoft's Devpac Assembler)
- (1998) HoloEmulator™ holographic simulator with 3D shutter glasses, 60fps virtual reality screen update, 1.3 sec full 30 views 3D image loading, DCTV output
- (1999) VirtualScanLab™ automated 3D object/hologram scanner/cataloger, 30" robotic rotating plate with my custom stepper driver, VLab Motion Amiga board, linked to the HoloEmulator
- (1999) Comfy3D Compositing (unfinished) automated animation gel scanner and 3D animator/compositor for ALL old Disney animation for Disney Channel US
- (1999) an unamed high speed 3D and/or animated lenticular image interlacer (for lithographic printing directly on PETE lenticular plastics)
- (2000) DRIP™ (Dense Raster image Processor) for FULL COLOR LOW NOISE ZERO MOIRE ultra-high definition lithographic printing with pure 2D 600x1200 pixels per inch resolution, or 3D 150x4800 ppi for encoding uptp 63 images on 75.5 lpi lenticular plastic lenses, patent accepted under my name in US and Europe (brought me more Disney business and new customers: Sony, Microsoft, Universal Studio, Lucas Film, TV Guide, The Simpsons, Amex, US Army, Warner Bros., Paramount, Toyota, BASF, Malboro, Audi, Kellog, PepsiCo, The Matrix, The Mummy, The Mummy Returns,...; won 1st prize at PIA contest with it)
- (2002 on Amiga emulator running on big PCs) DRIP™ Composite, like Quark Xpress but with 0.000104" or 9600dpi image placement resolution (100x more)
- fully automated robotic 360°camera mount for Canon EOS 5D mk III taking 36 images per Google street view images spot with stichless ultra-HDR (with image merge done in the linear domain before logarithmic curve) software, the program generates a stereo soundtack the control the step motor and the camera shutter with a radically simple batterie powered electronic circuit and a little mp3 player.
- ...
On Dec Alpha 21164 WindowsNT (Dual 21164 CPU with 128-bit memory path monsters!)
- ...
And on PCs!!... ...very late, need to sleep; sorry the rest of story one day (and I'm not taking about stuff done without PureBasic, on other plateforms and in other fields then computer science like analog and digital electronics, electroacoustics, optics, music, synthetizers, sound studios, photography, NLE video editing, audio FX, agrotechnology, AI systems, light sequencer/controller/driver/installations, and my eyes are closed)
User avatar
IceSoft
Addict
Addict
Posts: 1694
Joined: Thu Jun 24, 2004 8:51 am
Location: Germany

Re: CompareMemory() optimizations

Post by IceSoft »

original:
CompareMemory: 97ms
Own function: 423ms
Little bit optimized (but PB bottle neck).
CompareMemory: 98ms
Own function: 262ms

The big bottle neck is this part:
*Mem11.quads = *Mem1+i
*Mem22.quads = *Mem2+i

If *Mem11\quad<>*Mem22\quad
@fred
maybe a direct use of this kind of source will be a performance hup:
If *Mem11+i\quad<>*Mem22+i\quad
Here the faster version but has the bottle neck

Code: Select all

#Cycles=1000


Structure quads
  quad.q
EndStructure



*Mem1=AllocateMemory(1048576)
*Mem2=AllocateMemory(1048576)
For i=0 To 1048575
  RandomByte=Random(255,0)
  PokeA(*Mem1+i,RandomByte)
  PokeA(*Mem2+i,RandomByte)
Next

Timer=ElapsedMilliseconds()
For counter=1 To #Cycles
  result=CompareMemory(*Mem1,*Mem2,1048576)
Next
Debug result
Timer1=ElapsedMilliseconds()-Timer

Timer=ElapsedMilliseconds()
For counter=1 To #Cycles
  success=#True
  For i=0 To 1048575 Step 8
    *Mem11.quads = *Mem1+i
    *Mem22.quads = *Mem2+i

    If *Mem11\quad<>*Mem22\quad
      success=#False
      Break
    EndIf
  Next
Next
Timer2=ElapsedMilliseconds()-Timer

MessageRequester("Result","Result:"+#CRLF$+"CompareMemory: "+Str(Timer1)+"ms"+#CRLF$+"Own function: "+Str(Timer2)+"ms",#PB_MessageRequester_Info)
Belive! C++ version of Puzzle of Mystralia
Bug Planet
<Wrapper>4PB, PB<game>, =QONK=, PetriDish, Movie2Image, PictureManager,...
Fred
Administrator
Administrator
Posts: 18179
Joined: Fri May 17, 2002 4:39 pm
Location: France
Contact:

Re: CompareMemory() optimizations

Post by Fred »

I checked the current CompareMemory() code and we didn't used memcmp() which is twice faster than our custom code. I changed it for the next beta.
User avatar
idle
Always Here
Always Here
Posts: 5886
Joined: Fri Sep 21, 2007 5:52 am
Location: New Zealand

Re: CompareMemory() optimizations

Post by idle »

if memcmp() is twice as fast as what you currently use then you can probably get it 4 times faster using SSE
the mcmp() function here runs same time as memcmp()
compile this with c backend with optimization. The mcmp function isn't complete, it would need a jmp table to account for remainders.
note the prototypes only needed to stop the c optimization from replacing the function call with the result.
CompareMemory 94 mcmp 45 *a=*b:1048576 *a<>*c: 0

Code: Select all


EnableExplicit 

Global *a,*b,*c,size,lp,a 

size = 1024*1024 
lp = 1000 

*a=AllocateMemory(size)  
*b=AllocateMemory(size) 
*c=AllocateMemory(size) 

RandomSeed(1)
RandomData(*a,size)    ;*a = *b 
RandomSeed(1)
RandomData(*b,size)
RandomData(*c,size)    ;*a <> *c     

ImportC "" 
  memcmp(*a,*b,size) 
EndImport   

Procedure mcmp(*a,*b,size) 
    Protected pt,*pa.quad,*pb.quad  
    *pa = *a 
    *pb = *b
    While (*pa\q ! *pb\q) = 0 
      *pa+8
      *pb+8 
      pt+8 
      If pt = size 
        ProcedureReturn size  
      EndIf   
    Wend 
EndProcedure   

Prototype pmcmp(*a,*b,size)
Global pmcmp.pmcmp = @mcmp() 

Global st,st1,et,et1 

st = ElapsedMilliseconds() 
For a = 0 To lp 
  CompareMemory(*a,*b,size)
  ;memcmp(*a,*b,size)
Next 
et = ElapsedMilliseconds() 

st1  = ElapsedMilliseconds() 
For a = 0 To lp 
  pmcmp(*a,*b,size) 
Next 
et1 = ElapsedMilliseconds() 

Global out.s = " CompareMemory " + Str(et-st) + " mcmp " + Str(et1-st1) + " *a=*b:" + Str(mcmp(*a,*b,size)) + " *a<>*c: " + Str(mcmp(*a,*c,size))

SetClipboardText(out)
MessageRequester("times",out) 

technopol
User
User
Posts: 27
Joined: Thu Mar 24, 2011 11:00 pm

Re: CompareMemory() optimizations

Post by technopol »

Bonne nouvelle! Great news! Merci Fred.
Fred wrote: Thu Apr 27, 2023 8:42 am I checked the current CompareMemory() code and we didn't used memcmp() which is twice faster than our custom code. I changed it for the next beta.
Post Reply