Is there a faster way for DSA ? (Test EXE)

va!n · Post by **va!n** » Sun Feb 12, 2006 12:23 am

@Hatonastick:
Sorry... just read the first posts... the box should only appear when the framerate skip down 50%...

@djes:
thanks for the link... i have downloaded the zip and tried the two posted examples.... its all in ASM... sadly i dont understand it... and when trying to compile both examples a got a black screen with some trashy moving red blue graphic blocks... looks like mempeek an protracker module with action replay on the good old amiga

if you have any tip for me, it would be great! thanks in advance..

djes · Post by **djes** » Sun Feb 12, 2006 12:46 pm

va!n wrote: @djes:
thanks for the link... i have downloaded the zip and tried the two posted examples.... its all in ASM... sadly i dont understand it... and when trying to compile both examples a got a black screen with some trashy moving red blue graphic blocks... looks like mempeek an protracker module with action replay on the good old amiga if you have any tip for me, it would be great! thanks in advance..

Well, that's the problem with direct access : compatibility! It works with some cards and not with others. It's why on PC with Windows its safer to use API functions from directx and opengl. When you come from Amiga you have to change your habits; and some things are really annoying and seems to come from deranged spirits!

Anyway, did you try this archive? http://do.nico.free.fr/docs/objet5.zip
For direct access to screen buffer you have to use DrawingBuffer() and DrawingBufferPitch(). Maybe there's some code in the forum. If so, we may complete this post with working and efficient code

traumatic · Post by **traumatic** » Sun Feb 12, 2006 1:01 pm

djes wrote: For direct access to screen buffer you have to use DrawingBuffer() and DrawingBufferPitch(). Maybe there's some code in the forum.

http://www.purebasic.com/Plasma_DSA.pb

djes · Post by **djes** » Sun Feb 12, 2006 3:07 pm

Traumatic, happy to see you on this topic

va!n · Post by **va!n** » Sun Feb 12, 2006 9:29 pm

@djes:
yes, i tried that archive... i have downloaded it again... this time there is an exe as example included... i get only tash on my screen... like Amiga ECS 1.2 game on OCS 1.3 Amiga!

Ok, its better to use DX API.... do you know any way to speed up such stuff, instead using PureBasic DSA commands nor PureBasic Plot() ?? Btw, the effect works now smooth... there was a flaot problem...

@traumatic:
thanks for the link! Its 99% the same i do in my example... but its stilll to slow when having two tables (screen width*height) and trying it on a 1280x1024 screen!

djes · Post by **djes** » Sun Feb 12, 2006 9:48 pm

Strange that it crash! But tonton said me he'll use drawingbufferpitch in its next examples.

For the traumatic code, here it's pretty fast! Don't forget to kill debugger?!

va!n · Post by **va!n** » Sun Feb 12, 2006 11:55 pm

@djes:
yes, the code linked by traumatic works fast ! but in my 1280x1024 resolution example, i am using two tables... each 1280x1024 ... then in the mainloop there are some small mathematical operations between this two tables and the result will be drawn to the screensbuffer... this will slow down a lot... and the innerloop is really very small.... atm i am a bit busy but i will try to work on it and bringing a new version very soon...

THCM · Post by **THCM** » Mon Feb 13, 2006 10:05 am

Here's another ugly hack I did after asking Fred for DSA

Code: Select all

;----------------------------------------------
;Mandelzoom
;by THCM / Masters' Design Group
;----------------------------------------------

#PB_PixelFormat_8Bits      = 1
#PB_PixelFormat_15Bits     = 2
#PB_PixelFormat_16Bits     = 3
#PB_PixelFormat_24Bits_RGB = 4
#PB_PixelFormat_24Bits_BGR = 5
#PB_PixelFormat_32Bits_RGB = 6
#PB_PixelFormat_32Bits_BGR = 7

#screenx=800
#screeny=600
#screend=32

#tiefemax=32

frame.f=128
subvalue.f=0.5

xminstart.f=-0.7/2
xmaxstart.f=2.1/2
yminstart.f=-1/2
ymaxstart.f=1/2

InitSprite()
OpenScreen(#screenx,#screeny,#screend,"")

Repeat

SetFrameRate (60)

xmin.f=xminstart.f*frame
xmax.f=xmaxstart.f*frame
ymin.f=yminstart.f*frame
ymax.f=ymaxstart.f*frame

dx.f=(xmax-xmin)/#screenx
dy.f=(ymax-ymin)/#screeny

cx.f=xmin
cy.f=ymax

StartDrawing(ScreenOutput())
DrawingMode(1)

    FrameBuffer.l = DrawingBuffer()
    FramePitch    = DrawingBufferPitch()

For y=0 To #screeny-1

    DrawPos=FrameBuffer
    Framebuffer=FrameBuffer+FramePitch

    For x=0 To #screenx-1

    tiefe.l=0
    xwert.f=0
    ywert.f=0
    xquad.f=0
    yquad.f=0

Repeat

    ywert=2*xwert*ywert-cy
    xwert=xquad-yquad-cx
    xquad=xwert*xwert
    yquad=ywert*ywert
    tiefe+1

Until (xquad+yquad) > 8 Or tiefe = #tiefemax

      MOV eax, tiefe
      And eax, 31
      SAL eax, 3
      MOV ah, al
      SAL eax,8
      MOV al, ah

      MOV ebx, DrawPos
      MOV [ebx], eax
      ADD ebx, 4
      MOV DrawPos, ebx

      cx=cx+dx

  Next x

    cx=xmin
    cy=cy-dy

Next y

StopDrawing()
FlipBuffers(1)

frame.f=frame.f-subvalue.f

If frame.f <=1
  frame.f=1
  subvalue.f=subvalue.f*-1
EndIf

If frame.f >=128
  frame.f=128
  subvalue.f=subvalue.f*-1
EndIf

Until GetAsyncKeyState_(#VK_ESCAPE)

End

va!n · Post by **va!n** » Mon Feb 13, 2006 1:00 pm

@THCM:
Thanks for helping trying to find another "fast" way.... i tried your example and it works... but the example is running in 800x600x32 and when the fractal is zooming (nearly fullscreen), its not smooth...

I dont know but i think there must be any other way to improve the speed... I will try to write it in BlitzBasic for speed comparsions.... (may take some days, because i have to relearn blitz and RTFM

Edited:
To optomize your example, you can remove DrawingMode(1) and set SetFrameRate (60) before the mainloop... afaik you dont must call SetFrameRate (60) each loop!?

THCM · Post by **THCM** » Mon Feb 13, 2006 1:25 pm

It's getting slower, the more black pixel you see on the screen. All black pixels take 32 (#tiefemax) iterations which takes a lot of computation time. A simple effect like a plasma has a fixed amount of computations. I don't think that Blitzbasic is any faster. Always try to do some calculations between each write to the videomemory. Only use a buffer in main memory if you need to reread written data. The best way is to use a buffer of one scanline in the main memory and calculate one scanline after another and copy each scanline after another from your buffer to the video memory.

djes · Post by **djes** » Mon Feb 13, 2006 1:34 pm

va!n wrote:@djes:
yes, the code linked by traumatic works fast ! but in my 1280x1024 resolution example, i am using two tables... each 1280x1024 ... then in the mainloop there are some small mathematical operations between this two tables and the result will be drawn to the screensbuffer... this will slow down a lot... and the innerloop is really very small.... atm i am a bit busy but i will try to work on it and bringing a new version very soon...

You have to avoid caches hits. Maybe one (or two) big tables is not the best solution to have the fastest effect.

THCM · Post by **THCM** » Mon Feb 13, 2006 1:43 pm

I think you mean avoid cache misses.... modern cpus are much faster doing a couple of calculations instead of reading a value from a large table.

remi_meier · Post by **remi_meier** » Mon Feb 13, 2006 4:34 pm

@THCM: Right!

@va!n:
After some more testings with my Analyzer (advertisement

) I found
out, that after all the calculation block (inside the two for-loops) is taking
74% of the time. The slowest calculation at the moment seems to be
the modulo operator (%) and the array access... Don't know how to avoid
it, but I think the screen access with 4% isn't your problem (any more).

We should try if C++ can optimize some things

djes · Post by **djes** » Mon Feb 13, 2006 5:33 pm

THCM wrote:I think you mean avoid cache misses.... modern cpus are much faster doing a couple of calculations instead of reading a value from a large table.

Yes it was what i meant (sorry for my bad english)
I know for some times that float calculations is faster than integer on new CPUs. For table access, it may interesting to know what size limit could be the most acceptable.

va!n · Post by **va!n** » Mon Feb 13, 2006 6:04 pm

remi_meier wrote: @va!n:
After some more testings with my Analyzer (advertisement ) I found
out, that after all the calculation block (inside the two for-loops) is taking
74% of the time. The slowest calculation at the moment seems to be
the modulo operator (%) and the array access... Don't know how to avoid
it, but I think the screen access with 4% isn't your problem (any more).

We should try if C++ can optimize some things

Advertisement? I dont see...
Try the great Analyzer tool by remi_meier!

@remi_meier:
Thanks a lot for the great feedback and testing results, showing us the real problem! How is it possible that the MODULO % operator and possible array access is so slow? Btw, would be interested to see the same in c++

@Fred:
Any idea of speed improvements / optimisations?