Parallel Programming

Share your advanced PureBasic knowledge/code with the community.
hellhound66
Enthusiast
Enthusiast
Posts: 119
Joined: Tue Feb 21, 2006 12:37 pm

Parallel Programming

Post by hellhound66 »

Closed.
Last edited by hellhound66 on Sun Mar 09, 2008 10:32 pm, edited 1 time in total.
dell_jockey
Enthusiast
Enthusiast
Posts: 767
Joined: Sat Jan 24, 2004 6:56 pm

Post by dell_jockey »

Moin HellHound,

thanks a lot for this code. Apparently, this method doesn't scale up very efficiently beyond two cores, see the results I got on my quad-core machine:

1 thread: 1566 iterations
2 threads: 3249 iterations
3 threads: 3295 iterations
4 threads: 3357 iterations

I've repeated this multiple times, (compiled as a console app on WinXP-Pro, no debugger, no other app running) each time getting very similar results.
cheers,
dell_jockey
________
http://blog.forex-trading-ideas.com
hellhound66
Enthusiast
Enthusiast
Posts: 119
Joined: Tue Feb 21, 2006 12:37 pm

Post by hellhound66 »

Thank you for testing.


Strange indeed. I'll check the code again. I will assign each thread to one core exclusively.

Are you sure you have a quadcore? :wink:


/edit:
Seems to be the worker tasks I used. Please check it out in 5 minutes again ^__^.
User avatar
blueznl
PureBasic Expert
PureBasic Expert
Posts: 6166
Joined: Sat May 17, 2003 11:31 am
Contact:

Post by blueznl »

On my quad core WITH DEBUGGER OFF:

Code: Select all

1 thread 849
2 threads 1444
3 threads 1666
4 threads 3066
Funny behaviour... Speed increase from two to three threads was minimal, however the jump to four threads was outright impressive.

Hellhound, what kind of hardware do you use?
( PB6.00 LTS Win11 x64 Asrock AB350 Pro4 Ryzen 5 3600 32GB GTX1060 6GB)
( The path to enlightenment and the PureBasic Survival Guide right here... )
hellhound66
Enthusiast
Enthusiast
Posts: 119
Joined: Tue Feb 21, 2006 12:37 pm

Post by hellhound66 »

I use a dual core (Core2Duo 2.0GHz).
And it's no wonder, that I get results like these:

1 thread 701
2 threads 1307
3 threads 1378
4 threads 1358


The behaviour with three and four threads is easy to explain.
I push all tasks in a Queue. Then I start all threads. The first thread ist now taking the First entry in the queue, the second thread the second element in queue and so on. If one thread finished the task, it takes the next task in queue and so on.

In the example I put 4 tasks in the queue. The tasks are identical so they take all (nearly) the same time. Now 3 threads start working. Now, when one thread finished work, it takes the last task. The two other threads are inactive in this time. So it is logical, that there is no big difference on a quadcore, between 2 and 3 threads. Only 4 threads can do all the work in one cycle.

The big deal in this programming is to split up the tasks in many subtasks, that can be processed parallel. The more tasks you have in queue, the better a multicore CPU can work.
dell_jockey
Enthusiast
Enthusiast
Posts: 767
Joined: Sat Jan 24, 2004 6:56 pm

Post by dell_jockey »

hellhound66 wrote: Are you sure you have a quadcore? :wink:
yes, Dell Precision T3400 quadcore...


with the current code I get the following:
1 thread 911 iterations
2 threads 1630 iterations
3 threads 1940 iterations
4 threads 3322 iterations

compare that to my prior test - odd, isn't it? Especially the fact that a single thread is now so much slower makes me wonder...
cheers,
dell_jockey
________
http://blog.forex-trading-ideas.com
hellhound66
Enthusiast
Enthusiast
Posts: 119
Joined: Tue Feb 21, 2006 12:37 pm

Post by hellhound66 »

Seems to be the worker tasks I used. Please check it out in 5 minutes again ^__^.
I changed the worker tasks. That's why it is "slower" know (other tasks).

/edit:
Feel free to change the task queue yourself, try it out. If there are some bugs, this is the best way to find out.
You can use different tasks as well, and there you can see that the order of tasks is important, too.
User avatar
DoubleDutch
Addict
Addict
Posts: 3220
Joined: Thu Aug 07, 2003 7:01 pm
Location: United Kingdom
Contact:

Post by DoubleDutch »

I think there may be some kind of bug when you have more than 4 cores (in a dual quad-core system = 8 cores)...

Starting process with 1 thread(s)
Iterations : 966
Starting process with 2 thread(s)
Iterations : 1873
Starting process with 3 thread(s)
Iterations : 1941
Starting process with 4 thread(s)
Iterations : 3497
Autodetecting cores..
Starting process with 8 thread(s)
Iterations : 3478


?

Looking at resources allocated overview, only 50% is used when at the 8 thread iteration - so it isn't using the cores on the 2nd cpu for some reason?
https://deluxepixel.com <- My Business website
https://reportcomplete.com <- School end of term reports system
User avatar
DoubleDutch
Addict
Addict
Posts: 3220
Joined: Thu Aug 07, 2003 7:01 pm
Location: United Kingdom
Contact:

Post by DoubleDutch »

If I change the "Game" procedure to this:

Code: Select all

Procedure Game(Cores) 
    CreateThreads(Cores) 
    PrintN("Starting process with "+Str(ThreadCount)+" thread(s)") 

    time.l = ElapsedMilliseconds() 
    Repeat 
        
        ;All tasks in the queue.. 
        AddWorkerQueueElement(@Dummy()) 
        AddWorkerQueueElement(@Dummy()) 
        AddWorkerQueueElement(@Dummy()) 
        AddWorkerQueueElement(@Dummy()) 
        AddWorkerQueueElement(@Dummy()) 
        AddWorkerQueueElement(@Dummy()) 
        AddWorkerQueueElement(@Dummy()) 
        AddWorkerQueueElement(@Dummy()) 
        ; Process all 
        ProcessAll() 
        
        ; The next tasks in the queue 
        AddWorkerQueueElement(@Dummy()) 
        AddWorkerQueueElement(@Dummy()) 
        AddWorkerQueueElement(@Dummy()) 
        AddWorkerQueueElement(@Dummy()) 
        AddWorkerQueueElement(@Dummy()) 
        AddWorkerQueueElement(@Dummy()) 
        AddWorkerQueueElement(@Dummy()) 
        AddWorkerQueueElement(@Dummy()) 
        ; Process both. 
        ProcessAll() 
        
        Counter+1 
    Until ElapsedMilliseconds()-time>4000 ; 2 sec. 
    
    PrintN("Iterations : "+Str(Counter)) 
    Counter = 0 
    
    EndThreads() 
EndProcedure 
I get:

Starting process with 1 thread(s)
Iterations : 489
Starting process with 2 thread(s)
Iterations : 960
Starting process with 3 thread(s)
Iterations : 1295
Starting process with 4 thread(s)
Iterations : 1858
Autodetecting cores..
Starting process with 8 thread(s)
Iterations : 3277


Which looks right, a quick check with the resource monitor shows that on 4 threads it's 48% used and on the 8 threads its 90% used. :)
https://deluxepixel.com <- My Business website
https://reportcomplete.com <- School end of term reports system
Post Reply