Multithreaded and Parallelizing

Everything else that doesn't fall into one of the other PB categories.
Dreglor
Enthusiast
Enthusiast
Posts: 759
Joined: Sat Aug 02, 2003 11:22 pm
Location: OR, USA

Multithreaded and Parallelizing

Post by Dreglor »

So, In some of my recent work I been doing there has been some occasions where a certain segments of code could be easily put into context of multithreaded, Parallelized, or even both. There are a few questions both such topics that I have to ask to make sure I can actually do it and efficiently.
First is that can multiple threads access the same buffer just not the same location on the buffer (Thread A is accessing a long at $45 and Thread B is accessing $C7) I ask this because I can set the thread to step Oddly and another thread Evenly inside a buffer of data that needs to be transformed

the second is that how does MMX/SSE interact with threads on the stack?
simply because if I using the registers and a thread comes in and uses MMX/SSE also would the registers be saved in such way (stack?)

and my last question is what kind of "lock-less" methods or lock reducing practices to improve parallelizing and multithreaded code?
~Dreglor
User avatar
Hades
Enthusiast
Enthusiast
Posts: 188
Joined: Tue May 17, 2005 8:39 pm

Post by Hades »

Hi Dreglor,

Raytracing again? :D


1. Buffer access is no problem. But it will be more efficient if every thread has it's own buffer (splitting the original buffer into parts).

2. That's no problem. Every thread has it's own set. Just be careful with globals.

3. My method: I split everything, that has to be done per frame, into small jobs, and create a list of them. A thread now reads an index to the next job in that list, and increases that index. For reading and increasing that index I need a lock, but that hardly effects the overall speed, if your jobs aren't too short.
Trond
Always Here
Always Here
Posts: 7446
Joined: Mon Sep 22, 2003 6:45 pm
Location: Norway

Post by Trond »

First is that can multiple threads access the same buffer just not the same location on the buffer (Thread A is accessing a long at $45 and Thread B is accessing $C7) I ask this because I can set the thread to step Oddly and another thread Evenly inside a buffer of data that needs to be transformed
They can even access the same position if you want that. But it's pretty useless because you will have two threads doing the same work.
Dreglor
Enthusiast
Enthusiast
Posts: 759
Joined: Sat Aug 02, 2003 11:22 pm
Location: OR, USA

Post by Dreglor »

Hades wrote:Hi Dreglor,

Raytracing again? :D


1. Buffer access is no problem. But it will be more efficient if every thread has it's own buffer (splitting the original buffer into parts).

2. That's no problem. Every thread has it's own set. Just be careful with globals.

3. My method: I split everything, that has to be done per frame, into small jobs, and create a list of them. A thread now reads an index to the next job in that list, and increases that index. For reading and increasing that index I need a lock, but that hardly effects the overall speed, if your jobs aren't too short.
heh, you never know what I have up my sleeve.
Your method that you explained there sounds a lot like a thread pool which I have a class that handles such a system but it contains a lock per "job" I've haven't used it much but it scales to the system core count (which makes it nice) and I'm unsure on how well the system handles heavy work be it lots of little jobs or a few large jobs.
~Dreglor
User avatar
Hades
Enthusiast
Enthusiast
Posts: 188
Joined: Tue May 17, 2005 8:39 pm

Post by Hades »

Yes, scaling to any number of cores you throw at it was my main concern. With quad cores getting cheaper and cheaper Hexa/Octo cores just ahead and multisocket entering desktop market it's the obvious priority, if you need maximum performance.
Also I don't know upfront how long a job will take, so symmetrical splitting those jobs isn't an option. But I don't think it would be worth the afford anyway.

With only a few large jobs it might be overkill, and creating jobs of a few lines code without loops probably isn't a good idea anyway.
For anything else, I'm pretty sure you won't find a solution that makes a measurable difference. (if you find one, please tell me :) )
Post Reply