So, In some of my recent work I been doing there has been some occasions where a certain segments of code could be easily put into context of multithreaded, Parallelized, or even both. There are a few questions both such topics that I have to ask to make sure I can actually do it and efficiently.
First is that can multiple threads access the same buffer just not the same location on the buffer (Thread A is accessing a long at $45 and Thread B is accessing $C7) I ask this because I can set the thread to step Oddly and another thread Evenly inside a buffer of data that needs to be transformed
the second is that how does MMX/SSE interact with threads on the stack?
simply because if I using the registers and a thread comes in and uses MMX/SSE also would the registers be saved in such way (stack?)
and my last question is what kind of "lock-less" methods or lock reducing practices to improve parallelizing and multithreaded code?
Multithreaded and Parallelizing
Multithreaded and Parallelizing
~Dreglor
Hi Dreglor,
Raytracing again?
1. Buffer access is no problem. But it will be more efficient if every thread has it's own buffer (splitting the original buffer into parts).
2. That's no problem. Every thread has it's own set. Just be careful with globals.
3. My method: I split everything, that has to be done per frame, into small jobs, and create a list of them. A thread now reads an index to the next job in that list, and increases that index. For reading and increasing that index I need a lock, but that hardly effects the overall speed, if your jobs aren't too short.
Raytracing again?

1. Buffer access is no problem. But it will be more efficient if every thread has it's own buffer (splitting the original buffer into parts).
2. That's no problem. Every thread has it's own set. Just be careful with globals.
3. My method: I split everything, that has to be done per frame, into small jobs, and create a list of them. A thread now reads an index to the next job in that list, and increases that index. For reading and increasing that index I need a lock, but that hardly effects the overall speed, if your jobs aren't too short.
They can even access the same position if you want that. But it's pretty useless because you will have two threads doing the same work.First is that can multiple threads access the same buffer just not the same location on the buffer (Thread A is accessing a long at $45 and Thread B is accessing $C7) I ask this because I can set the thread to step Oddly and another thread Evenly inside a buffer of data that needs to be transformed
heh, you never know what I have up my sleeve.Hades wrote:Hi Dreglor,
Raytracing again?
1. Buffer access is no problem. But it will be more efficient if every thread has it's own buffer (splitting the original buffer into parts).
2. That's no problem. Every thread has it's own set. Just be careful with globals.
3. My method: I split everything, that has to be done per frame, into small jobs, and create a list of them. A thread now reads an index to the next job in that list, and increases that index. For reading and increasing that index I need a lock, but that hardly effects the overall speed, if your jobs aren't too short.
Your method that you explained there sounds a lot like a thread pool which I have a class that handles such a system but it contains a lock per "job" I've haven't used it much but it scales to the system core count (which makes it nice) and I'm unsure on how well the system handles heavy work be it lots of little jobs or a few large jobs.
~Dreglor
Yes, scaling to any number of cores you throw at it was my main concern. With quad cores getting cheaper and cheaper Hexa/Octo cores just ahead and multisocket entering desktop market it's the obvious priority, if you need maximum performance.
Also I don't know upfront how long a job will take, so symmetrical splitting those jobs isn't an option. But I don't think it would be worth the afford anyway.
With only a few large jobs it might be overkill, and creating jobs of a few lines code without loops probably isn't a good idea anyway.
For anything else, I'm pretty sure you won't find a solution that makes a measurable difference. (if you find one, please tell me
)
Also I don't know upfront how long a job will take, so symmetrical splitting those jobs isn't an option. But I don't think it would be worth the afford anyway.
With only a few large jobs it might be overkill, and creating jobs of a few lines code without loops probably isn't a good idea anyway.
For anything else, I'm pretty sure you won't find a solution that makes a measurable difference. (if you find one, please tell me
