Page 1 of 1

Hyperthreading

Posted: Tue Sep 02, 2003 5:49 am
by Tiuri
Hi all,

I decided to submit this new post to discuss some of the subjects raised in regard to hyperthreading in the responses to my earlier question about CPU utilization in the Beginners section, since the topic is not very close to my original question.

In regard to Ryan's (RJP Computing) question whether hyperthreading really makes a difference: After some experimentation I am convinced the answer is YES! I was able to achieve performance increases up to 45% by splitting a PB application into two threads running simultaneously. More on this below for those interested.

(In response to Aszid) WinXP Home does not support multiple physical processors, but does support two logical processors (from Microsoft website and my experience) WinXP Pro supports 2 physical and 4 logical processors.

(In response to fsw) It is not so easy to disable HT, you can easily turn it off in BIOS, put unfortunately WinXP is aware of the presence of a HT processor during installation, and installs itself so as to use multiple processors, so you would have to reinstall WinXP after turning HT off in BIOS (from other discussion groups)

As to my own experiences:
I experimented with a CPU intensive simulation application that models a simple ecology. You can find the PB source code at
http://home.netspeed.com.au/dekool/RabbitWorld.pb
I split the code into two threads, one taking care of the actual calculations doing the evolution, and one for displaying the results on the screen.
If you adjust the frequency of displaying the results in such a way that these tasks take about an equal amount amount of time so that the threads are running simultaneously most of the time (about 30 timesteps per display), the code becomes 45% faster in the threaded mode as compared to the non-threaded mode
You can experiment with it for yourself if you want, there is a switch on the interface to enable /disable running in threaded mode. It is instructive to open Task Manager while the program is running to monitor the use of the two logical processors. In non-threaded mode it shows 50% usage distribited over the two logical CPUs, in threaded mode for optimum parameters both logical processors show up to 100%.

:!: One problem though: the threaded version of the program seems to confuse the PB Debugger completely, it halts at not entirely predictable places complaining about arrays being out of bounds , NULL values in all kinds of circumstances etc. This baffled me for some time until I just tried without the Debugger, and everything was working fine, with the results of threaded/non-threaded versions being identical. I don't know if this counts as a bug, I can imagine it would be very difficult to do something about it...

Posted: Tue Sep 02, 2003 7:09 pm
by Num3
Ok... You're right about splitting the cpu utilisation...

But what happens if you run 2 instances of the program???

What happens if you run the program and use other apps???

Is there really a performance increase or is everything divided by 2 ???
(yup... intel and microsoft could just easly cock up a simple: 2 processators = divide by 2)

Get yourself lightwave create a DVD anim, with all effects on, of 2 minutes and tell me how long it takes to render.... That's a real test... :twisted:

Posted: Tue Sep 02, 2003 7:18 pm
by Karbon
To do this test correctly we'd need two identical machines - one with hyperthreading and one without to see what benefit *just* hyperthreading gives... Regardless, I think it's safe to assume hyperthreading does *something* even if it's just a little something :-)

Posted: Tue Sep 02, 2003 11:32 pm
by Tiuri
Num3: Maybe I did not explain my test correctly: my claim of 45% performance increase was not based on looking at CPU utilization, but just by measuring the time it takes the run the program. The results were (at 30 timesteps per display)
threaded mode: 27.16 seconds
normal mode : 39.22 seconds

I think that is a good test, comparable to the rendering one you propose. I agree that you will not get this result with most applications, but it seems that at least for some problems you can get a significant improvement.

It does appear that you can only get such results if the application is optimized to make use of the feature, so in the test you propose you may only see an effect if you use a very recent version that is set up to make use of HT.

And I agree with you (from earlier experiments) that if you run two instances of the program in non-threaded mode you don't get any improvement.

As Karbon said, a "real" test would require two identical machines, one with and one without HT, and a test application in which you can turn HT optimization on and off. As it is now, it is still possible that my program would run faster in non-threaded mode on a non-HT machine.