Optmizations questions

Just starting out? Need help? Post your questions and find answers here.
Iron_Meiden
User
User
Posts: 37
Joined: Sun Sep 09, 2007 1:16 pm
Location: Asdenia

Post by Iron_Meiden »

Yeah, sure! And pigs can fly..
Yes, they can Image:

Image
User avatar
pdwyer
Addict
Addict
Posts: 2813
Joined: Tue May 08, 2007 1:27 pm
Location: Chiba, Japan

Post by pdwyer »

Cool! 8) That I can understand.

Does that include variations on quicksort variations or just the ways that PB can code them for speed? I suspect the later
Paul Dwyer

“In nature, it’s not the strongest nor the most intelligent who survives. It’s the most adaptable to change” - Charles Darwin
“If you can't explain it to a six-year old you really don't understand it yourself.” - Albert Einstein
User avatar
djes
Addict
Addict
Posts: 1806
Joined: Sat Feb 19, 2005 2:46 pm
Location: Pas-de-Calais, France

Post by djes »

Personally, I'm also interested by the compiler comparisons, based upon the same code (or almost).

Why?

To be able to see purebasic compiler's evolution. Let say we have now a point 0 : if we always use the same code, in C and in Purebasic, whatever version we have, we will be able to compare the two languages by themselves. Not the processor. Not the algorithm. Only the languages.

And I hope sometimes we can say : PB is as fast as C :D

Faster PB implementation is not usefull for comparison, as we will not do the fastest in C. You see my point?
User avatar
tinman
PureBasic Expert
PureBasic Expert
Posts: 1102
Joined: Sat Apr 26, 2003 4:56 pm
Location: Level 5 of Robot Hell
Contact:

Post by tinman »

pdwyer wrote:Does that include variations on quicksort variations or just the ways that PB can code them for speed? I suspect the later
Probably the latter for comparison purposes, but if there is a faster variation of the same algorithm then that would be good in general. It could always be ported back to C if needed.
If you paint your butt blue and glue the hole shut you just themed your ass but lost the functionality.
(WinXPhSP3 PB5.20b14)
hellhound66
Enthusiast
Enthusiast
Posts: 119
Joined: Tue Feb 21, 2006 12:37 pm

Post by hellhound66 »

Removed.
Last edited by hellhound66 on Wed Mar 19, 2008 11:38 pm, edited 1 time in total.
User avatar
pdwyer
Addict
Addict
Posts: 2813
Joined: Tue May 08, 2007 1:27 pm
Location: Chiba, Japan

Post by pdwyer »

djes wrote:Personally, I'm also interested by the compiler comparisons, based upon the same code (or almost).

Why?

To be able to see purebasic compiler's evolution. Let say we have now a point 0 : if we always use the same code, in C and in Purebasic, whatever version we have, we will be able to compare the two languages by themselves. Not the processor. Not the algorithm. Only the languages.

And I hope sometimes we can say : PB is as fast as C :D

Faster PB implementation is not usefull for comparison, as we will not do the fastest in C. You see my point?
Yes I do,

I think though that the assumption that both compilers would produce the same performance if both use inline ASM for the algorithm and that it would just be a processor speed test is premature. Running it on the same processor should then produce the same time. While that may well turn out to be the case I would be unsurprised to find out that it wasn't

If it is the case then from a more scientific test perspective it means that you can implement all the parts you don't want to test in ASM and know that they are equal and just put the parts you do want to test in normal code so you can compare just that part.

Until you can show though that you can filter out the noise and other factors you can't really compare them individually.

(I can't program in ASM though :oops: )
Paul Dwyer

“In nature, it’s not the strongest nor the most intelligent who survives. It’s the most adaptable to change” - Charles Darwin
“If you can't explain it to a six-year old you really don't understand it yourself.” - Albert Einstein
User avatar
djes
Addict
Addict
Posts: 1806
Joined: Sat Feb 19, 2005 2:46 pm
Location: Pas-de-Calais, France

Post by djes »

hellhound66 wrote:
And I hope sometimes we can say : PB is as fast as C
Then PB developer(s) should start with asm/pseudocode optimization. The asm output of PB works, it is stable, but it's far away from "fast".
It is not slow too! I'd like to see the comparison with others languages (especially C#) :D
User avatar
djes
Addict
Addict
Posts: 1806
Joined: Sat Feb 19, 2005 2:46 pm
Location: Pas-de-Calais, France

Post by djes »

pdwyer wrote:
djes wrote:Personally, I'm also interested by the compiler comparisons, based upon the same code (or almost).

Why?

To be able to see purebasic compiler's evolution. Let say we have now a point 0 : if we always use the same code, in C and in Purebasic, whatever version we have, we will be able to compare the two languages by themselves. Not the processor. Not the algorithm. Only the languages.

And I hope sometimes we can say : PB is as fast as C :D

Faster PB implementation is not usefull for comparison, as we will not do the fastest in C. You see my point?
Yes I do,

I think though that the assumption that both compilers would produce the same performance if both use inline ASM for the algorithm and that it would just be a processor speed test is premature. Running it on the same processor should then produce the same time. While that may well turn out to be the case I would be unsurprised to find out that it wasn't

If it is the case then from a more scientific test perspective it means that you can implement all the parts you don't want to test in ASM and know that they are equal and just put the parts you do want to test in normal code so you can compare just that part.

Until you can show though that you can filter out the noise and other factors you can't really compare them individually.

(I can't program in ASM though :oops: )
What difference could it be? Except few modifications because PB is using FASM, and stack allocations and code alignement may be different, there's no reason to have exactly the same result between inline ASM in C and inline ASM in pb.
User avatar
pdwyer
Addict
Addict
Posts: 2813
Joined: Tue May 08, 2007 1:27 pm
Location: Chiba, Japan

Post by pdwyer »

Dunno,

It's just a baseline test to make sure that you are on a level playing field before you get down to such a fine level that slight interference can upset the results.
Paul Dwyer

“In nature, it’s not the strongest nor the most intelligent who survives. It’s the most adaptable to change” - Charles Darwin
“If you can't explain it to a six-year old you really don't understand it yourself.” - Albert Einstein
User avatar
tinman
PureBasic Expert
PureBasic Expert
Posts: 1102
Joined: Sat Apr 26, 2003 4:56 pm
Location: Level 5 of Robot Hell
Contact:

Post by tinman »

djes wrote:(especially C#) :D
I'm using a different computer from the ones I've reported the times on previously so I won't check until I get home. Suffice to say, it's not slow.

Code: Select all

using System;
using System.Collections.Generic;
using System.Text;
using System.Threading;
using System.Runtime.InteropServices;

namespace QuickSort
{
    class Program
    {
        [DllImport("kernel32.dll")]
        extern static int QueryPerformanceCounter(out Int64 x);
        [DllImport("kernel32.dll")]
        extern static int QueryPerformanceFrequency(out Int64 x);

        const int gSz = 500000;

        struct vector
        {
            public int len;
            public int[] elem;
        }

        static void Main(string[] args)
        {
            int i; 
            int lDiv; 
            vector lV = new vector();
            Int64 liStart, liEnd, liFreq;
            System.Random rand = new Random();

            lV.len = gSz;
            lV.elem = new int[gSz];
            for ( i = 0; i < lV.len; ++i )
            {
                lDiv = rand.Next(32768);
                lV.elem[i] = (int)(rand.Next(32768) / (lDiv == 0 ? 1 : lDiv)); 
            }
   
            Console.Write("Vector populated; Hit any key to quicksort...\n"); 
            Console.ReadKey();

            System.Diagnostics.Process.GetCurrentProcess().PriorityClass = System.Diagnostics.ProcessPriorityClass.RealTime;
            QueryPerformanceCounter(out liStart);
            quicksort(ref lV);
            QueryPerformanceCounter(out liEnd);
            System.Diagnostics.Process.GetCurrentProcess().PriorityClass = System.Diagnostics.ProcessPriorityClass.Normal;

            QueryPerformanceFrequency(out liFreq);

            liEnd = (liEnd - liStart) * 1000 / liFreq;
            Console.Write("Elapsed milliseconds: " + liEnd.ToString() + "\n");
            Console.Write("Press any key to exit\n");
            Console.ReadKey();
        }

        static void quicksort(ref vector v)
        {
            //printf("Quicksort...\n");
            quicksort_real(ref v, 0, v.len - 1);
        }

        static void quicksort_real(ref vector v, int p, int r)
        {
            if (p < r)
            {
                int q = partition(ref v, p, r);
                quicksort_real(ref v, p, q);
                quicksort_real(ref v, q + 1, r);
            }
        }

        static int partition(ref vector v, int p, int r)
        {
            int x;
            int i;
            int j;

            if (p + 3 < r)
            {
                x = Math.Min(v.elem[p], Math.Max(v.elem[p + 1], v.elem[p + 2]));
            }
            else
            {
                x = v.elem[p];
            }

            i = p - 1;
            j = r + 1;
            while (true)
            {
                do
                {
                    --j;
                } while (v.elem[j] > x);
                do
                {
                    ++i;
                } while (v.elem[i] < x);

                if (i < j)
                {
                    int aux = v.elem[j];
                    v.elem[j] = v.elem[i];
                    v.elem[i] = aux;
                }
                else
                {
                    return j;
                }
            }
        }
    }
}
If you paint your butt blue and glue the hole shut you just themed your ass but lost the functionality.
(WinXPhSP3 PB5.20b14)
User avatar
djes
Addict
Addict
Posts: 1806
Joined: Sat Feb 19, 2005 2:46 pm
Location: Pas-de-Calais, France

Post by djes »

Cool! I can't wait! (can't test here by myself)
User avatar
tinman
PureBasic Expert
PureBasic Expert
Posts: 1102
Joined: Sat Apr 26, 2003 4:56 pm
Location: Level 5 of Robot Hell
Contact:

Post by tinman »

djes wrote:Cool! I can't wait! (can't test here by myself)
It takes around 135 ms on my PC, which is slower than the PB implementation.
If you paint your butt blue and glue the hole shut you just themed your ass but lost the functionality.
(WinXPhSP3 PB5.20b14)
hellhound66
Enthusiast
Enthusiast
Posts: 119
Joined: Tue Feb 21, 2006 12:37 pm

Post by hellhound66 »

Removed.
Last edited by hellhound66 on Wed Mar 19, 2008 11:38 pm, edited 1 time in total.
User avatar
djes
Addict
Addict
Posts: 1806
Joined: Sat Feb 19, 2005 2:46 pm
Location: Pas-de-Calais, France

Post by djes »

tinman wrote:
djes wrote:Cool! I can't wait! (can't test here by myself)
It takes around 135 ms on my PC, which is slower than the PB implementation.
:D :D :D (not really a surprise anyway, but faster than I thought...)
User avatar
tinman
PureBasic Expert
PureBasic Expert
Posts: 1102
Joined: Sat Apr 26, 2003 4:56 pm
Location: Level 5 of Robot Hell
Contact:

Post by tinman »

Some more observations:
  • * Protected is faster than global, shared or static (which will all use similar addressing anyway).
    * PB's stack frame creation is faster than simply allocating some space on the stack with uninitialised variables (could be different with larger local variables).
    * PB produces quite structured code, nothing really gets bound to a register. See here for the original PB code and here for my modified code which uses registers for one of the Repeat...Until loops. I get a 10% speedup from this (it's obviously one of the most used code sections). It would be great if PB could make more use of registers, although that's easier said than done :)
hellhound66 wrote:Perhaps you should sort different lists of huge size to achieve times, that are more representative.
I should. But I'll leave that fun for someone else as I don't have any more time to spend on this.
djes wrote:not really a surprise anyway, but faster than I thought...
Slower than I thought, I get better times on my slower computer at work. But I was using MSVC#2005 Express at home and professional at work. They don't have the same options, so it's possible that they produce different output.
If you paint your butt blue and glue the hole shut you just themed your ass but lost the functionality.
(WinXPhSP3 PB5.20b14)
Post Reply