Page 3 of 4

Posted: Wed Feb 13, 2008 12:33 pm
by Iron_Meiden
Yeah, sure! And pigs can fly..
Yes, they can Image:

Image

Posted: Wed Feb 13, 2008 12:48 pm
by pdwyer
Cool! 8) That I can understand.

Does that include variations on quicksort variations or just the ways that PB can code them for speed? I suspect the later

Posted: Wed Feb 13, 2008 1:20 pm
by djes
Personally, I'm also interested by the compiler comparisons, based upon the same code (or almost).

Why?

To be able to see purebasic compiler's evolution. Let say we have now a point 0 : if we always use the same code, in C and in Purebasic, whatever version we have, we will be able to compare the two languages by themselves. Not the processor. Not the algorithm. Only the languages.

And I hope sometimes we can say : PB is as fast as C :D

Faster PB implementation is not usefull for comparison, as we will not do the fastest in C. You see my point?

Posted: Wed Feb 13, 2008 1:37 pm
by tinman
pdwyer wrote:Does that include variations on quicksort variations or just the ways that PB can code them for speed? I suspect the later
Probably the latter for comparison purposes, but if there is a faster variation of the same algorithm then that would be good in general. It could always be ported back to C if needed.

Posted: Wed Feb 13, 2008 1:39 pm
by hellhound66
Removed.

Posted: Wed Feb 13, 2008 1:48 pm
by pdwyer
djes wrote:Personally, I'm also interested by the compiler comparisons, based upon the same code (or almost).

Why?

To be able to see purebasic compiler's evolution. Let say we have now a point 0 : if we always use the same code, in C and in Purebasic, whatever version we have, we will be able to compare the two languages by themselves. Not the processor. Not the algorithm. Only the languages.

And I hope sometimes we can say : PB is as fast as C :D

Faster PB implementation is not usefull for comparison, as we will not do the fastest in C. You see my point?
Yes I do,

I think though that the assumption that both compilers would produce the same performance if both use inline ASM for the algorithm and that it would just be a processor speed test is premature. Running it on the same processor should then produce the same time. While that may well turn out to be the case I would be unsurprised to find out that it wasn't

If it is the case then from a more scientific test perspective it means that you can implement all the parts you don't want to test in ASM and know that they are equal and just put the parts you do want to test in normal code so you can compare just that part.

Until you can show though that you can filter out the noise and other factors you can't really compare them individually.

(I can't program in ASM though :oops: )

Posted: Wed Feb 13, 2008 2:26 pm
by djes
hellhound66 wrote:
And I hope sometimes we can say : PB is as fast as C
Then PB developer(s) should start with asm/pseudocode optimization. The asm output of PB works, it is stable, but it's far away from "fast".
It is not slow too! I'd like to see the comparison with others languages (especially C#) :D

Posted: Wed Feb 13, 2008 2:36 pm
by djes
pdwyer wrote:
djes wrote:Personally, I'm also interested by the compiler comparisons, based upon the same code (or almost).

Why?

To be able to see purebasic compiler's evolution. Let say we have now a point 0 : if we always use the same code, in C and in Purebasic, whatever version we have, we will be able to compare the two languages by themselves. Not the processor. Not the algorithm. Only the languages.

And I hope sometimes we can say : PB is as fast as C :D

Faster PB implementation is not usefull for comparison, as we will not do the fastest in C. You see my point?
Yes I do,

I think though that the assumption that both compilers would produce the same performance if both use inline ASM for the algorithm and that it would just be a processor speed test is premature. Running it on the same processor should then produce the same time. While that may well turn out to be the case I would be unsurprised to find out that it wasn't

If it is the case then from a more scientific test perspective it means that you can implement all the parts you don't want to test in ASM and know that they are equal and just put the parts you do want to test in normal code so you can compare just that part.

Until you can show though that you can filter out the noise and other factors you can't really compare them individually.

(I can't program in ASM though :oops: )
What difference could it be? Except few modifications because PB is using FASM, and stack allocations and code alignement may be different, there's no reason to have exactly the same result between inline ASM in C and inline ASM in pb.

Posted: Wed Feb 13, 2008 2:55 pm
by pdwyer
Dunno,

It's just a baseline test to make sure that you are on a level playing field before you get down to such a fine level that slight interference can upset the results.

Posted: Wed Feb 13, 2008 3:11 pm
by tinman
djes wrote:(especially C#) :D
I'm using a different computer from the ones I've reported the times on previously so I won't check until I get home. Suffice to say, it's not slow.

Code: Select all

using System;
using System.Collections.Generic;
using System.Text;
using System.Threading;
using System.Runtime.InteropServices;

namespace QuickSort
{
    class Program
    {
        [DllImport("kernel32.dll")]
        extern static int QueryPerformanceCounter(out Int64 x);
        [DllImport("kernel32.dll")]
        extern static int QueryPerformanceFrequency(out Int64 x);

        const int gSz = 500000;

        struct vector
        {
            public int len;
            public int[] elem;
        }

        static void Main(string[] args)
        {
            int i; 
            int lDiv; 
            vector lV = new vector();
            Int64 liStart, liEnd, liFreq;
            System.Random rand = new Random();

            lV.len = gSz;
            lV.elem = new int[gSz];
            for ( i = 0; i < lV.len; ++i )
            {
                lDiv = rand.Next(32768);
                lV.elem[i] = (int)(rand.Next(32768) / (lDiv == 0 ? 1 : lDiv)); 
            }
   
            Console.Write("Vector populated; Hit any key to quicksort...\n"); 
            Console.ReadKey();

            System.Diagnostics.Process.GetCurrentProcess().PriorityClass = System.Diagnostics.ProcessPriorityClass.RealTime;
            QueryPerformanceCounter(out liStart);
            quicksort(ref lV);
            QueryPerformanceCounter(out liEnd);
            System.Diagnostics.Process.GetCurrentProcess().PriorityClass = System.Diagnostics.ProcessPriorityClass.Normal;

            QueryPerformanceFrequency(out liFreq);

            liEnd = (liEnd - liStart) * 1000 / liFreq;
            Console.Write("Elapsed milliseconds: " + liEnd.ToString() + "\n");
            Console.Write("Press any key to exit\n");
            Console.ReadKey();
        }

        static void quicksort(ref vector v)
        {
            //printf("Quicksort...\n");
            quicksort_real(ref v, 0, v.len - 1);
        }

        static void quicksort_real(ref vector v, int p, int r)
        {
            if (p < r)
            {
                int q = partition(ref v, p, r);
                quicksort_real(ref v, p, q);
                quicksort_real(ref v, q + 1, r);
            }
        }

        static int partition(ref vector v, int p, int r)
        {
            int x;
            int i;
            int j;

            if (p + 3 < r)
            {
                x = Math.Min(v.elem[p], Math.Max(v.elem[p + 1], v.elem[p + 2]));
            }
            else
            {
                x = v.elem[p];
            }

            i = p - 1;
            j = r + 1;
            while (true)
            {
                do
                {
                    --j;
                } while (v.elem[j] > x);
                do
                {
                    ++i;
                } while (v.elem[i] < x);

                if (i < j)
                {
                    int aux = v.elem[j];
                    v.elem[j] = v.elem[i];
                    v.elem[i] = aux;
                }
                else
                {
                    return j;
                }
            }
        }
    }
}

Posted: Wed Feb 13, 2008 3:58 pm
by djes
Cool! I can't wait! (can't test here by myself)

Posted: Sat Feb 16, 2008 4:59 pm
by tinman
djes wrote:Cool! I can't wait! (can't test here by myself)
It takes around 135 ms on my PC, which is slower than the PB implementation.

Posted: Sat Feb 16, 2008 6:33 pm
by hellhound66
Removed.

Posted: Sat Feb 16, 2008 7:30 pm
by djes
tinman wrote:
djes wrote:Cool! I can't wait! (can't test here by myself)
It takes around 135 ms on my PC, which is slower than the PB implementation.
:D :D :D (not really a surprise anyway, but faster than I thought...)

Posted: Sun Feb 17, 2008 6:13 pm
by tinman
Some more observations:
  • * Protected is faster than global, shared or static (which will all use similar addressing anyway).
    * PB's stack frame creation is faster than simply allocating some space on the stack with uninitialised variables (could be different with larger local variables).
    * PB produces quite structured code, nothing really gets bound to a register. See here for the original PB code and here for my modified code which uses registers for one of the Repeat...Until loops. I get a 10% speedup from this (it's obviously one of the most used code sections). It would be great if PB could make more use of registers, although that's easier said than done :)
hellhound66 wrote:Perhaps you should sort different lists of huge size to achieve times, that are more representative.
I should. But I'll leave that fun for someone else as I don't have any more time to spend on this.
djes wrote:not really a surprise anyway, but faster than I thought...
Slower than I thought, I get better times on my slower computer at work. But I was using MSVC#2005 Express at home and professional at work. They don't have the same options, so it's possible that they produce different output.