Optmizations questions

Just starting out? Need help? Post your questions and find answers here.
kinglestat
Enthusiast
Enthusiast
Posts: 746
Joined: Fri Jul 14, 2006 8:53 pm
Location: Malta
Contact:

Optmizations questions

Post by kinglestat »

This week I had a "contest" with a friend of mine about which programming language to use to sort 500,000 integers
Unfortunately, his program won....in borland C++, and not even using an internal sort function.
So I've been thinking how can I make my programs faster?

a) is it possible to use registers for loop variables ?
b) can more registers be used for parameter passing in procedures ?
c) void return for parameters not returning a value
d) memory allocation for strings handled by user?

Any suggestions to improve speed

I already use macros exhaustively and pay attention to optimize my code, but I am quite new to PB
I may not help with your coding
Just ask about mental issues!

http://www.lulu.com/spotlight/kingwolf
http://www.sen3.net
User avatar
tinman
PureBasic Expert
PureBasic Expert
Posts: 1102
Joined: Sat Apr 26, 2003 4:56 pm
Location: Level 5 of Robot Hell
Contact:

Re: Optmizations questions

Post by tinman »

kinglestat wrote:This week I had a "contest" with a friend of mine about which programming language to use to sort 500,000 integers
Unfortunately, his program won....in borland C++, and not even using an internal sort function.
What version of Borland C++ does your friend use? Modern C++ compilers are usually very good at optimising the code, to a level that PB's compiler does not do.

For example, loop unrolling, inline function calls are two optimisations that would make a big difference in certain situations. PB does not automatically put frequently used variables in registers, preferring to read them from memory each time they are used.
kinglestat wrote:a) is it possible to use registers for loop variables ?
You cannot use registers in PB. You could if you were willing to code in ASM, but that would no longer be PB code :)
kinglestat wrote:b) can more registers be used for parameter passing in procedures ?
PB passes parameters on the stack, so you cannot use registers.
kinglestat wrote:c) void return for parameters not returning a value
PB does not support the notion of a void return for procedures. If you do not return a value then you do not return a value. I do not know how much difference that would make anyway.
kinglestat wrote:d) memory allocation for strings handled by user?
Perhaps. Trond posted some ASM code and pdwyer was going to have a look at it for improving the speed of strings.

What sort algorithm does your friend use? It could be that the algorithms used have a large difference (I do not know what PB uses for SortArray).
If you paint your butt blue and glue the hole shut you just themed your ass but lost the functionality.
(WinXPhSP3 PB5.20b14)
rsts
Addict
Addict
Posts: 2736
Joined: Wed Aug 24, 2005 8:39 am
Location: Southwest OH - USA

Post by rsts »

Unless you both used the same basic 'code' - algorithm, it wasn't as much about language as it was programmer.

cheers

edit - not 'good' or 'bad' programming just 'choices' and 'methods'
Seymour Clufley
Addict
Addict
Posts: 1264
Joined: Wed Feb 28, 2007 9:13 am
Location: London

Re: Optmizations questions

Post by Seymour Clufley »

tinman wrote:It could be that the algorithms used have a large difference (I do not know what PB uses for SortArray).
Whatever the algorithm is, it adds 5k to executable size. Or at least it did when I used the function a few months ago. Just a sidenote.
kinglestat
Enthusiast
Enthusiast
Posts: 746
Joined: Fri Jul 14, 2006 8:53 pm
Location: Malta
Contact:

Post by kinglestat »

he uses codebuilder 2007
I think its the latest

It is optimizing well. We tried his code with watcom, and with some adjustments also microsoft. With watcom, PB was faster. with microsoft, sometimes faster others slower. Borland, always faster.

Its true, optimizations are mostly developer specific. But there are times, with very large iterations, where we depend on the language itself. And like tinman correcly points out, if we coded in assembly, it would not be PB anymore - also there is the risk that with upgrades in PB, these hardcoded optimizations might not work.

Further, optimizing strings is the hardest, as to optimize any string related stuff, would need to build all the associated routines.

Lets take a simple example.

for i = 1 to 1000
s1.s = "some text"
s2.s = "some other text"
s3.s = Peeks( mem )
;>
s1 = s2 + s3
s1 + s2
; do something with s1
s1 = "something else"
next

now from my understanding, PB would do the following
starting from comment

allocate mem for s1
copy string s1
append string s2
realloc mem for s1
copy string s2

and later
free mem for s1
s1 =something else


now, had s1's memory been enough at the beginning, lots of operations would be avoides, thus gaining speed

correct me if I am wrong

This kind of work is ideal in C, but its very hard using C code with PB, as to date there is no simple way to link PB libs with C libs (not using DLLs)
I may not help with your coding
Just ask about mental issues!

http://www.lulu.com/spotlight/kingwolf
http://www.sen3.net
kinglestat
Enthusiast
Enthusiast
Posts: 746
Joined: Fri Jul 14, 2006 8:53 pm
Location: Malta
Contact:

Post by kinglestat »

incidentally this is the code

Code: Select all

; English forum: http://www.purebasic.fr/english/viewtopic.php?t=7503&highlight=
; Author: Pupil (updated for PB4.00 by blbltheworm + Andre)
; Date: 16. September 2003
; OS: Windows
; Demo: Yes


; With this small assembly of previous post you can now easily compare the different
; sorting algorithms. The code below contains some algorithms that has already been
; posted, i.e. the basic BubbleSort and oldefoxx's "fastest" bubblesort variation.
; New sorting algoritms are the CombSort and the QuickSort algorithm (quicksort as
; found in the PB examples folder). 

; The CombSort algorithm is a modification of bubblesort that addresses the weakness
; of propagation from bottom to top in the bubblesort algorithm. CombSort was published
; by Box And Lacey in 1991. The algorithm is so much more efficient than the original
; bubblesort that it is in fact comparable to some of the best sorting algorithms
; there is, like QuickSort and HeapSort to name a few. In the CombSort algorithm the
; optimal value of the shrinkFactor vary from one array to another, but doesn't stray
; far away from the value 1.3. 


;- Code: 

; Compare sort algorithms: 
; BubbleSort, modified BubbleSort, CombSort, 
; QuickSort and PB internal SortArray() command 
; 


ImportC "msvcrt.lib" 
  ftime( a ) As "__ftime@4"
EndImport


Structure _timeb
  time.l
  milli.w
  tz.w
  flag.w
EndStructure


#NMAX = 500000
Global Dim dummy.l(0) 
Global Dim array.l(#NMAX) 

Global t1._timeb
Global t2._timeb

DisableDebugger

; QuickSort - From PureBasic Advanced examples 
; 
Declare QuickSortRecursive(g.l, d.l) 

Procedure QuickSort(*ptr, n.l) 
  ; small dummy() wrapper. 
  tmpAddress.l = @dummy() 
  dummy() = *ptr 
  QuickSortRecursive(0, n) 
  dummy() = tmpAddress 
EndProcedure 

Procedure QuickSortRecursive(g.l, d.l) 
  If g < d 
    v = dummy(d) 
    i = g-1 
    j = d 
    Repeat 
      Repeat 
        i=i+1 
      Until dummy(i) >= v 
      ok = 0 
      Repeat 
        If j>0 
          j=j-1 
        Else 
          ok=1 
        EndIf 
        If dummy(j) <= v 
          ok=1 

        EndIf 
      Until ok<>0 
      tmp.l = dummy(i) 
      dummy(i) = dummy(j) 
      dummy(j) = tmp 
    Until j <= i 
    
    t = dummy(j) 
    dummy(j) = dummy(i) 
    dummy(i) = dummy(d) 
    dummy(d) = t 
    
    QuickSortRecursive(g, i-1) 
    QuickSortRecursive(i+1, d) 
  EndIf 
EndProcedure 

; Fill Array with random numbers 
For i = 0 To #NMAX 
  array(i) = Random($FFFF)
   ;array(i) = Random(32767)
Next 

; Need the exact conditions for both tests 
; so copy array. 
Mem = AllocateMemory(#NMAX*4+4)
If Mem = 0
  End 
EndIf 

CopyMemory(@array(0), Mem, #NMAX*4+4) 

; Time the QuickSort algorithm 
ftime(@t1) 
QuickSort(Mem, #NMAX) 
ftime(@t2) 

message.s = "QuickSort : "+Str(t2\milli-t1\milli)+"ms"+Chr(13)+Chr(10) 

; Time the internal SortArray() command 
ftime(@t1)
SortArray(array(), 0) 
ftime(@t2)


message + "SortArray : "+Str(t2\milli-t1\milli)+"ms"+Chr(13)+Chr(10) 
MessageRequester("Result", message, 0) 

FreeMemory(Mem) 

Code: Select all

//---------------------------------------------------------------------------
#include	<stdafx.h>
#include <stdlib.h>
#include <conio.h>
#include <stdio.h>
#include <sys\timeb.h>
#pragma hdrstop

#define gSz 500000

typedef struct {
	int len;
	int elem[gSz];
} vector;

void printv(char *str, vector* v)
{
	int i = 0;
	printf("%s", str);
	for (; i < v->len; ++i)
		printf("%d ", v->elem[i]);
	printf("\n");
}

#define MAX(a, b) ((a > b) ? (a) : (b))
#define MIN(a, b) ((a < b) ? (a) : (b))

int partition(vector *v, int p, int r)
{
	int x;
	int i;
	int j;

	if (p + 3 < r)
		x = MIN(v->elem[p], MAX(v->elem[p+1], v->elem[p+2]));
	else
		x = v->elem[p];

	i = p - 1;
	j = r + 1;
	while (1) {
		do {
			--j;
		} while (v->elem[j] > x);
		do {
			++i;
		} while (v->elem[i] < x);

		if (i < j) {
			int aux = v->elem[j];
			v->elem[j] = v->elem[i];
			v->elem[i] = aux;
		} else
			return j;
	}
}

void quicksort_real(vector *v, int p, int r) {
	if (p < r) {
		int q = partition(v, p, r);
		quicksort_real(v, p, q);
		quicksort_real(v, q + 1, r);
	}
}

void quicksort(vector *v) {
	printf("Quicksort...\n");
	quicksort_real(v, 0, v->len - 1);
}



int main(int argc, char* argv[])
{
	int i;
	int lDiv;
	vector lV;
	unsigned long lS;
	struct timeb t;


	lV.len = gSz;
	for ( i = 0; i < lV.len; ++i ) {
		lDiv = rand();
		lV.elem[i] = (int)(rand()/(lDiv==0?1:lDiv));
	}
	printf("Vector populated; Hit any key to quicksort...");
	getch();
	ftime(&t);
	lS = t.millitm;
	quicksort(&lV);
	ftime(&t);
	lS = t.millitm - lS;
	printf("Elapsed milliseconds: %ld", lS);


	return 0;
}
//---------------------------------------------------------------------------

I changed the ellapsedmilliseconds() to ftime as I suspected ftime isnt so very hi res with timing, infact with ellapsedmilliseconds() gives "slower" results on our computers; which shows that there is some logic there.
I may not help with your coding
Just ask about mental issues!

http://www.lulu.com/spotlight/kingwolf
http://www.sen3.net
superadnim
Enthusiast
Enthusiast
Posts: 480
Joined: Thu Jul 27, 2006 4:06 am

Post by superadnim »

I didn't happen to read both sources but since it's a benchmark and it relates sorting algorithms.... Are you actually giving both methods the SAME data or is the second test performed with the sorted array? because that would explain for one the inconclusive results you're getting.

:lol: should I bash the keyboard and give up?
:?
kinglestat
Enthusiast
Enthusiast
Posts: 746
Joined: Fri Jul 14, 2006 8:53 pm
Location: Malta
Contact:

Post by kinglestat »

you can see in the code
data is random every time
I may not help with your coding
Just ask about mental issues!

http://www.lulu.com/spotlight/kingwolf
http://www.sen3.net
User avatar
pdwyer
Addict
Addict
Posts: 2813
Joined: Tue May 08, 2007 1:27 pm
Location: Chiba, Japan

Post by pdwyer »

what were the times? (differences I mean)

My qsort proc is the same speed as yours (edit, actually mine is a little slower :P). I got rid of all the time calls you made to msvcrt.lib though because I was getting answers with negative times. I wonder how powerbasic would fare, it has register variable options.

My version, which isn't as fast as PB's internal one BTW looks like below. I wonder if it was done with pointers instead of array calls if it would be faster...?

Maybe you'll just have to accept that his is bigger than yours :twisted:

Code: Select all

Procedure QuickSortLong(List.l(1), First.l, last.l)

    MedVal.l
    hi.l
    lo.l

    If First >= Last 
        ProcedureReturn
    EndIf
    
    MedVal = list(first)
    lo = first
    hi = last

    While 1
        While list(hi) >= medval
            hi = hi - 1
            If hi <= lo
                Break
            EndIf
        Wend

        If hi <= lo 
            list(lo) = medval
            Break
        EndIf

        list(lo) = list(hi) ;swap Vals

        lo = lo + 1

        While list(lo) < medval
            lo = lo + 1
            If lo >= hi
                Break 
            EndIf
        Wend

        If lo >= hi 
            lo = hi
            list (hi) = medVal
            Break
        EndIf

        list(hi) = list(lo) ;Swap Vals
    Wend

    QuickSortLong(list(), first, lo -1)
    QuickSortLong(list(), lo +1, last)

EndProcedure

;====================================

Dim Larray.l(20)

For i = 0 To 20 
  Larray(i) = Random($FFFF)
  Debug Larray(i) 
Next 

QuickSortLong(Larray(), 0,20) 

Debug "----------"

For i = 0 To 20 
  Debug Larray(i) 
Next
Paul Dwyer

“In nature, it’s not the strongest nor the most intelligent who survives. It’s the most adaptable to change” - Charles Darwin
“If you can't explain it to a six-year old you really don't understand it yourself.” - Albert Einstein
User avatar
pdwyer
Addict
Addict
Posts: 2813
Joined: Tue May 08, 2007 1:27 pm
Location: Chiba, Japan

Post by pdwyer »

I tried removing all of the arrays and swaping them for pointer arithmatic and it was not really any different, possibly slightly slower :shock: .

It looks more sophisticated though with those square brackets 8)

(also, making some vars global slows it down too)

Code: Select all



Structure MemArray 
    Lng.l[0] 
EndStructure 

Procedure QuickSortLong(pList.l, First.l, last.l)

    If Last > first 
        
        MedVal.l
        hi.l
        lo.l
        *LstArray.MemArray = pList;@List()
    
        MedVal = *LstArray\lng[first]
        lo = first
        hi = last
    
        While 1
            While *LstArray\lng[hi] >= medval
                hi = hi - 1
                If hi <= lo
                    Break
                EndIf
            Wend
    
            If hi <= lo 
                *LstArray\lng[lo] = medval
                Break
            EndIf
    
            *LstArray\lng[lo] = *LstArray\lng[hi] ;swap Vals
    
            lo = lo + 1
    
            While *LstArray\lng[lo] < medval
                lo = lo + 1
                If lo >= hi
                    Break 
                EndIf
            Wend
    
            If lo >= hi 
                lo = hi
                *LstArray\lng[hi] = medVal
                Break
            EndIf
    
            *LstArray\lng[hi] = *LstArray\lng[lo] ;Swap Vals
        Wend
    
        QuickSortLong(pList, first, lo -1)
        QuickSortLong(pList, lo +1, last)
    
    EndIf
EndProcedure

#ArraySize = 5
Dim Larray.l(#ArraySize)
 *pLArray.MemArray = @Larray()

For i = 0 To #ArraySize 
  Larray(i) = Random($FF)
  Debug Larray(i) 
Next 

QuickSortLong(@Larray(), 0,#ArraySize) 

Debug "----Sorted------"

For i = 0 To #ArraySize 
  Debug Larray(i) 
 ;Debug "-" + Str(*pLArray\lng[i])
Next

Paul Dwyer

“In nature, it’s not the strongest nor the most intelligent who survives. It’s the most adaptable to change” - Charles Darwin
“If you can't explain it to a six-year old you really don't understand it yourself.” - Albert Einstein
hellhound66
Enthusiast
Enthusiast
Posts: 119
Joined: Tue Feb 21, 2006 12:37 pm

Post by hellhound66 »

Removed.ry.
Last edited by hellhound66 on Wed Mar 19, 2008 11:37 pm, edited 1 time in total.
kinglestat
Enthusiast
Enthusiast
Posts: 746
Joined: Fri Jul 14, 2006 8:53 pm
Location: Malta
Contact:

Post by kinglestat »

well, that sort is from the codearchive
any merits to speed is to blbltheworm + Andre
I only adapted it
but there are 2 sorts in the PB source
a custom sort and a sort using SortArray (PB inbuilt sort function) which is the faster of the 2...probably due to pointer swaps rather than assiging 1st to temp

Speeds are as follows

PB
QuickSort: 125
SortArray: 94

C/C++
Borland 84 Avg
Microsoft 100 Avg
Watcom 140 Avg


re negative timing I think ftime is not accurate for high resultion timing, so on fast computers it might throw the timings off
I may not help with your coding
Just ask about mental issues!

http://www.lulu.com/spotlight/kingwolf
http://www.sen3.net
User avatar
Rook Zimbabwe
Addict
Addict
Posts: 4322
Joined: Tue Jan 02, 2007 8:16 pm
Location: Cypress TX
Contact:

Post by Rook Zimbabwe »

See PB SortArray() is faster than MS and Watcom!

What version numbers of each... I would figure the latest versions of each would be more optimized! :D
Binarily speaking... it takes 10 to Tango!!!

Image
http://www.bluemesapc.com/
hellhound66
Enthusiast
Enthusiast
Posts: 119
Joined: Tue Feb 21, 2006 12:37 pm

Post by hellhound66 »

Removed.
Last edited by hellhound66 on Wed Mar 19, 2008 11:37 pm, edited 1 time in total.
User avatar
Demivec
Addict
Addict
Posts: 4260
Joined: Mon Jul 25, 2005 3:51 pm
Location: Utah, USA

Post by Demivec »

hellhound66 wrote:Offtopic:
See PB is faster than MS and Watcom!
Yeah, sure! And pigs can fly..
More Offtopic, maybe they're PBpigs. :wink:
Post Reply