Page 1 of 1

Simple sorting of arrays with your own sort function

Posted: Sat Mar 17, 2012 7:30 am
by wilbert
Tested on Windows (XP, 7), OS X (Lion), Linux.

Code: Select all

ImportC ""
  qsort(*base, num, size, *comparator)
EndImport 

Dim values.i(5)
values(0) = 40
values(1) = 10
values(2) = 100
values(3) = 90
values(4) = 20
values(5) = 25

ProcedureC.i Compare(*a.Integer, *b.Integer)
  ProcedureReturn *a\i - *b\i
EndProcedure

qsort(@values(), ArraySize(values()) + 1, SizeOf(Integer), @Compare())

For n = 0 To 5
  Debug values(n)
Next

Re: Simple sorting of arrays with your own sort function

Posted: Sat Mar 17, 2012 7:45 am
by Demivec
Tested on Windows XP.

Re: Simple sorting of arrays with your own sort function

Posted: Sat Mar 17, 2012 7:47 am
by wilbert
Great that it works on XP also.

Here's a multi field example, first sorted on age, second on name when age is equal.

Code: Select all

EnableExplicit

ImportC ""
  qsort(*base, num, size, *comparator)
EndImport 

Structure Person
  name.s
  age.i
EndStructure

Define n

Dim people.Person(5)
people(0)\name = "John"
people(0)\age = 40
people(1)\name = "Charles"
people(1)\age = 10
people(2)\name = "Bart"
people(2)\age = 10
people(3)\name = "Richard"
people(3)\age = 90
people(4)\name = "Bob"
people(4)\age = 20
people(5)\name = "John Jr."
people(5)\age = 25

ProcedureC.i Compare(*a.Person, *b.Person)
  
  Protected Result.i
  
  Result = *a\age - *b\age
  
  If Result = 0; same age ? => compare name
    Result = CompareMemoryString(@*a\name, @*b\name)
  EndIf
  
  ProcedureReturn Result
EndProcedure

qsort(@people(), ArraySize(people()) + 1, SizeOf(Person), @Compare())

For n = 0 To 5
  Debug people(n)\name
  Debug people(n)\age
  Debug ""
Next
To prevent undefined results when compare results are equal, you probably could do a second comparison between the two pointers of the structures like
If Result = 0
Result = *a - *b
EndIf

That probably should keep items in the original order.
Unfortunately comparing the two pointers doesn't work

Re: Simple sorting of arrays with your own sort function

Posted: Sat Mar 17, 2012 9:07 am
by Little John
Very cool. 8)
Thanks a lot!

Re: Simple sorting of arrays with your own sort function

Posted: Sat Mar 17, 2012 5:55 pm
by rsts
Nice. And potentially very useful. :)

Thanks for sharing.

Re: Simple sorting of arrays with your own sort function

Posted: Sun Mar 18, 2012 2:28 am
by skywalk
[Edited for v56]
Thanks wilbert for pointing out the internal qsort() function. 8)
wilbert wrote: To prevent undefined results when compare results are equal, you probably could do a second comparison between the two pointers of the structures like
If Result = 0
Result = *a - *b
EndIf

That probably should keep items in the original order.
Unfortunately, the qsort() function rearranges the indexes prior to calling the compare function so this won't work. :(
Your idea of comparing additional fields is good and valid, but it still forces specific custom compare_myStruc()'s to be hardcoded. :(
And, if there are no additional fields, we must make some up to indicate the duplicates...or go with our own stable sort. We are cursed with this unstable QuickSort algorithm and I want to know who adopted it so...ummm...QUICKLY? :evil: ahem...optimized MergeSort please.
Anyway, here is the custom compare approach...

Code: Select all

EnableExplicit
ImportC ""
  qsort(*arBaseIndex, nPts.i, SizeOfStruc, *CompareFn)
EndImport

Structure myABC   ; Example Structure
  a.i
  b.d
  c$
EndStructure

ProcedureC.i Compare_myABC(*a.myABC, *b.myABC)
  ; Custom Sort order: \a.i+, \b.d+, \c$+
  Protected.i ri = *a\a - *b\a  ; compare element values
  If ri = 0                     ; Make stable if identical
    ; Comparing element addresses fails because qsort() moves elements internally
    ; Use next desired field's values instead
    ri = *a\b - *b\b
    If ri = 0                   ; Check next available field for desired order
      ri = CompareMemoryString(@*a\c$, @*b\c$, #PB_String_NoCase)
    EndIf
  EndIf
  ProcedureReturn ri
EndProcedure

ProcedureC.i Compare_myABC_BAD(*a.myABC, *b.myABC)
  ; Compare only \a.i+, but to maintain order of duplicates, check addresses
  ; Essentially, this eliminates 0 as a Return value.
  Protected.i ri = *a\a - *b\a  ; Compare Ascending Order, Descending -> *b\a - *a\a
  If ri = 0                     ; Make stable if identical
    ; Comparing element addresses fails because qsort() moves elements internally
    ; Use next desired field's values instead
    ri = *a - *b
  EndIf
  ProcedureReturn ri
EndProcedure

;-{ TEST
Macro Debug_myABC(arStruc, nPts, hdr="", tw=4)
  Debug hdr
  Debug LSet("--", tw) + LSet("--", tw) + LSet("--", tw)
  For i = 0 To nPts-1
    Debug LSet(Str(arstruc(i)\a), tw) + LSet(Str(arstruc(i)\b), tw) + LSet(arstruc(i)\c$, tw)
  Next
EndMacro
Define.i i, nPts, tw = 4
Define.s r$
Restore SortThis
Read.i nPts
Dim myL.myABC(nPts-1)
Dim myL1.myABC(nPts-1)
For i = 0 To nPts-1
  Read.s r$: myL1(i)\a = Val(r$)
  Read.s r$: myL1(i)\b = ValD(r$)
  Read.s myL1(i)\c$
Next
CopyArray(myL1(),myL())
Debug "-- Before Sort --"
Debug_myABC(myL, nPts, LSet("a", tw) + LSet("b", tw) + LSet("c$", tw))
; Attempt to use the built-in PB SortStructuredArray() for \a, \b, \c$
SortStructuredArray(myL(), #PB_Sort_Ascending|#PB_Sort_NoCase, OffsetOf(myABC\c$), #PB_String)
SortStructuredArray(myL(), #PB_Sort_Ascending,                 OffsetOf(myABC\b),  #PB_Double)
SortStructuredArray(myL(), #PB_Sort_Ascending,                 OffsetOf(myABC\a),  #PB_Integer)
Debug "-- FAIL = PB SortStructuredArray()         a+,b+,c$+ --"
Debug_myABC(myL, nPts, LSet("a", tw) + LSet("b", tw) + LSet("c$", tw))
CopyArray(myL1(),myL())
qsort(@myL(), nPts, SizeOf(myABC), @Compare_myABC_BAD())
Debug "-- FAIL = qsort() w/custom compare_myABC_BAD() a+ only --"
Debug_myABC(myL, nPts, LSet("a", tw) + LSet("b", tw) + LSet("c$", tw))
CopyArray(myL1(),myL())
qsort(@myL(), nPts, SizeOf(myABC), @Compare_myABC())
Debug "-- OK   = qsort() w/custom compare_myABC() a+,b+,c$+ --"
Debug_myABC(myL, nPts, LSet("a", tw) + LSet("b", tw) + LSet("c$", tw))
;-}

DataSection
  SortThis:
  Data.i 12
  ;       a,   b,   c$
  Data.s "1", "9", "1"
  Data.s "2", "2", "5"
  Data.s "3", "1", "b"
  Data.s "4", "1", "a"
  Data.s "5", "3", "z"
  Data.s "7", "3", "y"
  Data.s "6", "3", "3"
  Data.s "8", "3", "2"
  Data.s "9", "3", "x"
  Data.s "7", "5", "a"  ;<- The problem of a 3rd sort appears when both Field1 and Field2 are duplicates
  Data.s "7", "5", "W"  ;<- Field1 = primary, Field2 = secondary, etc.
  Data.s "7", "4", "x"
  IWantThis:
  Data.i 12
  ;       a,   b,   c$
  Data.s "1", "9", "1"
  Data.s "2", "2", "5"
  Data.s "3", "1", "b"
  Data.s "4", "1", "a"
  Data.s "5", "3", "z"
  Data.s "6", "3", "3"
  Data.s "7", "3", "y"   ;<- b = 3 then 4
  Data.s "7", "4", "x"
  Data.s "7", "5", "a"   ;<- c$ = a then W
  Data.s "7", "5", "W"
  Data.s "8", "3", "2"
  Data.s "9", "3", "x"
  IGetThis:
  ; -- FAIL = PB SortStructuredArray()         a+,b+,c$+ --
  ; a   b   c$
  ; --  --  --
  ; 1   9   1
  ; 2   2   5
  ; 3   1   b
  ; 4   1   a
  ; 5   3   z
  ; 6   3   3
  ; 7   4   x   ;<- FAIL b = 4 then 3
  ; 7   3   y
  ; 7   5   W   ;<- FAIL c$ = W then a
  ; 7   5   a
  ; 8   3   2
  ; 9   3   x
EndDataSection
;-}

Re: Simple sorting of arrays with your own sort function

Posted: Sun Mar 18, 2012 7:11 am
by wilbert
skywalk wrote:Unfortunately, the qsort() function rearranges the indexes prior to calling the compare function so this won't work. :(
After some more checking, you are right :(
On OS X the behavior seems to be a bit different.

Re: Simple sorting of arrays with your own sort function

Posted: Sun Mar 18, 2012 8:12 am
by skywalk
Hi wilbert,
I edited my previous post to show the error.
Look at the order of myABC\a before and after the different sorts.
Thanks for posting this solution!
This beats my mergesort approach by 100%! I retrieved each structure element as an array to mergesort() and then stuck them back in according to a sorted pointer array.
Demivec's Multi-sort lib also beat my mergesort by ~70%, but requires more code and globals.

This only requires:
qsort(@myArr(), nPts, SizeOf(myABC), @cmp_myABC())
And the code in cmp_myABC()
Very nice and clean.

Of course, we still have to code for specific structures...and that's a drag.

Re: Simple sorting of arrays with your own sort function

Posted: Sun Mar 18, 2012 8:21 am
by wilbert
I also just updated my previous post :)
You are right. My first tests were on OS X and the behavior is a bit different there when compare results are identical.
I would love to hear if it works on Linux also. I think it should but I'm not sure.

It is a clean solution but you are right about the structures.
If an item could be used as an index for a map or something like that so you can add a field virtually, that might also work but I guess that's also not possible.

Re: Simple sorting of arrays with your own sort function

Posted: Sun Mar 18, 2012 8:35 am
by skywalk
From what I read, qsort() is standard deployment in c libs.

Side note.
To do the speed test, I had to back out the CopyArray(myL1(),myL()) calls.
The qsort() is notoriously bad with presorted arrays.
So I didn't want to loop using the same sorted structured array.
What shocked me was the qsort() completed 5 times faster than the copyarray() :?:
Not sure how can that be?

Re: Simple sorting of arrays with your own sort function

Posted: Sun Mar 18, 2012 7:16 pm
by idle
wilbert wrote: I would love to hear if it works on Linux also. I think it should but I'm not sure.
Yes it works on Linux!

Re: Simple sorting of arrays with your own sort function

Posted: Sun Mar 18, 2012 7:24 pm
by wilbert
Thanks for checking Idle; nice to know it is fully cross platform.
Since I upgraded Ubuntu, PB doesn't compile anymore for some reason so I couldn't check for myself.