Re: Custom sort for Linked Lists
Posted: Tue Jan 29, 2019 7:10 am
				
				Thank you very much   You saved my day. Really useful functions.
 You saved my day. Really useful functions.
			 You saved my day. Really useful functions.
 You saved my day. Really useful functions.http://www.purebasic.com
https://www.purebasic.fr/english/
 You saved my day. Really useful functions.
 You saved my day. Really useful functions.
Thanks for your suggestion!wilbert wrote:Did you consider using a (natural) bottom-up merge sort instead of top-down ?
I got some strange results when I ran your speed test.Little John wrote:Speed test
The following code demonstrates that the hybrid sorting algorithm is considerably faster than using self-written MergeSort alone. On my system, 35 is a good threshold value.
Code: Select all
;-============  Structures  ============
Structure Person
  Idx.i
  Name.s
  Age.i
EndStructure
Procedure CreateRandomStructures (List x.Person(), n.i)
  Protected m$, a, i, r=Int(n/2)
  
  For i = 1 To r
    m$ = RandomString(Random(1000, 200))
    a = Random(80, 20)
    
    AddElement(x())
    x()\Idx = i*2 - 1
    x()\Name = m$
    x()\Age = a
    
    AddElement(x())
    x()\Idx = i*2
    x()\Name = m$
    x()\Age = a + 10
  Next
EndProcedure
Procedure.i ComparePersons (*pa.Integer, *pb.Integer, mode.i)
  ; -- custom comparison function of type 'ProtoCompare'
  ; in : *pa, *pb: pointers to pointers to structures to be compared
  ;      mode    : mode of comparison:
  ;                #PB_Sort_Ascending/#PB_Sort_Descending
  ; out: return value: #True/#False
  Protected.Person *a = *pa\i, *b = *pb\i     ; dereference the pointers passed as parameters
  
  If (mode & #PB_Sort_Descending)
    ProcedureReturn Bool(*a\Name < *b\Name)
  Else
    ProcedureReturn Bool(*a\Name > *b\Name)
  EndIf
EndProcedure
Define u0, u1, u2, u3, u4
NewList p0.Person()
CreateRandomStructures(p0(), n)
; --------------------------------------------------------------------------
threshold = 35
SortStructuredList(p0(), #PB_Sort_Ascending, OffsetOf(Person\Idx), TypeOf(Person\Idx))
u0 = ElapsedMilliseconds()
SortStructuredList(p0(), #PB_Sort_Ascending, OffsetOf(Person\Name), TypeOf(Person\Name))
u0 = ElapsedMilliseconds() - u0
SortStructuredList(p0(), #PB_Sort_Ascending, OffsetOf(Person\Idx), TypeOf(Person\Idx))
CS::SetInsertionSortMaxSize(2 * threshold)
u1 = ElapsedMilliseconds()
CS::SortListAny(p0(), @ ComparePersons())
u1 = ElapsedMilliseconds() - u1
SortStructuredList(p0(), #PB_Sort_Ascending, OffsetOf(Person\Idx), TypeOf(Person\Idx))
CS::SetInsertionSortMaxSize(threshold)
u2 = ElapsedMilliseconds()
CS::SortListAny(p0(), @ ComparePersons())
u2 = ElapsedMilliseconds() - u2
SortStructuredList(p0(), #PB_Sort_Ascending, OffsetOf(Person\Idx), TypeOf(Person\Idx))
CS::SetInsertionSortMaxSize(0.5 * threshold)
u3 = ElapsedMilliseconds()
CS::SortListAny(p0(), @ ComparePersons())
u3 = ElapsedMilliseconds() - u3
SortStructuredList(p0(), #PB_Sort_Ascending, OffsetOf(Person\Idx), TypeOf(Person\Idx))
CS::SetInsertionSortMaxSize(1)
u4 = ElapsedMilliseconds()
CS::SortListAny(p0(), @ ComparePersons())
u4 = ElapsedMilliseconds() - u4I wasn't aware of that problem. Thank you for your investigations!wilbert wrote: I got some strange results when I ran your speed test.
I found out it had to do with the copies of lists that you are using. Since they are located in different parts of memory, one list may be in a better place compared to another one.
Code: Select all
EnableExplicit
XIncludeFile "SortListCustom.pbi"
CompilerIf #PB_Compiler_Debugger
   MessageRequester("Error",
                    "Switch the Debugger off!",
                    #PB_MessageRequester_Error)
   End
CompilerEndIf
Procedure.s RandomString (length.i)
   Protected *char.Character, ret$=Space(length)
   
   *char = @ ret$
   While *char\c <> 0
      If Random(1) = 0
         *char\c = Random('Z', 'A')
      Else
         *char\c = Random('z', 'a')
      EndIf
      *char + SizeOf(Character)
   Wend
   
   ProcedureReturn ret$
EndProcedure
;-============  Structures  ============
Structure Person
  Idx.i
  Name.s
  Age.i
EndStructure
Procedure CreateRandomStructures (List x.Person(), n.i)
  Protected m$, a, i, r=Int(n/2)
 
  For i = 1 To r
    m$ = RandomString(Random(1000, 200))
    a = Random(80, 20)
   
    AddElement(x())
    x()\Idx = i*2 - 1
    x()\Name = m$
    x()\Age = a
   
    AddElement(x())
    x()\Idx = i*2
    x()\Name = m$
    x()\Age = a + 10
  Next
EndProcedure
Procedure.i ComparePersons (*pa.Integer, *pb.Integer, mode.i)
  ; -- custom comparison function of type 'ProtoCompare'
  ; in : *pa, *pb: pointers to pointers to structures to be compared
  ;      mode    : mode of comparison:
  ;                #PB_Sort_Ascending/#PB_Sort_Descending
  ; out: return value: #True/#False
  Protected.Person *a = *pa\i, *b = *pb\i     ; dereference the pointers passed as parameters
 
  If (mode & #PB_Sort_Descending)
    ProcedureReturn Bool(*a\Name < *b\Name)
  Else
    ProcedureReturn Bool(*a\Name > *b\Name)
  EndIf
EndProcedure
Define u0, u1, u2, u3, u4
Define msg$, threshold, n = 500000
NewList p0.Person()
CreateRandomStructures(p0(), n)
; --------------------------------------------------------------------------
threshold = 35
SortStructuredList(p0(), #PB_Sort_Ascending, OffsetOf(Person\Idx), TypeOf(Person\Idx))
u0 = ElapsedMilliseconds()
SortStructuredList(p0(), #PB_Sort_Ascending, OffsetOf(Person\Name), TypeOf(Person\Name))
u0 = ElapsedMilliseconds() - u0
SortStructuredList(p0(), #PB_Sort_Ascending, OffsetOf(Person\Idx), TypeOf(Person\Idx))
CS::SetInsertionSortMaxSize(2 * threshold)
u1 = ElapsedMilliseconds()
CS::SortListAny(p0(), @ ComparePersons())
u1 = ElapsedMilliseconds() - u1
SortStructuredList(p0(), #PB_Sort_Ascending, OffsetOf(Person\Idx), TypeOf(Person\Idx))
CS::SetInsertionSortMaxSize(threshold)
u2 = ElapsedMilliseconds()
CS::SortListAny(p0(), @ ComparePersons())
u2 = ElapsedMilliseconds() - u2
SortStructuredList(p0(), #PB_Sort_Ascending, OffsetOf(Person\Idx), TypeOf(Person\Idx))
CS::SetInsertionSortMaxSize(0.5 * threshold)
u3 = ElapsedMilliseconds()
CS::SortListAny(p0(), @ ComparePersons())
u3 = ElapsedMilliseconds() - u3
SortStructuredList(p0(), #PB_Sort_Ascending, OffsetOf(Person\Idx), TypeOf(Person\Idx))
CS::SetInsertionSortMaxSize(1)
u4 = ElapsedMilliseconds()
CS::SortListAny(p0(), @ ComparePersons())
u4 = ElapsedMilliseconds() - u4
; --------------------------------------------------------------------------
msg$ + "-- Sorting structures" + #LF$ +
       "u0 = " + StrD(u0/1000,3) + " Sec. (built-in MergeSort)" + #LF$ +
       "u1 = " + StrD(u1/1000,3) + " Sec."   + #LF$ +
       "u2 = " + StrD(u2/1000,3) + " Sec. <" + #LF$ +
       "u3 = " + StrD(u3/1000,3) + " Sec."   + #LF$ +
       "u4 = " + StrD(u4/1000,3) + " Sec. (self-written MergeSort alone)"
MessageRequester("Duration", msg$)Here are the results on my system with both methods.wilbert wrote: When I changed your code so every test is done with the same list, the results were very different on my computer.
Results of my original test wrote:-- Sorting structures
u0 = 0.396 Sec. (built-in MergeSort)
u1 = 1.214 Sec.
u2 = 1.177 Sec. <
u3 = 1.197 Sec.
u4 = 2.679 Sec. (self-written MergeSort alone)
The results are pretty similar. Especially using the proposed threshold (u2) is with both methods faster than using a bigger threshold (u1) or a smaller one (u3).Results of your test wrote:-- Sorting structures
u0 = 0.424 Sec. (built-in MergeSort)
u1 = 1.221 Sec.
u2 = 1.198 Sec. <
u3 = 1.234 Sec.
u4 = 2.742 Sec. (self-written MergeSort alone)
The changed version-- Sorting strings
t0 = 0.667 Sec. (built-in MergeSort)
t1 = 2.530 Sec.
t2 = 2.249 Sec. <
t3 = 1.612 Sec.
t4 = 1.829 Sec. (self-written MergeSort alone)
-- Sorting structures
u0 = 3.444 Sec. (built-in MergeSort)
u1 = 1.913 Sec.
u2 = 1.844 Sec. <
u3 = 1.807 Sec.
u4 = 2.197 Sec. (self-written MergeSort alone)
A PB linked list is a chain of small parts of allocated memory.-- Sorting structures
u0 = 0.383 Sec. (built-in MergeSort)
u1 = 1.105 Sec.
u2 = 1.050 Sec. <
u3 = 1.034 Sec.
u4 = 1.255 Sec. (self-written MergeSort alone)

This is really strange. It looks as if the built-in routine were the slowest one.wilbert wrote:Your original code.[...]
-- Sorting structures
u0 = 3.444 Sec. (built-in MergeSort)
u1 = 1.913 Sec.
u2 = 1.844 Sec. <
u3 = 1.807 Sec.
u4 = 2.197 Sec. (self-written MergeSort alone)

Code: Select all
Structure struct
  x.l
EndStructure
Procedure.i CompareStruc(*pa.struct, *pb.struct, mode.i)
  Res = #False
  
  If *pa\x > *pb\x
    Res = #True
  EndIf
  
  ProcedureReturn Res
EndProcedure
NewList l.struct()
For i=1 To 100
  AddElement(l())
  l()\x = Random(100, 10)
Next
CS::SortListAny(l(), @CompareStruc())
ForEach l()
  Debug l()\x
NextBecause you are not using a proper custom comparison function.User_Russian wrote:Why doesn't it sort?
Code: Select all
Procedure.i CompareStruc (*pa.Integer, *pb.Integer, mode.i)
   Protected.struct *a = *pa\i, *b = *pb\i    ; dereference the pointers
   Protected.i Res
   
   Res = #False
   If *a\x > *b\x
      Res = #True
   EndIf
   
   ProcedureReturn Res
EndProcedure