MergeSort for linked lists of complex elements

Share your advanced PureBasic knowledge/code with the community.
Little John
Addict
Addict
Posts: 4791
Joined: Thu Jun 07, 2007 3:25 pm
Location: Berlin, Germany

MergeSort for linked lists of complex elements

Post by Little John »

Works with PB 5.20+

Hi all,

for linked lists of standard data types, PB has the built-in commands SortList() and SortStructuredList().

However, since PB 4.50 there can be dynamic arrays, lists, and maps inside of structures, which opens up unlimited possibilities for creating complex data types. In order to sort linked lists (or arrays) that consist of complex elements, we'll have to write our own sorting routines.

Here is code for sorting linked lists of string arrays, according to one or more chosen array element(s). The code uses MergeSort, which is a fast and stable sorting-algorithm. Also, MergeSort is simple and straightforward, so that it's easy to adapt this code for sorting lists of any other data type. For details please see the comments in the code.

// edit 2014-05-03: Version 2.0

Code: Select all

; -- A module for MergeSorting lists of string arrays
; <http://www.purebasic.fr/english/viewtopic.php?f=12&t=46106>

; Version 2.00, 2014-05-03
; Public Domain, written by Juergen Luethje <http://luethje.eu/>.
; For Purebasic 5.20 LTS or newer.
; Cross-platform, Unicode compliant.
; Standard disclaimer: USE AT YOUR OWN RISK!


DeclareModule ArrayList
   EnableExplicit
   
   Structure StringArray
      Array s.s(0)
   EndStructure
   
   Structure CmpStruc
      index.i          ; index of array element which is to be used for comparison
      options.i        ; combination of #PB_Sort_Ascending/#PB_Sort_Descending and #PB_Sort_NoCase
   EndStructure
   
   Declare.i Sort (List x.StringArray(), cmpIndex.i, options.i)
   Declare.i IsSorted (List x.StringArray(), cmpIndex.i, options.i)
   Declare.i SortEx (List x.StringArray(), Array mode.CmpStruc(1))
   Declare.i IsSortedEx (List x.StringArray(), Array mode.CmpStruc(1))
EndDeclareModule


Module ArrayList
   Procedure.i _IsInTheRightOrder (x$, y$, options.i)
      ; -- Auxiliary function for _Merge() and IsSorted()
      ; in : x$, y$ : strings to be compared
      ;      options: #PB_Sort_Ascending/#PB_Sort_Descending optionally combined with #PB_Sort_NoCase;
      ;               determines the mode of comparison
      ; out: return value: #True/#False, or -1 on error
      
      Select options
         Case 0                                           ; #PB_Sort_Ascending
            ProcedureReturn Bool(x$ <= y$)
         Case 1                                           ; #PB_Sort_Descending
            ProcedureReturn Bool(x$ >= y$)
         Case 2                                           ; #PB_Sort_Ascending|#PB_Sort_NoCase
            ProcedureReturn Bool(LCase(x$) <= LCase(y$))
         Case 3                                           ; #PB_Sort_Descending|#PB_Sort_NoCase
            ProcedureReturn Bool(LCase(x$) >= LCase(y$))
         Default                                          ; illegal function call
            ProcedureReturn -1
      EndSelect
   EndProcedure
   
   
   Procedure.i _Merge (List a.StringArray(), List b.StringArray(), cmpIndex.i, options.i, List result.StringArray())
      ; -- Auxiliary function for Sort()
      ; in : a(), b(): partial lists which are already sorted
      ;      cmpIndex: index of the array element, according to which the list will get sorted
      ;      options : #PB_Sort_Ascending/#PB_Sort_Descending optionally combined with #PB_Sort_NoCase;
      ;                determines the mode of comparison
      ;      result(): empty list
      ; out: result()    : sorted list that contains all elements of a() and b()
      ;      return value: 1 on success, -1 on error
      Protected.i flag
      
      LastElement(a())
      FirstElement(b())
      
      flag = _IsInTheRightOrder(a()\s(cmpIndex), b()\s(cmpIndex), options)
      If flag = #True                 ; In this case we can save time:
         MergeLists(a(), result())    ; Just move a() to result(), and then
         MergeLists(b(), result())    ; append b() to the end of result().
         ProcedureReturn 1            ; success
      ElseIf flag = -1
         ProcedureReturn -1           ; illegal function call
      EndIf
      
      FirstElement(a())
      
      Repeat
         If ListSize(b()) = 0
            MergeLists(a(), result())  ; move the remaining elements of a() to result()
            Break
         ElseIf ListSize(a()) = 0
            MergeLists(b(), result())  ; move the remaining elements of b() to result()
            Break
         Else
            ; compare values in a() and b(), and move the appropriate one to result()
            AddElement(result())
            If _IsInTheRightOrder(a()\s(cmpIndex), b()\s(cmpIndex), options) = #True
               result() = a()
               DeleteElement(a(), 1)
            Else
               result() = b()
               DeleteElement(b(), 1)
            EndIf
         EndIf
      ForEver
      
      ProcedureReturn 1               ; success
   EndProcedure
   
   
   Procedure.i Sort (List x.StringArray(), cmpIndex.i, options.i)
      ; -- Sort list x() in ascending or descending order (stable)
      ; in : x()     : List of string arrays to be sorted
      ;      cmpIndex: index of the array element, according to which the list will get sorted
      ;      options : #PB_Sort_Ascending/#PB_Sort_Descending optionally combined with #PB_Sort_NoCase;
      ;                determines the mode of comparison
      ; out: x()         : sorted list
      ;      return value: 1 on success, -1 on error
      
      If ListSize(x()) <= 1
         ProcedureReturn
      EndIf
      
      Protected NewList a.StringArray()
      Protected NewList b.StringArray()
      
      SelectElement(x(), Int(ListSize(x())/2))
      SplitList(x(), b())            ; Move the second half of x() to b().
      MergeLists(x(), a())           ; Move the remaining first half of x() to a().
      
      Sort(a(), cmpIndex, options)
      Sort(b(), cmpIndex, options)
      ProcedureReturn _Merge(a(), b(), cmpIndex, options, x())
   EndProcedure
   
   
   Procedure.i IsSorted (List x.StringArray(), cmpIndex.i, options.i)
      ; -- Check whether list x() is sorted according to array element 'cmpIndex'
      ; in : x()     : List of string arrays to check
      ;      cmpIndex: Index of the array element, according to which the list should be sorted
      ;      options : #PB_Sort_Ascending/#PB_Sort_Descending optionally combined with #PB_Sort_NoCase;
      ;                determines the mode of comparison
      ; out: return value: #True/#False, or -1 on error
      Protected.i flag
      Protected.StringArray *cur, *nxt
      
      *cur = FirstElement(x())
      *nxt = NextElement(x())
      While *nxt
         flag = _IsInTheRightOrder(*cur\s(cmpIndex), *nxt\s(cmpIndex), options)
         If flag <> #True
            ProcedureReturn flag    ; #False or -1
         EndIf
         *cur = *nxt
         *nxt = NextElement(x())
      Wend
      
      ProcedureReturn #True
   EndProcedure
   
   ;--------------------------------------------------------------------
   
   Procedure.i _IsInTheRightOrderEx (Array x$(1), Array y$(1), Array mode.CmpStruc(1))
      ; -- Auxiliary function for _MergeEx() and IsSortedEx()
      ; in : x$(), y$(): string arrays to be compared
      ;      mode()    : List of one or more array indexes to sort on (in order of preference),
      ;                  together with the corresponding sorting options
      ;                  (combination of #PB_Sort_Ascending/#PB_Sort_Descending and #PB_Sort_NoCase)
      ; out: return value: #True/#False, or -1 on error
      Protected x_cmp$, y_cmp$
      Protected.i k, idx, equal, lastMode=ArraySize(mode()), lastField=ArraySize(x$())
      
      For k = 0 To lastMode
         idx = mode(k)\index
         If idx < 0 Or idx > lastField
            ProcedureReturn -1                             ; illegal function call
         EndIf
         
         ; Using CompareMemoryString() would be probably faster here, but that
         ; function has a bug when the mode #PB_Sort_NoCase is chosen (PB 5.22),
         ; see <http://www.purebasic.fr/english/viewtopic.php?f=4&t=59096>
         If mode(k)\options & 2 = 0
            equal = Bool(x$(idx) = y$(idx))
         Else                                              ; #PB_Sort_NoCase
            x_cmp$ = LCase(x$(idx))
            y_cmp$ = LCase(y$(idx))
            equal = Bool(x_cmp$ = y_cmp$)
         EndIf
         
         If Not equal
            Select mode(k)\options
               Case 0                                      ; #PB_Sort_Ascending
                  ProcedureReturn Bool(x$(idx) < y$(idx))
               Case 1                                      ; #PB_Sort_Descending
                  ProcedureReturn Bool(x$(idx) > y$(idx))
               Case 2                                      ; #PB_Sort_Ascending|#PB_Sort_NoCase
                  ProcedureReturn Bool(x_cmp$ < y_cmp$)
               Case 3                                      ; #PB_Sort_Descending|#PB_Sort_NoCase
                  ProcedureReturn Bool(x_cmp$ > y_cmp$)
               Default                                     ; illegal function call
                  ProcedureReturn -1
            EndSelect
         EndIf
      Next
      
      ProcedureReturn #True
   EndProcedure
   
   
   Procedure.i _MergeEx (List a.StringArray(), List b.StringArray(), Array mode.CmpStruc(1), List result.StringArray())
      ; -- Auxiliary function for SortEx()
      ; in : a(), b(): partial lists which are already sorted
      ;      mode()  : List of one or more array indexes to sort on (in order of preference),
      ;                together with the corresponding sorting options
      ;                (combination of #PB_Sort_Ascending/#PB_Sort_Descending and #PB_Sort_NoCase)
      ;      result(): empty list
      ; out: result()    : sorted list that contains all elements of a() and b()
      ;      return value: 1 on success, -1 on error
      Protected.i flag
      
      LastElement(a())
      FirstElement(b())
      
      flag = _IsInTheRightOrderEx(a()\s(), b()\s(), mode())
      If flag = #True                  ; In this case we can save time:
         MergeLists(a(), result())     ; Just move a() to result(), and then
         MergeLists(b(), result())     ; append b() to the end of result().
         ProcedureReturn 1             ; success
      ElseIf flag = -1
         ProcedureReturn -1            ; illegal function call
      EndIf
      
      FirstElement(a())
      
      Repeat
         If ListSize(b()) = 0
            MergeLists(a(), result())  ; move the remaining elements of a() to result()
            Break
         ElseIf ListSize(a()) = 0
            MergeLists(b(), result())  ; move the remaining elements of b() to result()
            Break
         Else
            ; compare values in a() and b(), and move the appropriate one to result()
            AddElement(result())
            flag = _IsInTheRightOrderEx(a()\s(), b()\s(), mode())
            If flag = #True
               result() = a()
               DeleteElement(a(), 1)
            ElseIf flag = #False
               result() = b()
               DeleteElement(b(), 1)
            Else
               ProcedureReturn -1      ; illegal function call
            EndIf
         EndIf
      ForEver
      
      ProcedureReturn 1                ; success
   EndProcedure
   
   
   Procedure.i SortEx (List x.StringArray(), Array mode.CmpStruc(1))
      ; -- Sort list x() according to array indexes and options in list 'mode' (stable)
      ; in : x()   : List of string arrays to be sorted
      ;      mode(): List of one or more array indexes to sort on (in order of preference),
      ;              together with the corresponding sorting options
      ;              (combination of #PB_Sort_Ascending/#PB_Sort_Descending and #PB_Sort_NoCase)
      ; out: x()         : sorted list
      ;      return value: 1 on success, -1 on error
      
      If ListSize(x()) <= 1
         ProcedureReturn 1
      EndIf
      
      Protected NewList a.StringArray()
      Protected NewList b.StringArray()
      
      SelectElement(x(), Int(ListSize(x())/2))
      SplitList(x(), b())            ; Move the second half of x() to b().
      MergeLists(x(), a())           ; Move the remaining first half of x() to a().
      
      SortEx(a(), mode())
      SortEx(b(), mode())
      ProcedureReturn _MergeEx(a(), b(), mode(), x())
   EndProcedure
   
   
   Procedure.i IsSortedEx (List x.StringArray(), Array mode.CmpStruc(1))
      ; -- Check whether list x() is sorted according to array indexes and options in list 'mode'
      ; in : x()   : List of string arrays to check
      ;      mode(): List of one or more array indexes to sort on (in order of preference),
      ;              together with the corresponding sorting options
      ;              (combination of #PB_Sort_Ascending/#PB_Sort_Descending and #PB_Sort_NoCase)
      ; out: return value: #True/#False, or -1 on error
      Protected.i flag
      Protected.StringArray *cur, *nxt
      
      *cur = FirstElement(x())
      *nxt = NextElement(x())
      While *nxt
         flag = _IsInTheRightOrderEx(*cur\s(), *nxt\s(), mode())
         If flag <> #True
            ProcedureReturn flag    ; #False or -1
         EndIf
         *cur = *nxt
         *nxt = NextElement(x())
      Wend
      
      ProcedureReturn #True
   EndProcedure
EndModule


CompilerIf #PB_Compiler_IsMainFile
   ; -- Module demo
   EnableExplicit
   
   DataSection
      RecordSize:
      Data.i 3
      
      StringData:
      Data.s "Schmidt", "Michelle", "Paris"
      Data.s "Johnson", "Jamie", "London"
      Data.s "Schmidt", "Jürgen", "Berlin"
      Data.s "Miller", "Mary", "Washington"
      Data.s "Parker", "Peter", "Canberra"
      Data.s "Johnson", "Karl", "London"
      Data.s "Parker", "Michelle", "Berlin"
      Data.s ""
   EndDataSection
   
   
   Procedure ReadList (List x.ArrayList::StringArray())
      Protected r.i, lastField.i, i.i, temp$
      
      Restore RecordSize
      Read.i r
      lastField = r - 1
      
      Restore StringData
      Read.s temp$
      While temp$
         AddElement(x())
         ReDim x()\s(lastField)
         For i = 0 To lastField
            x()\s(i) = temp$
            Read.s temp$
         Next
      Wend
   EndProcedure
   
   
   Procedure DisplayList (List x.ArrayList::StringArray())
      Protected i.i, record$
      
      Debug ""
      ForEach x()
         record$ = x()\s(0)
         For i = 1 To ArraySize(x()\s())
            record$ + ", " + x()\s(i)
         Next
         Debug record$
      Next
   EndProcedure
   
   
   NewList x.ArrayList::StringArray()
   Dim mode.ArrayList::CmpStruc(2)
   Define CmpIndex, Options, Result
   
   ReadList(x())
   
   Debug "Shuffled:"
   RandomizeList(x())
   DisplayList(x())
   
   Debug "-----------------------------------"
   
   CmpIndex = 1
   Options = #PB_Sort_Descending
   Debug "Sorted descending according to given name (stable): "
   ArrayList::Sort(x(), CmpIndex, Options)
   If ArrayList::IsSorted(x(), CmpIndex, Options) = #True
      Debug "OK"
   Else
      Debug "List x() is not sorted correctly."
   EndIf
   DisplayList(x())
   
   Debug "-----------------------------------"
   
   CmpIndex = 0
   Options = #PB_Sort_Ascending
   Debug "Sorted ascending according to family name (stable): "
   ArrayList::Sort(x(), CmpIndex, Options)
   If ArrayList::IsSorted(x(), CmpIndex, Options) = #True
      Debug "OK"
   Else
      Debug "List x() is not sorted correctly."
   EndIf
   DisplayList(x())
   
   Debug "==================================="
   
   Debug "Shuffled:"
   RandomizeList(x())
   DisplayList(x())
   
   Debug "-----------------------------------"
   
   mode(0)\index = 0
   mode(0)\options = #PB_Sort_Descending|#PB_Sort_NoCase
   
   mode(1)\index = 2
   mode(1)\options = #PB_Sort_Ascending|#PB_Sort_NoCase
   
   mode(2)\index = 1
   mode(2)\options = #PB_Sort_Ascending
   
   Debug "Sorted according to multiple indexes and options: "
   If ArrayList::SortEx(x(), mode()) = 1
      Debug "OK"
   Else
      Debug "Illegal function call"
   EndIf
   Result = ArrayList::IsSortedEx(x(), mode())
   If Result = #True
      Debug "OK"
   ElseIf Result = #False
      Debug "List x() is not sorted correctly."
   Else
      Debug "Illegal function call"
   EndIf
   DisplayList(x())
CompilerEndIf
Last edited by Little John on Sat May 03, 2014 4:36 pm, edited 3 times in total.
User avatar
idle
Always Here
Always Here
Posts: 5915
Joined: Fri Sep 21, 2007 5:52 am
Location: New Zealand

Re: MergeSort for linked lists of complex elements

Post by idle »

thanks
Windows 11, Manjaro, Raspberry Pi OS
Image
Little John
Addict
Addict
Posts: 4791
Joined: Thu Jun 07, 2007 3:25 pm
Location: Berlin, Germany

Re: MergeSort for linked lists of complex elements

Post by Little John »

New version 2.0

Changes
  • The code now has the form of a module.
  • Changed several internal details.
Removed
  • Removed code for compatibility with PB versions < 4.60
    (since PB 5.20 or newer is required now anyway because of the module).
New
  • New parameter for passing sorting options
    (#PB_Sort_Ascending/#PB_Sort_Descending, and #PB_Sort_NoCase)
  • New public procedure IsSorted(),
    in order to check whether the given list is sorted as expected.
  • New public procedures SortEx() and IsSortedEx(),
    for advanced sorting according to multiple array elements.
davido
Addict
Addict
Posts: 1890
Joined: Fri Nov 09, 2012 11:04 pm
Location: Uttoxeter, UK

Re: MergeSort for linked lists of complex elements

Post by davido »

Very nice.
Thank you. :D
DE AA EB
Post Reply