Multi-field sorting for a structured array

Share your advanced PureBasic knowledge/code with the community.
User avatar
Demivec
Addict
Addict
Posts: 4260
Joined: Mon Jul 25, 2005 3:51 pm
Location: Utah, USA

Multi-field sorting for a structured array

Post by Demivec »

Code updated for 5.20+

Here is some code I created for sorting a structured array on multiple fields. This code is needed because sorting an array in PB is not performed with a stable sort. The procedures I created below will allow you to sort on multiple fields at once (so to speak). It uses PureBasic's SortStructuredArray() function to do the main work but selectively calls it only over certain ranges and with selected parameters to accomplish the desired multi-field sort.

Because PB's SortStructuredArray() function requires a specific structured array parameter and not a pointer to an array I have written a macro that creates the needed procedures for an array of a specific structure. For instance to create the procedures to be used for arrays with the structure 'point' you would call the macro mcr_multiSort(point). This would create the procedure multiSort_point() that would be called with parameters as multiSort_point(a.point(), sortSpecs()).

The sortSpecs() array contains the details of the sorting for each field being sorted. Its dimensions must be one less than the number of fields being sorted. It is structured with the structure sortFieldSpecs which contains entries for the offset of the field, it's type, and the option for sorting. Each of those fields correspond to one of the parameters in PB's SortStructuredArray() function.

I've arranged the code as an include:

Code: Select all

;Program: Multi-Field Sort for Structured Arrays.
;Name:  Array_MultiFieldSort.pbi
;Author: Demivec
;Date: 3/12/2012
;Written for PureBasic v4.61

EnableExplicit

;-sort routines
Enumeration   
  ;The standard equality functions return one of the following comparison
  ;results.
  #isNotEqualTo = 0
  #isEqualTo
EndEnumeration

;=================================================================================
; Description: Each of the equality functions compares two elements that are a
;              PureBasic standard data type for equality and returns a boolean
;              result.  The elements are at addresses [*element_1] and [*element_2].
;
; Returns: Returns '#isEqualTo' if the first element is equal to
;          element two, otherwise it returns '#isNotEqualTo'.
;
; Note: A given equality function can only operate on compatible data and so care
;       must be taken in using the correct function.  These functions are only
;       used by the subSort() function.
;=================================================================================
Macro mcr_equality_standardTypes(type, subStructure)
  Procedure _isEqual_#type(*element_1.type, *element_2.type)
    If *element_1\subStructure = *element_2\subStructure
      ProcedureReturn #isEqualTo
    EndIf 
    ProcedureReturn #isNotEqualTo
  EndProcedure
EndMacro

mcr_equality_standardTypes(byte, b)
mcr_equality_standardTypes(word, w)
mcr_equality_standardTypes(long, l)
mcr_equality_standardTypes(integer, i)
mcr_equality_standardTypes(quad, q)
mcr_equality_standardTypes(float, f)
mcr_equality_standardTypes(double, d)
mcr_equality_standardTypes(ascii, a)
mcr_equality_standardTypes(character, c)
mcr_equality_standardTypes(unicode, u)
mcr_equality_standardTypes(string, s)

DataSection
  _standardEqualityFunctions:
  ;Equality functions for each of the standard element types.
  ;Data must be present for each of PB's standard sort compare types.
  ;A line for each parameter-constant followed by the address of its equality function.
  Data.i 11 ;1 based count of equality functions
  Data.i #PB_Byte     , @_isEqual_byte()     
  Data.i #PB_Word     , @_isEqual_word()     
  Data.i #PB_Long     , @_isEqual_long()     
  Data.i #PB_Integer  , @_isEqual_Integer()
  Data.i #PB_Quad     , @_isEqual_quad()     
  Data.i #PB_Float    , @_isEqual_float()    
  Data.i #PB_Double   , @_isEqual_double()   
  Data.i #PB_Ascii    , @_isEqual_ascii()    
  Data.i #PB_Character, @_isEqual_character()
  Data.i #PB_Unicode  , @_isEqual_unicode()  
  Data.i #PB_String   , @_isEqual_string()   
EndDataSection

Prototype.i equalityFunction(*element_1, *element_2)

Global Dim *equalityFunctionAdr(0)

Procedure setupEqualityFunctions()
  Shared *equalityFunctionAdr()
  Protected i, equalityFunctionCount, dataType

  Restore _standardEqualityFunctions
  Read.i equalityFunctionCount
  For i = 1 To equalityFunctionCount
    Read.i dataType
    If dataType > ArraySize(*equalityFunctionAdr())
      Redim *equalityFunctionAdr(dataType)
    EndIf
    Read.i *equalityFunctionAdr(dataType)
  Next 
EndProcedure

setupEqualityFunctions()

Structure sortFieldSpecs
  offset.i
  type.i
  option.i
EndStructure

;The multiSort procedure created ends with the structure name i.e. multiSort_struc()
;The multiSort procedure has parameters {arr(), sortSpecs() [, rangeStart, rangeEnd]}.
;rangeStart and rangeEnd are optional and if either of them is specified they both
;must be specified.
Macro mcr_multiSort(struc)
  ;multiSort() and subSort() only are set up for use with the given structure
  Procedure subSort_#struc(Array x.struc(1), Array sortFieldSpecs.sortFieldSpecs(1), sortLevel, rangeStart, rangeEnd)
    Protected i, firstElement = rangeStart, prevSortLevel = sortLevel - 1
    Protected equalityFunc.equalityFunction = *equalityFunctionAdr(sortFieldSpecs(prevSortLevel)\type)
    
    For i = rangeStart + 1 To rangeEnd
      If equalityFunc(@x(firstElement) + sortFieldSpecs(prevSortLevel)\offset, @x(i) + sortFieldSpecs(prevSortLevel)\offset) <> #isEqualTo
        If i - firstElement > 1
          SortStructuredArray(x(), sortFieldSpecs(sortLevel)\option, sortFieldSpecs(sortLevel)\offset, sortFieldSpecs(sortLevel)\type, firstElement, i - 1)
          If sortLevel < ArraySize(sortFieldSpecs())
            subSort_#struc(x(), sortFieldSpecs(), sortLevel + 1, firstElement, i - 1)
          EndIf 
        EndIf
        firstElement = i
      EndIf
    Next
    
    If rangeEnd - firstElement > 0
      SortStructuredArray(x(), sortFieldSpecs(sortLevel)\option, sortFieldSpecs(sortLevel)\offset, sortFieldSpecs(sortLevel)\type, firstElement, rangeEnd)
      If sortLevel < ArraySize(sortFieldSpecs())
        subSort_#struc(x(), sortFieldSpecs(), sortLevel + 1, firstElement, rangeEnd)
      EndIf
    EndIf 
  EndProcedure
  
  Procedure multiSort_#struc(Array x.struc(1), Array sortFieldSpecs.sortFieldSpecs(1), rangeStart = -1, rangeEnd = -1)
    If rangeStart = -1 Or rangeEnd = -1
      rangeStart = 0: rangeEnd = ArraySize(x())
    EndIf 
    SortStructuredArray(x(), sortFieldSpecs(0)\option, sortFieldSpecs(0)\offset, sortFieldSpecs(0)\type, rangeStart, rangeEnd)
    If ArraySize(sortFieldSpecs()) > 0 And ArraySize(x()) > 1
      subSort_#struc(x(), sortFieldSpecs(), 1, rangeStart, rangeEnd)
    EndIf
  EndProcedure
EndMacro

;This helper macro makes it possible to specify on one line the sort options for an array of 
;sortFieldSpecs at a specified index.
Macro mcr_setSortSpec(arrName, index, _offset, _type, _option = #PB_Sort_Ascending)
  arrName(index)\offset = _offset
  arrName(index)\type = _type
  arrName(index)\option = _option
EndMacro
The macro mcr_setSortSpec() is provided so that the sortSpecs for an index can be specified on a single line.

Here's some demonstration code that uses the include file:

Code: Select all

;Demonstration code for Multi-Field Sort of a Structured Array.
;Author: Demivec
;Written for PureBasic v4.61 
;Date: 3/7/2012

EnableExplicit

IncludeFile "Array_MultiFieldSort.pbi"

Structure myOwn
  a.w
  b$
  c.i
  d$
EndStructure

Procedure display_myOwn(Array x.myOwn(1), header.s = "", footer.s = "")
  Protected i, tw = 8
  Debug header
  Debug LSet("a", tw) + LSet("b$", tw) + LSet("c", tw) + LSet("d$", tw)
  Debug LSet("---", tw) + LSet("---", tw) + LSet("---", tw) + LSet("---", tw)
  For i = 0 To ArraySize(x())
    Debug LSet(Str(x(i)\a), tw) + LSet(x(i)\b$, tw) + LSet(Str(x(i)\c), tw) + LSet(x(i)\d$, tw)
  Next
  Debug footer
EndProcedure

;-sort routines
mcr_multiSort(myOwn) ;create procedure multiSort_myOwn()

;-setup random data
Define i, entryCount = 10000
Dim myOwn.myOwn(entryCount)
For i = 0 To entryCount
  myOwn(i)\a = Random(10)
  myOwn(i)\b$ = Chr(Random(10) + 65)
  myOwn(i)\c = (Random(10) + 1) * 20000
  myOwn(i)\d$ = Chr(Random(10) + 65)
Next 

display_myOwn(myOwn(), "-- Before Sort --")

;now for some heavy duty sorting
Dim myOwn_sortSpecs.sortFieldSpecs(3) ;Array should be sized to be one less than the number of fields to be sorted

mcr_setSortSpec(myOwn_sortSpecs, 0, OffsetOf(myOwn\d$), #PB_String, #PB_Sort_Ascending)
mcr_setSortSpec(myOwn_sortSpecs, 1, OffsetOf(myOwn\b$), #PB_String, #PB_Sort_Descending)
mcr_setSortSpec(myOwn_sortSpecs, 2, OffsetOf(myOwn\a), #PB_Word, #PB_Sort_Descending)
mcr_setSortSpec(myOwn_sortSpecs, 3, OffsetOf(myOwn\c), #PB_Integer, #PB_Sort_Ascending)

multiSort_myOwn(myOwn(), myOwn_sortSpecs())

display_myOwn(myOwn(), "-- Sort by: d$ A, b$ D, a.w D, c.i A --") ;'A' and 'D' represent ascending and descending order
To resort with different fields, options, or the number of fields just redimension the sortSpecs() array and fill it in with the new options.


@Edit: separated the previous code into an include file and demonstration code.
Created a procedure for the initialization code so that the variables used for setup could be kept local to the procedure, leaving only one global variable defined.

Modified the multiSort() procedure to allow an optional range to be specified.


Here is the code arranged as a module with the demonstration code included:

Code: Select all

;Program: Multi-Field Sort for Structured Arrays.
;Programme: tri d'un tableau structuré sur plusieurs champs.
;Name:  Array_MultiFieldSort.pbi
;Author: Demivec
;Date: 3/12/2012
;Written for PureBasic v4.61, updated for v5.40b3
;Translations of comments into French made with the help of GoogleTranslate.

EnableExplicit

DeclareModule amf_sort
  EnableExplicit
  
  Enumeration   
    ;The standard equality functions return one of the following comparison results.
    ;Les fonctions d'égalité standard retournent l'un des résultats de comparaison suivants.
    #isNotEqualTo = 0
    #isEqualTo
  EndEnumeration
  
  Structure sortFieldSpecs
    offset.i
    type.i
    option.i
  EndStructure
  
  Prototype.i equalityFunction(*element_1, *element_2)
  
  Global Dim *equalityFunctionAdr(0)
  
  ;A multiSort procedure will be created that includes the structure name, i.e. multiSort_struc().
  ;The procedure multiSort will have parameters {arr.struc(), sortSpecs.sortFieldSpecs() [, rangeStart, rangeEnd]}.
  ; arr.struc() is an array of the structure specified when using the create macro.
  ; sortSpecs.sortFieldSpecs() is an array that specifies each field to sort, its offset, type, and
  ; method (methods uses the same values as PB's sort functions).
  ; rangeStart and rangeEnd are optional and if either of them is specified they both
  ; must be specified.
  ;Une procédure 'multiSort_struc' sera créé qui comprend le nom de la structure, i.e. multiSort_struc().
  ;La procédure, 'multiSort_struc', aura paramètres {arr.struc(), sortSpecs() [, rangeStart, rangeEnd]}.
  ; arr.struc() est un tableau de la structure spécifiée lors de l'utilisation de la créer macro.
  ; sortSpecs.sortFieldSpecs() est un tableau qui spécifie chaque champ pour trier, son décalage, le type et
  ; la méthode (méthodes utilise les mêmes valeurs que les fonctions de tri PB) 
  ; rangeStart et rangeEnd sont facultatives et si l'un d'entre eux est spécifié qu'ils fois
  ; doit être spécifié.

  Macro mcr_createMultiSort(struc)
    ;The procedure multiSort_struc() is created for use with a specific structure.
    ;The macro has to be called for each specific structure that needs to be sorted.
    ;La procédure multiSort_struc() est créé pour être utilisé avec une structure spécifique.
    ;La macro doit être appelée pour chaque structure spécifique qui doit être trié.
    Procedure subSort_#struc(Array x.struc(1), Array sortFieldSpecs.amf_sort::sortFieldSpecs(1), sortLevel, rangeStart, rangeEnd)
      Protected i, firstElement = rangeStart, prevSortLevel = sortLevel - 1
      Protected equalityFunc.amf_sort::equalityFunction = amf_sort::*equalityFunctionAdr(sortFieldSpecs(prevSortLevel)\type)
      
      For i = rangeStart + 1 To rangeEnd
        If equalityFunc(@x(firstElement) + sortFieldSpecs(prevSortLevel)\offset, @x(i) + sortFieldSpecs(prevSortLevel)\offset) <> amf_sort::#isEqualTo
          If i - firstElement > 1
            SortStructuredArray(x(), sortFieldSpecs(sortLevel)\option, sortFieldSpecs(sortLevel)\offset, sortFieldSpecs(sortLevel)\type, firstElement, i - 1)
            If sortLevel < ArraySize(sortFieldSpecs())
              subSort_#struc(x(), sortFieldSpecs(), sortLevel + 1, firstElement, i - 1)
            EndIf 
          EndIf
          firstElement = i
        EndIf
      Next
      
      If rangeEnd - firstElement > 0
        SortStructuredArray(x(), sortFieldSpecs(sortLevel)\option, sortFieldSpecs(sortLevel)\offset, sortFieldSpecs(sortLevel)\type, firstElement, rangeEnd)
        If sortLevel < ArraySize(sortFieldSpecs())
          subSort_#struc(x(), sortFieldSpecs(), sortLevel + 1, firstElement, rangeEnd)
        EndIf
      EndIf 
    EndProcedure
    
    Procedure multiSort_#struc(Array x.struc(1), Array sortFieldSpecs.amf_sort::sortFieldSpecs(1), rangeStart = -1, rangeEnd = -1)
      If rangeStart = -1 Or rangeEnd = -1
        rangeStart = 0: rangeEnd = ArraySize(x())
      EndIf 
      SortStructuredArray(x(), sortFieldSpecs(0)\option, sortFieldSpecs(0)\offset, sortFieldSpecs(0)\type, rangeStart, rangeEnd)
      If ArraySize(sortFieldSpecs()) > 0 And ArraySize(x()) > 1
        subSort_#struc(x(), sortFieldSpecs(), 1, rangeStart, rangeEnd)
      EndIf
    EndProcedure
  EndMacro
  
  ;This helper macro makes it possible to specify on one line the sort options for an array of 
  ;sortFieldSpecs at a specified index.
  ;Cette macro auxiliaire permet de spécifier sur une seule ligne les options de tri pour un tableau de
  ;sortFieldSpecs à un index spécifié.
  Macro mcr_setSortSpec(arrName, index, _offset, _type, _option = #PB_Sort_Ascending)
    arrName(index)\offset = _offset
    arrName(index)\type = _type
    arrName(index)\option = _option
  EndMacro  
EndDeclareModule

Module amf_sort
  ;=================================================================================
  ; Description: Each of the equality functions compares two elements that are a
  ;              PureBasic standard data type for equality and returns a boolean
  ;              result.  The elements are at addresses [*element_1] and [*element_2].
  ;
  ; Returns: Returns '#isEqualTo' if the first element is equal to the second element,
  ;          otherwise it returns '#isNotEqualTo'.
  ;
  ; Note: Each equality function can only operate on one data type.  Care should be
  ;       taken so that the correct function is used.  These functions are used by
  ;       the subSort() function.
  ;=================================================================================
  ; Description: Chacune des fonctions d'égalité compare deux éléments qui sont un
  ;              type de données standard PureBasic pour l'égalité et retourne un
  ;              résultat booléen. Les éléments sont à des adresses [*element_1] et [*element_2] 
  ;
  ; Retours: Retours '#isEqualTo' si le premier élément est égale à la deuxième élément,
  ;          sinon il retourne '#isNotEqualTo'. 
  ;
  ;Remarque: Chaque fonction de l'égalité ne peut fonctionner que sur un seul type. Il faut prendre soin
  ;          de sorte que la functin correct est utilisé. Ces fonctions sont utilisées
  ;          par la fonction subSort().
    ;=================================================================================
  Macro mcr_equality_standardTypes(type, subStructure)
    Procedure _isEqual_#type(*element_1.type, *element_2.type)
      If *element_1\subStructure = *element_2\subStructure
        ProcedureReturn #isEqualTo
      EndIf 
      ProcedureReturn #isNotEqualTo
    EndProcedure
  EndMacro
  
  mcr_equality_standardTypes(byte, b)
  mcr_equality_standardTypes(word, w)
  mcr_equality_standardTypes(long, l)
  mcr_equality_standardTypes(integer, i)
  mcr_equality_standardTypes(quad, q)
  mcr_equality_standardTypes(float, f)
  mcr_equality_standardTypes(double, d)
  mcr_equality_standardTypes(ascii, a)
  mcr_equality_standardTypes(character, c)
  mcr_equality_standardTypes(unicode, u)
  mcr_equality_standardTypes(string, s)
  
  DataSection
    _standardEqualityFunctions:
    ;Equality functions for each of the standard element types.
    ;Data must be present for each of PB's standard data types.
    ;A line for each parameter-constant followed by the address of its equality function.
    ;Fonctions de l'égalité pour chacun des types d'éléments standard.
    ;Données doit être présent pour chacun des types de données standard de PB.
    ;Une ligne pour chaque paramètre-constant suivi de l'adresse de sa fonction d'égalité.
    Data.i 11 ;1 based count of equality functions
    Data.i #PB_Byte     , @_isEqual_byte()     
    Data.i #PB_Word     , @_isEqual_word()     
    Data.i #PB_Long     , @_isEqual_long()     
    Data.i #PB_Integer  , @_isEqual_Integer()
    Data.i #PB_Quad     , @_isEqual_quad()     
    Data.i #PB_Float    , @_isEqual_float()    
    Data.i #PB_Double   , @_isEqual_double()   
    Data.i #PB_Ascii    , @_isEqual_ascii()    
    Data.i #PB_Character, @_isEqual_character()
    Data.i #PB_Unicode  , @_isEqual_unicode()  
    Data.i #PB_String   , @_isEqual_string()   
  EndDataSection
  
  Global Dim *equalityFunctionAdr(0)
  
  Procedure setupEqualityFunctions()
    Shared *equalityFunctionAdr()
    Protected i, equalityFunctionCount, dataType
    
    Restore _standardEqualityFunctions
    Read.i equalityFunctionCount
    For i = 1 To equalityFunctionCount
      Read.i dataType
      If dataType > ArraySize(*equalityFunctionAdr())
        ReDim *equalityFunctionAdr(dataType)
      EndIf
      Read.i *equalityFunctionAdr(dataType)
    Next 
  EndProcedure
  
  setupEqualityFunctions()
  
EndModule

;Demonstration code.
CompilerIf #PB_Compiler_IsMainFile = 1
  ;Demonstration code for sorting a Structured Array on multiple fields.
  ;Code de démonstration pour le tri d'un tableau structuré sur plusieurs champs.
  ;Author: Demivec
  ;Written for PureBasic v5.40b3
  ;Date: 9/8/2015
  
  EnableExplicit
  
  
  Structure myOwn
    a.w
    b$
    c.i
    d$
  EndStructure
  
  Procedure display_myOwn(Array x.myOwn(1), header.s = "", footer.s = "")
    Protected i, tw = 16
    Debug header
    Debug LSet("a", tw) + LSet("b$", tw) + LSet("c", tw) + LSet("d$", tw)
    Debug LSet("---", tw) + LSet("---", tw) + LSet("-----------", tw) + LSet("---", tw)
    For i = 0 To ArraySize(x())
      Debug LSet(Str(x(i)\a), tw) + LSet(x(i)\b$, tw) + LSet(Str(x(i)\c), tw) + LSet(x(i)\d$, tw)
    Next
    Debug footer
  EndProcedure
  
  ;create procedure multiSort_myOwn()
  ;créer procédure multiSort_myOwn()
  amf_sort::mcr_createMultiSort(myOwn) 
  
  ;setup random data
  ;configuration des données aléatoires
  Define i, entryCount = 10000
  Dim myOwn.myOwn(entryCount)
  For i = 0 To entryCount
    myOwn(i)\a = Random(10)
    myOwn(i)\b$ = Chr(Random(10) + 65)
    myOwn(i)\c = (Random(10) + 1) * 20000
    myOwn(i)\d$ = Chr(Random(10) + 65)
  Next 
  
  display_myOwn(myOwn(), "-- Before Sort --")
  
  ;Now for some heavy duty sorting.
  ;Maintenant pour certains poids lourds tri.
  
  ;Array size should be equal to (number of fields to be sorted) - 1
  ;Taille de tableau doit être égal au (nombre de champs à trier) - 1
  Dim myOwn_sortSpecs.amf_sort::sortFieldSpecs(3)
  
  ;index 0 is primary key, index 1 is secondary key, and so forth
  ;index 0 est la clé primaire, l'index 1 est la clé secondaire, etc.
  amf_sort::mcr_setSortSpec(myOwn_sortSpecs, 0, OffsetOf(myOwn\d$), TypeOf(myOwn\d$), #PB_Sort_Ascending)
  amf_sort::mcr_setSortSpec(myOwn_sortSpecs, 1, OffsetOf(myOwn\b$), TypeOf(myOwn\b$), #PB_Sort_Descending)
  amf_sort::mcr_setSortSpec(myOwn_sortSpecs, 2, OffsetOf(myOwn\a), TypeOf(myOwn\a), #PB_Sort_Descending)
  amf_sort::mcr_setSortSpec(myOwn_sortSpecs, 3, OffsetOf(myOwn\c), TypeOf(myOwn\c), #PB_Sort_Ascending)
  
  multiSort_myOwn(myOwn(), myOwn_sortSpecs())
  
  Debug "================================================================="
  ;'A' and 'D' represent ascending and descending order
  ;'A' and 'D' représenter ordre croissant ou décroissant
  display_myOwn(myOwn(), "-- Sort by: d$ A, b$ D, a.w D, c.i A --")
  
CompilerEndIf
@Edit: corrected a small bug that skipped sorts of only 2 elements if they occurred at the end of a sub-sort range. Made a very minor change to cleanup code.

Added a default option of #PB_Sort_Ascending for the mcr_setSortSpec() macro. Thanks for skywalk for reminding me of this possibility.

@Edit2: Added the include in the form of a module. Proper use still requires using a macro (defined in the module) that will define a procedure for a particular multi-field structure, then calling the procedure produced by that macro for the actual sorting of arrays with that structure.
Last edited by Demivec on Sat Oct 03, 2015 2:11 am, edited 4 times in total.
User avatar
skywalk
Addict
Addict
Posts: 4211
Joined: Wed Dec 23, 2009 10:14 pm
Location: Boston, MA

Re: Multi-field sorting for a structured array

Post by skywalk »

Very cool Demivec!
I wish this was natively supported as in SortStructuredList().
The nice thing about standards is there are so many to choose from. ~ Andrew Tanenbaum
User avatar
einander
Enthusiast
Enthusiast
Posts: 744
Joined: Thu Jun 26, 2003 2:09 am
Location: Spain (Galicia)

Re: Multi-field sorting for a structured array

Post by einander »

Thanks Demivec. Great code and very useful!
User avatar
Demivec
Addict
Addict
Posts: 4260
Joined: Mon Jul 25, 2005 3:51 pm
Location: Utah, USA

Re: Multi-field sorting for a structured array

Post by Demivec »

@skywalk, @einander: Thanks.

I've made some slight modifications to the first post that includes some useful structuring of the code with a slight improvement as well. It should be ready for primetime now. :)
User avatar
skywalk
Addict
Addict
Posts: 4211
Joined: Wed Dec 23, 2009 10:14 pm
Location: Boston, MA

Re: Multi-field sorting for a structured array

Post by skywalk »

Hi Demivec.
Trying to use this I found a weird case where #PB_Sort_NoCase doesn't affect the outcome?
Am I doing something wrong or is there a bug in the SortStructuredArray() when Or'ing options?
So many macros it's tough to debug... :(
Where are you Macro Expander :evil:

Code: Select all

;Program: Multi-Field Sort for Structured Arrays.
;Name:  Array_MultiFieldSort.pbi
;Author: Demivec
;Date: 3/7/2012
;Written for PureBasic v4.61

EnableExplicit

;-sort routines
Enumeration   
  ;The standard equality functions return one of the following comparison
  ;results.
  #isNotEqualTo = 0
  #isEqualTo
EndEnumeration

;=================================================================================
; Description: Each of the equality functions compares two elements that are a
;              PureBasic standard data type for equality and returns a boolean
;              result.  The elements are at addresses [*element_1] and [*element_2].
;
; Returns: Returns '#isEqualTo' if the first element is equal to
;          element two, otherwise it returns '#isNotEqualTo'.
;
; Note: A given equality function can only operate on compatible data and so care
;       must be taken in using the correct function.  These functions are only
;       used by the subSort() function.
;=================================================================================
Macro mcr_equality_standardTypes(type, subStructure)
  Procedure _isEqual_#type(*element_1.type, *element_2.type)
    If *element_1\subStructure = *element_2\subStructure
      ProcedureReturn #isEqualTo
    EndIf 
    ProcedureReturn #isNotEqualTo
  EndProcedure
EndMacro

mcr_equality_standardTypes(byte, b)
mcr_equality_standardTypes(word, w)
mcr_equality_standardTypes(long, l)
mcr_equality_standardTypes(integer, i)
mcr_equality_standardTypes(quad, q)
mcr_equality_standardTypes(float, f)
mcr_equality_standardTypes(double, d)
mcr_equality_standardTypes(ascii, a)
mcr_equality_standardTypes(character, c)
mcr_equality_standardTypes(unicode, u)
mcr_equality_standardTypes(string, s)

DataSection
  _standardEqualityFunctions:
  ;Equality functions for each of the standard element types.
  ;Data must be present for each of PB's standard sort compare types.
  ;A line for each parameter-constant followed by the address of its equality function.
  Data.i 11 ;1 based count of equality functions
  Data.i #PB_Sort_Byte     , @_isEqual_byte()     
  Data.i #PB_Sort_Word     , @_isEqual_word()     
  Data.i #PB_Sort_Long     , @_isEqual_long()     
  Data.i #PB_Sort_Integer  , @_isEqual_Integer()  ;same as either long or quad, technically this entry is an unnecessary duplicate
  Data.i #PB_Sort_Quad     , @_isEqual_quad()     
  Data.i #PB_Sort_Float    , @_isEqual_float()    
  Data.i #PB_Sort_Double   , @_isEqual_double()   
  Data.i #PB_Sort_Ascii    , @_isEqual_ascii()    
  Data.i #PB_Sort_Character, @_isEqual_character()
  Data.i #PB_Sort_Unicode  , @_isEqual_unicode()  
  Data.i #PB_Sort_String   , @_isEqual_string()   
EndDataSection

Prototype.i equalityFunction(*element_1, *element_2)

Global Dim *equalityFunctionAdr(0)

Procedure setupEqualityFunctions()
  Shared *equalityFunctionAdr()
  Protected i, equalityFunctionCount, dataType

  Restore _standardEqualityFunctions
  Read.i equalityFunctionCount
  For i = 1 To equalityFunctionCount
    Read.i dataType
    If dataType > ArraySize(*equalityFunctionAdr())
      ReDim *equalityFunctionAdr(dataType)
    EndIf
    Read.i *equalityFunctionAdr(dataType)
  Next 
EndProcedure

setupEqualityFunctions()

Structure sortFieldSpecs
  offset.i
  type.i
  option.i
EndStructure

;The multiSort procedure created ends with the structure name i.e. multiSort_struc()
;The multiSort procedure has parameters {arr(), sortSpecs() [, rangeStart, rangeEnd]}.
;rangeStart and rangeEnd are optional and if either of them is specified they both
;must be specified.
Macro mcr_multiSort(struc)
  ; multiSort() and subSort() only are set up for use with the given structure
  Procedure subSort_#struc(Array x.struc(1), Array sortFieldSpecs.sortFieldSpecs(1), sortLevel, rangeStart, rangeEnd)
    Protected i = rangeStart, firstElement, prevSortLevel = sortLevel - 1
    Protected equalityFunc.equalityFunction = *equalityFunctionAdr(sortFieldSpecs(prevSortLevel)\type)
    firstElement = rangeStart
    For i = rangeStart + 1 To rangeEnd
      If equalityFunc(@x(firstElement) + sortFieldSpecs(prevSortLevel)\offset, @x(i) + sortFieldSpecs(prevSortLevel)\offset) <> #isEqualTo
        If i - firstElement > 1
          SortStructuredArray(x(), sortFieldSpecs(sortLevel)\option, sortFieldSpecs(sortLevel)\offset, sortFieldSpecs(sortLevel)\type, firstElement, i - 1)
          If sortLevel < ArraySize(sortFieldSpecs())
            subSort_#struc(x(), sortFieldSpecs(), sortLevel + 1, firstElement, i - 1)
          EndIf 
        EndIf
        firstElement = i
      EndIf
    Next
    If rangeEnd - firstElement > 1
      SortStructuredArray(x(), sortFieldSpecs(sortLevel)\option, sortFieldSpecs(sortLevel)\offset, sortFieldSpecs(sortLevel)\type, firstElement, rangeEnd)
      If sortLevel < ArraySize(sortFieldSpecs())
        subSort_#struc(x(), sortFieldSpecs(), sortLevel + 1, firstElement, rangeEnd)
      EndIf
    EndIf 
  EndProcedure
  
  Procedure multiSort_#struc(Array x.struc(1), Array sortFieldSpecs.sortFieldSpecs(1), rangeStart = -1, rangeEnd = -1)
    If rangeStart = -1 Or rangeEnd = -1
      rangeStart = 0: rangeEnd = ArraySize(x())
    EndIf 
    SortStructuredArray(x(), sortFieldSpecs(0)\option, sortFieldSpecs(0)\offset, sortFieldSpecs(0)\type, rangeStart, rangeEnd)
    If ArraySize(sortFieldSpecs()) > 0 And ArraySize(x()) > 1
      subSort_#struc(x(), sortFieldSpecs(), 1, rangeStart, rangeEnd)
    EndIf
  EndProcedure
EndMacro

; This helper macro makes it possible to specify on one line the sort options for an array of 
; sortFieldSpecs at a specified index.
Macro mcr_setSortSpec(arrName, index, _offset, _type, _option=#PB_Sort_Ascending)
  arrName(index)\offset = _offset
  arrName(index)\type = _type
  arrName(index)\option = _option
EndMacro

;-{ TEST
EnableExplicit
;IncludeFile "Array_MultiFieldSort.pbi"
Structure myOwn
  a.i
  b.d
  c$
EndStructure

Procedure display_myOwn(Array x.myOwn(1), header.s = "", footer.s = "")
  Protected i, tw = 8
  Debug header
  Debug LSet("a", tw) + LSet("b", tw) + LSet("c$", tw)
  Debug LSet("---", tw) + LSet("---", tw) + LSet("---", tw)
  For i = 0 To ArraySize(x())
    Debug LSet(Str(x(i)\a), tw) + LSet(Str(x(i)\b), tw) + LSet(x(i)\c$, tw)
  Next
  Debug footer
EndProcedure

mcr_multiSort(myOwn) ; creates procedure multiSort_myOwn()

Define.i i, entryCount
Define.s r$
Restore SortThis
Read.i entryCount
Dim myOwn.myOwn(entryCount-1)
For i = 0 To entryCount-1
  Read.s r$: myOwn(i)\a = Val(r$)
  Read.s r$: myOwn(i)\b = ValD(r$)
  Read.s myOwn(i)\c$
Next 
display_myOwn(myOwn(), "-- Before Sort --")

Dim myOwn_sortSpecs.sortFieldSpecs(2) ; Array should be sized to be one less than the number of fields to be sorted
mcr_setSortSpec(myOwn_sortSpecs, 0, OffsetOf(myOwn\a), #PB_Sort_Integer)
mcr_setSortSpec(myOwn_sortSpecs, 1, OffsetOf(myOwn\b), #PB_Sort_Double)
;mcr_setSortSpec(myOwn_sortSpecs, 2, OffsetOf(myOwn\c$), #PB_Sort_String)
mcr_setSortSpec(myOwn_sortSpecs, 2, OffsetOf(myOwn\c$), #PB_Sort_String,#PB_Sort_Ascending|#PB_Sort_NoCase)

multiSort_myOwn(myOwn(), myOwn_sortSpecs())
display_myOwn(myOwn(), "-- Sort by: a.i+, b.d+, c$+ --")
DataSection
  SortThis:
  Data.i 12
  ;       a,   b,   c$
  Data.s "1", "9", "1"
  Data.s "2", "2", "5"
  Data.s "3", "1", "b"
  Data.s "4", "1", "a"
  Data.s "5", "3", "z"
  Data.s "7", "3", "y"
  Data.s "6", "3", "3"
  Data.s "8", "3", "2"
  Data.s "9", "3", "x"
  Data.s "7", "5", "a"  ;<- The problem of a 3rd sort appears when both Field1 and Field2 are duplicates
  Data.s "7", "5", "W"  ;<- Field1 = primary, Field2 = secondary, etc.
  Data.s "7", "4", "x"
EndDataSection
; mcr_setSortSpec(myOwn_sortSpecs, 2, OffsetOf(myOwn\c$), #PB_Sort_String)
; -- Sort by: a.i+, b.d+, c$+ --
; a       b       c$      
; ---     ---     ---     
; 1       9       1       
; 2       2       5       
; 3       1       b       
; 4       1       a       
; 5       3       z       
; 6       3       3       
; 7       3       y       
; 7       4       x       
; 7       5       W       ;<-- 3rd Column Sort OK
; 7       5       a       ;<-- because 'W' < 'a'
; 8       3       2       
; 9       3       x       

; mcr_setSortSpec(myOwn_sortSpecs, 2, OffsetOf(myOwn\c$), #PB_Sort_String,#PB_Sort_Ascending|#PB_Sort_NoCase)
; -- Sort by: a.i+, b.d+, c$+ --
; a       b       c$      
; ---     ---     ---     
; 1       9       1       
; 2       2       5       
; 3       1       b       
; 4       1       a       
; 5       3       z       
; 6       3       3       
; 7       3       y       
; 7       4       x       
; 7       5       W       ;<-- 3rd Column Sort FAIL using '#PB_Sort_NoCase'
; 7       5       a       ;<-- c$ should be 'a' then 'W'
; 8       3       2       
; 9       3       x       
The nice thing about standards is there are so many to choose from. ~ Andrew Tanenbaum
User avatar
Demivec
Addict
Addict
Posts: 4260
Joined: Mon Jul 25, 2005 3:51 pm
Location: Utah, USA

Re: Multi-field sorting for a structured array

Post by Demivec »

skywalk wrote:Trying to use this I found a weird case where #PB_Sort_NoCase doesn't affect the outcome?
Am I doing something wrong or is there a bug in the SortStructuredArray() when Or'ing options?
Thanks for reporting your results.

I have corrected a bug that skipped sorts of only 2 elements if they occurred at the end of a sub-sort range.

I also added a default option of #PB_Sort_Ascending for the mcr_setSortSpec(). Thanks for reminding me of this possibility.

Everything should be okeedokee now. :)
User avatar
skywalk
Addict
Addict
Posts: 4211
Joined: Wed Dec 23, 2009 10:14 pm
Location: Boston, MA

Re: Multi-field sorting for a structured array

Post by skywalk »

Great, that fixed it. :!:
Ha!, Sorry I couldn't find this: If rangeEnd - firstElement > 0 ; Change 1 to 0
The nice thing about standards is there are so many to choose from. ~ Andrew Tanenbaum
SeregaZ
Enthusiast
Enthusiast
Posts: 628
Joined: Fri Feb 20, 2009 9:24 am
Location: Almaty (Kazakhstan. not Borat, but Triple G)
Contact:

Re: Multi-field sorting for a structured array

Post by SeregaZ »

how i can use SortStructuredArray with a non standart sorting? i mean #PB_Sort_String is sorting by 123abc, how i can sort with this type - abc123?
User avatar
Michael Vogel
Addict
Addict
Posts: 2798
Joined: Thu Feb 09, 2006 11:27 pm
Contact:

Re: Multi-field sorting for a structured array

Post by Michael Vogel »

Is there an alternative for doing such cool things when using different languages with special chars (like german umlauts)?

The following example shows the problem ('ü' should be sorted near 'u')...

Code: Select all

;-setup random data
Define i, entryCount = 1000
Dim myOwn.myOwn(entryCount)
For i = 0 To entryCount
	myOwn(i)\a = Random(10)
	myOwn(i)\b$ = Chr(Random(10) + 65)
	myOwn(i)\c = (Random(10) + 1) * 20000
	myOwn(i)\d$ = Mid("abcdefghijklmnopöäüßABCDEFGHIJKLMNOPQÖÄÜ",Random(39)+1,1)
Next

display_myOwn(myOwn(), "-- Before Sort --")

;now for some heavy duty sorting
Dim myOwn_sortSpecs.sortFieldSpecs(3) ;Array should be sized to be one less than the number of fields to be sorted

mcr_setSortSpec(myOwn_sortSpecs, 0, OffsetOf(myOwn\d$), #PB_String, #PB_Sort_Ascending)
mcr_setSortSpec(myOwn_sortSpecs, 1, OffsetOf(myOwn\b$), #PB_String, #PB_Sort_Descending)
mcr_setSortSpec(myOwn_sortSpecs, 2, OffsetOf(myOwn\a), #PB_Word, #PB_Sort_Descending)
mcr_setSortSpec(myOwn_sortSpecs, 3, OffsetOf(myOwn\c), #PB_Integer, #PB_Sort_Ascending)

multiSort_myOwn(myOwn(), myOwn_sortSpecs())

display_myOwn(myOwn(), "-- Sort by: d$ A, b$ D, a.w D, c.i A --") ;'A' and 'D' represent ascending and descending order

wilbert
PureBasic Expert
PureBasic Expert
Posts: 3942
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: Multi-field sorting for a structured array

Post by wilbert »

Michael Vogel wrote:Is there an alternative for doing such cool things when using different languages with special chars (like german umlauts)?
With unicode, the easiest solution is to create a custom compare procedure which uses a lookup table for the characters and compare those.
If each character can be replaced by one, it's very easy. If you also want for example ß to be treated as ss, it's a bit more complicated but still possible.
Windows (x64)
Raspberry Pi OS (Arm64)
Post Reply