Multi-field sorting for a structured array
Posted: Thu Mar 08, 2012 4:05 am
Code updated for 5.20+
Here is some code I created for sorting a structured array on multiple fields. This code is needed because sorting an array in PB is not performed with a stable sort. The procedures I created below will allow you to sort on multiple fields at once (so to speak). It uses PureBasic's SortStructuredArray() function to do the main work but selectively calls it only over certain ranges and with selected parameters to accomplish the desired multi-field sort.
Because PB's SortStructuredArray() function requires a specific structured array parameter and not a pointer to an array I have written a macro that creates the needed procedures for an array of a specific structure. For instance to create the procedures to be used for arrays with the structure 'point' you would call the macro mcr_multiSort(point). This would create the procedure multiSort_point() that would be called with parameters as multiSort_point(a.point(), sortSpecs()).
The sortSpecs() array contains the details of the sorting for each field being sorted. Its dimensions must be one less than the number of fields being sorted. It is structured with the structure sortFieldSpecs which contains entries for the offset of the field, it's type, and the option for sorting. Each of those fields correspond to one of the parameters in PB's SortStructuredArray() function.
I've arranged the code as an include:
The macro mcr_setSortSpec() is provided so that the sortSpecs for an index can be specified on a single line.
Here's some demonstration code that uses the include file:
To resort with different fields, options, or the number of fields just redimension the sortSpecs() array and fill it in with the new options.
@Edit: separated the previous code into an include file and demonstration code.
Created a procedure for the initialization code so that the variables used for setup could be kept local to the procedure, leaving only one global variable defined.
Modified the multiSort() procedure to allow an optional range to be specified.
Here is the code arranged as a module with the demonstration code included:
@Edit: corrected a small bug that skipped sorts of only 2 elements if they occurred at the end of a sub-sort range. Made a very minor change to cleanup code.
Added a default option of #PB_Sort_Ascending for the mcr_setSortSpec() macro. Thanks for skywalk for reminding me of this possibility.
@Edit2: Added the include in the form of a module. Proper use still requires using a macro (defined in the module) that will define a procedure for a particular multi-field structure, then calling the procedure produced by that macro for the actual sorting of arrays with that structure.
Here is some code I created for sorting a structured array on multiple fields. This code is needed because sorting an array in PB is not performed with a stable sort. The procedures I created below will allow you to sort on multiple fields at once (so to speak). It uses PureBasic's SortStructuredArray() function to do the main work but selectively calls it only over certain ranges and with selected parameters to accomplish the desired multi-field sort.
Because PB's SortStructuredArray() function requires a specific structured array parameter and not a pointer to an array I have written a macro that creates the needed procedures for an array of a specific structure. For instance to create the procedures to be used for arrays with the structure 'point' you would call the macro mcr_multiSort(point). This would create the procedure multiSort_point() that would be called with parameters as multiSort_point(a.point(), sortSpecs()).
The sortSpecs() array contains the details of the sorting for each field being sorted. Its dimensions must be one less than the number of fields being sorted. It is structured with the structure sortFieldSpecs which contains entries for the offset of the field, it's type, and the option for sorting. Each of those fields correspond to one of the parameters in PB's SortStructuredArray() function.
I've arranged the code as an include:
Code: Select all
;Program: Multi-Field Sort for Structured Arrays.
;Name: Array_MultiFieldSort.pbi
;Author: Demivec
;Date: 3/12/2012
;Written for PureBasic v4.61
EnableExplicit
;-sort routines
Enumeration
;The standard equality functions return one of the following comparison
;results.
#isNotEqualTo = 0
#isEqualTo
EndEnumeration
;=================================================================================
; Description: Each of the equality functions compares two elements that are a
; PureBasic standard data type for equality and returns a boolean
; result. The elements are at addresses [*element_1] and [*element_2].
;
; Returns: Returns '#isEqualTo' if the first element is equal to
; element two, otherwise it returns '#isNotEqualTo'.
;
; Note: A given equality function can only operate on compatible data and so care
; must be taken in using the correct function. These functions are only
; used by the subSort() function.
;=================================================================================
Macro mcr_equality_standardTypes(type, subStructure)
Procedure _isEqual_#type(*element_1.type, *element_2.type)
If *element_1\subStructure = *element_2\subStructure
ProcedureReturn #isEqualTo
EndIf
ProcedureReturn #isNotEqualTo
EndProcedure
EndMacro
mcr_equality_standardTypes(byte, b)
mcr_equality_standardTypes(word, w)
mcr_equality_standardTypes(long, l)
mcr_equality_standardTypes(integer, i)
mcr_equality_standardTypes(quad, q)
mcr_equality_standardTypes(float, f)
mcr_equality_standardTypes(double, d)
mcr_equality_standardTypes(ascii, a)
mcr_equality_standardTypes(character, c)
mcr_equality_standardTypes(unicode, u)
mcr_equality_standardTypes(string, s)
DataSection
_standardEqualityFunctions:
;Equality functions for each of the standard element types.
;Data must be present for each of PB's standard sort compare types.
;A line for each parameter-constant followed by the address of its equality function.
Data.i 11 ;1 based count of equality functions
Data.i #PB_Byte , @_isEqual_byte()
Data.i #PB_Word , @_isEqual_word()
Data.i #PB_Long , @_isEqual_long()
Data.i #PB_Integer , @_isEqual_Integer()
Data.i #PB_Quad , @_isEqual_quad()
Data.i #PB_Float , @_isEqual_float()
Data.i #PB_Double , @_isEqual_double()
Data.i #PB_Ascii , @_isEqual_ascii()
Data.i #PB_Character, @_isEqual_character()
Data.i #PB_Unicode , @_isEqual_unicode()
Data.i #PB_String , @_isEqual_string()
EndDataSection
Prototype.i equalityFunction(*element_1, *element_2)
Global Dim *equalityFunctionAdr(0)
Procedure setupEqualityFunctions()
Shared *equalityFunctionAdr()
Protected i, equalityFunctionCount, dataType
Restore _standardEqualityFunctions
Read.i equalityFunctionCount
For i = 1 To equalityFunctionCount
Read.i dataType
If dataType > ArraySize(*equalityFunctionAdr())
Redim *equalityFunctionAdr(dataType)
EndIf
Read.i *equalityFunctionAdr(dataType)
Next
EndProcedure
setupEqualityFunctions()
Structure sortFieldSpecs
offset.i
type.i
option.i
EndStructure
;The multiSort procedure created ends with the structure name i.e. multiSort_struc()
;The multiSort procedure has parameters {arr(), sortSpecs() [, rangeStart, rangeEnd]}.
;rangeStart and rangeEnd are optional and if either of them is specified they both
;must be specified.
Macro mcr_multiSort(struc)
;multiSort() and subSort() only are set up for use with the given structure
Procedure subSort_#struc(Array x.struc(1), Array sortFieldSpecs.sortFieldSpecs(1), sortLevel, rangeStart, rangeEnd)
Protected i, firstElement = rangeStart, prevSortLevel = sortLevel - 1
Protected equalityFunc.equalityFunction = *equalityFunctionAdr(sortFieldSpecs(prevSortLevel)\type)
For i = rangeStart + 1 To rangeEnd
If equalityFunc(@x(firstElement) + sortFieldSpecs(prevSortLevel)\offset, @x(i) + sortFieldSpecs(prevSortLevel)\offset) <> #isEqualTo
If i - firstElement > 1
SortStructuredArray(x(), sortFieldSpecs(sortLevel)\option, sortFieldSpecs(sortLevel)\offset, sortFieldSpecs(sortLevel)\type, firstElement, i - 1)
If sortLevel < ArraySize(sortFieldSpecs())
subSort_#struc(x(), sortFieldSpecs(), sortLevel + 1, firstElement, i - 1)
EndIf
EndIf
firstElement = i
EndIf
Next
If rangeEnd - firstElement > 0
SortStructuredArray(x(), sortFieldSpecs(sortLevel)\option, sortFieldSpecs(sortLevel)\offset, sortFieldSpecs(sortLevel)\type, firstElement, rangeEnd)
If sortLevel < ArraySize(sortFieldSpecs())
subSort_#struc(x(), sortFieldSpecs(), sortLevel + 1, firstElement, rangeEnd)
EndIf
EndIf
EndProcedure
Procedure multiSort_#struc(Array x.struc(1), Array sortFieldSpecs.sortFieldSpecs(1), rangeStart = -1, rangeEnd = -1)
If rangeStart = -1 Or rangeEnd = -1
rangeStart = 0: rangeEnd = ArraySize(x())
EndIf
SortStructuredArray(x(), sortFieldSpecs(0)\option, sortFieldSpecs(0)\offset, sortFieldSpecs(0)\type, rangeStart, rangeEnd)
If ArraySize(sortFieldSpecs()) > 0 And ArraySize(x()) > 1
subSort_#struc(x(), sortFieldSpecs(), 1, rangeStart, rangeEnd)
EndIf
EndProcedure
EndMacro
;This helper macro makes it possible to specify on one line the sort options for an array of
;sortFieldSpecs at a specified index.
Macro mcr_setSortSpec(arrName, index, _offset, _type, _option = #PB_Sort_Ascending)
arrName(index)\offset = _offset
arrName(index)\type = _type
arrName(index)\option = _option
EndMacro
Here's some demonstration code that uses the include file:
Code: Select all
;Demonstration code for Multi-Field Sort of a Structured Array.
;Author: Demivec
;Written for PureBasic v4.61
;Date: 3/7/2012
EnableExplicit
IncludeFile "Array_MultiFieldSort.pbi"
Structure myOwn
a.w
b$
c.i
d$
EndStructure
Procedure display_myOwn(Array x.myOwn(1), header.s = "", footer.s = "")
Protected i, tw = 8
Debug header
Debug LSet("a", tw) + LSet("b$", tw) + LSet("c", tw) + LSet("d$", tw)
Debug LSet("---", tw) + LSet("---", tw) + LSet("---", tw) + LSet("---", tw)
For i = 0 To ArraySize(x())
Debug LSet(Str(x(i)\a), tw) + LSet(x(i)\b$, tw) + LSet(Str(x(i)\c), tw) + LSet(x(i)\d$, tw)
Next
Debug footer
EndProcedure
;-sort routines
mcr_multiSort(myOwn) ;create procedure multiSort_myOwn()
;-setup random data
Define i, entryCount = 10000
Dim myOwn.myOwn(entryCount)
For i = 0 To entryCount
myOwn(i)\a = Random(10)
myOwn(i)\b$ = Chr(Random(10) + 65)
myOwn(i)\c = (Random(10) + 1) * 20000
myOwn(i)\d$ = Chr(Random(10) + 65)
Next
display_myOwn(myOwn(), "-- Before Sort --")
;now for some heavy duty sorting
Dim myOwn_sortSpecs.sortFieldSpecs(3) ;Array should be sized to be one less than the number of fields to be sorted
mcr_setSortSpec(myOwn_sortSpecs, 0, OffsetOf(myOwn\d$), #PB_String, #PB_Sort_Ascending)
mcr_setSortSpec(myOwn_sortSpecs, 1, OffsetOf(myOwn\b$), #PB_String, #PB_Sort_Descending)
mcr_setSortSpec(myOwn_sortSpecs, 2, OffsetOf(myOwn\a), #PB_Word, #PB_Sort_Descending)
mcr_setSortSpec(myOwn_sortSpecs, 3, OffsetOf(myOwn\c), #PB_Integer, #PB_Sort_Ascending)
multiSort_myOwn(myOwn(), myOwn_sortSpecs())
display_myOwn(myOwn(), "-- Sort by: d$ A, b$ D, a.w D, c.i A --") ;'A' and 'D' represent ascending and descending order
@Edit: separated the previous code into an include file and demonstration code.
Created a procedure for the initialization code so that the variables used for setup could be kept local to the procedure, leaving only one global variable defined.
Modified the multiSort() procedure to allow an optional range to be specified.
Here is the code arranged as a module with the demonstration code included:
Code: Select all
;Program: Multi-Field Sort for Structured Arrays.
;Programme: tri d'un tableau structuré sur plusieurs champs.
;Name: Array_MultiFieldSort.pbi
;Author: Demivec
;Date: 3/12/2012
;Written for PureBasic v4.61, updated for v5.40b3
;Translations of comments into French made with the help of GoogleTranslate.
EnableExplicit
DeclareModule amf_sort
EnableExplicit
Enumeration
;The standard equality functions return one of the following comparison results.
;Les fonctions d'égalité standard retournent l'un des résultats de comparaison suivants.
#isNotEqualTo = 0
#isEqualTo
EndEnumeration
Structure sortFieldSpecs
offset.i
type.i
option.i
EndStructure
Prototype.i equalityFunction(*element_1, *element_2)
Global Dim *equalityFunctionAdr(0)
;A multiSort procedure will be created that includes the structure name, i.e. multiSort_struc().
;The procedure multiSort will have parameters {arr.struc(), sortSpecs.sortFieldSpecs() [, rangeStart, rangeEnd]}.
; arr.struc() is an array of the structure specified when using the create macro.
; sortSpecs.sortFieldSpecs() is an array that specifies each field to sort, its offset, type, and
; method (methods uses the same values as PB's sort functions).
; rangeStart and rangeEnd are optional and if either of them is specified they both
; must be specified.
;Une procédure 'multiSort_struc' sera créé qui comprend le nom de la structure, i.e. multiSort_struc().
;La procédure, 'multiSort_struc', aura paramètres {arr.struc(), sortSpecs() [, rangeStart, rangeEnd]}.
; arr.struc() est un tableau de la structure spécifiée lors de l'utilisation de la créer macro.
; sortSpecs.sortFieldSpecs() est un tableau qui spécifie chaque champ pour trier, son décalage, le type et
; la méthode (méthodes utilise les mêmes valeurs que les fonctions de tri PB)
; rangeStart et rangeEnd sont facultatives et si l'un d'entre eux est spécifié qu'ils fois
; doit être spécifié.
Macro mcr_createMultiSort(struc)
;The procedure multiSort_struc() is created for use with a specific structure.
;The macro has to be called for each specific structure that needs to be sorted.
;La procédure multiSort_struc() est créé pour être utilisé avec une structure spécifique.
;La macro doit être appelée pour chaque structure spécifique qui doit être trié.
Procedure subSort_#struc(Array x.struc(1), Array sortFieldSpecs.amf_sort::sortFieldSpecs(1), sortLevel, rangeStart, rangeEnd)
Protected i, firstElement = rangeStart, prevSortLevel = sortLevel - 1
Protected equalityFunc.amf_sort::equalityFunction = amf_sort::*equalityFunctionAdr(sortFieldSpecs(prevSortLevel)\type)
For i = rangeStart + 1 To rangeEnd
If equalityFunc(@x(firstElement) + sortFieldSpecs(prevSortLevel)\offset, @x(i) + sortFieldSpecs(prevSortLevel)\offset) <> amf_sort::#isEqualTo
If i - firstElement > 1
SortStructuredArray(x(), sortFieldSpecs(sortLevel)\option, sortFieldSpecs(sortLevel)\offset, sortFieldSpecs(sortLevel)\type, firstElement, i - 1)
If sortLevel < ArraySize(sortFieldSpecs())
subSort_#struc(x(), sortFieldSpecs(), sortLevel + 1, firstElement, i - 1)
EndIf
EndIf
firstElement = i
EndIf
Next
If rangeEnd - firstElement > 0
SortStructuredArray(x(), sortFieldSpecs(sortLevel)\option, sortFieldSpecs(sortLevel)\offset, sortFieldSpecs(sortLevel)\type, firstElement, rangeEnd)
If sortLevel < ArraySize(sortFieldSpecs())
subSort_#struc(x(), sortFieldSpecs(), sortLevel + 1, firstElement, rangeEnd)
EndIf
EndIf
EndProcedure
Procedure multiSort_#struc(Array x.struc(1), Array sortFieldSpecs.amf_sort::sortFieldSpecs(1), rangeStart = -1, rangeEnd = -1)
If rangeStart = -1 Or rangeEnd = -1
rangeStart = 0: rangeEnd = ArraySize(x())
EndIf
SortStructuredArray(x(), sortFieldSpecs(0)\option, sortFieldSpecs(0)\offset, sortFieldSpecs(0)\type, rangeStart, rangeEnd)
If ArraySize(sortFieldSpecs()) > 0 And ArraySize(x()) > 1
subSort_#struc(x(), sortFieldSpecs(), 1, rangeStart, rangeEnd)
EndIf
EndProcedure
EndMacro
;This helper macro makes it possible to specify on one line the sort options for an array of
;sortFieldSpecs at a specified index.
;Cette macro auxiliaire permet de spécifier sur une seule ligne les options de tri pour un tableau de
;sortFieldSpecs à un index spécifié.
Macro mcr_setSortSpec(arrName, index, _offset, _type, _option = #PB_Sort_Ascending)
arrName(index)\offset = _offset
arrName(index)\type = _type
arrName(index)\option = _option
EndMacro
EndDeclareModule
Module amf_sort
;=================================================================================
; Description: Each of the equality functions compares two elements that are a
; PureBasic standard data type for equality and returns a boolean
; result. The elements are at addresses [*element_1] and [*element_2].
;
; Returns: Returns '#isEqualTo' if the first element is equal to the second element,
; otherwise it returns '#isNotEqualTo'.
;
; Note: Each equality function can only operate on one data type. Care should be
; taken so that the correct function is used. These functions are used by
; the subSort() function.
;=================================================================================
; Description: Chacune des fonctions d'égalité compare deux éléments qui sont un
; type de données standard PureBasic pour l'égalité et retourne un
; résultat booléen. Les éléments sont à des adresses [*element_1] et [*element_2]
;
; Retours: Retours '#isEqualTo' si le premier élément est égale à la deuxième élément,
; sinon il retourne '#isNotEqualTo'.
;
;Remarque: Chaque fonction de l'égalité ne peut fonctionner que sur un seul type. Il faut prendre soin
; de sorte que la functin correct est utilisé. Ces fonctions sont utilisées
; par la fonction subSort().
;=================================================================================
Macro mcr_equality_standardTypes(type, subStructure)
Procedure _isEqual_#type(*element_1.type, *element_2.type)
If *element_1\subStructure = *element_2\subStructure
ProcedureReturn #isEqualTo
EndIf
ProcedureReturn #isNotEqualTo
EndProcedure
EndMacro
mcr_equality_standardTypes(byte, b)
mcr_equality_standardTypes(word, w)
mcr_equality_standardTypes(long, l)
mcr_equality_standardTypes(integer, i)
mcr_equality_standardTypes(quad, q)
mcr_equality_standardTypes(float, f)
mcr_equality_standardTypes(double, d)
mcr_equality_standardTypes(ascii, a)
mcr_equality_standardTypes(character, c)
mcr_equality_standardTypes(unicode, u)
mcr_equality_standardTypes(string, s)
DataSection
_standardEqualityFunctions:
;Equality functions for each of the standard element types.
;Data must be present for each of PB's standard data types.
;A line for each parameter-constant followed by the address of its equality function.
;Fonctions de l'égalité pour chacun des types d'éléments standard.
;Données doit être présent pour chacun des types de données standard de PB.
;Une ligne pour chaque paramètre-constant suivi de l'adresse de sa fonction d'égalité.
Data.i 11 ;1 based count of equality functions
Data.i #PB_Byte , @_isEqual_byte()
Data.i #PB_Word , @_isEqual_word()
Data.i #PB_Long , @_isEqual_long()
Data.i #PB_Integer , @_isEqual_Integer()
Data.i #PB_Quad , @_isEqual_quad()
Data.i #PB_Float , @_isEqual_float()
Data.i #PB_Double , @_isEqual_double()
Data.i #PB_Ascii , @_isEqual_ascii()
Data.i #PB_Character, @_isEqual_character()
Data.i #PB_Unicode , @_isEqual_unicode()
Data.i #PB_String , @_isEqual_string()
EndDataSection
Global Dim *equalityFunctionAdr(0)
Procedure setupEqualityFunctions()
Shared *equalityFunctionAdr()
Protected i, equalityFunctionCount, dataType
Restore _standardEqualityFunctions
Read.i equalityFunctionCount
For i = 1 To equalityFunctionCount
Read.i dataType
If dataType > ArraySize(*equalityFunctionAdr())
ReDim *equalityFunctionAdr(dataType)
EndIf
Read.i *equalityFunctionAdr(dataType)
Next
EndProcedure
setupEqualityFunctions()
EndModule
;Demonstration code.
CompilerIf #PB_Compiler_IsMainFile = 1
;Demonstration code for sorting a Structured Array on multiple fields.
;Code de démonstration pour le tri d'un tableau structuré sur plusieurs champs.
;Author: Demivec
;Written for PureBasic v5.40b3
;Date: 9/8/2015
EnableExplicit
Structure myOwn
a.w
b$
c.i
d$
EndStructure
Procedure display_myOwn(Array x.myOwn(1), header.s = "", footer.s = "")
Protected i, tw = 16
Debug header
Debug LSet("a", tw) + LSet("b$", tw) + LSet("c", tw) + LSet("d$", tw)
Debug LSet("---", tw) + LSet("---", tw) + LSet("-----------", tw) + LSet("---", tw)
For i = 0 To ArraySize(x())
Debug LSet(Str(x(i)\a), tw) + LSet(x(i)\b$, tw) + LSet(Str(x(i)\c), tw) + LSet(x(i)\d$, tw)
Next
Debug footer
EndProcedure
;create procedure multiSort_myOwn()
;créer procédure multiSort_myOwn()
amf_sort::mcr_createMultiSort(myOwn)
;setup random data
;configuration des données aléatoires
Define i, entryCount = 10000
Dim myOwn.myOwn(entryCount)
For i = 0 To entryCount
myOwn(i)\a = Random(10)
myOwn(i)\b$ = Chr(Random(10) + 65)
myOwn(i)\c = (Random(10) + 1) * 20000
myOwn(i)\d$ = Chr(Random(10) + 65)
Next
display_myOwn(myOwn(), "-- Before Sort --")
;Now for some heavy duty sorting.
;Maintenant pour certains poids lourds tri.
;Array size should be equal to (number of fields to be sorted) - 1
;Taille de tableau doit être égal au (nombre de champs à trier) - 1
Dim myOwn_sortSpecs.amf_sort::sortFieldSpecs(3)
;index 0 is primary key, index 1 is secondary key, and so forth
;index 0 est la clé primaire, l'index 1 est la clé secondaire, etc.
amf_sort::mcr_setSortSpec(myOwn_sortSpecs, 0, OffsetOf(myOwn\d$), TypeOf(myOwn\d$), #PB_Sort_Ascending)
amf_sort::mcr_setSortSpec(myOwn_sortSpecs, 1, OffsetOf(myOwn\b$), TypeOf(myOwn\b$), #PB_Sort_Descending)
amf_sort::mcr_setSortSpec(myOwn_sortSpecs, 2, OffsetOf(myOwn\a), TypeOf(myOwn\a), #PB_Sort_Descending)
amf_sort::mcr_setSortSpec(myOwn_sortSpecs, 3, OffsetOf(myOwn\c), TypeOf(myOwn\c), #PB_Sort_Ascending)
multiSort_myOwn(myOwn(), myOwn_sortSpecs())
Debug "================================================================="
;'A' and 'D' represent ascending and descending order
;'A' and 'D' représenter ordre croissant ou décroissant
display_myOwn(myOwn(), "-- Sort by: d$ A, b$ D, a.w D, c.i A --")
CompilerEndIf
Added a default option of #PB_Sort_Ascending for the mcr_setSortSpec() macro. Thanks for skywalk for reminding me of this possibility.
@Edit2: Added the include in the form of a module. Proper use still requires using a macro (defined in the module) that will define a procedure for a particular multi-field structure, then calling the procedure produced by that macro for the actual sorting of arrays with that structure.