Man findet viel Diskussion und wenig funktionalen Code, verständlich schon gar nicht!
Ich brauch das für die Sortierung von SPS-Adressen usw. Da hat man Ausdrücke wie
DB1010.DBW.5; DB500.DBX.4.1 usw.
Wenn es nur um die fixe Form der Adressensortierung geht, würde ich das hinbekommen.
Ein allgemeiner 'natural Sort' ist schon schwieriger.
https://rosettacode.org/wiki/Natural_sorting
ist auch nicht so richtig brauchbar!
Damit das evtl. doch nicht so schwierig und etwas durchsichtiger bleibt hab ich mir folgenden Ansatz ausgedacht.
1. Alle Strings in eine List() oder Array() oder Map()
2. Dann 1:1 Kopie davon erstellen
3. Die Kopie parsen und so manipulieren, dass sie mit Standardmethoden asphabetisch sortiert werden kann.
z.B. 31 wird dann mit führenden Nullen ergänzt, 031 oder 0031 usw.
4. Die manipulierte Liste alphabetisch sortieren
5. Aus der manipulierten sortierten Liste die Reihenfolge extrahieren, und die orignale Liste nach dieser Reihenfolge sortieren!
mehr als etwas Brainstorming Ansatz hab ich leider noch nicht!
Wer hat dazu noch Ideen? Bzw. Code?
hier mein Brainstorming Code
Code: Alles auswählen
DeclareModule NatSort
;- ----------------------------------------------------------------------
;- STRUCTURES and CONSTANTS
; ----------------------------------------------------------------------
EnumerationBinary
EndEnumeration
EndDeclareModule
Module NatSort
EnableExplicit
PbFw::ListModule(#PB_Compiler_Module) ; Lists the Module in the ModuleList (for statistics)
Structure pChar ; virtual CHAR-ARRAY, used as Pointer to overlay on strings
a.a[0] ; fixed ARRAY Of CHAR Length 0
c.c[0]
EndStructure
Structure TNatural
*ListElement ; Pointer to the original ListElement
Txt.String ; The modified Sort-String maybe 'ä'=>'ae"
EndStructure
Structure TReplace
sOrg.s ; original String
sRep.s ; replace with this
Flags.i ; some Flags
EndStructure
Define NewList MyReplaces.TReplace()
;- ----------------------------------------------------------------------
;- Module Private Functions
;- ----------------------------------------------------------------------
Macro mac_StringStartWithWord(MyString, MyWord)
Bool(Left(MyString, Len(MyWord)) = MyWord)
EndMacro
Procedure.i _CreateReplaceList(List lstRepl.TReplace())
Protected cnt.i
ClearList(lstRepl())
ProcedureReturn cnt
EndProcedure
Procedure.s _RemoveTabsAndDoubleSpace(str.s)
; ============================================================================
; NAME: _RemoveTabsAndDoubleSpace
; DESC: Removes TABS and double SPACE from a String
; DESC: Left and Right Space are removed to. We do not need a separate Trim()
; VAR(str.s) : The String
; RET.s: The trimmed String
; ============================================================================
Protected I, *pC.pChar
Protected cnt, lastSpace
#cst_repChar = 1
#cst_Tab = 9
#cst_Space = 32
*pC = @str ; Set the CharPointer = StartOfString
While *pC\c[I] = #cst_Space
*pC\c[I] = #cst_repChar ; mark leading spaces to remove
I+1
Wend
While *pC\c[I] ; While Not EndOfString
Select *pC\c[I] ; Switch for the CharType
Case #cst_Tab ; TAB
If cnt > 0 ; If there was a Space befor
*pC\c[I] = #cst_repChar ; mark it to remove
Else ; No Space befor
*pC\c[I] = #cst_Space ; TAB => SPACE
lastSpace = I ; save position of last Space
cnt + 1 ; Cnt the Space
EndIf
Case #cst_Space ; SPACE
If cnt > 0
*pC\c[I] = #cst_repChar
Else
lastSpace = I ; save Position of last Space
cnt + 1
EndIf
Default ; Any other Character
cnt = 0 ; SpaceCount = 0
EndSelect
I + 1
Wend
If cnt ; if there is a open Space at the End, mark it to remove
*pC\c[lastSpace] = #cst_repChar
EndIf
ProcedureReturn ReplaceString(str, Chr(#cst_repChar), "") ; now remove the marked Characters
EndProcedure
Procedure.i _Natural_RemoveFirstWords(List lstNatStr.TNatural(), List lstWords.s())
; ============================================================================
; NAME: _Natural_RemoveFirstWords
; DESC: Remove a List of first Words from the StringList
; DESC: We use this for sorting if we want to ignore common Words or Articeles
; DESC: like 'the' in englisch or 'der', 'die', 'das' in german
; VAR(lstNatStr.TNatural()) : The List of Strings to sort
; VAR(lstWords.s(): List of 1st Words to remove
; RET.i: Count of removes
; ============================================================================
Protected cnt, L
Protected word.s
If ListSize(lstNatStr()) And ListSize(lstWords()) ; if both Lists are not empty
ResetList(lstNatStr()) ; Set List to the beginning
ForEach lstNatStr() ; Step trough all entries of lstNatStr()
ResetList(lstWords()) ; Set List to the beginning
ForEach lstWords() ; Step trough all entries of lstWords()
word = lstWords() + " " ; add a Space to the word otherwisw we remove 'The' from 'Thermo'
L=Len(word) ; WordLength
With lstNatStr()
If (Left(\Txt\s, L) = word) ; If the Strings starts with the searched word
\Txt\s = Mid(\Txt\s, L+1) ; Remove the word from the String
cnt + 1
EndIf
EndWith
Next
Next
EndIf
ProcedureReturn cnt ; Return number of removed words
EndProcedure
Procedure _Natural_ReplaceAll(*Ret.String, In.s, List lstRepl.TReplace(), PbSort=#PB_Sort_Ascending|#PB_Sort_NoCase)
; ============================================================================
; NAME: _Natural_ReplaceAll
; DESC: Replace all parts of the String which we want to change for sorting
; VAR(*Ret.String) : The Return-String ByRefernce
; VAR(In.s): The Input String
; VAR(lstRepl.TReplace()): List with Replaces ('ä'->'ae' ...)
; VAR(PbSort): Sorting Flags
; RET: -
; ============================================================================
Protected MyReplMode.i ; #PB_String_CaseSensitive, #PB_String_NoCase,
If PbSort & #PB_Sort_NoCase
*Ret\s = _RemoveTabsAndDoubleSpace(LCase(In))
MyReplMode = #PB_String_NoCase
Else
*Ret\s = _RemoveTabsAndDoubleSpace(In)
MyReplMode = #PB_String_CaseSensitive
EndIf
If ListSize(lstRepl())
ResetList(lstRepl())
ForEach lstRepl()
*Ret\s = ReplaceString(*Ret\s, lstRepl()\sOrg, lstRepl()\sRep, MyReplMode)
Next
EndIf
EndProcedure
;- ----------------------------------------------------------------------
;- Module Public Functions
;- ----------------------------------------------------------------------
Procedure NaturalSort(List lstSTR.s(), List lstReplaces.TReplace(), PbSort=#PB_Sort_Ascending|#PB_Sort_NoCase, Flags.i=0)
Protected I
Protected *pA, *pB
Protected NewList lNat.TNatural() ; This is our copy of the StringList with all modifications for the natural sorting
If ListSize(lstSTR()) > 1
; First copy the String List and do Replacements
ForEach lstSTR()
AddElement(lNat())
lNat()\ListElement = @lstSTR()
_Natural_Replace(lNat()\Txt, lstSTR(), lstReplaces(), PbSort)
Next
; Sorting a List or a Structured List is a buildin Function of Purebasic
SortStructuredList(lNat(), PbSort & #PB_Sort_Descending, OffsetOf(TNatural\Txt), #PB_String)
; *pA = FirstElement(lstSTR())
;
; ForEach lstSTR()
; *pB = @lstSTR()
;
; If *pA\c[I] > *pB\c[I]
; SwapElements(lstSTR(), *pA, *pB)
; EndIf
; Next
EndIf
EndProcedure
EndModule
Procedure CreateTestList(List lStr.s())
EndProcedure
DataSection
Strings:
Data.s "Ignoring leading spaces."
Data.s "ignore leading spaces: 2-2"
Data.s " ignore leading spaces: 2-1"
Data.s " ignore leading spaces: 2+0"
Data.s " ignore leading spaces: 2+1"
Data.s "Ignoring multiple adjacent spaces (MAS)."
Data.s "ignore MAS spaces: 2-2"
Data.s "ignore MAS spaces: 2-1"
Data.s "ignore MAS spaces: 2+0"
Data.s "ignore MAS spaces: 2+1"
Data.s "Equivalent whitespace characters."
Data.s "Equiv. spaces: 3-3"
Data.s "Equiv. \rspaces: 3-2"
Data.s "Equiv. \x0cspaces: 3-1"
Data.s "Equiv. \x0bspaces: 3+0"
Data.s "Equiv. \nspaces: 3+1"
Data.s "Equiv. \tspaces: 3+2"
Data.s "Case Independent sort."
Data.s "cASE INDEPENDENT: 3-2"
Data.s "caSE INDEPENDENT: 3-1"
Data.s "casE INDEPENDENT: 3+0"
Data.s "case INDEPENDENT: 3+1"
Data.s "Numeric fields as numerics."
Data.s "foo100bar99baz0.txt"
Data.s "foo100bar10baz0.txt"
Data.s "foo1000bar99baz10.txt"
Data.s "foo1000bar99baz9.txt"
Data.s "Title sorts."
Data.s "The Wind in the Willows"
Data.s "The 40th step more"
Data.s "The 39 steps"
Data.s "Wanda"
Replace_DE:
Data.s "ä", "ae", ""
Data.s "ö", "oe", ""
Data.s "ü", "ue", ""
Data.s "A", "Ae", "#PB_String_CaseSensitive"
Data.s "Ö", "Oe", "#PB_String_CaseSensitive"
Data.s "Ü", "Ue", "#PB_String_CaseSensitive"
Data.s "ß", "ss", ""