Sort using natural instead of ASCII

Just starting out? Need help? Post your questions and find answers here.
User avatar
marcoagpinto
Addict
Addict
Posts: 1045
Joined: Sun Mar 10, 2013 3:01 pm
Location: Portugal
Contact:

Sort using natural instead of ASCII

Post by marcoagpinto »

Hello!

Andre hasn't replied so far to my topic:
http://www.purebasic.fr/english/viewtop ... =3&t=66401

Could someone with the knowledge provide a code snippet?

Thanks!
User avatar
Thunder93
Addict
Addict
Posts: 1788
Joined: Tue Mar 21, 2006 12:31 am
Location: Canada

Re: Sort using natural instead of ASCII

Post by Thunder93 »

Natural Order String Comparison - http://sourcefrog.net/projects/natsort/


.. this is what you looking for? to be-able to do with PB right? :wink:
ʽʽSuccess is almost totally dependent upon drive and persistence. The extra energy required to make another effort or try another approach is the secret of winning.ʾʾ --Dennis Waitley
User avatar
marcoagpinto
Addict
Addict
Posts: 1045
Joined: Sun Mar 10, 2013 3:01 pm
Location: Portugal
Contact:

Re: Sort using natural instead of ASCII

Post by marcoagpinto »

Thunder93 wrote:Natural Order String Comparison - http://sourcefrog.net/projects/natsort/


.. this is what you looking for? to be-able to do with PB right? :wink:
Not really... it is a Hunspell project... I want to be able to sort words with accents àáéèú... etc. in a natural order and not at the bottom of the array.
User avatar
Thunder93
Addict
Addict
Posts: 1788
Joined: Tue Mar 21, 2006 12:31 am
Location: Canada

Re: Sort using natural instead of ASCII

Post by Thunder93 »

I should have left my original post... It was just working with the first character. It might give you some ideas.

Code: Select all

0: !
1: 1 Hoo
2: 2 Hoo
3: [
4: ]
5: `
6: Cat
7: Dog
8: Micky
9: z1
10: z11
11: Zoo
12: È

Mark1: 0 ( ! )
Mark2: 6 ( Cat )
Mark3: 12 ( È )

----
!
1 Hoo
2 Hoo
[
]
`
È
èè
ó
Cat
Dog
Micky
z1
z11
Zoo

Code: Select all

Dim MyArray.s(14) : Dim MyArray2.s(14)

MyArray(0) = "Micky"
MyArray(1) = "Zoo"
MyArray(2) = "Dog"
MyArray(3) = "Cat"
MyArray(4) = "1 Hoo"
MyArray(5) = "2 Hoo"
MyArray(6) = "z1"
MyArray(7) = "z11"
MyArray(8) = "`"
MyArray(9) = "ó"
MyArray(10) = "È"
MyArray(11) = "èè"
MyArray(12) = "["
MyArray(13) = "!"
MyArray(14) = "]"


SortArray(MyArray(), #PB_Sort_Ascending|#PB_Sort_NoCase) : Num.l : Mark1.l = -1 : Mark2.l = -1 : Mark3.l = -1

For k=0 To ArraySize(MyArray())
  Debug Str(k)+": "+MyArray(k)
  Num = Asc(Mid(MyArray(k), 1, 1))
  
  If (Mark1 = -1) And (Num > 32 And Num < 65)
    Mark1 = k
  ElseIf (Mark2 = -1) And Not (Num > 90 And Num < 97) And (Num > 64 And Num < 123)
    Mark2 = k
  ElseIf Num > 191 And Num < 256
    Mark3 = k
    Break
  EndIf
Next

Debug ""
Debug "Mark1: "+Str(Mark1)+" ( "+MyArray(Mark1)+" )"
Debug "Mark2: "+Str(Mark2)+" ( "+MyArray(Mark2)+" )"
Debug "Mark3: "+Str(Mark3)+" ( "+MyArray(Mark3)+" )"
Debug ""

For k=Mark1 To Mark2-1
  MyArray2(kk) = MyArray(k)
  kk+1
Next

For k=Mark3 To ArraySize(MyArray())
  MyArray2(kk) = MyArray(k)
  
  If MyArray2(kk) = "" : Break : EndIf
  kk+1
Next

For k=Mark2 To Mark3-1
  MyArray2(kk) = MyArray(k)
  kk+1
Next

CopyArray(MyArray2(), MyArray())
FreeArray(MyArray2())


Debug "----"
For k=0 To ArraySize(MyArray())
  Debug MyArray(k)
Next
ʽʽSuccess is almost totally dependent upon drive and persistence. The extra energy required to make another effort or try another approach is the secret of winning.ʾʾ --Dennis Waitley
User avatar
marcoagpinto
Addict
Addict
Posts: 1045
Joined: Sun Mar 10, 2013 3:01 pm
Location: Portugal
Contact:

Re: Sort using natural instead of ASCII

Post by marcoagpinto »

I had a crazy idea:

I have a main array with the words:
main_array$(0)="até"
main_array$(1)="atão"
main_array$(2)="Luís"
etc.

and a secondary array blank:
secondary_array$(0)=""
secondary_array$(1)=""
secondary_array$(2)=""

What if do a loop:
For f=0 to 2
t$=main_array$(f)
> Here I replace all accents with letters without accents (using the direct replace command, one line per each existing accent) and put t$ lowercase,
then I add to the t$ the current position after a chr(9):
t$+chr(9)+str(f)
secondary_array$(f)=t$
next f


Then I use the command SORT on the secondary array.

Then I do a loop in the secondary array replacing each array position with the text from the main_array:

For f=0 to 2
secondary_array$(f)=main_array$(stringfield(secondary_array$(f),2,chr(9))
next f

Then I just copy from the secondary array to the main.

Does this sound viable?

It was the best I could think of.
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3942
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: Sort using natural instead of ASCII

Post by wilbert »

You could also use the api procedure qsort which uses a callback and create your own compare procedure.
Windows (x64)
Raspberry Pi OS (Arm64)
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3942
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: Sort using natural instead of ASCII

Post by wilbert »

Here's an example using qsort.
It might be needed to extend the lookup table to unicode values between 256 and 591.

Code: Select all

; import qsort procedure
ImportC ""
  qsort(*base, num, size, *comparator)
EndImport 

; lookup table
DataSection
  LUT_Compare:
  Data.u 64,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,
         112,113,114,115,116,117,118,119,120,121,122,91,92,93,94,95,
         96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,
         112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,
         128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,
         144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159,
         160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,
         176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191,
         97,97,97,97,97,97,230,99,101,101,101,101,105,105,105,105,
         240,110,111,111,111,111,111,215,248,117,117,117,117,121,254,223,
         97,97,97,97,97,97,230,99,101,101,101,101,105,105,105,105,
         240,110,111,111,111,111,111,247,248,117,117,117,117,121,254,121
EndDataSection

; create global lookup array
Global Dim LUT_Compare.u(191)
CopyMemory(?LUT_Compare, @LUT_Compare(), 384)

; compare procedure
ProcedureC.i Compare(*s1.String, *s2.String)
  
  Protected c1.c, c2.c, result.i
  Protected *c1.Character = @*s1\s
  Protected *c2.Character = @*s2\s
  
  If *c1 = 0
    If *c2 = 0
      ProcedureReturn 0   ; Both pointers 0
    Else
      ProcedureReturn -1  ; First pointer 0, second not
    EndIf
  ElseIf *c2 = 0
    ProcedureReturn 1     ; Second pointer 0, first not
  Else
    
    ; Both valid strings so compare
    Repeat
      c1 = *c1\c : *c1 + SizeOf(Character)
      c2 = *c2\c : *c2 + SizeOf(Character)
      If c1 >= 64 And c1 < 192 : c1 = LUT_Compare(c1 - 64) : EndIf
      If c2 >= 64 And c2 < 192 : c2 = LUT_Compare(c2 - 64) : EndIf
      result = c1 - c2
    Until result Or c1 = 0
    ProcedureReturn result
    
  EndIf
  
EndProcedure


; array to sort
Dim values.s(2)
values(0)="até"
values(1)="atão"
values(2)="Luís"

; perform sort
qsort(@values(), ArraySize(values()) + 1, SizeOf(String), @Compare())

; present result
For i = 0 To 2
  Debug values(i)
Next
Windows (x64)
Raspberry Pi OS (Arm64)
normeus
Enthusiast
Enthusiast
Posts: 470
Joined: Fri Apr 20, 2012 8:09 pm
Contact:

Re: Sort using natural instead of ASCII

Post by normeus »

I would use @wilbert solution (even thou I haven't looked at it but it is wilbert code so it should be good)
but here is a simplified method which can be used to understand whats happening:

Code: Select all

Structure toSort
  Name$
  noaccent$
EndStructure

Dim names.tosort (2)
names(0)\Name$="Alemen"
names(1)\Name$="alemon"
names(2)\Name$="alemán"


Procedure createfield(Array names.tosort(1))
  ; create a string with all names replace accent then create second array
  Protected k,temp$="",pipe.s="|"
  If ArraySize(names())
    temp$=names(0)\Name$
    For k = 1 To ArraySize(names())
      temp$ = temp$+ pipe +names(k)\Name$ ;pipe to separate
    Next
    ;Long list of accents to replace so you can sort the way you like 
    temp$= ReplaceString(temp$, "á", "a", #PB_String_NoCase)
    temp$= ReplaceString(temp$, "å", "a", #PB_String_NoCase)
    ;//etc...
    ;now populate 2nd array without accents
    For k = 0 To ArraySize(names())
      names(k)\noaccent$=StringField(temp$,k+1,pipe)
    Next
  EndIf
  temp$=""
EndProcedure

createfield(names())
SortStructuredArray(names(), #PB_Sort_NoCase, OffsetOf(tosort\noaccent$) ,TypeOf(tosort\noaccent$))


For k = 0 To ArraySize(names())
  Debug names(k)\Name$
Next

Norm.
google Translate;Makes my jokes fall flat- Fait mes blagues tombent à plat- Machte meine Witze verpuffen- Eh cumpari ci vo sunari
User avatar
marcoagpinto
Addict
Addict
Posts: 1045
Joined: Sun Mar 10, 2013 3:01 pm
Location: Portugal
Contact:

Re: Sort using natural instead of ASCII

Post by marcoagpinto »

I have managed to make it work!!!!!!!!

Code: Select all

Dim main_array$(2)
main_array$(0)="Alemen"
main_array$(1)="alemon"
main_array$(2)="alemán"


Dim secondary_array$(2)
secondary_array$(0)=""
secondary_array$(1)=""
secondary_array$(2)=""


For f=0 To 2
  t$=LCase(main_array$(f))
  ReplaceString(t$,"á","a",#PB_String_InPlace)
  t$+Chr(9)+Str(f)
  secondary_array$(f)=t$
Next f


SortArray(secondary_array$(),#PB_Sort_Ascending|#PB_Sort_NoCase)


For f=0 To 2
  secondary_array$(f)=main_array$(Val(StringField(secondary_array$(f),2,Chr(9))))
Next f


CopyArray(secondary_array$(),main_array$())


For f=0 To 2
  Debug main_array$(f)
Next f

wilbert
PureBasic Expert
PureBasic Expert
Posts: 3942
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: Sort using natural instead of ASCII

Post by wilbert »

This seems to work on Windows ...

Code: Select all

; import qsort procedure
ImportC ""
  qsort(*base, num, size, *comparator)
EndImport 

; compare procedure
ProcedureC.i Compare(*s1.String, *s2.String)
  ; Compare using LOCALE_USER_DEFAULT
  ProcedureReturn CompareString_($400, #NORM_IGNORECASE, @*s1\s, -1, @*s2\s, -1) - 2
EndProcedure

; array to sort
Dim values.s(2)
values(0)="Alemen"
values(1)="alemon"
values(2)="alemán"

; perform sort
qsort(@values(), ArraySize(values()) + 1, SizeOf(String), @Compare())

; present result
For i = 0 To 2
  Debug values(i)
Next
Windows (x64)
Raspberry Pi OS (Arm64)
User avatar
marcoagpinto
Addict
Addict
Posts: 1045
Joined: Sun Mar 10, 2013 3:01 pm
Location: Portugal
Contact:

Re: Sort using natural instead of ASCII

Post by marcoagpinto »

Thanks for all the comments :mrgreen:

But I guess my approach will make it work on the three platforms.

I still think the best would be for Fredddy to add a flag #LATIN to the sort command as that would solve complex coding.
User avatar
Thunder93
Addict
Addict
Posts: 1788
Joined: Tue Mar 21, 2006 12:31 am
Location: Canada

Re: Sort using natural instead of ASCII

Post by Thunder93 »

Wilbert's example;

To:
résu1
résu11
résu2
résu22
resume
résumé
Résumé
resumes
Resumes
résumés

From:
resumes
resume
résumés
Resumes
Résumé
résumé
résu1
résu11
résu2
résu22


Not sure where, but I was reading this morning something about résu2 comes before résu11. Must be another type of sorting. :?
ʽʽSuccess is almost totally dependent upon drive and persistence. The extra energy required to make another effort or try another approach is the secret of winning.ʾʾ --Dennis Waitley
Little John
Addict
Addict
Posts: 4779
Joined: Thu Jun 07, 2007 3:25 pm
Location: Berlin, Germany

Re: Sort using natural instead of ASCII

Post by Little John »

Thunder93 wrote:Not sure where, but I was reading this morning something about résu2 comes before résu11. Must be another type of sorting. :?
Yes, it's another type. And this is called Natural sort order.
What marcoagpinto is asking for is not called natural sorting or similar, so the title of this thread is very misleading.
User avatar
Thunder93
Addict
Addict
Posts: 1788
Joined: Tue Mar 21, 2006 12:31 am
Location: Canada

Re: Sort using natural instead of ASCII

Post by Thunder93 »

Thanks for your insight Little John. :wink:
ʽʽSuccess is almost totally dependent upon drive and persistence. The extra energy required to make another effort or try another approach is the secret of winning.ʾʾ --Dennis Waitley
Post Reply