Sort_NoCase became a case

Windows specific forum
pTb
User
User
Posts: 16
Joined: Sat Apr 16, 2011 3:17 pm

Sort_NoCase became a case

Post by pTb »

Hi,
after reviving some old code I wrote a couple of years ago to find out why the Sort library doesn't seam to work, I re-found out why.
It doesn't handle Swedish letters as supposed.
The Swedish alphabet has ÅÄÖ as letters after Z. In this order: ABC...XYZÅÄÖ

Two things doesn't work in the Sort library:
1. sorting Swedish characters in correct order - I get ÄÅÖ instead of ÅÄÖ (could have with the old ASC-II representation to do). It's always a hassle to sort Swedish... :P
2. the #PB_Sort_NoCase flag doesn't ignore the case on åäöÅÄÖ. This problem should perhaps also appear on special French characters? How is it with the German ÄÖÜ and eszet (that I can't reproduce on my keyboard right now... :lol: )

If this complies with the other OS's is not known by me. But probably.
My current version of PB is 5.70 LTS (x86)

All righty, then. Some code to show what I'm talking about:

Code: Select all

NewList fa.s()

AddElement(fa())
fa() = "ö"
AddElement(fa())
fa() = "Å"
AddElement(fa())
fa() = "v"
AddElement(fa())
fa() = "W"
AddElement(fa())
fa() = "ä"
AddElement(fa())
fa() = "Ö"
AddElement(fa())
fa() = "w"
AddElement(fa())
fa() = "V"
AddElement(fa())
fa() = "Ä"
AddElement(fa())
fa() = "å"
AddElement(fa())
fa() = "z"
AddElement(fa())
fa() = "U"
AddElement(fa())
fa() = "Z"
AddElement(fa())
fa() = "u"

SortList(fa(), #PB_Sort_NoCase)
testString.s = ""
ForEach fa()
  If testString > UCase(fa())
    Debug "Wrong: " + fa() + " < " + testString
  Else
    Debug fa()
  EndIf
  testString = UCase(fa())
Next
This should produce the following output (does here at least):
U
u
v
V
W
w
z
Z
Ä
Å
Ö
Wrong: ä < Ö
å
ö
The correct order for this program would be:
U
u
v
V
W
w
z
Z
Å
å
ä
Ä
ö
Ö
Little John
Addict
Addict
Posts: 4519
Joined: Thu Jun 07, 2007 3:25 pm
Location: Berlin, Germany

Re: Sort_NoCase became a case

Post by Little John »

Problems with sorting non-ASCII characters have been discussed before, see e.g.
viewtopic.php?f=4&t=59890
viewtopic.php?p=511499#p511499

You might want to consider using a custom sorting routine. I recommend
pTb
User
User
Posts: 16
Joined: Sat Apr 16, 2011 3:17 pm

Re: Sort_NoCase became a case

Post by pTb »

Hello Little John,
your search skills are quite a bit better than mine. Thanks! I had a feeling this should have been discussed before so I didn't "dare" to post under bugs...

Vielen danke für die linken! (oh, sorry. That wasn't in PureBasic :wink: ) (to be honest, my PureBasic is waaayyy better than my German :lol: )
Little John
Addict
Addict
Posts: 4519
Joined: Thu Jun 07, 2007 3:25 pm
Location: Berlin, Germany

Re: Sort_NoCase became a case

Post by Little John »

Hello pTb, you are welcome!
pTb wrote:Vielen danke für die linken! (oh, sorry. That wasn't in PureBasic :wink: ) (to be honest, my PureBasic is waaayyy better than my German :lol: )
:lol:
User avatar
Andre
PureBasic Team
PureBasic Team
Posts: 2056
Joined: Fri Apr 25, 2003 6:14 pm
Location: Germany (Saxony, Deutscheinsiedel)
Contact:

Re: Sort_NoCase became a case

Post by Andre »

Beside the already suggested threads I can tell you my handling of such "special chars" (umlauts, but also Northern/Eastern languages) when sorting:

For structured arrays or linked lists I always use an additional structure element 'ABC', which gets filled with a converted (lower case) version of the names (strings to sort). Here I convert every umlauts like Ä/Ö/ü etc. into ae/oe/ue etc.
Other special chars like á/â/à etc. will be converted into the regular a.

When I now sort my arrays/lists by this 'ABC' structure element, I get the order of strings I want... :)
Bye,
...André
(PureBasicTeam::Docs & Support - PureArea.net | Order:: PureBasic | PureVisionXP)
pTb
User
User
Posts: 16
Joined: Sat Apr 16, 2011 3:17 pm

Re: Sort_NoCase became a case

Post by pTb »

@Andre:

I actually solved it with the same trick but my extra structured element was in upper case. (much better, don't you think? 8) )
It was when I recently returned to this code, I came to remember this little snag.

Thanks for your input!
Post Reply