PureBasic Forum
https://www.purebasic.fr/english/

Sort_NoCase became a case
https://www.purebasic.fr/english/viewtopic.php?f=5&t=72982
Page 1 of 1

Author:  pTb [ Sun Jun 09, 2019 5:25 am ]
Post subject:  Sort_NoCase became a case

Hi,
after reviving some old code I wrote a couple of years ago to find out why the Sort library doesn't seam to work, I re-found out why.
It doesn't handle Swedish letters as supposed.
The Swedish alphabet has ÅÄÖ as letters after Z. In this order: ABC...XYZÅÄÖ

Two things doesn't work in the Sort library:
1. sorting Swedish characters in correct order - I get ÄÅÖ instead of ÅÄÖ (could have with the old ASC-II representation to do). It's always a hassle to sort Swedish... :P
2. the #PB_Sort_NoCase flag doesn't ignore the case on åäöÅÄÖ. This problem should perhaps also appear on special French characters? How is it with the German ÄÖÜ and eszet (that I can't reproduce on my keyboard right now... :lol: )

If this complies with the other OS's is not known by me. But probably.
My current version of PB is 5.70 LTS (x86)

All righty, then. Some code to show what I'm talking about:

Code:
NewList fa.s()

AddElement(fa())
fa() = "ö"
AddElement(fa())
fa() = "Å"
AddElement(fa())
fa() = "v"
AddElement(fa())
fa() = "W"
AddElement(fa())
fa() = "ä"
AddElement(fa())
fa() = "Ö"
AddElement(fa())
fa() = "w"
AddElement(fa())
fa() = "V"
AddElement(fa())
fa() = "Ä"
AddElement(fa())
fa() = "å"
AddElement(fa())
fa() = "z"
AddElement(fa())
fa() = "U"
AddElement(fa())
fa() = "Z"
AddElement(fa())
fa() = "u"

SortList(fa(), #PB_Sort_NoCase)
testString.s = ""
ForEach fa()
  If testString > UCase(fa())
    Debug "Wrong: " + fa() + " < " + testString
  Else
    Debug fa()
  EndIf
  testString = UCase(fa())
Next


This should produce the following output (does here at least):
Quote:
U
u
v
V
W
w
z
Z
Ä
Å
Ö
Wrong: ä < Ö
å
ö


The correct order for this program would be:
Quote:
U
u
v
V
W
w
z
Z
Å
å
ä
Ä
ö
Ö

Author:  Little John [ Sun Jun 09, 2019 4:31 pm ]
Post subject:  Re: Sort_NoCase became a case

Problems with sorting non-ASCII characters have been discussed before, see e.g.
viewtopic.php?f=4&t=59890
viewtopic.php?p=511499#p511499

You might want to consider using a custom sorting routine. I recommend

Author:  pTb [ Sun Jun 16, 2019 9:01 am ]
Post subject:  Re: Sort_NoCase became a case

Hello Little John,
your search skills are quite a bit better than mine. Thanks! I had a feeling this should have been discussed before so I didn't "dare" to post under bugs...

Vielen danke für die linken! (oh, sorry. That wasn't in PureBasic :wink: ) (to be honest, my PureBasic is waaayyy better than my German :lol: )

Author:  Little John [ Sun Jun 16, 2019 9:26 am ]
Post subject:  Re: Sort_NoCase became a case

Hello pTb, you are welcome!

pTb wrote:
Vielen danke für die linken! (oh, sorry. That wasn't in PureBasic :wink: ) (to be honest, my PureBasic is waaayyy better than my German :lol: )

:lol:

Author:  Andre [ Sun Jun 16, 2019 10:28 pm ]
Post subject:  Re: Sort_NoCase became a case

Beside the already suggested threads I can tell you my handling of such "special chars" (umlauts, but also Northern/Eastern languages) when sorting:

For structured arrays or linked lists I always use an additional structure element 'ABC', which gets filled with a converted (lower case) version of the names (strings to sort). Here I convert every umlauts like Ä/Ö/ü etc. into ae/oe/ue etc.
Other special chars like á/â/à etc. will be converted into the regular a.

When I now sort my arrays/lists by this 'ABC' structure element, I get the order of strings I want... :)

Author:  pTb [ Tue Jun 18, 2019 4:53 pm ]
Post subject:  Re: Sort_NoCase became a case

@Andre:

I actually solved it with the same trick but my extra structured element was in upper case. (much better, don't you think? 8) )
It was when I recently returned to this code, I came to remember this little snag.

Thanks for your input!

Page 1 of 1 All times are UTC + 1 hour
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group
http://www.phpbb.com/