It is currently Wed May 22, 2013 12:48 pm

All times are UTC + 1 hour




Post new topic Reply to topic  [ 6 posts ] 
Author Message
 Post subject: Some small tricks for sorting the letters of a string.
PostPosted: Tue Jun 09, 2009 10:38 am 
Offline
Addict
Addict
User avatar

Joined: Mon Jul 25, 2005 3:51 pm
Posts: 2399
Location: Utah, USA
I have two variations for sorting the characters of a string. They turn a string like "amazing" into "aagimnz". This kind of operation is frequently made to determine if two strings have the same elements irrespective of order (i.e. anagrams) and is usually performed after converting the string to all upper or lower case.

I am posting these small snippets because I think the methods used in them are interesting and unique. The methods might find some use in other areas.

One variation returns the sorted string, the other sorts the string in place. Both versions work with Ascii or Unicode and sort all characters, not just the letters. Both are faster than individually looping through the string's characters to sort them.

The subroutines take a pointer to a string and copy the string's content into the memory for a newly declared array of characters. The array is then sorted using PureBasic's SortArray(). The string is then read as a whole from the array's memory location and either returned or copied over the original strings contents. One important note is that the array's last element (an extra element) is set aside for the strings Null and is not sorted.

Code:
;Author:Demivec  (Jared Johnson)
;Using: PureBasic 4.31

Procedure.s sortLetters(*word.Character, wordLength) ;returns a string with the letters of a word sorted
  Protected Dim letters.c(wordLength)
  Protected *letAdr = @letters()
 
  CopyMemoryString(*word, @*letAdr)
  SortArray(letters(), #PB_Sort_Ascending, 0, wordLength - 1)
  ProcedureReturn PeekS(@letters(), wordLength)
EndProcedure

;example
Define Before$,After$

Before$ = "demonstration"
After$ = sortLetters(@Before$,Len(Before$))
Debug Before$ + "  ==>  " + After$



Code:
;Author:Demivec  (Jared Johnson)
;Using: PureBasic v4.31

Procedure sortLettersInPlace(*word.Character, wordLength) ;sorts the letters of a word without creating a new string
  Protected Dim letters.c(wordLength)
  Protected *letAdr = @letters()
 
  CopyMemoryString(*word, @*letAdr)
  SortArray(letters(), #PB_Sort_Ascending, 0, wordLength - 1)
  CopyMemoryString(@letters(), @*word)
EndProcedure

;example
Define Text$

Text$ = "demonstration"
Debug Text$
sortLettersInPlace(@Text$,Len(Text$))
Debug Text$

_________________
Image


Last edited by Demivec on Tue Jun 09, 2009 9:29 pm, edited 1 time in total.

Top
 Profile  
 
 Post subject:
PostPosted: Tue Jun 09, 2009 1:31 pm 
Offline
Always Here
Always Here
User avatar

Joined: Mon Sep 22, 2003 6:45 pm
Posts: 7304
Location: Norway
I want to add mine here, if it's ok. It too sorts the string in place. I find it's fast at short strings, but loses to yours on longer strings.
(This is a much improved version of the sort used for the state problem.)
Code:
Structure Charray
  C.c[0]
EndStructure

Procedure QuickSortStringAux(*String.Charray, *Start.i, *Nd.Character)
  QuickSortStringAuxStart:
  If *Start < *Nd
    Protected Pivot.c = *Nd\c
    Protected *I.Character = *Start
    Protected *J.Character = *Nd
    While *I <> *J
      If *I\C < Pivot
        *I + SizeOf(Character)
      Else
        *J\c = *I\c
        *J - SizeOf(Character)
        *I\c = *J\c
      EndIf
    Wend
    *J\c = Pivot
    QuickSortStringAux(*String, *Start, *J-SizeOf(Character))
    ;QuickSortStringAux(*String, J+1, Nd)
    *Start = *J + SizeOf(Character)
    Goto QuickSortStringAuxStart
  EndIf
EndProcedure

Procedure QuickSortString(*String.Character)
  QuickSortStringAux(*String, *String, *String+(MemoryStringLength(*String)-1)*SizeOf(Character))
EndProcedure

_________________
Woa, I set up a web server.


Top
 Profile  
 
 Post subject:
PostPosted: Tue Jun 09, 2009 2:15 pm 
Offline
Addict
Addict
User avatar

Joined: Mon Jul 25, 2005 3:51 pm
Posts: 2399
Location: Utah, USA
@Trond: No problem. I think it is good thing to collect a variety of working methods together in one place, as long as it's in keeping with the thread topic. :wink:

_________________
Image


Top
 Profile  
 
 Post subject:
PostPosted: Tue Jun 09, 2009 2:30 pm 
Offline
Addict
Addict
User avatar

Joined: Sun Oct 15, 2006 8:56 pm
Posts: 1040
Location: Germany
Demivec, wow, how elegant your solution is!
Thank your very much for this! :shock:
I never would have had the idea to lay a string over an array... Respect!

_________________
PB 4.30
Code:
onErrorGoto(?Fred)


Top
 Profile  
 
 Post subject:
PostPosted: Wed Jun 10, 2009 8:11 pm 
Offline
Addict
Addict
User avatar

Joined: Thu Jul 01, 2004 2:51 am
Posts: 905
Location: Tacoma, WA
I was kind of curious about this so I tried a different approach and put together a speed test with Demivec's code and mine. Mine is just called SortInPlace() and seems to be faster on average although I think I might lose out in shorter words.

EDIT: Made a change and now I think it's faster on both long and short words.

EDIT2: Even shorter version of my function.

Code:
;
Global HoldClip.s
Global Dim gTest1.s(50000)
Global Dim gTest2.s(50000)
For i.i = 0 To 50000
   r.i = Random(200) + 1
   a.s = Space(r)
   *b.Character = @a
   For j.i = 1 To r
      If Random(1)
         *b\c = 97 + Random(25)
      Else
         *b\c = 65 + Random(25)
      EndIf
      *b + SizeOf(Character)
   Next j
   gTest1(i) = a
   gTest2(i) = a
Next i
HoldClip = gTest1(0)
HoldClip = HoldClip + Chr(13) + Chr(10) + gTest1(1)
HoldClip = HoldClip + Chr(13) + Chr(10) + gTest1(2)
;-
Procedure sortLettersInPlace(*word.Character, wordLength) ;sorts the letters of a word without creating a new string
  Protected Dim letters.c(wordLength)
  Protected *letAdr = @letters()
 
  CopyMemoryString(*word, @*letAdr)
  SortArray(letters(), #PB_Sort_Ascending, 0, wordLength - 1)
  CopyMemoryString(@letters(), @*word)
EndProcedure
;/
Delay(500)
q1.i = GetTickCount_()
For i.i = 0 To 50000
   sortLettersInPlace(@gTest1(i), Len(gTest1(i)))
Next i
r1.i = GetTickCount_()
;/
;-
Procedure SortInPlace(*word.Character)
   ;
   Protected iLoop.i
   ;
   Protected jLoop.i
   ;
   Protected Dim HoldCharacter.i(126)
   ;
   Protected *Position.Character = *word
   ;
   While *Position\c
      ;
      HoldCharacter(*Position\c) + 1
      ;
      *Position + SizeOf(Character)
      ;
   Wend
   ;
   *Position = *word
   ;
   For iLoop = 0 To 126
      ;
      If HoldCharacter(iLoop)
         ;
         For jLoop = 1 To HoldCharacter(iLoop)
            ;
            *Position\c = iLoop
            ;
            *Position + SizeOf(Character)
            ;
         Next jLoop
         ;
      EndIf
      ;
   Next iLoop
   ;
EndProcedure
;/
Delay(500)
q2.i = GetTickCount_()
For i.i = 0 To 50000
   SortInPlace(@gTest2(i))
Next i
r2.i = GetTickCount_()
;/
HoldClip = HoldClip + Chr(13) + Chr(10) + " .... Test 1 ...."
HoldClip = HoldClip + Chr(13) + Chr(10) + gTest1(0)
HoldClip = HoldClip + Chr(13) + Chr(10) + gTest1(1)
HoldClip = HoldClip + Chr(13) + Chr(10) + gTest1(2)
HoldClip = HoldClip + Chr(13) + Chr(10) + " .... Test 2 ...."
HoldClip = HoldClip + Chr(13) + Chr(10) + gTest2(0)
HoldClip = HoldClip + Chr(13) + Chr(10) + gTest2(1)
HoldClip = HoldClip + Chr(13) + Chr(10) + gTest2(2)
;
MessageRequester("Test Results", Str(r1-q1)+" : "+Str(r2-q2))
;
SetClipboardText(HoldClip)
;


Top
 Profile  
 
 Post subject:
PostPosted: Thu Jun 11, 2009 4:16 am 
Offline
Addict
Addict
User avatar

Joined: Mon Jul 25, 2005 3:51 pm
Posts: 2399
Location: Utah, USA
Xombie wrote:
I was kind of curious about this so I tried a different approach and put together a speed test with Demivec's code and mine. Mine is just called SortInPlace() and seems to be faster on average although I think I might lose out in shorter words.


@Xombie: I like your "counting" sort. It actually avoids the sort altogether.

It has one major advantage and one minor limitation. The advantage is it works wonderfully on Ascii (2x + faster :wink:). The limitation is it doesn't work well on Unicode. Your test purposely limited the content of the string to only Ascii upper and lower case letters. It would not be able to handle even characters in the extended code page(s). If Unicode character values are between 0 -> 1000 yours becomes even with mine, or worse.

I have only limited knowledge of the pratical use of Unicode, though I am trying to make my code frequently able to process it. I think Latin, Greek and Cyrillic characters are found as high as the range I mentioned. In the end I am not sure how serious those limitations (or practical) those limitations would be.

Thanks for the posting.

_________________
Image


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 6 posts ] 

All times are UTC + 1 hour


Who is online

Users browsing this forum: No registered users and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  

 


Powered by phpBB © 2008 phpBB Group
subSilver+ theme by Canver Software, sponsor Sanal Modifiye