Tip: Creating ProperCase strings

Share your advanced PureBasic knowledge/code with the community.
User avatar
blueznl
PureBasic Expert
PureBasic Expert
Posts: 6166
Joined: Sat May 17, 2003 11:31 am
Contact:

Post by blueznl »

what does the vb code do to the following string:

"*t*h*is is *a t*est *oh* yeah"

oh, i managed to save a little more on the old code, there was one redundant line in there:

Code: Select all

Procedure.s x_propercase(s.s) 
  Protected *p.l, f.l, b.l
  ;
  ; *** make all lowercase except for the first chars of each word
  ;
  *p = @s 
  f = 1 
  b = PeekB(*p) 
  While b <> 0 
    If b = 32 
      f = 1 
    ElseIf f = 1 And b >= 97 And b<=122 
      PokeB(*p,b & $DF) 
      f = 0 
    ElseIf f = 0 And b >= 65 And b <= 90 
      PokeB(*p,b | $20) 
    Else 
      f = 0 
    EndIf 
    *p = *p+1
    b = PeekB(*p) 
  Wend 
  ProcedureReturn s 
EndProcedure 
( PB6.00 LTS Win11 x64 Asrock AB350 Pro4 Ryzen 5 3600 32GB GTX1060 6GB)
( The path to enlightenment and the PureBasic Survival Guide right here... )
ebs
Enthusiast
Enthusiast
Posts: 557
Joined: Fri Apr 25, 2003 11:08 pm

Post by ebs »

what does the vb code do to the following string:

"*t*h*is is *a t*est *oh* yeah"
StrConv("*t*h*is is *a t*est *oh* yeah", vbProperCase) produces
*t*h*is Is *a T*est *oh* Yeah

Eric
User avatar
blueznl
PureBasic Expert
PureBasic Expert
Posts: 6166
Joined: Sat May 17, 2003 11:31 am
Contact:

Post by blueznl »

ebs, pb, one of you two is wrong as your answers are conflicting... what is now the proper output?!?
( PB6.00 LTS Win11 x64 Asrock AB350 Pro4 Ryzen 5 3600 32GB GTX1060 6GB)
( The path to enlightenment and the PureBasic Survival Guide right here... )
oldefoxx
Enthusiast
Enthusiast
Posts: 532
Joined: Fri Jul 25, 2003 11:24 pm

Non-Alpha Leader Approach

Post by oldefoxx »

I went ahead and adapted the code for allowing any non-Alpha leader to
force the following alpha character to UPPER case, and all other alpha characters to lower case. and wrote it in ASM for speed.

Code: Select all


;Note that either you have to change the Compiler/Compiler Options to allow
;Inline ASM Support for this code to compile correctly.

;Note that statements in ASM (Assembler) that refer to line lables have a "l_"
;(small L with underscore) added in front when referenced by an ASM instruction,
;and the characters must be in lower case.  The actual line labels can be in
;mixed case, as illustrated in this code.


Procedure.s x_propercase(s.s)  ;convert all groups of letters to Ucase form
    MOV eax, [esp]              ;string pointer on stack (pointed to by esp) into EAX reg      
    XOR dl, dl                  ;clear garbage for DL
  Cycle:                        ;return point to repeat for each character in string   
    MOV dh,dl                   ;save the last processed char in DH 
    MOV dl,[eax]                ;get the next character to process in DL
    TEST dl,dl                  ;set flags against same register to check value           
    JZ l_endstring              ;character is zero (Null), so exit process
    AND dx,$DFDF                ;get rid of lower case flag in DH and DL registers
    CMP dl, 'Z'                 ;compare the value in DL with 90 (ascii code for 'Z')  
    JA l_not_alpha              ;if above 'Z', it is not an Alpha character
    CMP dl, 'A'                 ;compare the value in DL with 65 (ascii code for 'A')
    JB l_not_alpha              ;if below 'A', it is not an Alpha character
    CMP dh, 'A'                 ;compare last character with 65 (ascii code for 'A')
    JB l_high                   ;if below 'A', we keep the current letter in UPPER case  
    CMP dh, 'Z'                 ;compare last character with 90 (ascii code for 'Z')
    JA l_high                   ;if above 'Z', we keep the current letter in UPPER case  
  Low:                          ;otherwise, we have two or more alpha characters in a row
    OR dl, $20                  ;and we force the current letter to lower case  
  High:                         ;and this is where the UPPER case letters merge in again  
    MOV [eax], dl               ;so that we can store the correct case letter back
  Not_Alpha::                   ;or we skipped if current character not an Alpha character
    INC eax                     ;we increment the string pointer for s.s to next character 
    JMP l_cycle                 ;and jump back to repeat for the next character  
  EndString: 
  ProcedureReturn s           ;we make sure the changes are returned. 
EndProcedure 

OpenConsole()
ConsoleColor(15,1)
PrintN(x_propercase("tHIS iS a tESt of pROper *C*a(se)s."))
While Inkey()=""
Wend

I don't think that there are many more treatments left for this issue.
We now have examples in PB and in ASM for doing pretty much the
same thing, but some routines only capitalize immediately after a
space, and others (such as this example), capitalize immediately after
any non-alpha character.

I want to thank the contributors that gave some ASM examples, as these were great for helping me gain some insights as to how I can interface PB with ASM code.
has-been wanna-be (You may not agree with what I say, but it will make you think).
WolfgangS
Enthusiast
Enthusiast
Posts: 174
Joined: Fri Apr 25, 2003 3:30 pm

Post by WolfgangS »

blueznl wrote: (wrote a fairly good centipee clone called multibug for the vic20... without an assembler! if anybody has a copy of that game i would feel *very* obliged, lost my own copy though i have a vic20 standing here since a few months)
Uh, I didn't find it ... where and when did you publish the game ?

MFG
WolfgangS
WolfgangS' projects http://www.schliess.net
Quotation of the month:
<p3hicy>oder ich hol mir so eine geile aus asien
<p3hicy>die ständig poppen will
<p3hicy>'n brötchen pro tag reicht doch
<p3hicy>die essen eh' nich so viel
PB
PureBasic Expert
PureBasic Expert
Posts: 7581
Joined: Fri Apr 25, 2003 5:24 pm

Post by PB »

> hey pb, you're changing the rules!

Hehehe, no I'm not, I'm just emulating VB as I originally stated. :)
PB
PureBasic Expert
PureBasic Expert
Posts: 7581
Joined: Fri Apr 25, 2003 5:24 pm

Re: Sorry about the Typos

Post by PB »

> there is no absolute rule for capitalization - and it often depends upon
> the context you are trying to apply to it to.

I was applying it to emulating VB's command, and nothing more. ;)

> The best way to make this determination in code is that if a letter
> follows another letter, it does not get capitalized. In all other cases
> it is capitalized.

That's exactly what my procedure does. :)
PB
PureBasic Expert
PureBasic Expert
Posts: 7581
Joined: Fri Apr 25, 2003 5:24 pm

Post by PB »

> ebs, pb, one of you two is wrong as your answers are conflicting...
> what is now the proper output?!?

Hmm, well *t*h*is isn't really a word, is it? :) It's just some letters
separated by asterisks. As for VB's proper output, I was only going
by its official description in its docs, which states that vbProperCase
will "capitalize the first letter of every word".
User avatar
blueznl
PureBasic Expert
PureBasic Expert
Posts: 6166
Joined: Sat May 17, 2003 11:31 am
Contact:

Post by blueznl »

wolfgang, that was a *long* time ago, it was distributed via a dutch chain, i think it was 'foto kral', at the speed stuff was copied at that time it might have ended up anywhere, i just feel somewhat sad i lost it :-)
( PB6.00 LTS Win11 x64 Asrock AB350 Pro4 Ryzen 5 3600 32GB GTX1060 6GB)
( The path to enlightenment and the PureBasic Survival Guide right here... )
User avatar
blueznl
PureBasic Expert
PureBasic Expert
Posts: 6166
Joined: Sat May 17, 2003 11:31 am
Contact:

Post by blueznl »

pb, ebs, anybody: yeah, we've pretty much chewed up this one :-)

let's do another one, any suggestions?
( PB6.00 LTS Win11 x64 Asrock AB350 Pro4 Ryzen 5 3600 32GB GTX1060 6GB)
( The path to enlightenment and the PureBasic Survival Guide right here... )
User avatar
Frarth
Enthusiast
Enthusiast
Posts: 241
Joined: Tue Jul 21, 2009 11:11 am
Location: On the planet
Contact:

Re: Tip: Creating ProperCase strings

Post by Frarth »

While searching for ProperCase examples this thread turns up as one of the first. But it is OOOOLD. Today when the input is non-ascii the byte-examples don't work.

Here is a rewrite of blueznl's example that does work. Any improvements are welcome.

Code: Select all

structure TCharacter
  c.c[0]
endstructure

procedure.s ProperCase(text.s)
  protected *ptr.TCharacter
  protected.l c, f, i, length
  
  *ptr = @text
  length = MemoryStringLength(*ptr) - 1
  
  f = 1
  for i = 0 to length
    if *ptr\c[i] = 32
      f = 1
    elseif f = 1 and *ptr\c[i] >= 97 and *ptr\c[i] <= 122
      *ptr\c[i] = (*ptr\c[i] & $DF)
      f = 0
    elseif f = 0 and *ptr\c[i] >= 65 and *ptr\c[i] <= 90
      *ptr\c[i] = (*ptr\c[i] | $20)
    else
      f = 0
    endif
  next
  
  procedurereturn text
endprocedure
PureBasic 5.41 LTS | Xubuntu 16.04 (x32) | Windows 7 (x64)
acreis
Enthusiast
Enthusiast
Posts: 204
Joined: Fri Jun 01, 2012 12:20 am

Re: Tip: Creating ProperCase strings

Post by acreis »

European languages (windows):

Code: Select all

Structure TCharacter
  c.c[0]
EndStructure

Procedure.s ProperCase(text.s)
  Protected *ptr.TCharacter
  Protected.l c, f, i, length
  
  *ptr = @text
  length = MemoryStringLength(*ptr) - 1
  
  f = 1
  For i = 0 To length
    If *ptr\c[i] = 32
      f = 1
    ElseIf f = 1 
      If *ptr\c[i] >= 97 And *ptr\c[i] <= 122
        *ptr\c[i] = (*ptr\c[i] & $DF)
      ElseIf *ptr\c[i] >= 127
        *ptr\c[i] = CharUpper_(*ptr\c[i])
      EndIf  
      f = 0
    ElseIf f = 0 
      If *ptr\c[i] >= 65 And *ptr\c[i] <= 90
        *ptr\c[i] = (*ptr\c[i] | $20)
      ElseIf *ptr\c[i] >= 127
        *ptr\c[i] = CharLower_(*ptr\c[i])
      EndIf  
    Else
      f = 0
    EndIf
  Next
  
  ProcedureReturn text
EndProcedure

Debug ProperCase("maria ágÁta")
wilbert
PureBasic Expert
PureBasic Expert
Posts: 3942
Joined: Sun Aug 08, 2004 5:21 am
Location: Netherlands

Re: Tip: Creating ProperCase strings

Post by wilbert »

Frarth wrote:While searching for ProperCase examples this thread turns up as one of the first.
Some forum threads use 'capitalize string' instead of 'ProperCase' so if you search for 'capitalize string' you might find different examples.
Windows (x64)
Raspberry Pi OS (Arm64)
User avatar
skywalk
Addict
Addict
Posts: 4211
Joined: Wed Dec 23, 2009 10:14 pm
Location: Boston, MA

Re: Tip: Creating ProperCase strings

Post by skywalk »

Code: Select all

Procedure$ SF_TitleCase(s$)
  ; skywalk modified from luis, Little John
  ;   http://www.forums.purebasic.com/english/viewtopic.php?p=370491&sid=c211be8dff9e7412095071dd1feee541#p370491
  ; Capitalize 1st letter of each word found.
  ; Proper case capitalizes 1st letter of 1st word.
  ; Words defined with the following delimiters:
  ;   space , ! " # $ % & ' ( ) * + - . /
  ;   tab
  ;   : ; < = > ? @
  ;   [ \ ] ^ _ `
  Protected *p.Character = @s$
  Protected.i newWord = 1
  While *p\c
    If newWord
      *p\c = Asc(UCase(Chr(*p\c)))
      newWord = 0
    Else
      *p\c = Asc(LCase(Chr(*p\c)))
    EndIf
    If *p\c > 31 And *p\c < 48      ; space , ! " # $ % & ' ( ) * + - . /
      newWord = 1
    ElseIf *p\c = #TAB
      newWord = 2
    ElseIf *p\c > 57 And *p\c < 65  ; : ; < = > ? @
      newWord = 3
    ElseIf *p\c > 90 And *p\c < 97  ; [ \ ] ^ _ `
      newWord = 4
    EndIf
    *p + SizeOf(Character)
  Wend
  ProcedureReturn s$
EndProcedure
Debug SF_TitleCase("  hmm, w::w w[we are one]   we,  went to neW,"+Chr(9)+"yorK toDay. to buy       sausages  ")
The nice thing about standards is there are so many to choose from. ~ Andrew Tanenbaum
User avatar
Frarth
Enthusiast
Enthusiast
Posts: 241
Joined: Tue Jul 21, 2009 11:11 am
Location: On the planet
Contact:

Re: Tip: Creating ProperCase strings

Post by Frarth »

acreis wrote:European languages (windows):

Code: Select all

Structure TCharacter
  c.c[0]
EndStructure

Procedure.s ProperCase(text.s)
  Protected *ptr.TCharacter
  Protected.l c, f, i, length
  
  *ptr = @text
  length = MemoryStringLength(*ptr) - 1
  
  f = 1
  For i = 0 To length
    If *ptr\c[i] = 32
      f = 1
    ElseIf f = 1 
      If *ptr\c[i] >= 97 And *ptr\c[i] <= 122
        *ptr\c[i] = (*ptr\c[i] & $DF)
      ElseIf *ptr\c[i] >= 127
        *ptr\c[i] = CharUpper_(*ptr\c[i])
      EndIf  
      f = 0
    ElseIf f = 0 
      If *ptr\c[i] >= 65 And *ptr\c[i] <= 90
        *ptr\c[i] = (*ptr\c[i] | $20)
      ElseIf *ptr\c[i] >= 127
        *ptr\c[i] = CharLower_(*ptr\c[i])
      EndIf  
    Else
      f = 0
    EndIf
  Next
  
  ProcedureReturn text
EndProcedure

Debug ProperCase("maria ágÁta")
I'm not doing much programming in Windows but is CharUpper_ and CharLower_ not the same as UCase and LCase, or do the latter only support <= 127?
PureBasic 5.41 LTS | Xubuntu 16.04 (x32) | Windows 7 (x64)
Post Reply