Page 3 of 4

Posted: Thu Apr 15, 2004 8:55 am
by blueznl
what does the vb code do to the following string:

"*t*h*is is *a t*est *oh* yeah"

oh, i managed to save a little more on the old code, there was one redundant line in there:

Code: Select all

Procedure.s x_propercase(s.s) 
  Protected *p.l, f.l, b.l
  ;
  ; *** make all lowercase except for the first chars of each word
  ;
  *p = @s 
  f = 1 
  b = PeekB(*p) 
  While b <> 0 
    If b = 32 
      f = 1 
    ElseIf f = 1 And b >= 97 And b<=122 
      PokeB(*p,b & $DF) 
      f = 0 
    ElseIf f = 0 And b >= 65 And b <= 90 
      PokeB(*p,b | $20) 
    Else 
      f = 0 
    EndIf 
    *p = *p+1
    b = PeekB(*p) 
  Wend 
  ProcedureReturn s 
EndProcedure 

Posted: Thu Apr 15, 2004 1:45 pm
by ebs
what does the vb code do to the following string:

"*t*h*is is *a t*est *oh* yeah"
StrConv("*t*h*is is *a t*est *oh* yeah", vbProperCase) produces
*t*h*is Is *a T*est *oh* Yeah

Eric

Posted: Thu Apr 15, 2004 3:03 pm
by blueznl
ebs, pb, one of you two is wrong as your answers are conflicting... what is now the proper output?!?

Non-Alpha Leader Approach

Posted: Thu Apr 15, 2004 9:52 pm
by oldefoxx
I went ahead and adapted the code for allowing any non-Alpha leader to
force the following alpha character to UPPER case, and all other alpha characters to lower case. and wrote it in ASM for speed.

Code: Select all


;Note that either you have to change the Compiler/Compiler Options to allow
;Inline ASM Support for this code to compile correctly.

;Note that statements in ASM (Assembler) that refer to line lables have a "l_"
;(small L with underscore) added in front when referenced by an ASM instruction,
;and the characters must be in lower case.  The actual line labels can be in
;mixed case, as illustrated in this code.


Procedure.s x_propercase(s.s)  ;convert all groups of letters to Ucase form
    MOV eax, [esp]              ;string pointer on stack (pointed to by esp) into EAX reg      
    XOR dl, dl                  ;clear garbage for DL
  Cycle:                        ;return point to repeat for each character in string   
    MOV dh,dl                   ;save the last processed char in DH 
    MOV dl,[eax]                ;get the next character to process in DL
    TEST dl,dl                  ;set flags against same register to check value           
    JZ l_endstring              ;character is zero (Null), so exit process
    AND dx,$DFDF                ;get rid of lower case flag in DH and DL registers
    CMP dl, 'Z'                 ;compare the value in DL with 90 (ascii code for 'Z')  
    JA l_not_alpha              ;if above 'Z', it is not an Alpha character
    CMP dl, 'A'                 ;compare the value in DL with 65 (ascii code for 'A')
    JB l_not_alpha              ;if below 'A', it is not an Alpha character
    CMP dh, 'A'                 ;compare last character with 65 (ascii code for 'A')
    JB l_high                   ;if below 'A', we keep the current letter in UPPER case  
    CMP dh, 'Z'                 ;compare last character with 90 (ascii code for 'Z')
    JA l_high                   ;if above 'Z', we keep the current letter in UPPER case  
  Low:                          ;otherwise, we have two or more alpha characters in a row
    OR dl, $20                  ;and we force the current letter to lower case  
  High:                         ;and this is where the UPPER case letters merge in again  
    MOV [eax], dl               ;so that we can store the correct case letter back
  Not_Alpha::                   ;or we skipped if current character not an Alpha character
    INC eax                     ;we increment the string pointer for s.s to next character 
    JMP l_cycle                 ;and jump back to repeat for the next character  
  EndString: 
  ProcedureReturn s           ;we make sure the changes are returned. 
EndProcedure 

OpenConsole()
ConsoleColor(15,1)
PrintN(x_propercase("tHIS iS a tESt of pROper *C*a(se)s."))
While Inkey()=""
Wend

I don't think that there are many more treatments left for this issue.
We now have examples in PB and in ASM for doing pretty much the
same thing, but some routines only capitalize immediately after a
space, and others (such as this example), capitalize immediately after
any non-alpha character.

I want to thank the contributors that gave some ASM examples, as these were great for helping me gain some insights as to how I can interface PB with ASM code.

Posted: Thu Apr 15, 2004 11:57 pm
by WolfgangS
blueznl wrote: (wrote a fairly good centipee clone called multibug for the vic20... without an assembler! if anybody has a copy of that game i would feel *very* obliged, lost my own copy though i have a vic20 standing here since a few months)
Uh, I didn't find it ... where and when did you publish the game ?

MFG
WolfgangS

Posted: Fri Apr 16, 2004 3:07 am
by PB
> hey pb, you're changing the rules!

Hehehe, no I'm not, I'm just emulating VB as I originally stated. :)

Re: Sorry about the Typos

Posted: Fri Apr 16, 2004 3:11 am
by PB
> there is no absolute rule for capitalization - and it often depends upon
> the context you are trying to apply to it to.

I was applying it to emulating VB's command, and nothing more. ;)

> The best way to make this determination in code is that if a letter
> follows another letter, it does not get capitalized. In all other cases
> it is capitalized.

That's exactly what my procedure does. :)

Posted: Fri Apr 16, 2004 3:17 am
by PB
> ebs, pb, one of you two is wrong as your answers are conflicting...
> what is now the proper output?!?

Hmm, well *t*h*is isn't really a word, is it? :) It's just some letters
separated by asterisks. As for VB's proper output, I was only going
by its official description in its docs, which states that vbProperCase
will "capitalize the first letter of every word".

Posted: Fri Apr 16, 2004 10:38 am
by blueznl
wolfgang, that was a *long* time ago, it was distributed via a dutch chain, i think it was 'foto kral', at the speed stuff was copied at that time it might have ended up anywhere, i just feel somewhat sad i lost it :-)

Posted: Fri Apr 16, 2004 10:41 am
by blueznl
pb, ebs, anybody: yeah, we've pretty much chewed up this one :-)

let's do another one, any suggestions?

Re: Tip: Creating ProperCase strings

Posted: Mon Jun 13, 2016 12:40 pm
by Frarth
While searching for ProperCase examples this thread turns up as one of the first. But it is OOOOLD. Today when the input is non-ascii the byte-examples don't work.

Here is a rewrite of blueznl's example that does work. Any improvements are welcome.

Code: Select all

structure TCharacter
  c.c[0]
endstructure

procedure.s ProperCase(text.s)
  protected *ptr.TCharacter
  protected.l c, f, i, length
  
  *ptr = @text
  length = MemoryStringLength(*ptr) - 1
  
  f = 1
  for i = 0 to length
    if *ptr\c[i] = 32
      f = 1
    elseif f = 1 and *ptr\c[i] >= 97 and *ptr\c[i] <= 122
      *ptr\c[i] = (*ptr\c[i] & $DF)
      f = 0
    elseif f = 0 and *ptr\c[i] >= 65 and *ptr\c[i] <= 90
      *ptr\c[i] = (*ptr\c[i] | $20)
    else
      f = 0
    endif
  next
  
  procedurereturn text
endprocedure

Re: Tip: Creating ProperCase strings

Posted: Mon Jun 13, 2016 3:00 pm
by acreis
European languages (windows):

Code: Select all

Structure TCharacter
  c.c[0]
EndStructure

Procedure.s ProperCase(text.s)
  Protected *ptr.TCharacter
  Protected.l c, f, i, length
  
  *ptr = @text
  length = MemoryStringLength(*ptr) - 1
  
  f = 1
  For i = 0 To length
    If *ptr\c[i] = 32
      f = 1
    ElseIf f = 1 
      If *ptr\c[i] >= 97 And *ptr\c[i] <= 122
        *ptr\c[i] = (*ptr\c[i] & $DF)
      ElseIf *ptr\c[i] >= 127
        *ptr\c[i] = CharUpper_(*ptr\c[i])
      EndIf  
      f = 0
    ElseIf f = 0 
      If *ptr\c[i] >= 65 And *ptr\c[i] <= 90
        *ptr\c[i] = (*ptr\c[i] | $20)
      ElseIf *ptr\c[i] >= 127
        *ptr\c[i] = CharLower_(*ptr\c[i])
      EndIf  
    Else
      f = 0
    EndIf
  Next
  
  ProcedureReturn text
EndProcedure

Debug ProperCase("maria ágÁta")

Re: Tip: Creating ProperCase strings

Posted: Mon Jun 13, 2016 3:16 pm
by wilbert
Frarth wrote:While searching for ProperCase examples this thread turns up as one of the first.
Some forum threads use 'capitalize string' instead of 'ProperCase' so if you search for 'capitalize string' you might find different examples.

Re: Tip: Creating ProperCase strings

Posted: Mon Jun 13, 2016 3:42 pm
by skywalk

Code: Select all

Procedure$ SF_TitleCase(s$)
  ; skywalk modified from luis, Little John
  ;   http://www.forums.purebasic.com/english/viewtopic.php?p=370491&sid=c211be8dff9e7412095071dd1feee541#p370491
  ; Capitalize 1st letter of each word found.
  ; Proper case capitalizes 1st letter of 1st word.
  ; Words defined with the following delimiters:
  ;   space , ! " # $ % & ' ( ) * + - . /
  ;   tab
  ;   : ; < = > ? @
  ;   [ \ ] ^ _ `
  Protected *p.Character = @s$
  Protected.i newWord = 1
  While *p\c
    If newWord
      *p\c = Asc(UCase(Chr(*p\c)))
      newWord = 0
    Else
      *p\c = Asc(LCase(Chr(*p\c)))
    EndIf
    If *p\c > 31 And *p\c < 48      ; space , ! " # $ % & ' ( ) * + - . /
      newWord = 1
    ElseIf *p\c = #TAB
      newWord = 2
    ElseIf *p\c > 57 And *p\c < 65  ; : ; < = > ? @
      newWord = 3
    ElseIf *p\c > 90 And *p\c < 97  ; [ \ ] ^ _ `
      newWord = 4
    EndIf
    *p + SizeOf(Character)
  Wend
  ProcedureReturn s$
EndProcedure
Debug SF_TitleCase("  hmm, w::w w[we are one]   we,  went to neW,"+Chr(9)+"yorK toDay. to buy       sausages  ")

Re: Tip: Creating ProperCase strings

Posted: Mon Jun 13, 2016 8:37 pm
by Frarth
acreis wrote:European languages (windows):

Code: Select all

Structure TCharacter
  c.c[0]
EndStructure

Procedure.s ProperCase(text.s)
  Protected *ptr.TCharacter
  Protected.l c, f, i, length
  
  *ptr = @text
  length = MemoryStringLength(*ptr) - 1
  
  f = 1
  For i = 0 To length
    If *ptr\c[i] = 32
      f = 1
    ElseIf f = 1 
      If *ptr\c[i] >= 97 And *ptr\c[i] <= 122
        *ptr\c[i] = (*ptr\c[i] & $DF)
      ElseIf *ptr\c[i] >= 127
        *ptr\c[i] = CharUpper_(*ptr\c[i])
      EndIf  
      f = 0
    ElseIf f = 0 
      If *ptr\c[i] >= 65 And *ptr\c[i] <= 90
        *ptr\c[i] = (*ptr\c[i] | $20)
      ElseIf *ptr\c[i] >= 127
        *ptr\c[i] = CharLower_(*ptr\c[i])
      EndIf  
    Else
      f = 0
    EndIf
  Next
  
  ProcedureReturn text
EndProcedure

Debug ProperCase("maria ágÁta")
I'm not doing much programming in Windows but is CharUpper_ and CharLower_ not the same as UCase and LCase, or do the latter only support <= 127?