Page 2 of 4
Posted: Mon Apr 12, 2004 6:04 pm
by El_Choni
EDIT: after looking at Pupil's modified asm code, faster than my first attempt, I've modified it a bit (hope you don't mind):
Code: Select all
Procedure.s x_propercase(s.s)
MOV eax, [esp]
MOV dl, 32
Check1:
MOV cl, [eax]
TEST cl, cl
JZ l_done
CMP dl, 32
JNE l_check2
MOV dl, cl
CMP dl, 'a'
JL l_checkend
CMP dl, 'z'
JG l_checkend
SUB dl, 32
MOV [eax], dl
Check2:
MOV dl, cl
CMP dl, 'A'
JL l_checkend
CMP dl, 'Z'
JG l_checkend
ADD dl, 32
MOV [eax], dl
CheckEnd:
INC eax
JMP l_check1
Done:
ProcedureReturn s
EndProcedure
Posted: Mon Apr 12, 2004 6:47 pm
by blueznl
who's gonna benchmark all of this?
hey elchoni, this way i tried, but found out if i changed the (seeuqnce of the) logic and refrained from using the struct it was a few ticks faster (but not as readable... although i might have messed around a little too much and thus made it actually slower...
looks like a competition... who's next to invent the wheel?

Posted: Mon Apr 12, 2004 6:52 pm
by El_Choni
LOL, this is fun

But my code is based on Pupil's, I'm cheating XD
Posted: Mon Apr 12, 2004 7:12 pm
by Kris_a
Code: Select all
Procedure.s x_propercase1(s.s)
MOV eax, [esp]
MOV dl, 32
Check1:
MOV cl, [eax]
TEST cl, cl
JZ l_done
CMP dl, 32
JNE l_check2
MOV dl, cl
CMP dl, 'a'
JL l_checkend
CMP dl, 'z'
JG l_checkend
SUB dl, 32
MOV [eax], dl
Check2:
MOV dl, cl
CMP dl, 'A'
JL l_checkend
CMP dl, 'Z'
JG l_checkend
ADD dl, 32
MOV [eax], dl
CheckEnd:
INC eax
JMP l_check1
Done:
ProcedureReturn s
EndProcedure
Procedure.s x_propercase2(s.s)
*p = @s
f = 1
b = PeekB(*p)
While b <> 0
If b = 32
f = 1
ElseIf f = 1 And b >= 97 And b<=122
PokeB(*p,b & $DF)
f = 0
ElseIf f = 0 And b >= 65 And b <= 90
PokeB(*p,b | $20)
f = 0
Else
f = 0
EndIf
*p = *p+1
b = PeekB(*p)
Wend
ProcedureReturn s
EndProcedure
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
#numtests = 5
#numloops= 100000
output.s = ""
string.s = "this is the string to be tested (52 characters long)"
time = GetTickCount_()
For a = 1 To #numtests
For b = 1 To #numloops
x_propercase1(string)
Next
Next
avg.f = (GetTickCount_()-time)/#numtests
Output + "El Choni's code: "+StrF(avg)+"ms"+Chr(13)+Chr(10)
time = GetTickCount_()
For a = 1 To #numtests
For b = 1 To #numloops
x_propercase2(string)
Next
Next
avg.f = (GetTickCount_()-time)/#numtests
Output + "Blueznl's code: "+StrF(avg)+"ms"+Chr(13)+Chr(10)
MessageRequester("Results",output)
El Choni's code: 100.0000000ms
Blueznl's code: 156.1999997ms
It'd be great if someone was willing to benchmark with 1000+ test repetitions, then we'd have a really accurate result
Another Variation
Posted: Mon Apr 12, 2004 7:30 pm
by oldefoxx
Another 6502 coder, eh? I use to hate the fact that there was no instruction for directly exchanging the X and Y index registers. You
had to push a, move x to a, push a, move y to a, move a to x, pop a, move a to y, and pop a for that simple operation (you did not have the same addressing modes available from both index registers). I did write my own assembler for the PET computer with 8K of RAM, which left you about 500 bytes for assembler code.
I made some small changes to the best example above. I considered doing it in Assembler, but fact is, FASM is still a bit of a mystery to me -- you have to use simple variables to interface to it from PB, and that means extra work up front. Aside from the fact that I'm not comfortable with addressing modes for the old architecture (DS, ES, SI, and DI) as opposed to the extended addressing modes currently offered for the expanded memory in modern computers. Most examples found in Assembler coding books only deal with the old 8086 architecture.
I think it would be great if PB started a new forum for language extensions, to which we could submit examples and vie to come up with the best version. These could then be bundled into libraries for use in various projects. There might be less demand for major compiler updates - after all, C and C++ are mostly comprised of library extensions.
Code: Select all
Procedure.s y_propercase(s.s)
*p=@s
b.b=32
Repeat
a.b=b
b=PeekB(*p)
If b=0
Goto exitrepeat
EndIf
c.b=b & 223
If c>= 65 And c<=90
If a=32
PokeB(*p,c)
Else
PokeB(*p,c | 32)
EndIf
EndIf
*p+1
ForEver
exitrepeat:
ProcedureReturn s
EndProcedure
Posted: Mon Apr 12, 2004 7:35 pm
by Num3
ATHLON XP 2000+ (1666mhz)
Number of tests: 1000
El Choni: 94.093
Blueznl: 140.391
Posted: Mon Apr 12, 2004 8:31 pm
by blueznl
oldefox: oh yeah, indexed indirect adressing and indexed indirect adressing, iirc... those were the days
LDA ($1024),X
LDA ($1024,Y)
or something amongst those lines
the 65102 / 6510 (vague on the numbers) had a few extra statements that helped here, my good old atari 600 xl has still the upgrade on board

(and 256 kb paged in multiple 16 kb banks, amazing)
an additional forum on assembly might be interesting though, it's not tips and tricks, and not beginners either... well, for beginners in asm it is
hey num3, i don't feel half bad about those results, to be honest

Posted: Mon Apr 12, 2004 9:51 pm
by dell_jockey
references to 65xx ASM here without even mentioning zero-page adressing?

I just had to add my bit... (pun intended)
Posted: Tue Apr 13, 2004 3:36 am
by PB
Hehehe, nice to see all the responses to my original code, but you can't
leave some words with lowercase first letters.

In other words, you
can't just check for a space and letter -- you need to ensure every word
starts with a capital letter. Try this with your variations -- they should
return
This *Is* A (Test) if done correctly:
Code: Select all
Debug ProperCase("thIs *IS* a (test)")
Not Necessarily So
Posted: Tue Apr 13, 2004 5:44 am
by oldefoxx
Proper Casing is not as intelligent as you make it out to be. True, you may have to consider such things as tabs in addition to spaces, and the Question of hyphens, but your He He comes too soon, since there are many circumstances where additional rules need to apply,
"This is a Test". would be considered corredt. But "thiis Is A Test". would not (leading quote, and we do not normally capitalize incidental words such as the, is, and, and so on).
McQuire and MacNally are generally spelled this way, not Mcquire and Macnally. o'Conner would not normally become O'conner. dBase III is another voilation.
But the fact is, the so-call "proper case" function works on very simple syntatical rules, and none of the implementations will meet all needs or requirements. COBOL is usually in all capitals, since it is an acronum, whereas Fortran is usually in this manner.
My own name, Darden, has various spellings, as it is an old name, and one of these is d'Arden -- are you planning on including a rule for it, in case you ever encounter it again?
My feeling is that the best recourse is to follow the generally accepted rules, such as they are, since with experience you will know what to expect when you apply this function in one language or another.
Re: Not Necessarily So
Posted: Tue Apr 13, 2004 6:05 am
by PB
> your He He comes too soon, since there are many circumstances where
> additional rules need to apply
Sorry, but I beg to differ -- and here's why: you have to take into account
why I created the tip. When I wrote this tip, I stated that its task was, and
I quote,
emulate Visual Basic's vbProperCase flag to create strings with capital
letters on each word. Therefore, any "additional rules" such as hyphens,
nouns, names, etc, don't actually apply because Visual Basic doesn't apply
them either. I simply emulated Visual Basic exactly, and nothing more.

Re: Not Necessarily So
Posted: Tue Apr 13, 2004 6:25 am
by Fangbeast
oldefoxx wrote:
corredt.
"thiis".
voilation.
syntatical
acronum
Brother!! Where have you been all my life??? You spell just like I do:):):)
(Okay, okay, so I have waaay too much time on my hands)
Posted: Tue Apr 13, 2004 11:35 am
by blueznl
hey pb, you're changing the rules!
what would it make of "dit*is*een*test"?
Posted: Tue Apr 13, 2004 12:21 pm
by dell_jockey
what would it make of "dit*is*een*test"?
"Dit*Was*Een*Test"

Sorry about the Typos
Posted: Tue Apr 13, 2004 5:12 pm
by oldefoxx
I was really bushed last night, and though I noted that the fingers of one hand sometimes got ahead of the other, I was too tired to care.
The point I was trying to make, is that there is no absolute rule for capitalization - and it often depends upon the context you are trying to apply to it to.
Yes, following Visual Basic's capitalization rules makes sence. That way you get the same results. But those results would not be pleasing to everyone.
The best way to make this determination in code is that if a letter follows another letter, it does not get capitalized. In all other cases it is capitalized.
This code will follow that general rule. Writing too specific a set of rules will cause the overall process to be a lot slower.
Code: Select all
Procedure.s x_propercase(s.s)
*p=@s
c.b=32
Repeat
a.b=c
b.b=PeekB(*p)
If b=0
Goto exitrepeat
EndIf
c.b=b & 223
If c>= 65 And c<=90
If a>=65 And a<=90
PokeB(*p,c)
Else
PokeB(*p,c | 32)
EndIf
EndIf
*p+1
ForEver
exitrepeat:
ProcedureReturn s
EndProcedure