Page 2 of 2
Re: BSTR* fast dynamic string datatype
Posted: Sun Mar 19, 2017 5:17 am
by Keya
nco2k btw here's the problem im trying to address with the structure... the main issue is that a .String isn't an address of a string, it's an address of a pointer to the string:
Code: Select all
Define *mystr.String
*mystr=AllocateMemory(4)
*mystr\s = "abc"
Debug Hex(@*mystr) ;4350F4
Debug Hex(*mystr) ;3A1E90 all three are dispersed, not consecutive
Debug Hex(@*mystr\s) ;391EA8
So I can't just have "*bstr = [Size][StringData]" and return *bstr+4, because @ +4 is the string data, not a pointer to a string.
So for example this fails:
Code: Select all
Procedure.i BSTR()
*bstr = AllocateMemory(512)
Debug "Alloc @ " + Hex(*bstr)
PokeL(*bstr,4)
PokeS(*bstr+4, "abcd", #PB_Ascii)
ProcedureReturn *bstr+4
EndProcedure
Define *mystr.String
*mystr = BSTR()
Debug "*mystr = " + Hex(*mystr)
Debug *mystr\s ;invalid, trying to read the data as the address
This works, but still not quite there:
Code: Select all
Structure BSTR
bufaddr.i ;always points to @buf[0]
size.l ;string size
buf.a[0] ;string data
EndStructure
Procedure.i BSTR() ;create a 4-byte Bstr "abcd"
*bstr.BSTR = AllocateMemory(512)
*bstr\bufaddr = @*bstr\buf[0]
*bstr\size = 4
PokeS(*bstr+8, "abcd", #PB_Ascii)
ProcedureReturn *bstr
EndProcedure
Define *mystr.String
*mystr = BSTR()
Debug "*mystr = " + Hex(*mystr)
Debug *mystr\s
Trying to make the size and string consecutive now
Re: BSTR* fast dynamic string datatype
Posted: Sun Mar 19, 2017 5:26 am
by Keya
think ive got it, everything is consecutive in memory now. It seems the only difference from true BSTR is the addition of the address pointer before the size, and because i'm returning that instead of the address of the string (required for PB strings) the size is also retrieved differently @ PeekL(*bstr+4)
Code: Select all
Structure BSTR
bufaddr.i ;always points to @buf[0] (as PB Strings need a pointer to the string, not the string address directly)
size.l ;string size
buf.a[0] ;string data
EndStructure
Procedure.i BSTR() ;create a 5-byte Bstr "abcde"
*bstr.BSTR = AllocateMemory(SizeOf(BSTR))
*bstr\bufaddr = @*bstr\buf[0]
*bstr\size = 5
PokeS(*bstr\bufaddr, "abcde", #PB_Ascii)
ProcedureReturn *bstr
EndProcedure
Define *mystr.String = BSTR()
Debug "Text="+*mystr\s
Debug "Size="+Str(PeekL(*mystr+4)) ;v1
Debug "Size="+Str(PeekL(PeekL(*mystr)-4)) ;v2 - to access it via "-4" its 2x Peeks due to ptr-to-ptr
Because *mystr.String is a pointer-to-string-pointer and not pointer-to-data i can't quite envisage how a true BSTR could be constructed, but at the end of the day:
1) it's still accessible as a normal string via \s
2) the address returned is the address of the string (just like normal PB strings, and similar to true BStr returning address of string)
3) PeekL() wouldnt normally be used to get the string size anyway - that's what bLen() is for
4) its still perfectly correct valid BStr structure if you give it the address @size, skipping the address variable.
5) its all in the one memory allocation now (my first demo uses two as the String was stored separately)
So i dont think this difference (having the pointer at the start) is particularly important, and as the pointer is the only difference BSTR* has turned out to be a good name hehe
Re: BSTR* fast dynamic string datatype
Posted: Sun Mar 19, 2017 6:58 am
by Keya
PB can convert strings to bstr with its Pseudotype support:
Code: Select all
Prototype.i protMakeBstr(bstr.p-bstr)
Procedure MakeBstr_(*bstr.String)
Debug "String=" + *bstr\s ;doesn't show, as it's a ptr to the data, not address of string ptr
ShowMemoryViewer(*bstr-4,30) ;but it is correct in memory - converted to unicode and stored with length @ -4
EndProcedure
MakeBstr.protMakeBstr = @MakeBstr_()
MakeBstr("test")
but this also demonstrates the BStr<>PB String incompatibility problem with how bstr is a ptr to the start of the data, not the address of the pointer to the start of data -
so i think something like my above solution in previous post that includes the additional pointer field is unavoidable, but doesn't break the structure anyway - apart from the pointer at the start the rest of the structure is true BStr, and can therefore be accessed as such @ *bstr+Sizeof(Integer)+4
infratec, yes it seems bstr is always stored as unicode, regardless of ascii/unicode compile, so that would solve any Mid() issue i guess. I like the flexibility of offering Ascii also though for when its known a priori there'll only be Ascii chars and not any utf8
Re: BSTR* fast dynamic string datatype
Posted: Sun Mar 19, 2017 1:37 pm
by mk-soft
@Keya
Very nice for fast string management, but is better you rename your BSTR to FastStr then there is no discusion about the structures used.
BSTR is not equal to FastStr and finished!
Property times a code written to create BSTR manually.
BSTR is only needed under Windows and there are ready APIs.
Update for Keya
Code: Select all
; BSTR Functions; Created by mk-soft; Date 19.03.017
; *****************************************************************************
Structure udtArrayChar
c.c[0]
EndStructure
Structure udtBStr;Align 4
len.l
str.udtArrayChar
EndStructure
; -----------------------------------------------------------------------------
Procedure CreateBStr(Text.s)
Protected *bstr.udtBStr, len
len = StringByteLength(Text)
*bstr = AllocateMemory(len + SizeOf(Long) + SizeOf(character))
*bstr\len = len
CopyMemory(@Text, @*bstr\str, len)
ProcedureReturn @*bstr\str
EndProcedure
; -----------------------------------------------------------------------------
Procedure FreeBStr(*Bstr)
Protected *mem = *Bstr - SizeOf(Long)
FreeMemory(*mem)
EndProcedure
; -----------------------------------------------------------------------------
Procedure ConcatBStr(T1, T2)
Protected *t1.udtBStr, *t2.udtBStr, *r1.udtBStr, len
*t1 = t1 - SizeOf(Long)
*t2 = t2 - SizeOf(Long)
If *t1\len And *t2\len
len = *t1\len + *t2\len
*r1 = AllocateMemory(len + SizeOf(Long) + SizeOf(character))
*r1\len = len
CopyMemory(@*t1\str, @*r1\str, *t1\len)
CopyMemory(@*t2\str, @*r1\str + *t1\len, *t2\len)
ElseIf *t1\len
len = MemorySize(*t1)
*r1 = AllocateMemory(len)
CopyMemory(*t1, *r1, len)
ElseIf *t2\len
len = MemorySize(*t2)
*r1 = AllocateMemory(len)
CopyMemory(*t2, *r1, len)
Else
*r1 = AllocateMemory(SizeOf(long) + SizeOf(character))
EndIf
ProcedureReturn @*r1\str
EndProcedure
; -----------------------------------------------------------------------------
Procedure _AddBStr(BStr, Text.s)
Protected *bstr.udtBStr, len, len2
*bstr = Bstr - SizeOf(Long)
len = StringByteLength(Text)
len2 = *bstr\len + len
*bstr = ReAllocateMemory(*bstr, len2 + SizeOf(Long) + SizeOf(character))
CopyMemory(@Text, @*bstr\str + *bstr\len, len)
*bstr\len = len2
ProcedureReturn @*bstr\str
EndProcedure
Macro AddBstr(BStr, Text)
BStr = _AddBStr(BStr, Text)
EndMacro
; -----------------------------------------------------------------------------
Procedure LenBStr(*Bstr)
Protected *mem.udtBStr = *Bstr - SizeOf(Long)
ProcedureReturn (*mem\len / SizeOf(character))
EndProcedure
; -----------------------------------------------------------------------------
Procedure LeftBStr(BStr, Lenght)
Protected *r1.udtBStr, *BStr.udtBStr, len
*BStr.udtBStr = Bstr - SizeOf(Long)
len = Lenght * SizeOf(character)
If len > *BStr\len
len = *BStr\len
EndIf
*r1 = AllocateMemory(len + SizeOf(Long) + SizeOf(character))
*r1\len = len
CopyMemory(@*BStr\str, @*r1\str, len)
ProcedureReturn @*r1\str
EndProcedure
; -----------------------------------------------------------------------------
Procedure RightBStr(BStr, Lenght)
Protected *r1.udtBStr, *BStr.udtBStr, len, pos
*BStr.udtBStr = Bstr - SizeOf(Long)
len = Lenght * SizeOf(character)
If len > *BStr\len
len = *BStr\len
EndIf
*r1 = AllocateMemory(len + SizeOf(Long) + SizeOf(character))
*r1\len = len
pos = *BStr\len - len
CopyMemory(@*BStr\str + Pos, @*r1\str, len)
ProcedureReturn @*r1\str
EndProcedure
; -----------------------------------------------------------------------------
Procedure MidBStr(BStr, Position, Lenght = 0)
Protected *r1.udtBStr, *BStr.udtBStr, len, ofs
*BStr.udtBStr = Bstr - SizeOf(Long)
ofs = (position - 1) * SizeOf(character)
len = Lenght * SizeOf(character)
Repeat
If ofs >= *BStr\len Or ofs < 0
*r1 = AllocateMemory(SizeOf(Long) + SizeOf(character))
Break
EndIf
If Not len
len = *BStr\len
EndIf
If ofs + len > *BStr\len
len = *BStr\len - ofs
EndIf
*r1 = AllocateMemory(len + SizeOf(Long) + SizeOf(character))
*r1\len = len
CopyMemory(@*BStr\str + ofs, @*r1\str, len)
Until #True
ProcedureReturn @*r1\str
EndProcedure
; -----------------------------------------------------------------------------
Procedure.s BStrString(*BStr)
Protected *value.String = @*Bstr
ProcedureReturn *value\s
EndProcedure
; -----------------------------------------------------------------------------
Procedure BStrVal(*BStr)
Protected *value.String = @*Bstr
ProcedureReturn Val(*value\s)
EndProcedure
; -----------------------------------------------------------------------------
Procedure.f BStrValF(*BStr)
Protected *value.String = @*Bstr
ProcedureReturn ValF(*value\s)
EndProcedure
; -----------------------------------------------------------------------------
Procedure.d BStrValD(*BStr)
Protected *value.String = @*Bstr
ProcedureReturn ValD(*value\s)
EndProcedure
; -----------------------------------------------------------------------------
; -----------------------------------------------------------------------------
; -----------------------------------------------------------------------------
;- Test
CompilerIf #PB_Compiler_Debugger
Define t1, t2, t3, r1
t1 = CreateBStr("Hello World")
t2 = CreateBStr(", Purebasic Power")
t3 = ConcatBStr(t1,t2)
AddBStr(t3, " !")
Debug LenBStr(t3)
Debug BStrString(t3)
r1 = LeftBStr(t3, 5)
Debug BStrString(r1)
FreeBStr(r1)
r1 = RightBStr(t3, 7)
Debug BStrString(r1)
FreeBStr(r1)
r1 = MidBStr(t3, 14, 9)
Debug BStrString(r1)
Debug LenBStr(r1)
FreeBStr(r1)
r1 =CreateBStr("12345.12345")
Debug BStrVal(r1)
Debug BStrValF(r1)
Debug BStrValD(r1)
FreeBStr(r1)
FreeBStr(t1)
FreeBStr(t2)
FreeBStr(t3)
CompilerElse
Define t1, append$ ;a BString and a normal String$
append$ = ""
Time1=ElapsedMilliseconds()
For i = 1 To 20000
append$ + "Append This"
Next i
Time2=ElapsedMilliseconds()
A$ = "Str$ Time=" + Str(Time2 - Time1) + ~"ms\n"
t1 = CreateBStr("")
Time1=ElapsedMilliseconds()
For i = 1 To 20000
AddBStr(t1, "Append This")
Next i
Time2=ElapsedMilliseconds()
A$ + "BSTR Time=" + Str(Time2 - Time1) + "ms"
MessageRequester("BSTR timings",A$)
;Str$ Time=7716ms
;BSTR Time=26ms
CompilerEndIf
P.S. Only Unicode
Re: BSTR* fast dynamic string datatype
Posted: Sun Mar 19, 2017 6:31 pm
by Sicro
In order to accelerate the expansion of strings even more, you can reserve generous storage space so that you do not have to extend the memory for each expansion.
A module for quickly expanding strings I have also recently written:
https://github.com/SicroAtGIT/PureBasic ... trings.pbi
Re: BSTR* fast dynamic string datatype
Posted: Sun Mar 19, 2017 8:58 pm
by Keya
in your demo is there no way to access the strings except by using PeekS? I've done my best to keep mine compatible with \s so it can be treated directly as a PB string for read ops, ie mine are *mybstr.String's not *mybstr.bstrstruct
mk-soft wrote:but is better you rename your BSTR to FastStr then there is no discusion about the structures used.
But it
IS a BStr, the only difference is the address pointer at the start, but every byte from then on is identical to BStr, and can be referenced as such very easily as TrueBStr = *bstr+Sizeof(Integer), while at the same time it's directly accessible as a PB string as *bstr\s
BSTR is only needed under Windows and there are ready APIs.
wellll,
fast string handling is also needed in Linux and Mac where the same issues of C-style null-terminated strings equally apply - they have the exact same issue. It doesn't necessarily need to be in the form of BStr though, it just seemed a good model to try

Re: BSTR* fast dynamic string datatype
Posted: Sun Mar 19, 2017 9:25 pm
by mk-soft
Unfortunately, this is not entirely true.
In a 'BSTR' the pointer points directly to the text and 4 byte before the pointer to the length.
With you it is not the pointer to the text, but the pointer to the pointer of the text. Thus, it is not compatible with the BSTR.
Code: Select all
t1 = bstr(0, "Hello World")
ShowMemoryViewer(t1 - 4, 32)
CallDebugger
t2 = SysAllocString_("Hello World")
ShowMemoryViewer(t2 - 4, 32)
But is very fast

Re: BSTR* fast dynamic string datatype
Posted: Sun Mar 19, 2017 9:29 pm
by nco2k
ha, i was so tired yesterday, that i accidentally replied you in private instead of public.
like explained in the link i posted earlier, a real bstr is [size.l][unicode string][null.w] and the returned pointer of a bstr function points to [unicode string] and not [size.l]. i think a lot of confusion here is due the fact that you use two buffers. one for the info and one for the actual string, while a real bstr uses only one buffer, like in the example from mk-soft. only that a real bstr is always unicode and actually binary safe, hence the name binary-string.
when you write ABC[0]DEF[0], purebasic will stop after hitting the first null and return only ABC[0], while a real bstr could return ABC[0]DEF[0]. a bstr doesnt search for null, it reads the [size.l] value and copies everything of that size, wether it contains null or not. thats why they are so fast. purebasic strings are not binary safe, so if you want proper bstr handling, you would have to use CompareMemory() etc. but the point of this thread is not really to re-create bstr handling, but making pbstr handling faster.
and yes, .String needs a pointer to a pointer, thats why you have to either carry an additional variable in your structure, or simply use .String in the functions that require it:
Code: Select all
String$ = "ABCDEF"
*Memory = AllocateMemory(StringByteLength(String$) + SizeOf(Character))
PokeS(*Memory, String$)
Procedure$ MyLeft(*MyStr, Length)
Protected *String.String = @*MyStr
ProcedureReturn Left(*String\s, Length)
EndProcedure
Debug MyLeft(*Memory, 3)
c ya,
nco2k
Re: BSTR* fast dynamic string datatype
Posted: Sun Mar 19, 2017 9:39 pm
by Keya
mk-soft wrote:Unfortunately, this is not entirely true. In a 'BSTR' the pointer points directly to the text and 4 byte before the pointer to the length. With you it is not the pointer to the text, but the pointer to the pointer of the text. Thus, it is not compatible with the BSTR.
Please read again what i said -
every byte from then on (after the pointer) is identical to BStr, and can be referenced as such very easily as
TrueBStr = *bstr+Sizeof(Integer), while at the same time it's directly accessible as a PB string as *bstr\s. The only difference is the pointer at the start, which can be ignored when using it as a BStr, but its presence makes it fully compatible as a PB string while also being fully BStr compatible. Its structure is literally [pointer][True BStr]
Re: BSTR* fast dynamic string datatype
Posted: Sun Mar 19, 2017 10:18 pm
by mk-soft
What nco2k writes is correct.
Perhaps you look at times my code again to get around without double AllocateMemory get along.
For this I update my stand again in the previous code.
I think your idea is very good.
To work with windows with BSTR should in the case the API be used if one with foreign functions or Dll's would like to work the BSTR need.

Re: BSTR* fast dynamic string datatype
Posted: Mon Mar 20, 2017 2:01 am
by Keya
mk-soft wrote:Perhaps you look at times my code again to get around without double AllocateMemory get along.
sorry if i'm misunderstanding you, but while I did 2 allocations in my first post in my last example i'm only doing 1 allocation - everything is exactly the same as true BStr apart from a pointer at the start of it

and again to use it as a true Bstr the address is simply *bstr+Sizeof(Integer), as opposed to referencing it as a string with *bstr\s - hopefully best of both worlds. But is the BSTR format the best approach for this? I don't know, but if anything it has the advantage of Windows compatibility and that doesn't really seem to come at any particular cost
Re: BSTR* fast dynamic string datatype
Posted: Sun Jun 25, 2017 8:54 am
by Fig
I think we should go with the "fast string" idea and let the Bstr away.
Because as it was highlighted, bstr api to create them are not ios/linux compatible. (and accessing with regular pb function will ignore next chr(0) strings)
This said, we don't care anymore of compatibility with real Bstr.
On windows Os, we can create a translate procedure to convert them in real Bstr if needed.
The structure looks very good and should be adopted.
FastString\string
FastString\size
If everybody agree, why not starting from that to write efficient function in asm ?
It will become the new standard of PB faststring.
Re: BSTR* fast dynamic string datatype
Posted: Sun Jun 25, 2017 9:31 am
by Mijikai
Code: Select all
Structure POWERBASIC_Str
Ptr.l
Size.l
;StringBuffer
EndStructure
Recently i looked at some powerbasic code which seems to do exactly that ->
http://www.purebasic.fr/english/viewtop ... 13&t=68613
I think its a fast & smart way to deal with strings.
(theres also some code that shows how i deal with the memory allocation...)