Page 1 of 1

dynamic memory allocation concerns

Posted: Sun Dec 26, 2004 12:19 pm
by SurreaL
Hello!

I'm in the design phase for a project which will require support for any number of dynamically created objects (structures). Inside these structures will be strings. I'm trying to think of the best way to do this in PB.. as far as I see it my options are a) PB's linked lists commands, and b) manual memory allocation (using either PB's AllocateMemory, or the WinAPI's heap functions).

I sort of like the linked list idea, but with my recent discovery of the bug described here, I am hesitant to use them (are there any other bugs w/ linked lists that are known..?) I also may need to do nested linked lists (ie lists inside of lists), and AFAIK this is currently unsupported with native PB commands.

So this brings about some questions about rolling my own code to dynamically allocate an ever-changing amount of structures. I figure it can be done by doing something like:
*newStruct = AllocateMemory (SizeOf(myStruct))
However this brings some questions to mind. Namely, what will this do to structures that have strings in them? The size of a struct that has a string is constant (since regardless of the string's contents, the struct itself will contain a pointer for the string buffer), however the amount of memory actually being used by the string is prone to change (ie if the string is modified, etc)

What I am wondering is if FreeMemory (*newStruct) will therefore take this into account, and de-allocate the string's buffer as well? Or will it simply de-allocate the amount of memory equal to the size of the structure? (Leaving the string's contents somewhere in memory.. ie a memory leak)

And finally, is there any reason why PB's memory functions would not suffice for this? Would the WinAPI's heap functions offer any difference/advantage?

Thanks in advance for any help you PB gurus can offer :)

Re: dynamic memory allocation concerns

Posted: Sun Dec 26, 2004 1:22 pm
by tinman
SurreaL wrote:Or will it simply de-allocate the amount of memory equal to the size of the structure? (Leaving the string's contents somewhere in memory.. ie a memory leak)
That is what will happen.
And finally, is there any reason why PB's memory functions would not suffice for this? Would the WinAPI's heap functions offer any difference/advantage?
Because you are only allocating memory. PureBasic doesn't know what you are doing with it, and can't assume so on the simple basis that you pass in a size to allocate - nothing about it's structure.

WinAPI will not help here.

Posted: Sun Dec 26, 2004 1:31 pm
by freedimension
I wonder if emptying a strings content before deleting the structure will help, i.e. if the string buffer will be clipped or if only the /0 Character gets moved.

Code: Select all

a/str = ""
DeleteElement(a)
One could also try to deallocate the strings buffer manually using its pointer?

Posted: Sun Dec 26, 2004 7:35 pm
by SurreaL
Ok well I decided to write up a little test program to see exactly what would happen.. and it turns out my original fears were correct. (In other words, you were bang on, tinman!)

I also did this to try freedimension's suggestion, and see if there was any way I could free the string myself (by resetting it to an empty string before de-allocating the string's owner structure) It appears however that this was also in vain, as there was still memory which seemed to be tied up somewhere after everything was said and done.

The code which I used was this:

Code: Select all

Structure mystruct
  st1.s
  st2.s
  st3.s
  st4.s
  st5.s
  st6.s
  st7.s
  st8.s
  st9.s
  st10.s
EndStructure

Procedure debugmemory ()
  Static lastval
  memstat.MemoryStatus
  GlobalMemoryStatus_(memstat)
  newval = memstat\dwAvailPhys
  diff = newval - lastval
  lastval = newval
  ProcedureReturn diff
EndProcedure

Debug "size of structure: " + Str(SizeOf (mystruct))

Debug "Initial Memory free: " + Str(debugmemory ())

*eh.mystruct
*eh = AllocateMemory (SizeOf (mystruct))

*eh\st1 = Space (60000)
*eh\st2 = Space (60000)
*eh\st3 = Space (60000)
*eh\st4 = Space (60000)
*eh\st5 = Space (60000)
*eh\st6 = Space (60000)
*eh\st7 = Space (60000)
*eh\st8 = Space (60000)
*eh\st9 = Space (60000)
*eh\st10 = Space (60000)

Debug "Memory difference after allocating 10 strings of 60,000 bytes: " + Str(debugmemory ())

;*eh\st1 = ""
;*eh\st2 = ""
;*eh\st3 = ""
;*eh\st4 = ""
;*eh\st5 = ""
;*eh\st6 = ""
;*eh\st7 = ""
;*eh\st8 = ""
;*eh\st9 = ""
;*eh\st10 = ""
FreeMemory(*eh)

Debug "Memory reclaimed by freeing strings and de-allocating structure: " + Str(debugmemory ())
Basically it contains a simple function to use the winAPI's GlobalMemoryStatus to check available system memory, and keep a running tally of the changes happening in memory. The program allocates a struct with 10 strings, makes the strings fairly large (60,000 bytes each), and then tries to free the structure. As you can see by running the code, the strings seem to take up roughly 606,208 bytes after being created, and if not reset to "" still remain in memory after the struct is freed.

If you uncomment the lines which reset the strings, they seem to free up around 573,440 bytes, which is close, but obviously not all :/ (I had to run the code several times to see which were the lowest numbers being reported.. since memory use is something that fluctuates quite often in windows!)

I guess this means to do what I want, I will have to do something similar to c++ and manually allocate arrays of characters, and basically end up rolling my own string functions. Or try Linked Lists again, and see if they exhibit similar behaviour, or maybe are a bit more well-behaved. I remember seeing something in the version history about strings in linked lists being fixed.. so maybe this is the exact issue they were addressing. I will probably write a similar test prog later to find out :)

Posted: Mon Dec 27, 2004 1:34 pm
by tinman
I posted code on the forums somewhere which shows you how to free a string using inline ASM. Use the search or check out Andre's code archive at http://www.purearea.net

Posted: Mon Dec 27, 2004 2:20 pm
by SurreaL
hm.. Well, Tinman, I believe I've found your original post, however it seems that this 'nasty hack' as you put it may not work anymore :/ At least.. when I tried it with this example code, it shows that somewhere around 0 memory is being freed:

Code: Select all

;manual structure memory allocation/memory leak test #3
Structure STR
  StructureUnion 
    String.s 
    Pointer.l 


  EndStructureUnion 
EndStructure 

Structure mystruct
  st1.s
  st2.s
  st3.s
  st4.s
  st5.s
  st6.s
  st7.s
  st8.s
  st9.s
  st10.s
EndStructure

Procedure debugmemory ()
  Static lastval
  memstat.MemoryStatus
  GlobalMemoryStatus_(memstat)
  newval = memstat\dwAvailPhys
  diff = newval - lastval
  lastval = newval
  ProcedureReturn diff
EndProcedure

Procedure FreeString(*free_me.STR) 
  !   MOV     edx, dword [esp+0] 
  !   MOV     edx, dword [edx] 
  !   CALL    SYS_FreeString 
  *free_me\Pointer = 0 
EndProcedure 

Debug "Initial Memory free: " + Str(debugmemory ())

*eh.mystruct
*eh = AllocateMemory (SizeOf (mystruct))

*eh\st1 = Space (60000)
*eh\st2 = Space (60000)
*eh\st3 = Space (60000)
*eh\st4 = Space (60000)
*eh\st5 = Space (60000)
*eh\st6 = Space (60000)
*eh\st7 = Space (60000)
*eh\st8 = Space (60000)
*eh\st9 = Space (60000)
*eh\st10 = Space (60000)

Debug "Memory difference after allocating 10 strings of 60,000 bytes: " + Str(debugmemory ())

FreeString(@*eh\st1)
FreeString(@*eh\st2)
FreeString(@*eh\st3)
FreeString(@*eh\st4)
FreeString(@*eh\st5)
FreeString(@*eh\st6)
FreeString(@*eh\st7)
FreeString(@*eh\st8)
FreeString(@*eh\st9)
FreeString(@*eh\st10)
FreeMemory(*eh)

Debug "Memory reclaimed by freeing strings and de-allocating structure: " + Str(debugmemory ())
I had to rename the "STRING" structure from your example to "STR", as it seems "STRING" is already defined. Anyhow, fortunately it seems that the latest PB update's bug fix to do with strings in structures in linked lists, is perfectly relevant, as this code seems to prove:

Code: Select all

;manual structure memory allocation/memory leak test #2
;allocate structure using PB's linked list instead of AllocateMemory
Structure mystruct
  st1.s
  st2.s
  st3.s
  st4.s
  st5.s
  st6.s
  st7.s
  st8.s
  st9.s
  st10.s
EndStructure

Procedure debugmemory ()
  Static lastval
  memstat.MemoryStatus
  GlobalMemoryStatus_(memstat)
  newval = memstat\dwAvailPhys
  diff = newval - lastval
  lastval = newval
  ProcedureReturn diff
EndProcedure

NewList list.mystruct()
AddElement (list())
Debug "Initial Memory free: " + Str(debugmemory ())

*eh.mystruct
*eh = list()

*eh\st1 = Space (60000)
*eh\st2 = Space (60000)
*eh\st3 = Space (60000)
*eh\st4 = Space (60000)
*eh\st5 = Space (60000)
*eh\st6 = Space (60000)
*eh\st7 = Space (60000)
*eh\st8 = Space (60000)
*eh\st9 = Space (60000)
*eh\st10 = Space (60000)

Debug "Memory difference after allocating 10 strings of 60,000 bytes: " + Str(debugmemory ())

DeleteElement(list())

Debug "Memory reclaimed by freeing strings and de-allocating structure: " + Str(debugmemory ())
It seems my best bet will be to just use PB's linked lists, and design around the inherant flaws they present.

Thank you very much for your help though tinman :) I really have to learn ASM sometime so I can take advantage of it in languages such as PB which allow you to inline it.. and, well, naturally your code brings a few more questions to mind..! It would indeed be useful for other purposes to be able to manually free a string, so I would like to see if it's possible to get this hack working.

Code: Select all

Procedure FreeString(*free_me.STR)
  !   MOV     edx, dword [esp+0]
  !   MOV     edx, dword [edx]
  !   CALL    SYS_FreeString
  *free_me\Pointer = 0
EndProcedure 
I can sort of see what the code is trying to accomplish, except I'm a little confused at how it seems you are writing a variable to the edx register, only to overwrite it in the next instruction with it's own contents? Or is EDX used to store the parameter for the SYS_FreeString call? (which is indeed what is perhaps failing) Also.. why would you write [esp+0].. isn't that the same as just [esp]?

Would you mind taking the time to explain what this code is doing? :)

Posted: Tue Dec 28, 2004 1:17 pm
by tinman
SurreaL wrote:hm.. Well, Tinman, I believe I've found your original post, however it seems that this 'nasty hack' as you put it may not work anymore :/ At least.. when I tried it with this example code, it shows that somewhere around 0 memory is being freed:

Code: Select all

FreeString(@*eh\st1)
Part of the problem is that you are using it wrongly (probably would have helped if I posted a proper explanation for the procedure :). You need to pass the address of the string pointer. When you use @*eh\st1 what PB actually gets is the address of the first character of the string. You either need to use the "STR" structure for your strings, or computer the address of the string pointer in some other way. For example:

Code: Select all

FreeString(*eh + OffsetOf(mystruct, st1))
However, doing it this way also doesn't work correctly as some memory is not freed. It allocates 602112 bytes of memory but only frees 598016.

It will solve the false memory near zero being freed though.

Code: Select all

Procedure FreeString(*free_me.STR)
  !   MOV     edx, dword [esp+0]
  !   MOV     edx, dword [edx]
  !   CALL    SYS_FreeString
  *free_me\Pointer = 0
EndProcedure 
I can sort of see what the code is trying to accomplish, except I'm a little confused at how it seems you are writing a variable to the edx register, only to overwrite it in the next instruction with it's own contents? Or is EDX used to store the parameter for the SYS_FreeString call? (which is indeed what is perhaps failing) Also.. why would you write [esp+0].. isn't that the same as just [esp]?
All parameters in PB are passed to procedures on the stack (referred to through esp). At the start of the procedure, esp will point to the last variable declared (or defined, or used) in the procedure. Above those are the parameters in the order first to last (from lowest memory address to highest IIRC).

So that means that the parameter "*free_me.STR" will be at the address pointed to by esp. You can get this value by using the square bracket notation, so [esp] or [esp+0] - both are equivalent.

(Aside - there may be some stuff I've not taken into account such as whether the addition is always performed, whether it actually consumes any additional processor cycles, or whether the nasm/fasm optimises it out.)

The first line of ASM effectively reads the value of the *free_me parameter from the stack and puts it into edx.

Since this is actually the address of the string pointer (not the address of the string - first character), we need to get the string pointer from the address. Again, we use the square bracket notation to read the value at the address, and store it in edx again.

The PB internal function SYS_FreeString requires one parameter, the string pointer, in the edx variable. So we can call it immediately.

After that the string pointer is set to 0 to indicate that there is no string there.

Hope that helps.

Posted: Wed Dec 29, 2004 8:08 am
by SurreaL
hey hey.. what do you know, it works :D

I suppose the square brackets [] in ASM merely de-reference a memory address to it's contents.. well now at least I understand what your code is doing alot better :) (Although I'm curious how you found out about the name of an internal PB function!)

Also you mention in the post that this allocation originally came about, that this is: "a nasty hack and no one should ever use it" (or something like that), and I'm wondering why you would say it shouldn't be used? It seems reliable enough, so long as the PB function 'SYS_FreeString' isn't re-named of course.. or is there something else I'm missing?
However, doing it this way also doesn't work correctly as some memory is not freed. It allocates 602112 bytes of memory but only frees 598016.
Although I hate to argue with someone who has proven quite knowledgeable, I would venture that perhaps this code is working better than you thought.. I tried running it after your suggested fix, and it seems it does allocate all the memory which was originally used. It's just hard to tell because in between this happening, windows occasionally allocates it's own memory to other programs which of course throws off the calculations this app is trying to do.. I haven't found any method to determine how much memory one specific program is using however, so I guess the global memory functions will have to suffice. Anyhow.. after several test runs I did occasionally see the same values being reported as being allocated and then de-allocated.. and so I'd like to think this code is indeed 'perfect' for the cause :)

Thanks again for all your help by the way :) This is indeed interesting.. I think I'm going to have to poke around with the assembly debugging options of PureBasic to see what other things PB does 'under the hood' if possible :) This has proven quite the learning experience!

(p.s. I love the perl script you're using for a signature!)

Posted: Wed Dec 29, 2004 1:58 pm
by tinman
SurreaL wrote:I suppose the square brackets [] in ASM merely de-reference a memory address to it's contents.. well now at least I understand what your code is doing alot better :) (Although I'm curious how you found out about the name of an internal PB function!)
Yep, thats what the square brackets do. The dword in front of them tells the assembler what size of data to read from the address.

I found the name of the function by writing a simple program which just allocated a string, compiled it with commented assembly. Handily enough, the code generated calls that function for all the strings it clears up at the end (IIRC).
SurreaL wrote:Also you mention in the post that this allocation originally came about, that this is: "a nasty hack and no one should ever use it" (or something like that), and I'm wondering why you would say it shouldn't be used? It seems reliable enough, so long as the PB function 'SYS_FreeString' isn't re-named of course.. or is there something else I'm missing?
Nope, just for that reason. If something changes internally then it might or might not be possible to use this procedure. It might or might not be possible to find out how things work if they change. It's not cross platform. It would be far nicer if there was a way of freeing strings as a built-in command.
SurreaL wrote:memory functions will have to suffice. Anyhow.. after several test runs I did occasionally see the same values being reported as being allocated and then de-allocated.. and so I'd like to think this code is indeed 'perfect' for the cause :)
I had thought of that, and was testing with as little as possible running. I even changed the debug statements so they were only output at the end of the program (in case the listview in the debug window was causing something). I also tried to find some way of getting Windows to flush any memory, but I didn't find anything.

Edit: Turning the debugger off and displaying the output at the end in a messagerequester does show that the two values match up each time. So I guess it does work :)
(p.s. I love the perl script you're using for a signature!)
The script itself sucks big time, but the effect is pretty good :)