Category Archives: Coding

Coding in PureBasic itself

Debugging tips – Part 1: Heap corruption

If you are wondering why this blog is silent for so long, its because we are in the beta phase of 4.30, and writing about bugfixing and documentation work is just boring.

I want to talk a bit about the most common problems i have seen on the forums where the PureBasic debugger is of not much help and which therefore leave most users quite confused. The usual reaction is to post it as a bugreport. So in the future, if somebody posts a bugreport about AllocateMemory(), you can point him to this article and it should explain everything 🙂

Symptoms:
  • A crash at AllocateMemory() or FreeMemory() even though the given input is valid.
  • A crash on a simple string assignment
  • Modifying the code in seemingly unrelated places makes the problem go away.
Reason:

First of all: AllocateMemory() is never the problem. Its a direct wrapper to the HeapAlloc() API function and also a heavily used function. If it had a bug we would know by now. What you have gotten yourself into here is what is called a heap corruption. You destroyed part of the data Windows uses to manage allocated memory.

When Windows allocates memory, it keeps a data structure to manage the allocated memory (usually 12bytes on 32bit Windows). This data structure is normal, writable memory which means that you will not get any access error when accidentally writing over it. It is nowhere specified where this data is kept, but it is a fact that it sometimes ends up right after your allocated memory buffer. Now, if you happen to write over the end of this buffer by just a few bytes, you destroy the heap structure without getting any error. The crash only happens later when another attempt to allocate or free memory causes Windows to examine this piece of heap data and crash due to an invalid pointer. This fact, that the cause of the problem and the actual crash are in different places makes this kind of bug so hard to debug.

Why does this not happen always when you overwrite a buffer ?

Getting the heap data right at the end of the allocated buffer is a very rare condition. Windows often rounds the allocated buffer size up to page boundaries or for alignment purposes, so you often get more memory than you asked for, which makes a small overwrite have no effect. Another scenario is that the memory after your allocated buffer is simply marked as invalid, in which case you get an error when trying to write to it. Which of these scenarios actually happens depends on the sequence of memory allocations done by your program, which means that if you comment another totally unrelated program part, you may change the allocation sequence and the problem seemingly disappears. (Note that in this case only the symptom disappears, not the problem of writing over the end of a buffer.)

Solution:

As we saw above, the problem is not where the crash is. Fortunately, Windows provides a function to check if the heap structures. Below is a piece of code that can be used to check the most used memory heaps in PureBasic: (this works in PureBasic Windows 32bit and 64bit)

Procedure _TestHeaps(File$, Line)
  Protected StringHeap, MemoryBase, MemoryHeap

  CompilerIf #PB_Compiler_Processor = #PB_Processor_x86
    !extrn _PB_StringHeap
    !extrn _PB_Memory_Heap         

    !mov eax, dword [_PB_StringHeap]
    !mov [p.v_StringHeap], eax
    !mov eax, dword [_PB_MemoryBase]
    !mov [p.v_MemoryBase], eax
    !mov eax, dword [_PB_Memory_Heap]
    !mov [p.v_MemoryHeap], eax
  CompilerElse
    !extrn PB_StringHeap
    !extrn PB_Memory_Heap

    !mov rax, qword [PB_StringHeap]
    !mov [p.v_StringHeap], rax
    !mov rax, qword [_PB_MemoryBase]
    !mov [p.v_MemoryBase], rax
    !mov rax, qword [PB_Memory_Heap]
    !mov [p.v_MemoryHeap], rax
  CompilerEndIf

  If HeapValidate_(StringHeap, 0, 0) = 0
    MessageRequester("StringHeap corrupted !", File$+" : "+Str(Line))
  EndIf

  If HeapValidate_(MemoryBase, 0, 0) = 0
    MessageRequester("MemoryBase heap corrupted !", File$+" : "+Str(Line))
  EndIf

  If HeapValidate_(MemoryHeap, 0, 0) = 0
    MessageRequester("AllocateMemory heap corrupted !", File$+" : "+Str(Line))
  EndIf
EndProcedure

Macro TestHeaps
  _TestHeaps(#PB_Compiler_File, #PB_Compiler_Line)
EndMacro 

Steps to find the bug:

  • Place a TestHeaps call right before the line that crashes. If you get one of the message requesters, you have a heap corruption. If not, then the problem is something else and the above code will not help.
  • Start placing TestHeaps calls in places that are executed before the crashing line. Start with only a call every bunch of lines and narrow it down later.
  • You need to find the line of code, where TestHeaps reports nothing before, and reports a heap corruption after. This is the line that causes all the problems.
  • Make sure you remove all this test code after fixing the bug, as it can have a big performance hit on the program (see below).

So why doesn’t the PureBasic debugger make this check automatically ?

The reason is that HeapValidate() has an effect on the future run of the program. The documentation sais that it degrades performance and that this effect can last until the end of the program. My guess is that the check for a valid heap somehow reorganizes it into a less efficient state which means that future allocations will be slower. This is why this check is not done by the debugger. Maybe there will be an option for this somewhen in the future. who knows ?

Porting your programs to PureBasic 64bit

The first public beta of PureBasic 64bit is out today, so the quick ones of you will soon start porting their projects to it. Here are a few tips from our own experience of porting the PureBasic tools to x64:

1) Port your project first to 4.30 x86 and fully test it there before moving on

The 4.30 update introduces a number of incompatible changes (many because of the x64 introduction) which are not all visible as compiler errors when used the old way. So instead of jumping directly to the x64 version and thinking about wrong variable sizes etc, better first make sure the crashes you get are not caused by one of these changes and are totally unrelated to the fact that your program is a 64bit one now. If you plan to release both a 32bit and 64bit version of your program, this is a logical first step anyway.

The most important change in this respect is probably the Read command. To avoid problems with different variable sizes on x86 and x64 with Read (which could lead to hard to find bugs), we decided to change the command to not get its type from the variable but rather like Procedure and others from a type postfix. This way you can match the Read type with the type in the DataSection, independent from the type of the variable you read the data into (you just cannot read a float into an integer type directly, but reading a byte into a quad works for example). Unfortunately, this change means that a 4.20 program which does “Read a.b” will now read an integer (4/8 bytes) as that is the default type if no postfix is specified. So our decision was to have this problem now once, or for all eternity when developing a program in both 32bit and 64bit. We decided to go with the former. The best and most foolproof way to deal with this is just to do a “Find in Files” on all your code for the Read command and adding a postfix for everyone of them.

Another source for a possibly hard to locate error is the change in LinkedList commands. They now return the element pointer, and no longer the list header pointer. So if you used the returnvalue of AddElement() and the like, you should check them all and make the needed changes.

Less problematic (because easily visible) changes are the ComboBoxGadget() height parameter, and the removal of the ButtonImageGadget() backward compatibility (you now need SetGadgetAttribute() to change the image). All other incompatible changes should raise compiler errors or warnings, so these are easy to spot.

Once all this is dealt with and your program runs fine with 4.30 x86, its time to dive into x64 programming…

2) .l is evil… very, very, very evil!

Many people have the habit of appending .l to every variable, even though this has been the default type all the time. Well, now is the time to stop that practice. Not only will the speed be worse using a too small type on x64, you also need to constantly worry what you store in the variable, as many things need 8 bytes for storage on x64 (pointers, handles, #PB_Any values, etc). Do not try to get guided by the urge to “save memory” here. 64bit systems have a ton of memory, and for normal variables this is really no argument.

From now on, you should treat the long type as you treated words and bytes so far. Use them only when really needed inside structures, or for Arrays/Lists where memory concerns really become an issue. For every normal variable just leave out the type (which will default to .i), or if you really cannot shake the habit, use .i instead. Just do this consistently, even for counter variables and the like. This will save you a ton of trouble, trust me. Truncated pointers are the worst kind of bug to try to track down.

Another good habit is to simply use a real *Pointer instead of just an integer variable whenever a pointer is stored. This is no requirement, as the .i type will work just as well. But in the long term this increases readability of the code in my opinion.

3) Working directly with memory – keep the pointer size in mind

When working with raw memory, we all probably used “*Pointer + 4” at one point to skip a pointer or long value. You now need to keep in mind to us 8 here for 64bit mode. Using any fixed numbers is discouraged here. Either use the SizeOf() compiler function, or define yourself a nice #PointerSize constant to make this very transparent.

4) API stuff

The API is quite transparent between the 32bit and 64bit versions. We updated all API structure definitions to have the correct types, so most API code should work on x64 out of the box.

Some things need to be considered though:

  • A number of API types also switch sizes, such as HANDLE, LONG_PTR etc. However, a number of types do not change size, such as DWORD, LONG, etc. For “in”-Parameters that are passed byvalue to a function, this is not a real problem, but if you need to pass a pointer to the variable to receive some data you need to check which type is needed.
  • Do NOT use Get/SetWindowLong_() for subclassing anymore. These remain 32bit even on x64. Use Get/SetWindowLongPtr_() instead, which works correctly everywhere. There is really no need to use SetWindowLong_() over SetWindowLongPtr_() actually, so best do a search and replace and replace all of them right away.
  • If you happen to use some API structure that is not predefined by PB, be careful about the padding. There is a lot more padding involved in the x64 structures, and the padding behavior is a bit complex to be explained quickly. If possible, just check the PB structure size against the output of a C compiler like VC8 (the 64bit compiler is provided with the Windows SDK) or ask on the Forum if you are unsure about the makeup of the structure.
5) Other

There is not much more too it. Calling the PB functions is straight forward on x64, as long as you do not use a too small type to store the return values (#PB_Any etc).

If you use ASM code, the x64 fasm commands can be used on the x64 version without trouble. The #PB_Compiler_Processor constant is a good help here to have separate CompilerIf sections of code for this.

All in all, the x64 port of the PB Programs like the IDE was much easier than we expected. If you get the variable types right, its pretty much done. Now good luck with your x64 ports.