Debugging nightmare - advice?

Just starting out? Need help? Post your questions and find answers here.
ProphetOfDoom
User
User
Posts: 84
Joined: Mon Jun 30, 2008 4:36 pm
Location: UK

Debugging nightmare - advice?

Post by ProphetOfDoom »

Hi there,
Okay so, I have approx 3000 lines of C code that I wrote. When I compile it as a standalone application (macOS and Linux) it works perfectly.When I compile it as a shared object, and call it from another C program, again it works perfectly.
HOWEVER... when I try to use the SAME shared object from a PureBasic app (with OpenLibrary()) it behaves really strangely and usually crashes.
The bizarre thing is, nothing about the C code has changed. Literally the only change when I build it as a library, is that I rename the main() function (because shared objects aren't supposed to have one).
On Linux, valgrind reports no errors. No invalid reads, writes, or conditional movs/jmps.
I cannot seem to debug the PB calling code with valgrind, because it complains about a missing implementation for a syscall (or something like that). I'm guessing this is because PB's debugger uses "interesting" techniques. :|
To describe the error in more detail, it seems that certain C structs that should be all zeroes end up with "random" numbers in them. The values are different on each run of the program. Almost any libc call can corrupt my structs in weird unpredictable ways.
I've tried sprinkling dozens of printfs around the code, then running the working/nonworking versions of my app, and doing a "git diff" on the output. This enabled me to see the memory corruption happening in situ, but I could only find the effects, not the cause.
I'm at my wits' end as I've been debugging this for four days. I don't even know what to try next.
Does anyone have any advice? Any tools that might help?
Thanks in advance.
User avatar
NicTheQuick
Addict
Addict
Posts: 1515
Joined: Sun Jun 22, 2003 7:43 pm
Location: Germany, Saarbrücken
Contact:

Re: Debugging nightmare - advice?

Post by NicTheQuick »

Usually malloc does not zero out the memory. In contrast AllocateMemory() does it by default, but it can be disabled for speed with #PB_Memory_NoClear. I don't know if this helps you.
On the other hand there are different calling conventions, that's the reason why there is "Import" and "ImportC". Maybe that is something which could help you? Or if you are working with "Prototype", you may have to switch to "PrototypeC".

My answers are a bit vague but maybe someone else has a better idea.
The english grammar is freeware, you can use it freely - But it's not Open Source, i.e. you can not change it or publish it in altered way.
ProphetOfDoom
User
User
Posts: 84
Joined: Mon Jun 30, 2008 4:36 pm
Location: UK

Re: Debugging nightmare - advice?

Post by ProphetOfDoom »

Hullo Nic thanks for your input. I have indeed zeroed all my structures with memset (and probably checked this three times!). And yes I’m using PrototypeC.
I’m hoping there’s some magic app or feature of PB that can help... I’m really stuck!
Olli
Addict
Addict
Posts: 1238
Joined: Wed May 27, 2020 12:26 pm

Re: Debugging nightmare - advice?

Post by Olli »

Hello, what about the alignements of the structured buffers on C ? And on PureBasic ?
ProphetOfDoom
User
User
Posts: 84
Joined: Mon Jun 30, 2008 4:36 pm
Location: UK

Re: Debugging nightmare - advice?

Post by ProphetOfDoom »

Hullo Olli.
I’m not using any structures on the PB side at all. The only structures are in the C code so they can’t be disagreeing about alignments. I should add, all of the code in both languages is single-threaded, so that’s not the issue here. I know I’m not giving you guys a lot to go on but I think it would be just as unreasonable to throw all that code at the forum and ask you to debug it for me...
I need ideas... an approach...
sq4
User
User
Posts: 98
Joined: Wed Feb 26, 2014 3:16 pm
Contact:

Re: Debugging nightmare - advice?

Post by sq4 »

A very wild guess : does the C code use MMX registers?
ProphetOfDoom
User
User
Posts: 84
Joined: Mon Jun 30, 2008 4:36 pm
Location: UK

Re: Debugging nightmare - advice?

Post by ProphetOfDoom »

Hi there sq4, well I’m compiling it with gcc on Linux, and clang on macOS, using just the -g flag and whatever the default optimisations might be... I’m guessing none so probably no MMX?
sq4
User
User
Posts: 98
Joined: Wed Feb 26, 2014 3:16 pm
Contact:

Re: Debugging nightmare - advice?

Post by sq4 »

ProphetOfDoom wrote:Hi there sq4, well I’m compiling it with gcc on Linux, and clang on macOS, using just the -g flag and whatever the default optimisations might be... I’m guessing none so probably no MMX?
I have had some crashes when calling extern functions that return floats.
But that was a long time ago, before Prototypes and therefor callCfunction() was needed.
Anyway, if that would be the case, you need to inline asm with EMMS instruction to reset FPU state.

Like I said, it's just a wild guess.

And the other possibility cause which I had in mind, namely C alignment, was already proposed.

I think you need to provide some more data...
ProphetOfDoom
User
User
Posts: 84
Joined: Mon Jun 30, 2008 4:36 pm
Location: UK

Re: Debugging nightmare - advice?

Post by ProphetOfDoom »

Well thanks for all your suggestions. I’m going to struggle on with it for a few days - if I have no luck I will upload it to my github page and link to it so you helpful people can have a look. I didn’t really want to do this because it means removing all the curse words. And also this code is something I consider a work of art which I really wanted to share in its finished state.
It’s a lexer/parser generator using a domain-specific language embedded in C code. It basically lets you describe an arbitrary language in something akin to English. Well unless you call it from PB then it becomes demon-possessed.
sq4
User
User
Posts: 98
Joined: Wed Feb 26, 2014 3:16 pm
Contact:

Re: Debugging nightmare - advice?

Post by sq4 »

A lexer/parser -> it looks like you must be using strings a lot?
Maybe it's as trivial as Unicode<->Ascii or something...
Bitblazer
Enthusiast
Enthusiast
Posts: 762
Joined: Mon Apr 10, 2017 6:17 pm
Location: Germany
Contact:

Re: Debugging nightmare - advice?

Post by Bitblazer »

Create a sample library with a single function which has this problem, upload it and post the link and somebody will find the reason what happens and why :)

Does it happen with other compilers too? Can you compile it with the commandline compiler version and which flags did you use for that?
Commandline compiling usually means you see all options while a IDE might use options you arent really aware of. Knowing all options and what they do, changing options and understanding them all, might already give you the solution.

Posting a binary sample dll which fails, means we can peak inside the binary and see what kind of library it is, what the actual assembly code does and why it fails. Does a sample library call fail if the library function doesnt use the stack and just returns a single value, or doesnt do anything even and just returns? Could the calling convention be the problem? Is it an x86 or x64 dll or even an arm one ;)?
User avatar
skywalk
Addict
Addict
Posts: 4215
Joined: Wed Dec 23, 2009 10:14 pm
Location: Boston, MA

Re: Debugging nightmare - advice?

Post by skywalk »

Lots to try.
Start very small.
Use EnableExplicit
Use Prototype|C, not CallFunction.

Print the sizeof(yourstruct) in both dll and pb.
I know you said the pb caller has no struct's but that means you are passing fixed values.
Just create a struct in C, populate it with known values and pass it to PB.
Then use the memoryviewer to inspect or a bunch of debug's.

Structure alignment and padding was my issue in a Windows only dll.
Treat strings as bytes before attempting straight convert to PB strings.
The nice thing about standards is there are so many to choose from. ~ Andrew Tanenbaum
ProphetOfDoom
User
User
Posts: 84
Joined: Mon Jun 30, 2008 4:36 pm
Location: UK

Re: Debugging nightmare - advice?

Post by ProphetOfDoom »

Hi again and thanks skywalk and bitblazer for your suggestions.
Something weird has happened. I was up all night tweaking things here and there and now it works. I have a book called The Pragmatic Programmer where it warns against "programming by accident" - i.e. copying and pasting and commenting and uncommenting until you get the right output. This is of course very bad programming practice.
But I'm just really glad it's fixed!
I suspect it was me using a linked list wrongly (in C) because a middle element in a list was getting corrupted and I was tweaking that.
If I was really being scientific here I'd do more diffing to find out exactly when it started working but I think I'll have a coffee instead!
I really appreciate everyone's input. I should have a cool project to share with you soon.
ProphetOfDoom
User
User
Posts: 84
Joined: Mon Jun 30, 2008 4:36 pm
Location: UK

Re: Debugging nightmare - advice?

Post by ProphetOfDoom »

Hullo there,
Just wanted to update this topic because I think I've found the exact cause of my error and it's very strange.
I had a function in my C code called "times". PureBasic, at least when debugging with the IDE on Linux, appears to include a symbol called "times" in the generated executable. When I tried to call this function, PB's function was called instead, and corrupted the stack. This is why the code worked fine without PB.
I was really frustrated and about to give up when I suddenly thought to change my C function's name to "timez" and everything worked.
I can't say 100% that my diagnosis is correct, as PB appears to strip executables of their symbols, but I think this is it!
Thanks again for your suggestions and I'm not surprised no-one guessed what was wrong; it took me weeks!
ProphetOfDoom
User
User
Posts: 84
Joined: Mon Jun 30, 2008 4:36 pm
Location: UK

Re: Debugging nightmare - advice?

Post by ProphetOfDoom »

Okay oops... "times" appears to be a C library function.
Post Reply