Category Archives: Compiler

Compiler related articles

PureBasic Non-Regression Suite

When writing massive software like PureBasic, you need to have automated tests to validate it and ensures there is no regression when publishing a new version. We will try to explain how we achieve this for PureBasic and why it is essential for the product robustness.

First, let’s recap the main PureBasic components and their size:

  • The compiler, which is about 250k lines of C, generating x86 asm, x64 asm and C
  • The IDE, written in PureBasic which is about 130k lines of code
  • The libraries, written for 3 platforms (Linux, Windows, OS X) in ASM, C, C++ or Objective C which are about 1.200k lines of code

Each time a compiler bug is squashed, we add a new test in the test suite to ensures it won’t happen again. For now, there is more than 2500 unit tests covering all public features of the PureBasic language, validating the syntax and ensuring the results are corrects. Theses tests are run in different context (Threaded mode ON/OFF, debugger ON/OFF, Optimizer ON/OFF) and if one of these tests fail, the build script is aborted. These tests are just written in standard PureBasic files, compiled into a single executable and run all at once. Each platform runs these tests. Here is a small example of how it looks from the AritmeticFloat.pb file:

; Float promotion
;
Check(Round(Val("2") * 1.2, #PB_Round_Nearest) = 2)

Integer.i = Round(Val("3") * 1.2, #PB_Round_Nearest)
Check(Integer = 4)

Integer = 1
Integer = Integer + Round(Val("2") * 1.5, #PB_Round_Nearest)
Check(Integer = 4)

Integer = 1
Integer = Integer - Round(Val("2") * 1.5, #PB_Round_Nearest)
Check(Integer = -2)

Integer = 2
Integer = Integer * Round(Val("2") * 1.5, #PB_Round_Nearest)
Check(Integer = 6)

Integer = 9
Integer = Integer / Round(Val("2") * 1.5, #PB_Round_Nearest)
Check(Integer = 3)

; Constant rounding
;
Integer.i = 0.6 * 2 ; rounded to 1
Check(Integer = 1)

Integer = 0.9 * 2 ; rounded to 2
Check(Integer = 2)

Integer = 0.6 ; rounded to 1
Check(Integer * 2 = 2) 

When we work on the libraries, it’s a bit different, as we use the PureUnit tool (which can be found freely in the SDK) which is similar to NUnit or JUnit if you know those tools. It is very good at writing single commands tests, and provide a solid way to run the tests randomly, in different modes (Threaded, with different Subsystems, etc.). When a bug is fixed in a library we add a test when it’s possible (testing UI, Sprites, 3D engine is complicated as you can’t really check the graphical output). The Syntax is very easy to use as you can see from this example taken from the String library unit test file:

ProcedureUnit InsertStringTest()
  Hello$ = "Hello"
  World$ = "World"
  AssertString(InsertString("Hello !", World$, 7), "Hello World!")
  AssertString(InsertString("Hello !", World$, 8), "Hello !World")
  AssertString(InsertString("Hello !", World$, 1), "WorldHello !")
  AssertString(InsertString("Hello !", World$, -1), "WorldHello !") ; Test with out of bounds value
  AssertString(InsertString(Null$, Null$, 1), "") ; Null values should be accepted as well
  AssertString(InsertString("", "Hello", 1), "Hello") ; Inserting empty string shouldn't touch the string
  AssertString(InsertString(Null$, "Hello", 1), "Hello") ; Inserting null string shouldn't touch the string
  
  AssertString(InsertString(Hello$+Hello$, World$+World$, 6), "HelloWorldWorldHello") ; Test with internal buffer use
EndProcedureUnit


ProcedureUnit LeftTest()
  Test$ = "Test"
  AssertString(Left(Test$, 1)      , "T")
  AssertString(Left(Test$, 0)     , "")
  AssertString(Left(Test$, -1)     , "")
  AssertString(Left(Test$, 10)     , "Test")
  AssertString(Left(Test$, 3)      , "Tes")
  AssertString(Left(Test$, 4)      , "Test")
  AssertString(Left(Null$, 0)      , "")
  Result$ = Left(Test$, 1)
EndProcedureUnit


ProcedureUnit LenTest()
  Test$ = "Test"
  Assert(Len(Test$) = 4)
  Assert(Len(Null$) = 0)
EndProcedureUnit

The test procedure are declared with ProcedureUnit and EndProcedureUnit, there is some Assert commands available to control the checks. You can declare some common setup or clear procedure to run before and after each test with ProcedureUnitBefore and ProcedureUnitAfter. There is much more, feel free to check the PureUnit documentation to get the most of it. I strongly recommand to give it a try if you need to write some unit tests for your own PureBasic application.

The next tests we have are ‘linker’ tests. We put all commands of a single library in a file, and compile it to ensures it links properly. Then we put all the commands of PureBasic in a single file and we link it as well.

Then we test if all the examples found the in PureBasic/Examples directory compiles properly with the new PureBasic version.

Once all these tests are run and successful, the final package can created and published.

PureBasic on Apple M1 processors

Last week, I bought a new Mac Mini to be able to port PureBasic on the new Apple M1 chip. First boot looks very familiar, you can’t really tell there is a new processor here. A quick look at the Task Monitor and we can see all the programs running on an ‘Apple’ CPU. Great. All seems smooth so far, a very quiet computer in a small form factor.

After a few minutes toying with the prefs to plugin my PC keyboard and Monitor on a KVM switch, I was ready to start to dev, and downloaded XCode. It was the longest installation I never experienced, at a point I though the Mini was broken. It tooks over 4 (four !) hours to install (after downloading). Seems to be known issue if I believe all the posts found on Reddit. Needed to install homebrew for subversion (I know, I know) and I finally started hacking the compiler.

This promised to be exciting, as the C toolchain isn’t based on gcc but on clang/LLVM. I expected some adjustements to do here and here, but there were actually zero. Everything compiled out of the box and the first running executable was created after half day. Two more days and the whole compiler test suits was working which was amazing ! There is indeed some more work to do to build the full package, as some libraries needs to be tuned (missing headers due to new Cocoa SDK mainly), build scripts adjusted and so on, but it does looks very bright.

This will be the first version of PureBasic shipped without an assembly back-end. It does feel a bit weird for me, but I know it will do the job just fine to build your cool apps on new Macs !

Sneak peak into new CPU support for C back-end

I just finished the x86 (32-bit) support for the C Back-end on Windows and it barely took a week to do. Most of the time was spent on upgrading the build chain (linker, windows static libs, packaging the 32-bit version of gcc etc.), it was very very fast. Next step is to work on a ‘real’ new architecture, probably ARM M1 for OS X, looks interesting !

The IDE is finally working with the C back-end !

This is the milestone we waited to validate the C back-end. It tooks quite a lot of iteration with the alpha testers (and thanks to all of you all btw) to iron major issues, but here we are: the IDE is mostly working when compiled with the C back-end. This is a big deal: the IDE is a very large software of about 130 000 lines of PureBasic code and using a lot of different features of the language. So what’s the functional changes ? Well, nothing. And that’s on purpose. The plan is to be able to switch seemlessly from ASM to C back-end and be able to compile the IDE on new platform, like Raspberry or OS X M1.

The only noticeable change is the compilation time, which jumps from 5 secs on the ASM back-end to a whooping 24 secs on the C back-end (based on GCC) with the debugger on and optimisation disabled. Keep in mind I got a very old CPU (Core i7 860 from 2009) and as discussed previously, just swapping the C compiler could lead to dramatic improvements.

All in all, it’s very good news, and we are looking forward to fix the last remaining showstoppers and creating a beta version for all the OS, including may be some new like Raspberry !

Quick look to PureBasic C back-end performance

PureBasic is currently using raw assembly code generation, which means the code is translated line by line to its corresponding assembly counterpart. This way to generate code has pros and cons: it’s easy to do but we loose the code context, so doing advanced optimizations (like variables to registers, loop unrolling, function inlining, code vectorization, variable removing etc.) are off the table. The only optimization PureBasic can perform is once the full assembly code is generated: a pass to detect patterns which can be optimized and replaced by more efficient one. For example:

          MOV ebx,dword [v_Gadget_ImagePool]
          INC ebx
          MOV dword [v_Gadget_ImagePool],ebx

is transformed to:

          INC    dword [v_Gadget_ImagePool]

PureBasic x86 can detect 16 different patterns and optimize them. It’s not much, but it still improves the final code quality. This kind of optimization is called peephole optimizer. It works on final code generation and can’t do much regarding high level optimization. The good point is the speed of it, the additional pass barely add any time to the compilation time. So you may ask why PureBasic can’t generated more complex optimizations ? Well, because we just don’t have the time (and let be honest, the skills) to do it our-self. It takes a lot of efforts which requires large programmer teams and academic researches.

Entering the C language, the world most used low-level language and probably the one which best optimizing compilers. PureBasic is now able to leverage the power of its optimizations, so let’s take a real world look at it. We are using the 3D example ‘MeshManualParametrics’ found in the PureBasic package. This is a mix of 3D rendering (through OGRE) and 3D calculations done in PureBasic. Here are the results:

PureBasic x64 – assembly back-end : 192 FPS

PureBasic x64 – C Back-end with optimizer enabled (-02) : 298 FPS

That’s basically a 50% increase in frame-rate just by switching the compiler. It’s really a lot. Another interesting point is the speed of the executable of C back-end without optimization (-O0) was 192 FPS as well, like PureBasic assembly back-end. Not that bad for small team compiler !

Sneak peek to C generated code

Next major PureBasic version will feature a brand new C backend, which will allow to support virtually any CPU, current and future. The generated file is a big flat file, with no include directive to allow the fastest compilation speed as possible. In the meantime, we decided to create a C output as clean as possible to allow easy modification or even direct reuse into a C project. Let’s take a look to it by comparing a PureBasic snippet and its generated C code counterpart:

Procedure Test()
  Counter = Counter + 5
  
  If Counter = 5
    MessageRequester("Hello world", "Hello")
  EndIf
  
  ProcedureReturn 6
EndProcedure

Test()

And the generated C code (omitting all boilerplate) :

static unsigned short _S1[]={72,101,108,108,111,32,119,111,114,108,100,0};
static unsigned short _S2[]={72,101,108,108,111,0};

static integer f_test() 
{
  integer r=0;
  integer v_counter=0;
  
  v_counter=(v_counter+5);
  
  if (v_counter==5) 
  {
    integer rr0=PB_MessageRequester(_S1,_S2);
  }

  r=6;
  goto end;
  end:
  return r;
}

int __stdcall WinMain(void *instance, void *prevInstance, void *cmdLine, int cmdShow) 
{
  PB_Instance = GetModuleHandleW(0);
  PB_MemoryBase = HeapCreate(0,4096,0);
  SYS_InitString();
  PB_InitDesktop();
  PB_InitRequester();

  integer rr1=f_test();
  
  SYS_Quit();
}

As we can see, the code is pretty close to PureBasic one, and easy to read. Inline C will be supported, and all the objects like variables, arrays, lists etc. could be accessed by following the generated token name pattern (example: ‘v_’ prefix for variables followed with the PureBasic name in lowercase).

What about compilation speed ? We are using GCC 8.1.0 with no optimization (-O0) on Windows 10 x64 on a first gen Core i7 to perform the tests. We compile a 13.000 lines program, DocMaker, with debugger ON:

  • GCC backend: about 3 seconds to create the executable
  • FASM backend: about 1 second to create the executable

So that’s about 3 times slower but still an OK time to develop. Next test is a big program, the PureBasic IDE, with about 125.000 lines of codes with debugger ON. We also includes Microsoft VisualC++ 2015 in the tests:

  • GCC backend: about 24 seconds to create the executable
  • VC++ backend: about 9 seconds to create the executable
  • FASM backend: about 4 seconds to create the executable

GCC is a lot slower, about 6 times, than the current FASM backend. The VC++ backend is about 2 times slower, which is much better but not that great. All in all, the FASM backend will still be better to use for quick development cycle. Keep in mind than it is very early tests and as we just seen, just switching the C compiler could dramatically increase compile time. That’s it for now, next time we will focus about runtime performance between C and FASM backends !

PureBasic and increased varieties of CPU

Early versions of PureBasic started on Amiga, with the support of the Motorola 680×0 CPU series. PureBasic has been designed from start to generate raw assembly code, and the first working CPU assembly back-end was the one for 680×0. As the audience on Amiga was quickly falling, we decided to move on PC. In 1998, there was only one PC CPU architecture, the x86, and it was decided to add a new assembly back-end for it. It is a very time consuming task and requires a lot of learning to be able to handle a new assembly. 680×0 and x86 architectures are quite opposite but it fitted in the PureBasic architecture and PureBasic for x86 was a reality.

Shortly after, we decided to go really cross-platform and support OS X, which was running on PowerPC processors. A new assembly back-end was needed and we did it again, during almost a full year to get something working. Shortly after, Apple announced they will drop support for PowerPC and choose x86 instead ! That was a huge blow for us, and a lot of time wasted. In the meantime, x86-64 was released and slowly gaining traction and we knew we should support it as well. Unlike the name suggests it, x86 and x86-64 are two different beasts, and a lot of work was needed to make it happen. Once done, we though we could finally focus on functionalities for a while, but computer science is evolving at quick rate and a new CPU architecture appeared and took the world by storm, first on mobiles, then on tiny computers and now on desktop: ARM and ARM-64.

So here we are, to support these new platforms, we face the same issue again: we need new assembly back-ends. As a small company, we can’t dedicate a whole year every time a new assembly emerge, so we need to find a future proof solution. Meanwhile, a fork of the PureBasic compiler has been created to generate high-level language instead of raw assembly. This allowed to create SpiderBasic, a PureBasic-like language which can run on the browser by generating JavaScript code.

When Apple announced their new computers will be running on ARM, we knew we needed to change our way to handle new processors architecture and we decided to leverage the work already done of SpiderBasic and port it back for PureBasic. We experimented a lot with LLVM for quite a time now, but it is an headache to find it on all architectures. Also it’s a kind of mix between assembly code and high-level language so it didn’t fit either PureBasic or SpiderBasic back-end architecture. For the past 8 months, we decided to focus on a new high-level back-end for PureBasic: the C language.

The good news is the whole PureBasic compiler test suit is successfully running on the C back-end, so we are very confident we could have a compiler ready to test quickly. Stay tuned, more details to come soon !

New PureArea.net Interview with Fred and freak released

Hi folks, Happy New Year 2016!

I’m proudly presenting a new interview I made with the main PureBasic coders Frederic ‘alphaSND’ Laboureur and Timo ‘freak’ Harter.

You can read it on my website www.PureArea.net, just go to the ‘News’ section of 4th January 2016.

I hope you enjoy the reading of this (like I find) interesting interview!

(Direct link to the interview)

Support for Ascii compilation ends after the next LTS cycle

As Fred has explained here, supporting both the creation of ascii and unicode executables in the compiler is becoming a burden and we would like to end support for ascii compilation in order to streamline the library code and make it easier to maintain the PureBasic package in the future. However, the above thread has shown that people wish for a longer transition period, and we would like to honor this wish.

So we decided to remove the ability to compile in ascii mode in the PB version that follows after the next LTS version (that is, the LTS version coming after the 5.2x LTS cycle ends).

There are no exact dates for the releases, but the timeline looks like this:

  • The current 5.2x LTS version will be supported until at least 09/17/2015
  • After that date, a new LTS version will be released with support for 2 years. This version will still have full ascii compilation support.
  • The first non-LTS version released after the next LTS cycle starts will have no ascii compilation support

This means that the first non-LTS version without ascii support will be released in about a year. By staying with LTS releases, you have the ability to use an ascii mode compiler with a fully supported PB version for at least another 3 years starting from today. This should give enough time for a smooth transition.

No changes will be made to the language or data types (The .c, .a and .u as well as the pseudo-types and library commands will remain as they are). The only difference is that the “compile in unicode mode” switch in the compiler will be permanently set to “on”.

Please understand that we do listen to the concerns voiced in the discussion thread and that we do not make this decision lightly. I think we have a quite good track record of supporting older technology as is evidenced by the very long support for Windows 9x which just recently ended and by the fact that we still support Windows XP even after MS dropped all support for it. However, in order for us to be able to introduce new technologies (like x64 compilation in the past, or now support for the web with Spiderbasic), we simply cannot support old stuff indefinitely. In order to move forward into the future, we have to leave some stuff behind from time to time. We hope you can understand that.

Windows 95, the comeback !

I know, i know. But there is some stuffs i really hate, and one of them is when you upgrade your compiler version, and suddenly a software stop working on a specific plateform. That happened to the purebasic compiler executable, when we upgraded to VC8. Basically, VC8 adds a dependency to every executable created with it, which is not supported in Windows 95: the famous IsDebuggerPresent() API function (found in Kernel32.dll). So when you launch it on Win95, it just fails at loading time, telling there is a missing symbol. Well, who care about Win95 anymore, you think. I do. For fun. And it’s not a missing symbol (which is useless BTW, and added under the hood by linking libcmt.lib – which is requiered) which will stop PureBasic to work on lovely oldies plateforms.

So the fun began. I asked to my best virtual friend – google – to gently find me a workaround. And guess what ? There was one. But you had to link statically the c runtime and hack a lot with sources. I didn’t want to do that. So i tried to mess with the kernel32 import library (kernel32.lib) to remove the IsDebuggerPresent object. Total failure. Microsoft ‘lib’ tool just see tons on ‘Kernel32.dll’ object in the kernel32.lib when you launch:

prompt> lib /LIST kernel32.lib
KERNEL32.dll
KERNEL32.dll
KERNEL32.dll
KERNEL32.dll
KERNEL32.dll
KERNEL32.dll
....
prompt>

Ok. Not very useful to remove a specific object. Sounds like a bug btw…

So i tried with ‘polib’, which comes from PellesC package, without more success. It successfully list the objects name, but unfortunately can’t remove it. Well, i was out of luck. So i created a .def with kernel32 exports and tried to use ‘lib’ to create an import lib. But it doesn’t work as well, for some reason (seems like the lib created isn’t a real dll import lib). Finally i ended up to create a fake kernel32.dll, and reuse the import lib created by the linker. And voilĂ , it worked.. At least VC8 linked and complained about the missing IsDebuggerPresent import, so the hard part was done. Now we just need a small stub in our executable and it’s over. Here is the stub:

// The cool Win95 hack, so the compiler can launch on it, even compiled with VS2005
// Basically, this function is needed by the libcmt.lib/gs_report.obj
//
BOOL __stdcall _imp__IsDebuggerPresent()
{
return FALSE;
}

Timo kindly tested the new build and PureBasic launched and worked correctly on an old Win95. Victory !