Merge PB sources

Share your advanced PureBasic knowledge/code with the community.
srod
PureBasic Expert
PureBasic Expert
Posts: 10589
Joined: Wed Oct 29, 2003 4:35 pm
Location: Beyond the pale...

Merge PB sources

Post by srod »

Hi,

just a little utility which I required.

This utility will take a Purebasic source code file, parse it whilst looking for 'include files' and produce a single UTF-8 source file containing all of the source code thus encountered.

It is quite flexible in that the following commands in your source :

Code: Select all

IncludePath
IncludeFile
XIncludeFile
can all use string concatenation and can use the #PB_Compiler_Home constant as provided by the PB compiler.

The return value is quite detailed and will provide info on all kinds of errors and will, for example, list the full name of any include file which cannot be located etc.

There are also options for removing whitespace from the resulting source file etc.

NOTES.
  • This utility requires my OOP stack class in order to run : http://www.purebasic.fr/english/viewtop ... ight=stack
  • This code is fully Unicode compliant, BUT is not threadsafe because of the use of a global variable. To combat this either use a mutex or... alter the code! :wink: It would be a simple matter to alter the code, but the fact is that I do not require this to be threadsafe.
  • The GetPBFolder() function has been lifted from the Tailbite sources. Thanks to El-Choni, ABBKlaus and Gnozal for this.
    It is only this function which renders this utility 'Window's only'. Simply replace it with a 'dummy' function to make this utility cross-platform. Using #PB_Compiler_Home instead is fine providing this utility is only ever used in source code form when transferred between machines.

Code: Select all

;/////////////////////////////////////////////////////////////////////////////////
;A utility for parsing a Purebasic source file, creating a single source file
;from that specified plus all included files.

;By Stephen Rodriguez.
;February 2008.

;Developed with Purebasic 4.2 beta 2.
;Windows only.  (Easily made cross-platform!)
;
;Unicode compliant.
;NOT threadsafe because of the use of a single global variable as seen below. This is
;easily rectified.
;/////////////////////////////////////////////////////////////////////////////////

;/////////////////////////////////////////////////////////////////////////////////
;-NOTES.
; i)    All XincludeFile and IncludeFile and IncludePath statements must not be part
;       of multi-statemented (colon separated) lines and only involve a constant
;       literal string. Concatenation can be used.
; ii)   All XincludeFile and IncludeFile and IncludePath statements must not depend on any
;       preprocessor actions such as macros. The only exception is the permissable use of
;       #PB_Compiler_Home.
; iii)  All source files must be in either ASCII or UTF-8 format.
;/////////////////////////////////////////////////////////////////////////////////

;/////////////////////////////////////////////////////////////////////////////////
;USAGE:
;======
;   MergePBSources(inFile$, outFile$, flags=0)
;
;flags can be a combination of #MPBS_RemoveComments and #MPBS_RemoveWhiteSpace and #MPBS_RemoveEmptyLines.
;
;Return value is one of the constants listed in the enumeration below. Also, in the case
;of a #MPBS_INCLUDEFILENOTFOUND error return, the global variable gMPBS_Include$ will contain
;the full name of the file which could not be located.
;/////////////////////////////////////////////////////////////////////////////////


;/////////////////////////////////////////////////////////////////////////////////
;-EXTERNAL INCLUDES
  IncludePath #PB_Compiler_Home+"Includes"
    XIncludeFile "stackClass.pbi"
;/////////////////////////////////////////////////////////////////////////////////

;/////////////////////////////////////////////////////////////////////////////////
;-CONSTANTS and STRUCTURES.
  #_MERGESOURCES_SIZEOFSTACK = 1024  ;The max number of include files.
  #_MERGESOURCES_SEPARATOR = "+ "+Chr(34)

  Enumeration
    #MPBS_RemoveCommentedLines    = 1
    #MPBS_RemoveWhiteSpace  = 2
    #MPBS_RemoveEmptyLines  = 4
  EndEnumeration

  Enumeration  ;Return values.
    #MPBS_OKAY                    = 0
    #MPBS_INVALIDSOURCEFILE       = -1
    #MPBS_INSUFFICIENTMEMORY      = -2
    #MPBS_FILEOPENERROR           = -3
    #MPBS_INVALIDPATH             = -4
    #MPBS_INVALIDINCLUDEFILE      = -5
    #MPBS_INCLUDEFILENOTFOUND     = -6  ;In this case, gMPBS_Include$ will contain the path\filename
                                        ;of the file which cannot be located.
    #MPBS_NESTEDINCLUDESOVERFLOW  = -7  ;Probably a recursive use of a certain include file.
  EndEnumeration
;/////////////////////////////////////////////////////////////////////////////////

;The following global is used in the case of a #MPBS_INCLUDEFILENOTFOUND error and contains
;the full name of the file which could not be located. (It is also used internally!)
  Global gMPBS_Include$


EnableExplicit


;/////////////////////////////////////////////////////////////////////////////////
;The following function has been 'lifted' from the Tailbite sources.
;My thanks to El-Choni, ABBKlaus and Gnozal for this.
Procedure.s GetPBFolder()
  Protected hKey1.l, Type.l, Res.l, Folder$, lpbData.l, cbData.l, PBRegKey.s
  cbData = 1024  ;The PB entry is really quite long!
  lpbData = AllocateMemory(cbData)
  Folder$=""
  hKey1=0
  Type=0
  Res=-1
  If lpbData
    Select OSVersion()
      Case #PB_OS_Windows_95,#PB_OS_Windows_98,#PB_OS_Windows_ME
        PBRegKey="Software\Classes\PureBasic.exe\shell\open\command"
        Res=RegOpenKeyEx_(#HKEY_LOCAL_MACHINE, PBRegKey, 0, #KEY_ALL_ACCESS, @hKey1)
      Case #PB_OS_Windows_NT3_51,#PB_OS_Windows_NT_4,#PB_OS_Windows_2000,#PB_OS_Windows_XP,#PB_OS_Windows_Server_2003
        PBRegKey="Applications\PureBasic.exe\shell\open\command"
        Res=RegOpenKeyEx_(#HKEY_CLASSES_ROOT, PBRegKey, 0, #KEY_ALL_ACCESS, @hKey1)
      Case #PB_OS_Windows_Vista,#PB_OS_Windows_Server_2008,#PB_OS_Windows_Future
        PBRegKey="Software\Classes\PureBasic.exe\shell\open\command"
        Res=RegOpenKeyEx_(#HKEY_CURRENT_USER, PBRegKey, 0, #KEY_ALL_ACCESS , @hKey1)
    EndSelect
    If Res = #ERROR_SUCCESS And hKey1
      If RegQueryValueEx_(hKey1, "", 0, @Type, lpbData, @cbData)=#ERROR_SUCCESS
        Folder$ = PeekS(lpbData)
        Folder$ = GetPathPart(StringField(Folder$,2,Chr(34)))
      EndIf
      RegCloseKey_(hKey1)
    EndIf
    FreeMemory(lpbData)
  EndIf
  ProcedureReturn Folder$
EndProcedure
;/////////////////////////////////////////////////////////////////////////////////


;/////////////////////////////////////////////////////////////////////////////////
;The following function processes a path / filename string and returns the corresponding
;string literal. ;The result of this is not necessarily a perfectly validated path / filename,
;but that will be picked up later.
;The string could contain concatenations or the #PB_Compiler_Home constant.
;Returns 0 if an error else places the return in gMPBS_Include$.
Procedure.l MergePBSources_ProcessIncludeString(line$)
  Protected result=1, char$="", token$=""
  Protected numberoftokens, length, left, right, i
  gMPBS_Include$ = ""
  length=Len(line$)
  If length
    ;First we 'tokenise' the line.
      left=1 : right=1
      Repeat 
        char$=Mid(line$,right,1)
        If FindString(#_MERGESOURCES_SEPARATOR, char$,1)
          If left<right
            token$ + Mid(line$,left,right-left) + Chr(10)
            numberoftokens+1
            left=right
          ElseIf char$=Chr(34) ;Open quote. left=right
            right = FindString(line$, char$,left+1)
            If right = 0 ;No end quote.
              numberoftokens = 0
              token$=""
              Break
            ElseIf right-left>1
              token$ + Mid(line$,left,right-left+1) + Chr(10)
              numberoftokens+1
            EndIf
            right+1          
            left = right
          ElseIf char$<>" " ;left=right
            token$ + Mid(line$,left,1) + Chr(10)
            numberoftokens+1
            left+1 : right+1
          Else          
            left+1 : right+1
          EndIf
        ElseIf right=length
          right+1
          token$ + Mid(line$,left,right-left) + Chr(10)
          numberoftokens+1
        Else          
          right+1
        EndIf
      Until right>length
    ;Now process the tokens.
      For i = 1 To numberoftokens
        char$ = StringField(token$, i, Chr(10))
        If i = i/2*2
          If char$<>"+"
            result=0
            gMPBS_Include$=""
            Break
          EndIf
        ElseIf LCase(char$) = "#pb_compiler_home"
          gMPBS_Include$+GetPBFOlder()
        ElseIf Left(char$,1) = Chr(34)
          gMPBS_Include$ + Mid(char$,2, Len(char$)-2)
        Else
          result=0
          gMPBS_Include$=""
          Break
        EndIf
      Next 
  EndIf
  ProcedureReturn result
EndProcedure
;/////////////////////////////////////////////////////////////////////////////////


;/////////////////////////////////////////////////////////////////////////////////
;The main function.
;Returns one of the constants listed above.
Procedure.l MergePBSources(inFile$, outFile$, flags=0)
  Protected result
  Protected fileStack.StackObject, inFileID
  Protected path$, line$, work$, field$, low, high
  Protected outFileID
  Protected XInclude$
  If FileSize(inFile$) <=0
    ProcedureReturn #MPBS_INVALIDSOURCEFILE
  EndIf
  ;Attempt to create the file stack.
    fileStack = NewStack(#_MERGESOURCES_SIZEOFSTACK)
    If fileStack = 0
      ProcedureReturn #MPBS_INSUFFICIENTMEMORY
    EndIf
  ;Attempt to open the main file.
    path$ = GetPathPart(inFile$)
    inFileID = ReadFile(#PB_Any, inFile$)
    If inFileID = 0
      fileStack\Destroy()
      ProcedureReturn #MPBS_FILEOPENERROR
    EndIf
  ;Attempt to open the output file.
    outFileID = CreateFile(#PB_Any, outFile$)
    If outFileID = 0
      fileStack\Destroy()
      CloseFile(inFileID)      
      ProcedureReturn #MPBS_FILEOPENERROR
    EndIf
  ;Good to go!
  ReadStringFormat(inFileID)
  WriteStringFormat(outFileID, #PB_UTF8) ;This is a PB source file after all.
  result = #MPBS_OKAY
  Repeat      
    If fileStack\NumberOfElementsPushed()
      inFileID = fileStack\pop()
    EndIf
    While Eof(inFileID) = 0
      line$ = ReadString(inFileID, #PB_UTF8)
      work$ = Trim(RemoveString(line$, Chr(9)))
      If work$ Or flags&#MPBS_RemoveEmptyLines = 0
        If flags&#MPBS_RemoveWhiteSpace
          line$ = work$
        EndIf
        ;Check the various 'include' options.
          If FindString(LCase(work$), "includepath",1) = 1
            work$ = Trim(Right(work$,Len(work$)-11)) 
            If work$ = Chr(34)+Chr(34)
              path$=""
            Else
              If MergePBSources_ProcessIncludeString(work$) = 0
                CloseFile(inFileID)
                result = #MPBS_INVALIDPATH ;Error.
                Break 2
              EndIf
              path$=gMPBS_Include$
              If path$
                If Right(path$,1)<>"\"
                  path$+"\"
                EndIf
              EndIf
            EndIf
          ElseIf FindString(LCase(work$), "xincludefile",1) = 1
            work$ = Trim(Right(work$,Len(work$)-12)) 
            If MergePBSources_ProcessIncludeString(work$) = 0
              CloseFile(inFileID)
              result = #MPBS_INVALIDINCLUDEFILE ;Error.
              Break 2
            EndIf
            work$ = gMPBS_Include$
            If FindString(work$, ":",1) = 0
              work$ = path$+work$
            EndIf
            If FileSize(work$)<0
              CloseFile(inFileID)
              gMPBS_Include$ = work$
              result = #MPBS_INCLUDEFILENOTFOUND ;Error.
              Break 2
            EndIf
            ;Here we have an XIncludeFile.
            ;First check if the file has already been included.
              If FindString(XInclude$, "<"+work$+">",1) = 0
                ;Attempt to open the new file.
                  low = ReadFile(#PB_Any, work$)
                  If low = 0
                    CloseFile(inFileID)
                    result = #MPBS_FILEOPENERROR ;Error.
                    Break 2
                  EndIf
                  ReadStringFormat(low)
                ;Push the current file# on the stack.
                  If fileStack\Push(inFileID) = 0
                    CloseFile(low)                    
                    CloseFile(inFileID)
                    result = #MPBS_NESTEDINCLUDESOVERFLOW ;Error.
                    Break 2
                  EndIf
                  inFileID = low
                ;Add this file to the list of already included files.
                xinclude$ + "<"+work$+">"
              EndIf          
          ElseIf FindString(LCase(work$), "includefile",1) = 1
            work$ = Trim(Right(work$,Len(work$)-11)) 
            If MergePBSources_ProcessIncludeString(work$) = 0
              CloseFile(inFileID)
              result = #MPBS_INVALIDINCLUDEFILE ;Error.
              Break 2
            EndIf
            work$ = gMPBS_Include$
            If FindString(work$, ":",1) = 0
              work$ = path$+work$
            EndIf
            If FileSize(work$)<0
              CloseFile(inFileID)
              gMPBS_Include$ = work$
              result = #MPBS_INCLUDEFILENOTFOUND ;Error.
              Break 2
            EndIf
            ;Here we have an IncludeFile.
            ;Attempt to open the new file.
              low = ReadFile(#PB_Any, work$)
              If low = 0
                CloseFile(inFileID)
                result = #MPBS_FILEOPENERROR ;Error.
                Break 2
              EndIf
              ReadStringFormat(low)
            ;Push the current file# on the stack.
              If fileStack\Push(inFileID) = 0
                CloseFile(low)                    
                CloseFile(inFileID)
                 result = #MPBS_NESTEDINCLUDESOVERFLOW ;Error.
                 Break 2
              EndIf
              inFileID = low
            ;Add this file to the list of already included files.
              xinclude$ + "<"+work$+">"
          Else  ;Write line.
            If Left(work$,1)<>";" Or flags&#MPBS_RemoveCommentedLines = 0
                WriteStringN(outFileID, line$)
            EndIf
          EndIf
      EndIf
    Wend
    CloseFile(inFileID)
  Until fileStack\NumberOfElementsPushed() = 0
  ;Clear the stack if there was an error.
    high = fileStack\NumberOfElementsPushed()
    For low = 1 To high
      inFileID = fileStack\Pop()
      CloseFile(inFileID)
    Next
  fileStack\Destroy()
  CloseFile(outFileID)
  If result <> #MPBS_OKAY ;Error.
    DeleteFile(outFile$)
  Else
    gMPBS_Include$ = ""
  EndIf
  ProcedureReturn result
EndProcedure
;/////////////////////////////////////////////////////////////////////////////////

DisableExplicit


;/////////////////////////////////////////////////////////////////////////////////
;-TEST.
;Uncomment to test.
;MergePBSources("MergePBSources.pbi", "output.pb", #MPBS_RemoveEmptyLines)
;/////////////////////////////////////////////////////////////////////////////////
Last edited by srod on Wed Feb 06, 2008 4:53 pm, edited 4 times in total.
I may look like a mule, but I'm not a complete ass.
#NULL
Addict
Addict
Posts: 1499
Joined: Thu Aug 30, 2007 11:54 pm
Location: right here

Post by #NULL »

what is that 'remove whitespace' for? it's killing any indentation.
and i had a problem with removing comments:
i often use folding like that:

Code: Select all

  Case 1 ;{
    abc=123
  ;}
  EndSelect
the comment in the 'case'-line will stay, but the other one will be removed. so the folding is broken (many openings and less closings will be in the out-file)

btw: i do not know right now if ;{ and ;} are folding keywords per default in PB or if i added them by myself. if they are no default actually, forget about it :P
srod
PureBasic Expert
PureBasic Expert
Posts: 10589
Joined: Wed Oct 29, 2003 4:35 pm
Location: Beyond the pale...

Post by srod »

#NULL wrote:what is that 'remove whitespace' for? it's killing any indentation.
That's what it is supposed to do!

The whole reason I wrote this utility is because I need to parse a PB source file and I'd rather do it with a single file rather than multiple includes. I chose to remove all 'whitespace' from the resulting source file to make parsing that much simpler. It's the same reason why I gave the utility the ability to remove all empty lines if you use the flag #MPBS_RemoveEmptyLines etc.

If you want to keep the indentation then don't use the #MPBS_RemoveWhiteSpace flag!
and i had a problem with removing comments...
Yes, the utility will only remove fully commented lines. I've now adjusted the code to use the flag #MPBS_RemoveCommentedLines in place of the old one.
The fact is that the application I am working on only needs to remove fully commented lines etc. Comments tagged on to the end of lines are neither here of there.

As for the folding; well for my purposes, the source file generated by this utility is not supposed to be fed back into the PB IDE and recompiled etc. If you are doing this and the comment removal is a problem, then opt not to remove the comments - simple! :)
Last edited by srod on Wed Feb 06, 2008 4:26 pm, edited 1 time in total.
I may look like a mule, but I'm not a complete ass.
srod
PureBasic Expert
PureBasic Expert
Posts: 10589
Joined: Wed Oct 29, 2003 4:35 pm
Location: Beyond the pale...

Post by srod »

Bug fixed.

The code in the first post has been adjusted accordingly.
I may look like a mule, but I'm not a complete ass.
gnozal
PureBasic Expert
PureBasic Expert
Posts: 4229
Joined: Sat Apr 26, 2003 8:27 am
Location: Strasbourg / France
Contact:

Post by gnozal »

Hi Srod,

is this tool like GPI's jaPBe cutter plugin ?
For free libraries and tools, visit my web site (also home of jaPBe V3 and PureFORM).
srod
PureBasic Expert
PureBasic Expert
Posts: 10589
Joined: Wed Oct 29, 2003 4:35 pm
Location: Beyond the pale...

Post by srod »

gnozal wrote:Hi Srod,

is this tool like GPI's jaPBe cutter plugin ?
I have no idea; what does the cutter tool do?
I may look like a mule, but I'm not a complete ass.
inc.
Enthusiast
Enthusiast
Posts: 406
Joined: Thu May 06, 2004 4:28 pm
Location: Cologne/GER

Post by inc. »

IIRC "Cutter" does remove unused Procedure definitions.

It seems this one follows the same approach as Remi Meiers "merger" tool.
http://www.purebasic.fr/english/viewtop ... 682#106682


PS: Nice work srod! As all of yours. :)
Check out OOP support for PB here!
gnozal
PureBasic Expert
PureBasic Expert
Posts: 4229
Joined: Sat Apr 26, 2003 8:27 am
Location: Strasbourg / France
Contact:

Post by gnozal »

inc. wrote:IIRC "Cutter" does remove unused Procedure definitions.
Yes, it merges all project files to one and removes unused procedures.
For free libraries and tools, visit my web site (also home of jaPBe V3 and PureFORM).
srod
PureBasic Expert
PureBasic Expert
Posts: 10589
Joined: Wed Oct 29, 2003 4:35 pm
Location: Beyond the pale...

Post by srod »

I cannot locate Remi Meier's merger tool - the download link seems to be broken.
I may look like a mule, but I'm not a complete ass.
srod
PureBasic Expert
PureBasic Expert
Posts: 10589
Joined: Wed Oct 29, 2003 4:35 pm
Location: Beyond the pale...

Post by srod »

Mine won't remove unused procedures (I'll leave that to the PB compiler) but does merge all files. It is of course independent from JaPBe or the PB IDE etc.
I may look like a mule, but I'm not a complete ass.
#NULL
Addict
Addict
Posts: 1499
Joined: Thu Aug 30, 2007 11:54 pm
Location: right here

Post by #NULL »

yes for further parsing this makes sense.
but i can do my own version thanks to your code, srod :D

remi meier has problems with his webspace. but if you contact him, i guess he can help you.
remi_meier
Enthusiast
Enthusiast
Posts: 468
Joined: Sat Dec 20, 2003 6:19 pm
Location: Switzerland

Post by remi_meier »

Yep, webspace gone. And the files are on another PC, as usual :roll:

But anyway, I don't think it is better than yours, it was just an interim solution.

But surely jaPBe's cutter plugin is better, though I don't know if it supports
Unicode, as it was written long time ago :? . As opposed to PB's ability to
cut unused procedures, the cutter will really do what you it should, that means
the cutter will give much smaller executables than PB on its own (search the
forums if you want to know more about it, this is of course not the right place
to discuss this :wink: ).


You talked about a parsing job? I just wanted to point you to my lexer, I still
see so many tools here that reinvent the wheel and create unnecessary
limitations. So perhaps it is also suited for your project:
http://www.purebasic.fr/english/viewtop ... ight=lexer
I've used it for various tools and so far it does its work perfectly. But as I said,
it's not suited for every application.


greetz
remi
Athlon64 3700+, 1024MB Ram, Radeon X1600
User avatar
luis
Addict
Addict
Posts: 3895
Joined: Wed Aug 31, 2005 11:09 pm
Location: Italy

Re: Merge PB sources

Post by luis »

I was about to create something similar for a little utility I had in mind but I give up because without macro expansion and the correct evaluation of CompilerIf directives there are really too many holes left open.

And doing it yourself is really too much work, at least for me. Just look at all the possibilities for CompilerIf + Defined, you would need to rewrite the PB parser and keep track of vars, constants, functions, arrays, maps, linked lists, structures, etc.
And all that should be keep updated if something change in the compiler.

I hope one day we will have an option to create a single include directly provided by the compiler itself.
http://www.purebasic.fr/english/viewtop ... =3&t=40658

In the meantime I believe there isn't something that *really* works published on the forum, am I right ? Would be awesome but I suppose is really asking too much.
"Have you tried turning it off and on again ?"
Post Reply