Code optimization

Got an idea for enhancing PureBasic? New command(s) you'd like to see?
User avatar
Tenaja
Addict
Addict
Posts: 1959
Joined: Tue Nov 09, 2010 10:15 pm

Re: Code optimization: The solution

Post by Tenaja »

Here is a possible solution that works with manually entered data. If Fred or Freak were to test implementing this instead of their simple flag setting, it should work, or at least get them on the way. They would need to replace their flag setting with AddProcCall(Proc, CalledProc), and change the Debug statements to their macro calling code generation. I even implemented tracking to avoid duplication.

I did not do anything to handle the new Modules; I'd presume this would be handled the same way as any other Procedure.

Code: Select all


Procedure a()
	; *NOT NEEDED* and not included, OK
	; never referenced 
EndProcedure

Procedure b()
	; *NOT NEEDED* but included
	; referenced only by itself (recursive)
	b()
EndProcedure

Procedure c()
	; *NOT NEEDED* but included
	; referenced by a function never referenced: d()
EndProcedure

Procedure d()
	; *NOT NEEDED* and not included, OK
	; never referenced
	c()
EndProcedure

Procedure f()				;<<<<<<<<<<<<<<<<<<<<
	; NEEDED
	; referenced and called
EndProcedure

Procedure g()				;<<<<<<<<<<<<<<<<<<<<
	; NEEDED
	; referenced and called
EndProcedure

Procedure e()				;<<<<<<<<<<<<<<<<<<<<
	; NEEDED
	; referenced and called
	f()
	g()
EndProcedure



Structure ProcList
	List ProcList.s()
EndStructure

Global NewMap CallTree.ProcList()
Global NewMap InsertedProcs()		; keeps track to avoid duplication



Procedure ListAllProcs()	; This only lists all proc calls:
	Debug "Full Call Tree List:"
	ForEach CallTree()
		ForEach CallTree()\ProcList()
			Debug MapKey(CallTree()) + ": " + CallTree()\ProcList()
		Next
	Next
	Debug " ------------------"
EndProcedure


Procedure InsertUsedProcs(s.s);p())
	ForEach CallTree(s)\ProcList()
		; 		Debug "; " + MapKey(CallTree()) + " calls: " + CallTree()\ProcList()
		If InsertedProcs(CallTree()\ProcList()) <> #True
			Debug " Insert MacroCall: " + CallTree()\ProcList()
		EndIf
		InsertedProcs(CallTree()\ProcList()) = #True
		InsertUsedProcs(CallTree()\ProcList())
	Next
	
EndProcedure

Macro dq
	"
EndMacro

Macro AddProcCall(Proc, CalledProc)
	AddElement(CallTree(dq#Proc#dq)\proclist())
	CallTree(dq#Proc#dq)\proclist() = dq#CalledProc#dq
EndMacro

; Generate a Call Tree:
AddProcCall(b, b)
AddProcCall(d, c)
AddProcCall(e, f)
AddProcCall(e, g)
AddProcCall(e, g)				; to test duplication
AddProcCall(PB_NoProc, e)
AddProcCall(PB_NoProc, ListAllProcs)
AddProcCall(PB_NoProc, InsertUsedProcs)


e()			; force the call to require the inclusion of e()
ListAllProcs()	
InsertUsedProcs("PB_NoProc")

User avatar
luis
Addict
Addict
Posts: 3893
Joined: Wed Aug 31, 2005 11:09 pm
Location: Italy

Re: Code optimization

Post by luis »

@tenaja

You are right. For some reason I overlooked that now. My apologies.

And yes, a map of lists was the approach I had tried at the time, and then I put that away waiting for the "all-in-one" file to be generated by PB.

I had made a test source like this:

Code: Select all

Procedure a()
  ; NOT NEEDED
  ; never called 
EndProcedure

Procedure b()
  ; NOT NEEDED
  ; referenced only by itself (recursive)
  b()
EndProcedure

Procedure c()
  ; NOT NEEDED
  ; referenced by a function never called: d()
EndProcedure

Procedure d()
  ; NOT NEEDED
  ; never called
  c()
EndProcedure

Declare f()
Declare e()

Procedure z()
  ; NEEDED
  ; called by e()
f()
e()
EndProcedure

Procedure e()
  ; NEEDED
  ; called by f(), z()
  f()
  z()
EndProcedure

Procedure f()  
  ; NEEDED
  ; called by toplevel, e(), z()
  e()
EndProcedure

Procedure g()
  ; NEEDED
  ; referenced by itself and called by toplevel
  g()
EndProcedure

f()
g()

and then I tried to see if something like the following would have worked:

Code: Select all

Structure T_CALL_LIST
 flgCalled.i
 List CallList$()
EndStructure

Global NewList TopLevel$()
Global NewMap Procedures.T_CALL_LIST()

; add proc a() from its body declaration
AddMapElement(Procedures(), "a") 

; add proc b() from its body declaration
AddMapElement(Procedures(), "b") 
; we found it just call itself, so we don't add that info

; add proc c() from its body declaration
AddMapElement(Procedures(), "c")

; add proc d() from its body declaration
AddMapElement(Procedures(), "d")
; add the referenced procedures in its body
AddElement(Procedures()\CallList$())
Procedures()\CallList$() = "c" ; 

; add proc f() from its declare statement
AddMapElement(Procedures(), "f") ; 

; add proc e() from its declare statement
AddMapElement(Procedures(), "e") ; 

; add proc z() from its body declaration
AddMapElement(Procedures(), "z")
; add the referenced procedures in its body
AddElement(Procedures()\CallList$())
Procedures()\CallList$() = "f" ; 
AddElement(Procedures()\CallList$())
Procedures()\CallList$() = "e" ; 

; we found proc e() body, it's already in the map so we retrieve it
FindMapElement(Procedures(),"e")
; add the referenced procedures in its body
AddElement(Procedures()\CallList$())
Procedures()\CallList$() = "f" 
AddElement(Procedures()\CallList$())
Procedures()\CallList$() = "z" 

; we found proc f() body, it's already in the map so we retrieve it
FindMapElement(Procedures(),"f")
; add the referenced procedures in its body
AddElement(Procedures()\CallList$())
Procedures()\CallList$() = "e"

; add proc g() from its body declaration
AddMapElement(Procedures(), "g")
; we found it just call itself, so we don't add that info

; top level call to the proc f()
AddElement(TopLevel$())
TopLevel$() = "f" 

; top level call to the proc g()
AddElement(TopLevel$())
TopLevel$() = "g" 


Procedure FollowChain (proc$)
 Protected called$
 
 ForEach Procedures(proc$)\CallList$()
    called$ = Procedures()\CallList$()
    If Procedures(called$)\flgCalled = 0
        Procedures(called$)\flgCalled = 1
        FollowChain(called$)
    EndIf
 Next 
EndProcedure

; mark them
ForEach TopLevel$()    
    Procedures(TopLevel$())\flgCalled = 1    
    FollowChain(TopLevel$())       
Next

; show them
ForEach Procedures()
    If Procedures()\flgCalled        
        Debug MapKey(Procedures())
    EndIf
Next
It seemed ok with the test-code above, I didn't give it any further thought though. Just stored away for future reference or better yet to delete it when PB will do all this by itself. The compiler is the right and obvious place where this stuff must be done, once and for all.
"Have you tried turning it off and on again ?"
A little PureBasic review
User avatar
Tenaja
Addict
Addict
Posts: 1959
Joined: Tue Nov 09, 2010 10:15 pm

Re: Code optimization

Post by Tenaja »

Coincidentally, our solutions are quite similar, right down to the recursive tracking.
User avatar
Danilo
Addict
Addict
Posts: 3036
Joined: Sat Apr 26, 2003 8:26 am
Location: Planet Earth

Re: Code optimization

Post by Danilo »

skywalk wrote:@Danilo, can you elaborate on your approach to parse the asm to build a table?
I would start with collecting informations from the ASM output:
- Find procedure macros
- Inside procedure macros, find: PB labels, Calls to other PB procedures, References to PB procedures
- Outside procedures, find: References to PB labels, Calls to PB procedures, References to PB procedures

Code: Select all

EnableExplicit

Procedure ProcessASM(inputFileName.s)
    Protected inputFile
    Protected currentLine$, trimmedLine$, currentLineNum
    Protected inProcedure, currentProcedureName$, regExResultCount, i
    
    Protected regExLabels, regExPBLabels, regExPBLabelsRef, regExProcedureLabels, regExProcedureCall, regExProcedureRef, regExProcedureMacroUsed
    Protected Dim regExResult.s(0)
    
    regExLabels             = CreateRegularExpression(#PB_Any,"^( )*([a-z]|[A-Z]|[0-9]|_)+:.*",#PB_RegularExpression_NoCase)    ; GENERAL LABELS
    regExPBLabels           = CreateRegularExpression(#PB_Any,"^( )*l_([a-z]|[A-Z]|[0-9]|_)+:.*",#PB_RegularExpression_NoCase)  ; PB labels, starting with 'l_'
    regExPBLabelsRef        = CreateRegularExpression(#PB_Any,"^( )*l_([a-z]|[A-Z]|[0-9]|_)+.*",#PB_RegularExpression_NoCase)   ; PB label reference, using 'l_'
    regExProcedureLabels    = CreateRegularExpression(#PB_Any,"^( )*_Procedure([0-9])+:.*",#PB_RegularExpression_NoCase)        ; PB procedure labels, starting with '_Procedure'
    regExProcedureCall      = CreateRegularExpression(#PB_Any,"^.*CALL.*_Procedure([0-9])+.*",#PB_RegularExpression_NoCase)     ; PB procedure call, using 'CALL _ProcedureXXX' labels
    regExProcedureRef       = CreateRegularExpression(#PB_Any,"^.*_Procedure([0-9])+.*",#PB_RegularExpression_NoCase)           ; PB procedure reference, using '_ProcedureXXX' labels
    regExProcedureMacroUsed = CreateRegularExpression(#PB_Any,"^( )*MP([0-9])+.*",#PB_RegularExpression_NoCase)                 ; PB procedure macro used, starting with 'MP'
    
    inputFile = ReadFile(#PB_Any,inputFileName)
    If inputFile = 0
        Debug "File '"+inputFileName+"' not found."
        ProcedureReturn
    EndIf
    
    currentProcedureName$ = "GLOBAL"
    
    Protected stringFormat = ReadStringFormat(inputFile)
    
    While Not Eof(inputFile)
        currentLineNum + 1
        currentLine$ = ReadString(inputFile,stringFormat)
        trimmedLine$ = Trim(currentLine$)
        
        If Left(trimmedLine$,8) = "macro MP"
            Debug "(line: "+Str(currentLineNum)+") Found procedure macro: "+Right(trimmedLine$,Len(trimmedLine$)-6)
            inProcedure = #True
        ElseIf inProcedure = #True And Left(trimmedLine$,1) = "}"
            Debug "(line: "+Str(currentLineNum)+") End procedure macro"
            Debug ""
            currentProcedureName$ = "GLOBAL"
            inProcedure = #False
        ElseIf inProcedure = #True
            If MatchRegularExpression(regExProcedureLabels,trimmedLine$)
                regExResultCount = ExtractRegularExpression(regExProcedureLabels,trimmedLine$,regExResult())
                If regExResultCount
                    regExResultCount - 1
                    For i = 0 To regExResultCount
                        Debug "   (line: "+Str(currentLineNum)+") Found PB procedure label inside procedure: "+regExResult(i)
                        currentProcedureName$ = Trim(Trim(regExResult(i),":"))
                        Debug "   current Procedure is: "+currentProcedureName$
                    Next
                EndIf
            ElseIf MatchRegularExpression(regExPBLabels,trimmedLine$)
                regExResultCount = ExtractRegularExpression(regExPBLabels,trimmedLine$,regExResult())
                If regExResultCount
                    regExResultCount - 1
                    For i = 0 To regExResultCount
                        Debug "   (line: "+Str(currentLineNum)+") Found PB label inside procedure: "+regExResult(i)
                    Next
                EndIf
            ElseIf MatchRegularExpression(regExLabels,trimmedLine$)
                regExResultCount = ExtractRegularExpression(regExLabels,trimmedLine$,regExResult())
                If regExResultCount
                    regExResultCount - 1
                    For i = 0 To regExResultCount
                        ;Debug "   (line: "+Str(currentLineNum)+") Found label: "+regExResult(i)
                    Next
                EndIf
            ElseIf MatchRegularExpression(regExProcedureCall,trimmedLine$)
                regExResultCount = ExtractRegularExpression(regExProcedureCall,trimmedLine$,regExResult())
                If regExResultCount
                    regExResultCount - 1
                    For i = 0 To regExResultCount
                        Debug "   (line: "+Str(currentLineNum)+") Found procedure call: "+regExResult(i)
                    Next
                EndIf
            ElseIf MatchRegularExpression(regExProcedureRef,trimmedLine$)
                regExResultCount = ExtractRegularExpression(regExProcedureRef,trimmedLine$,regExResult())
                If regExResultCount
                    regExResultCount - 1
                    For i = 0 To regExResultCount
                        Debug "   (line: "+Str(currentLineNum)+") Found reference to procedure: "+regExResult(i)
                    Next
                EndIf
            ElseIf MatchRegularExpression(regExPBLabelsRef,trimmedLine$)
                regExResultCount = ExtractRegularExpression(regExPBLabelsRef,trimmedLine$,regExResult())
                If regExResultCount
                    regExResultCount - 1
                    For i = 0 To regExResultCount
                        Debug "   (line: "+Str(currentLineNum)+") Found reference to PB label inside procedure: "+regExResult(i)
                    Next
                EndIf
            EndIf
        Else
            If MatchRegularExpression(regExProcedureCall,trimmedLine$)
                regExResultCount = ExtractRegularExpression(regExProcedureCall,trimmedLine$,regExResult())
                If regExResultCount
                    regExResultCount - 1
                    For i = 0 To regExResultCount
                        Debug "(line: "+Str(currentLineNum)+") Found procedure call: "+regExResult(i)
                    Next
                EndIf
            ElseIf MatchRegularExpression(regExProcedureRef,trimmedLine$)
                regExResultCount = ExtractRegularExpression(regExProcedureRef,trimmedLine$,regExResult())
                If regExResultCount
                    regExResultCount - 1
                    For i = 0 To regExResultCount
                        Debug "(line: "+Str(currentLineNum)+") Found reference to procedure: "+regExResult(i)
                    Next
                EndIf
            ElseIf MatchRegularExpression(regExPBLabelsRef,trimmedLine$)
                regExResultCount = ExtractRegularExpression(regExPBLabelsRef,trimmedLine$,regExResult())
                If regExResultCount
                    regExResultCount - 1
                    For i = 0 To regExResultCount
                        Debug "(line: "+Str(currentLineNum)+") Found reference To PB label outside Procedure: "+regExResult(i)
                    Next
                EndIf
            ElseIf MatchRegularExpression(regExProcedureMacroUsed,trimmedLine$)
                regExResultCount = ExtractRegularExpression(regExProcedureMacroUsed,trimmedLine$,regExResult())
                If regExResultCount
                    regExResultCount - 1
                    For i = 0 To regExResultCount
                        Debug "(line: "+Str(currentLineNum)+") Including procedure macro: "+regExResult(i)
                    Next
                EndIf
            EndIf
        EndIf
    Wend
    
    CloseFile(inputFile)
EndProcedure

ProcessASM("PureBasic.asm")
User avatar
skywalk
Addict
Addict
Posts: 4210
Joined: Wed Dec 23, 2009 10:14 pm
Location: Boston, MA

Re: Code optimization

Post by skywalk »

Thanks Danilo, I wrote a parser and have all the CALLs and _Procedures, but I haven't figured out the logic to filter the CALL's?
Is it as simple as this?
CALLs_macro()
- Inside procedure macros = CALLs found between "MPxxx{" AND "}"
Store these in a MAP list()
Each MPxx can have n CALLs to other Procedures.
CALLs_main()
- Outside procedure macros = All other CALLs

For each CALLs_main()
Add each CALLs_macro() including its MAP contents to another MAP called CALLs_unique
This serves the purpose of eliminating redundant CALLs.

For each CALLs_unique()
Print list.
The nice thing about standards is there are so many to choose from. ~ Andrew Tanenbaum
User avatar
Danilo
Addict
Addict
Posts: 3036
Joined: Sat Apr 26, 2003 8:26 am
Location: Planet Earth

Re: Code optimization

Post by Danilo »

skywalk wrote:Is it as simple as this?
Yes. You need to include also references to PB procedures, not just calls.

References (address of PB procedure) can be used in global space, inside procedures, and also in data sections at the very end of the ASM output.

Code: Select all

Procedure p1() : EndProcedure
Procedure p2() : EndProcedure
Procedure p3() : EndProcedure

Procedure p4()
    p1()      ; procedure call
    a = @p2() ; reference to procedure
    p4_label:
EndProcedure

p1()      ; procedure call
a = @p2() ; reference to procedure

DataSection
    procs:
    Data.i @p1(), @p2() ; reference to procedures
EndDataSection


DeclareModule module01
    Declare modProc1()
    
    DataSection
        procs:
        Data.i @modProc1() ; reference to procedures
    EndDataSection
    
EndDeclareModule

Module module01
    Procedure modProc1()
        modProcLabel:
    EndProcedure
    
    modProc1()      ; procedure call
    a = @modProc1() ; reference to procedure

EndModule


module01::modProc1()      ; procedure call
a = module01::@modProc1() ; reference to procedure

x = module01::?procs
Last edited by Danilo on Sun Dec 29, 2013 7:18 pm, edited 1 time in total.
User avatar
skywalk
Addict
Addict
Posts: 4210
Joined: Wed Dec 23, 2009 10:14 pm
Location: Boston, MA

Re: Code optimization

Post by skywalk »

Ah, I forgot about the Procedure pointers in DataSections and elsewhere!
The nice thing about standards is there are so many to choose from. ~ Andrew Tanenbaum
User avatar
skywalk
Addict
Addict
Posts: 4210
Joined: Wed Dec 23, 2009 10:14 pm
Location: Boston, MA

Re: Code optimization

Post by skywalk »

Another commented ASM question?

Code: Select all

;FROM example xxx.asm file.
_PB_EndFunctions:
  CALL  _PB_EndThread@0
  CALL  _PB_FreeMails@0
  CALL  _PB_FreeLibraries@0
  CALL  _PB_FreeNetworks@0
  CALL  _PB_FreeGadgets@0
  CALL  _PB_FreeWindows@0
  CALL  _PB_FreeFonts@0
  CALL  _PB_FreeFileSystem@0
  CALL  _PB_FreeFiles@0
  CALL  _PB_Event_Free@0
  CALL  _PB_FreeDesktops@0
  CALL  _PB_FreeImages@0
  CALL  _PB_EndAlphaImage@0
  CALL  _PB_FreeMemorys@0
  RET
; 
MP600
MP344
MP392
MP244
MP238
MP228
MP562
MP548
; and so on...
What are the trailing MPxxx Procedures at the end of the Main code?
Are they executed in some order or are they placeholders?
Some of them are never called in my final app so it is confusing still how to drop Procedures. :idea:
The nice thing about standards is there are so many to choose from. ~ Andrew Tanenbaum
User avatar
Danilo
Addict
Addict
Posts: 3036
Joined: Sat Apr 26, 2003 8:26 am
Location: Planet Earth

Re: Code optimization

Post by Danilo »

skywalk wrote:Another commented ASM question?

Code: Select all

; 
MP600
MP344
MP392
MP244
MP238
MP228
MP562
MP548
; and so on...
What are the trailing MPxxx Procedures at the end of the Main code?
Are they executed in some order or are they placeholders?
Some of them are never called in my final app so it is confusing still how to drop Procedures. :idea:
This are the inserted procedure macros.

"macro MP600{ .. }" is the macro definition.
"MP600" is the actual macro use. The macro is inserted at this point.
Same as PB macros.

To remove unused procedures, you need to remove (comment out) this macro uses.
You found macro procedures that are not called within your program, so just remove them.
For example, if "_Procedure600" is never called nor referenced, you can remove the "MP600" line.

Happy new year! ;)
User avatar
skywalk
Addict
Addict
Posts: 4210
Joined: Wed Dec 23, 2009 10:14 pm
Location: Boston, MA

Re: Code optimization

Post by skywalk »

Ok, I understand a little better.
For a small app(5 XIncludeFiles), I am able to reduce my Procedure count from 293 to 123 with a single pass.
But I still have the problem of unused Procedures calling other unused Procedures.
These then get added to the bottom of the MPXXX list and I don't have clear logic to eliminate them since they are not in the MAIN code as calls or references?
Have to think on it some more.
The nice thing about standards is there are so many to choose from. ~ Andrew Tanenbaum
User avatar
Danilo
Addict
Addict
Posts: 3036
Joined: Sat Apr 26, 2003 8:26 am
Location: Planet Earth

Re: Code optimization

Post by Danilo »

Next year, OK? ;)
User avatar
skywalk
Addict
Addict
Posts: 4210
Joined: Wed Dec 23, 2009 10:14 pm
Location: Boston, MA

Re: Code optimization

Post by skywalk »

haha, Happy New Year! Still some time left here :lol:
The nice thing about standards is there are so many to choose from. ~ Andrew Tanenbaum
User avatar
Danilo
Addict
Addict
Posts: 3036
Joined: Sat Apr 26, 2003 8:26 am
Location: Planet Earth

Re: Code optimization

Post by Danilo »

skywalk wrote:haha, Happy New Year! Still some time left here :lol:
Hope you had a good night.

Now that we have a new year, could you try the following with your project, please: DeadProcedureRemover.zip ?

Code: Select all

Found 1047 procedure macros.
Removed 396 out of 1010 used procedure macros.
Source is included.
User avatar
skywalk
Addict
Addict
Posts: 4210
Joined: Wed Dec 23, 2009 10:14 pm
Location: Boston, MA

Re: Code optimization

Post by skywalk »

Thanks for the New Year's gift Danilo 8)
You answered what to do after determining the minimum list of Procedures.
1. Recompile after rearranging the original source code?
or
2. Comment out unused Procedures in the ASM output and recompile using /REASM switch.
Option 2 is way faster :)
Just need to verify all version info/icon is retained(from IDE settings), otherwise I have to start using resource files. :?:
I remember this was not the case when running the pbcompiler.exe outside the IDE.
The nice thing about standards is there are so many to choose from. ~ Andrew Tanenbaum
User avatar
Danilo
Addict
Addict
Posts: 3036
Joined: Sat Apr 26, 2003 8:26 am
Location: Planet Earth

Re: Code optimization

Post by Danilo »

Maybe Fred can rework the algorithm to include procedures a little bit.
I can easily save 20% EXE size with example projects, just by removing unused procedures (dead code) with my tool. 116k -> 92k.
Removing unused PB libraries could save even more, when removing unnecessary PB library initializations and EndFunction calls.

20% - 30% is a big thing when fighting against BLOATED .EXEs from competitors... :)

Together with the correctness of compilation and runtime speed of the generated code, it is an indication of the quality of a compiler.
Locked