Page 2 of 6
Re: Code optimization: The solution
Posted: Sun Dec 29, 2013 3:39 am
by Tenaja
Here is a possible solution that works with manually entered data. If Fred or Freak were to test implementing this instead of their simple flag setting, it should work, or at least get them on the way. They would need to replace their flag setting with AddProcCall(Proc, CalledProc), and change the Debug statements to their macro calling code generation. I even implemented tracking to avoid duplication.
I did not do anything to handle the new Modules; I'd presume this would be handled the same way as any other Procedure.
Code: Select all
Procedure a()
; *NOT NEEDED* and not included, OK
; never referenced
EndProcedure
Procedure b()
; *NOT NEEDED* but included
; referenced only by itself (recursive)
b()
EndProcedure
Procedure c()
; *NOT NEEDED* but included
; referenced by a function never referenced: d()
EndProcedure
Procedure d()
; *NOT NEEDED* and not included, OK
; never referenced
c()
EndProcedure
Procedure f() ;<<<<<<<<<<<<<<<<<<<<
; NEEDED
; referenced and called
EndProcedure
Procedure g() ;<<<<<<<<<<<<<<<<<<<<
; NEEDED
; referenced and called
EndProcedure
Procedure e() ;<<<<<<<<<<<<<<<<<<<<
; NEEDED
; referenced and called
f()
g()
EndProcedure
Structure ProcList
List ProcList.s()
EndStructure
Global NewMap CallTree.ProcList()
Global NewMap InsertedProcs() ; keeps track to avoid duplication
Procedure ListAllProcs() ; This only lists all proc calls:
Debug "Full Call Tree List:"
ForEach CallTree()
ForEach CallTree()\ProcList()
Debug MapKey(CallTree()) + ": " + CallTree()\ProcList()
Next
Next
Debug " ------------------"
EndProcedure
Procedure InsertUsedProcs(s.s);p())
ForEach CallTree(s)\ProcList()
; Debug "; " + MapKey(CallTree()) + " calls: " + CallTree()\ProcList()
If InsertedProcs(CallTree()\ProcList()) <> #True
Debug " Insert MacroCall: " + CallTree()\ProcList()
EndIf
InsertedProcs(CallTree()\ProcList()) = #True
InsertUsedProcs(CallTree()\ProcList())
Next
EndProcedure
Macro dq
"
EndMacro
Macro AddProcCall(Proc, CalledProc)
AddElement(CallTree(dq#Proc#dq)\proclist())
CallTree(dq#Proc#dq)\proclist() = dq#CalledProc#dq
EndMacro
; Generate a Call Tree:
AddProcCall(b, b)
AddProcCall(d, c)
AddProcCall(e, f)
AddProcCall(e, g)
AddProcCall(e, g) ; to test duplication
AddProcCall(PB_NoProc, e)
AddProcCall(PB_NoProc, ListAllProcs)
AddProcCall(PB_NoProc, InsertUsedProcs)
e() ; force the call to require the inclusion of e()
ListAllProcs()
InsertUsedProcs("PB_NoProc")
Re: Code optimization
Posted: Sun Dec 29, 2013 12:51 pm
by luis
@tenaja
You are right. For some reason I overlooked that now. My apologies.
And yes, a map of lists was the approach I had tried at the time, and then I put that away waiting for the "all-in-one" file to be generated by PB.
I had made a test source like this:
Code: Select all
Procedure a()
; NOT NEEDED
; never called
EndProcedure
Procedure b()
; NOT NEEDED
; referenced only by itself (recursive)
b()
EndProcedure
Procedure c()
; NOT NEEDED
; referenced by a function never called: d()
EndProcedure
Procedure d()
; NOT NEEDED
; never called
c()
EndProcedure
Declare f()
Declare e()
Procedure z()
; NEEDED
; called by e()
f()
e()
EndProcedure
Procedure e()
; NEEDED
; called by f(), z()
f()
z()
EndProcedure
Procedure f()
; NEEDED
; called by toplevel, e(), z()
e()
EndProcedure
Procedure g()
; NEEDED
; referenced by itself and called by toplevel
g()
EndProcedure
f()
g()
and then I tried to see if something like the following would have worked:
Code: Select all
Structure T_CALL_LIST
flgCalled.i
List CallList$()
EndStructure
Global NewList TopLevel$()
Global NewMap Procedures.T_CALL_LIST()
; add proc a() from its body declaration
AddMapElement(Procedures(), "a")
; add proc b() from its body declaration
AddMapElement(Procedures(), "b")
; we found it just call itself, so we don't add that info
; add proc c() from its body declaration
AddMapElement(Procedures(), "c")
; add proc d() from its body declaration
AddMapElement(Procedures(), "d")
; add the referenced procedures in its body
AddElement(Procedures()\CallList$())
Procedures()\CallList$() = "c" ;
; add proc f() from its declare statement
AddMapElement(Procedures(), "f") ;
; add proc e() from its declare statement
AddMapElement(Procedures(), "e") ;
; add proc z() from its body declaration
AddMapElement(Procedures(), "z")
; add the referenced procedures in its body
AddElement(Procedures()\CallList$())
Procedures()\CallList$() = "f" ;
AddElement(Procedures()\CallList$())
Procedures()\CallList$() = "e" ;
; we found proc e() body, it's already in the map so we retrieve it
FindMapElement(Procedures(),"e")
; add the referenced procedures in its body
AddElement(Procedures()\CallList$())
Procedures()\CallList$() = "f"
AddElement(Procedures()\CallList$())
Procedures()\CallList$() = "z"
; we found proc f() body, it's already in the map so we retrieve it
FindMapElement(Procedures(),"f")
; add the referenced procedures in its body
AddElement(Procedures()\CallList$())
Procedures()\CallList$() = "e"
; add proc g() from its body declaration
AddMapElement(Procedures(), "g")
; we found it just call itself, so we don't add that info
; top level call to the proc f()
AddElement(TopLevel$())
TopLevel$() = "f"
; top level call to the proc g()
AddElement(TopLevel$())
TopLevel$() = "g"
Procedure FollowChain (proc$)
Protected called$
ForEach Procedures(proc$)\CallList$()
called$ = Procedures()\CallList$()
If Procedures(called$)\flgCalled = 0
Procedures(called$)\flgCalled = 1
FollowChain(called$)
EndIf
Next
EndProcedure
; mark them
ForEach TopLevel$()
Procedures(TopLevel$())\flgCalled = 1
FollowChain(TopLevel$())
Next
; show them
ForEach Procedures()
If Procedures()\flgCalled
Debug MapKey(Procedures())
EndIf
Next
It seemed ok with the test-code above, I didn't give it any further thought though. Just stored away for future reference or better yet to delete it when PB will do all this by itself. The compiler is the right and obvious place where this stuff must be done, once and for all.
Re: Code optimization
Posted: Sun Dec 29, 2013 4:35 pm
by Tenaja
Coincidentally, our solutions are quite similar, right down to the recursive tracking.
Re: Code optimization
Posted: Sun Dec 29, 2013 5:46 pm
by Danilo
skywalk wrote:@Danilo, can you elaborate on your approach to parse the asm to build a table?
I would start with collecting informations from the ASM output:
- Find procedure macros
- Inside procedure macros, find: PB labels, Calls to other PB procedures, References to PB procedures
- Outside procedures, find: References to PB labels, Calls to PB procedures, References to PB procedures
Code: Select all
EnableExplicit
Procedure ProcessASM(inputFileName.s)
Protected inputFile
Protected currentLine$, trimmedLine$, currentLineNum
Protected inProcedure, currentProcedureName$, regExResultCount, i
Protected regExLabels, regExPBLabels, regExPBLabelsRef, regExProcedureLabels, regExProcedureCall, regExProcedureRef, regExProcedureMacroUsed
Protected Dim regExResult.s(0)
regExLabels = CreateRegularExpression(#PB_Any,"^( )*([a-z]|[A-Z]|[0-9]|_)+:.*",#PB_RegularExpression_NoCase) ; GENERAL LABELS
regExPBLabels = CreateRegularExpression(#PB_Any,"^( )*l_([a-z]|[A-Z]|[0-9]|_)+:.*",#PB_RegularExpression_NoCase) ; PB labels, starting with 'l_'
regExPBLabelsRef = CreateRegularExpression(#PB_Any,"^( )*l_([a-z]|[A-Z]|[0-9]|_)+.*",#PB_RegularExpression_NoCase) ; PB label reference, using 'l_'
regExProcedureLabels = CreateRegularExpression(#PB_Any,"^( )*_Procedure([0-9])+:.*",#PB_RegularExpression_NoCase) ; PB procedure labels, starting with '_Procedure'
regExProcedureCall = CreateRegularExpression(#PB_Any,"^.*CALL.*_Procedure([0-9])+.*",#PB_RegularExpression_NoCase) ; PB procedure call, using 'CALL _ProcedureXXX' labels
regExProcedureRef = CreateRegularExpression(#PB_Any,"^.*_Procedure([0-9])+.*",#PB_RegularExpression_NoCase) ; PB procedure reference, using '_ProcedureXXX' labels
regExProcedureMacroUsed = CreateRegularExpression(#PB_Any,"^( )*MP([0-9])+.*",#PB_RegularExpression_NoCase) ; PB procedure macro used, starting with 'MP'
inputFile = ReadFile(#PB_Any,inputFileName)
If inputFile = 0
Debug "File '"+inputFileName+"' not found."
ProcedureReturn
EndIf
currentProcedureName$ = "GLOBAL"
Protected stringFormat = ReadStringFormat(inputFile)
While Not Eof(inputFile)
currentLineNum + 1
currentLine$ = ReadString(inputFile,stringFormat)
trimmedLine$ = Trim(currentLine$)
If Left(trimmedLine$,8) = "macro MP"
Debug "(line: "+Str(currentLineNum)+") Found procedure macro: "+Right(trimmedLine$,Len(trimmedLine$)-6)
inProcedure = #True
ElseIf inProcedure = #True And Left(trimmedLine$,1) = "}"
Debug "(line: "+Str(currentLineNum)+") End procedure macro"
Debug ""
currentProcedureName$ = "GLOBAL"
inProcedure = #False
ElseIf inProcedure = #True
If MatchRegularExpression(regExProcedureLabels,trimmedLine$)
regExResultCount = ExtractRegularExpression(regExProcedureLabels,trimmedLine$,regExResult())
If regExResultCount
regExResultCount - 1
For i = 0 To regExResultCount
Debug " (line: "+Str(currentLineNum)+") Found PB procedure label inside procedure: "+regExResult(i)
currentProcedureName$ = Trim(Trim(regExResult(i),":"))
Debug " current Procedure is: "+currentProcedureName$
Next
EndIf
ElseIf MatchRegularExpression(regExPBLabels,trimmedLine$)
regExResultCount = ExtractRegularExpression(regExPBLabels,trimmedLine$,regExResult())
If regExResultCount
regExResultCount - 1
For i = 0 To regExResultCount
Debug " (line: "+Str(currentLineNum)+") Found PB label inside procedure: "+regExResult(i)
Next
EndIf
ElseIf MatchRegularExpression(regExLabels,trimmedLine$)
regExResultCount = ExtractRegularExpression(regExLabels,trimmedLine$,regExResult())
If regExResultCount
regExResultCount - 1
For i = 0 To regExResultCount
;Debug " (line: "+Str(currentLineNum)+") Found label: "+regExResult(i)
Next
EndIf
ElseIf MatchRegularExpression(regExProcedureCall,trimmedLine$)
regExResultCount = ExtractRegularExpression(regExProcedureCall,trimmedLine$,regExResult())
If regExResultCount
regExResultCount - 1
For i = 0 To regExResultCount
Debug " (line: "+Str(currentLineNum)+") Found procedure call: "+regExResult(i)
Next
EndIf
ElseIf MatchRegularExpression(regExProcedureRef,trimmedLine$)
regExResultCount = ExtractRegularExpression(regExProcedureRef,trimmedLine$,regExResult())
If regExResultCount
regExResultCount - 1
For i = 0 To regExResultCount
Debug " (line: "+Str(currentLineNum)+") Found reference to procedure: "+regExResult(i)
Next
EndIf
ElseIf MatchRegularExpression(regExPBLabelsRef,trimmedLine$)
regExResultCount = ExtractRegularExpression(regExPBLabelsRef,trimmedLine$,regExResult())
If regExResultCount
regExResultCount - 1
For i = 0 To regExResultCount
Debug " (line: "+Str(currentLineNum)+") Found reference to PB label inside procedure: "+regExResult(i)
Next
EndIf
EndIf
Else
If MatchRegularExpression(regExProcedureCall,trimmedLine$)
regExResultCount = ExtractRegularExpression(regExProcedureCall,trimmedLine$,regExResult())
If regExResultCount
regExResultCount - 1
For i = 0 To regExResultCount
Debug "(line: "+Str(currentLineNum)+") Found procedure call: "+regExResult(i)
Next
EndIf
ElseIf MatchRegularExpression(regExProcedureRef,trimmedLine$)
regExResultCount = ExtractRegularExpression(regExProcedureRef,trimmedLine$,regExResult())
If regExResultCount
regExResultCount - 1
For i = 0 To regExResultCount
Debug "(line: "+Str(currentLineNum)+") Found reference to procedure: "+regExResult(i)
Next
EndIf
ElseIf MatchRegularExpression(regExPBLabelsRef,trimmedLine$)
regExResultCount = ExtractRegularExpression(regExPBLabelsRef,trimmedLine$,regExResult())
If regExResultCount
regExResultCount - 1
For i = 0 To regExResultCount
Debug "(line: "+Str(currentLineNum)+") Found reference To PB label outside Procedure: "+regExResult(i)
Next
EndIf
ElseIf MatchRegularExpression(regExProcedureMacroUsed,trimmedLine$)
regExResultCount = ExtractRegularExpression(regExProcedureMacroUsed,trimmedLine$,regExResult())
If regExResultCount
regExResultCount - 1
For i = 0 To regExResultCount
Debug "(line: "+Str(currentLineNum)+") Including procedure macro: "+regExResult(i)
Next
EndIf
EndIf
EndIf
Wend
CloseFile(inputFile)
EndProcedure
ProcessASM("PureBasic.asm")
Re: Code optimization
Posted: Sun Dec 29, 2013 6:40 pm
by skywalk
Thanks Danilo, I wrote a parser and have all the CALLs and _Procedures, but I haven't figured out the logic to filter the CALL's?
Is it as simple as this?
CALLs_macro()
- Inside procedure macros = CALLs found between "MPxxx{" AND "}"
Store these in a MAP list()
Each MPxx can have n CALLs to other Procedures.
CALLs_main()
- Outside procedure macros = All other CALLs
For each CALLs_main()
Add each CALLs_macro() including its MAP contents to another MAP called CALLs_unique
This serves the purpose of eliminating redundant CALLs.
For each CALLs_unique()
Print list.
Re: Code optimization
Posted: Sun Dec 29, 2013 7:08 pm
by Danilo
skywalk wrote:Is it as simple as this?
Yes. You need to include also references to PB procedures, not just calls.
References (address of PB procedure) can be used in global space, inside procedures, and also in data sections at the very end of the ASM output.
Code: Select all
Procedure p1() : EndProcedure
Procedure p2() : EndProcedure
Procedure p3() : EndProcedure
Procedure p4()
p1() ; procedure call
a = @p2() ; reference to procedure
p4_label:
EndProcedure
p1() ; procedure call
a = @p2() ; reference to procedure
DataSection
procs:
Data.i @p1(), @p2() ; reference to procedures
EndDataSection
DeclareModule module01
Declare modProc1()
DataSection
procs:
Data.i @modProc1() ; reference to procedures
EndDataSection
EndDeclareModule
Module module01
Procedure modProc1()
modProcLabel:
EndProcedure
modProc1() ; procedure call
a = @modProc1() ; reference to procedure
EndModule
module01::modProc1() ; procedure call
a = module01::@modProc1() ; reference to procedure
x = module01::?procs
Re: Code optimization
Posted: Sun Dec 29, 2013 7:15 pm
by skywalk
Ah, I forgot about the Procedure pointers in DataSections and elsewhere!
Re: Code optimization
Posted: Tue Dec 31, 2013 8:37 pm
by skywalk
Another commented ASM question?
Code: Select all
;FROM example xxx.asm file.
_PB_EndFunctions:
CALL _PB_EndThread@0
CALL _PB_FreeMails@0
CALL _PB_FreeLibraries@0
CALL _PB_FreeNetworks@0
CALL _PB_FreeGadgets@0
CALL _PB_FreeWindows@0
CALL _PB_FreeFonts@0
CALL _PB_FreeFileSystem@0
CALL _PB_FreeFiles@0
CALL _PB_Event_Free@0
CALL _PB_FreeDesktops@0
CALL _PB_FreeImages@0
CALL _PB_EndAlphaImage@0
CALL _PB_FreeMemorys@0
RET
;
MP600
MP344
MP392
MP244
MP238
MP228
MP562
MP548
; and so on...
What are the trailing MPxxx Procedures at the end of the Main code?
Are they executed in some order or are they placeholders?
Some of them are never called in my final app so it is confusing still how to drop Procedures.

Re: Code optimization
Posted: Tue Dec 31, 2013 9:10 pm
by Danilo
skywalk wrote:Another commented ASM question?
Code: Select all
;
MP600
MP344
MP392
MP244
MP238
MP228
MP562
MP548
; and so on...
What are the trailing MPxxx Procedures at the end of the Main code?
Are they executed in some order or are they placeholders?
Some of them are never called in my final app so it is confusing still how to drop Procedures.

This are the inserted procedure macros.
"macro MP600{ .. }" is the macro definition.
"MP600" is the actual macro use. The macro is inserted at this point.
Same as PB macros.
To remove unused procedures, you need to remove (comment out) this macro uses.
You found macro procedures that are not called within your program, so just remove them.
For example, if "_Procedure600" is never called nor referenced, you can remove the "MP600" line.
Happy new year!

Re: Code optimization
Posted: Tue Dec 31, 2013 9:27 pm
by skywalk
Ok, I understand a little better.
For a small app(5 XIncludeFiles), I am able to reduce my Procedure count from 293 to 123 with a single pass.
But I still have the problem of unused Procedures calling other unused Procedures.
These then get added to the bottom of the MPXXX list and I don't have clear logic to eliminate them since they are not in the MAIN code as calls or references?
Have to think on it some more.
Re: Code optimization
Posted: Tue Dec 31, 2013 10:30 pm
by Danilo
Next year, OK?

Re: Code optimization
Posted: Tue Dec 31, 2013 10:37 pm
by skywalk
haha, Happy New Year! Still some time left here

Re: Code optimization
Posted: Wed Jan 01, 2014 6:18 pm
by Danilo
skywalk wrote:haha, Happy New Year! Still some time left here

Hope you had a good night.
Now that we have a new year, could you try the following with your project, please:
DeadProcedureRemover.zip ?
Code: Select all
Found 1047 procedure macros.
Removed 396 out of 1010 used procedure macros.
Source is included.
Re: Code optimization
Posted: Wed Jan 01, 2014 7:27 pm
by skywalk
Thanks for the New Year's gift Danilo
You answered what to do after determining the minimum list of Procedures.
1. Recompile after rearranging the original source code?
or
2. Comment out unused Procedures in the ASM output and recompile using /REASM switch.
Option 2 is way faster

Just need to verify all version info/icon is retained(from IDE settings), otherwise I have to start using resource files.
I remember this was not the case when running the pbcompiler.exe outside the IDE.
Re: Code optimization
Posted: Thu Jan 02, 2014 8:12 am
by Danilo
Maybe Fred can rework the algorithm to include procedures a little bit.
I can easily save 20% EXE size with example projects, just by removing unused procedures (dead code) with my tool. 116k -> 92k.
Removing unused PB libraries could save even more, when removing unnecessary PB library initializations and EndFunction calls.
20% - 30% is a big thing when fighting against
BLOATED .EXEs from competitors...
Together with the correctness of compilation and runtime speed of the generated code, it is an indication of the quality of a compiler.