Page 1 of 1

MSCOFF code symbol dumper!

Posted: Sat Mar 03, 2007 7:28 pm
by srod
Not sure how much use this will be to others, but there's no harm in posting. :)

In need of a quick utility for listing all code symbols in any MSCOFF format object file, I quickly threw the following together.

It basically lists all public symbols in all code sections (which will generally be the names of functions) etc. It can easily be adjusted to list other symbols (e.g. those in .bss sections etc.)

Haven't tested with non-ms coff files, but I don't think it would be too difficult to adjust the code accordingly.

Code: Select all

;'CoffExportDump'.
;-----------------
;   Stephen Rodriguez.
;   Created with Purebasic 4.02 for Windows.
;
;   Date:  March 2007.
;
;   Platforms:  ALL

;   Licence: DAYLike
;     (Do As You Like with it! - No Warranties!)
;     A credit to myself, whilst nice, is not absolutely necessary.
;******************************************************************************************
;
;The function GetCoffCodeSymbols() will retrieve all symbols residing within all 
;code sections of a specified MSCOFF object file.  The symbol names are placed within a string
;array passed as a parameter to the procedure.
;The function returns a count of the number of symbols retrieved.
;It will also return zero in the case of an error.
;(Note that the symbol names are held in UTF-8 format within the coff file!)
;
;The array will be redimensioned as appropriate,
;******************************************************************************************

#IMAGE_SCN_CNT_CODE = $20
#IMAGE_SYM_CLASS_STATIC = $3


;*****************************Required MS COFF Structures.
;Structure IMAGE_FILE_HEADER
;  Machine.w
;  NumberOfSections.w
;  TimeDateStamp.l
;  PointerToSymbolTable.l
;  NumberOfSymbols.l
;  SizeOfOptionalHeader.w
;  Characteristics.w
;EndStructure
Structure _IMAGE_SECTION_HEADER
  Name.b[8]    ;8 bytes for a null-padded section name. UTF 8 format.
  VirtualSize.l ;= 0
  VirtualAddress.l ; = 0
  SizeOfRawData.l 
  PointerToRawData.l
  PointerToRelocations.l
  PointerToLineNumbers.l ; = 0
  NumberOfRelocations.w
  NumberOfLineNumbers.w ; = 0
  Characteristics.l 
EndStructure
Structure _IMAGE_SYMBOL_TABLE
  Name.b[8]  ;8 bytes for a null-padded symbol name. UTF 8 format.
          ;If the symbol name is too long it goes into the string table.
          ;In this case, the lower dword is zero and the high dword contains the offset
          ;into the string table.
  Value.l
  SectionNumber.w
  Type.w
  StorageClass.b
  NumberOfAuxSymbols.b
EndStructure
;*****************************End of MS COFF Structures.


Procedure.l GetCoffCodeSymbols(filename.s,  strTable.s(1))
  Protected i, count, section, storageclass, currentsymname$
  Protected filesize, fileid, ptrMem
  Protected secchar, codesections.s
  Protected *ptrSym._IMAGE_SYMBOL_TABLE, *ptrIfh.IMAGE_FILE_HEADER, *ptrIsc._IMAGE_SECTION_HEADER
  Protected strtableoffset
  ;First job, check the filename.
    filesize = FileSize(filename)
    If filesize <=0 : ProcedureReturn 0 : EndIf
  ;Now open the MSCOFF file and deposit the whole thing into memory.
    fileid = ReadFile(#PB_Any, filename)
    If fileid = 0 : ProcedureReturn 0 : EndIf
    ;File is open!
    ptrMem = AllocateMemory(filesize)
    If ptrMem = 0 
      CloseFile(fileid)
      ProcedureReturn 0
    EndIf
    ReadData(fileid, ptrMem, filesize)
    CloseFile(fileid)  
  ;Determine some important values.
    *ptrIfh = ptrMem
    strtableoffset = *ptrIfh\PointerToSymbolTable + *ptrIfh\NumberOfSymbols*SizeOf(_IMAGE_SYMBOL_TABLE)
  ;Examine each section header to determine which sections contain code.
  ;If the current section does contain code, we place a corresponding entry in the
  ;codesections string.
    For i = 1 To *ptrIfh\NumberOfSections
      *ptrIsc = ptrMem+SizeOf(IMAGE_FILE_HEADER)+*ptrIfh\SizeOfOptionalHeader + (i-1)*SizeOf(_IMAGE_SECTION_HEADER)
      secchar = *ptrIsc\Characteristics
      If secchar&#IMAGE_SCN_CNT_CODE
        codesections+"<"+Str(i)+">"
      EndIf
    Next
  ;Proceed only if there are any relevant symbols.
    If *ptrIfh\PointerToSymbolTable And *ptrIfh\NumberOfSymbols And codesections
    ;Now we scan the entire symbol table and if the underlying symbol lies within a code section,
    ;we add its name to the list of symbols.
    ;We must be careful though because some symbols will have auxilliary symbols attached.
      i=0
      While i < *ptrIfh\NumberOfSymbols
        i+1
        *ptrSym = ptrMem+*ptrIfh\PointerToSymbolTable+(i-1)*SizeOf(_IMAGE_SYMBOL_TABLE)
        storageclass=*ptrSym\StorageClass
        If storageclass <> #IMAGE_SYM_CLASS_STATIC
          section = *ptrSym\SectionNumber
        ;Is this a code section?
          If FindString(codesections,"<"+Str(section)+">",1)
          ;Get the name of the symbol. This may or may not involve a trundle through the symbol table.
            If PeekL(*ptrSym) ;If these first 4 bytes are none zero, then the 8 byte name
                              ;field contains the symbol name.
              currentsymname$ = PeekS(*ptrSym,8,#PB_UTF8)
            Else ;The next 4 bytes point to an entry in the string table.
              currentsymname$=PeekS(ptrMem+strtableoffset+PeekL(*ptrSym+4), -1, #PB_UTF8)

            EndIf
          ;Add the name to the strTable() array and increase the count.
            ReDim strTable.s(count)
            strTable(count)=currentsymname$
            count+1
          EndIf
        EndIf
        i+*ptrsym\NumberOfAuxSymbols
      Wend
    EndIf      
  ;Tidy up.
    FreeMemory(ptrMem)
  ProcedureReturn count
EndProcedure

;Test.
Dim names.s(0)
num=GetCoffCodeSymbols("string.obj", names())
For i = 0 To num-1
  Debug names(i)
Next

Right, next challenge, list out all exported symbols in a dll. :)

Posted: Sat Mar 03, 2007 7:35 pm
by srod
Two bugs fixed! :oops:

Posted: Mon Mar 05, 2007 10:00 am
by srod
If you run this program and find that it gives you no output, then it is most likely that the object file you're looking at has no code sections and thus no code symbols (function names etc.) to report!

I point this out just to avoid any confusion! :)