output data organisation.

Just starting out? Need help? Post your questions and find answers here.
SeregaZ
Enthusiast
Enthusiast
Posts: 619
Joined: Fri Feb 20, 2009 9:24 am
Location: Almaty (Kazakhstan. not Borat, but Triple G)
Contact:

output data organisation.

Post by SeregaZ »

i am working on dissassembler and have some problem with output data organisation. parse of code work fine. but other part of file should be paint as raw data - dc.b $xx, $xx, $xx... 16 bytes per line. that lines can be interrupt by some labels or part of code and some lines can be less that 16 bytes. for output i make list with lines for canvas gadget. with show that lines no problem. i just get limits "from" and "to", that depends of scrollbar and show only small part of list. it works fast and fine. but before add that unparsed parts of file into list is take 95% of time... it killing me.

it works by this algo:
1. start parse and collect code line by line into list. parts of code now near each other and mixed. every line have address of starting block of code.
2. start add into that list raw code - dc.b. format lines by collect 16 bytes per line and format with labels marks. i mean some lines can be not 16 bytes per line, but less.
3. sort list by address param - now it makes correct order, where code parts and raw data parts lays inside right places.
4. now that list ready to show with canvas.

that adding and format raw data eat that 95% of time. ok, i accept that long time happen only once, when file loading start. but then, i want to make hot disassemble from some address, that i click by mouse. it means i need to format raw data again. and it takes lot of time again. and that hot disasm will need a lot of time. and every time i should wait this minuts or bigger until it reformated? it killing me...

IDA show that raw data as 1 byte per line. maybe it will work faster. but i planing to save output asm file. with that 1 byte = 1 line file will have a millions strings...

i have no idea how to setup search for that my case :) maybe someone make same type of task and have ideas how to organize that output data, that will be more flexible and fast? i will try to make demo code to show problem.
SMaag
Enthusiast
Enthusiast
Posts: 112
Joined: Sat Jan 14, 2023 6:55 pm

Re: output data organisation.

Post by SMaag »

Some Code???
How should we know what you are doing???

I did a lot of speed resarch in PureBasic especally in combination with Strings.
If you give me a part of code, I guess I can help you speed up the code!
SeregaZ
Enthusiast
Enthusiast
Posts: 619
Joined: Fri Feb 20, 2009 9:24 am
Location: Almaty (Kazakhstan. not Borat, but Triple G)
Contact:

Re: output data organisation.

Post by SeregaZ »

i am working on demo code all of that time :)

Code: Select all

Enumeration
  #Window
  #Canvas
  #Scroll
  #Button
EndEnumeration

DataSection
  startdata:
  Data.b $01, $02       ; command 01
  Data.b $03, $04, $05  ; command 02
  Data.b $06, $07, $08  ; command 03
  Data.b $09, $00, $23  ; command 04 url to other part of code. 09 code and $0023 address value
  Data.b $21            ; end of code
  
  Data.b $12, $13, $14, $15, $16, $17, $18, $19, $20 ; some raw data, not code
  
  ; start second code again
  Data.b $03, $04, $05  ; command 02       - jump should be get from 3 part
  Data.b $03, $04, $05  ; command 02
  Data.b $21            ; end of code
  
  Data.b $12, $13, $14, $15, $16, $17, $18, $19, $20 ; some raw data, not code
  
  ; start third code again
  Data.b $01, $02       ; command 01
  Data.b $03, $04, $05  ; command 02
  Data.b $06, $07, $08  ; command 03
  Data.b $09, $FF, $E5  ; command 04 url to other part of code. 09 code and -27 address value
  Data.b $21            ; end of code
  
  Data.b $12, $13, $14, $15, $16, $17, $18, $19, $20 ; some raw data, not code
  Data.b $12, $13, $14, $15, $16, $17, $18, $19, $20 ; some raw data, not code
  Data.b $12, $13, $14, $15, $16, $17, $18, $19, $20 ; some raw data, not code
  Data.b $12, $13, $14, $15, $16, $17, $18, $19, $20 ; some raw data, not code
  
  ; start fourth unknown code, that should be selectead manualy later
  Data.b $01, $02       ; command 01
  Data.b $03, $04, $05  ; command 02
  Data.b $06, $07, $08  ; command 03
  Data.b $21            ; end of code
  
  Data.b $12, $13, $14, $15, $16, $17, $18, $19, $20 ; some raw data, not code
  Data.b $12, $13, $14, $15, $16, $17, $18, $19, $20 ; some raw data, not code
  Data.b $12, $13, $14, $15, $16, $17, $18, $19, $20 ; some raw data, not code
  Data.b $12, $13, $14, $15, $16, $17, $18, $19, $20 ; some raw data, not code
  Data.b $12, $13, $14, $15, $16, $17, $18, $19, $22 ; some raw data, not code
  
  enddata:  
EndDataSection

Global Dim AddressesArray.l(0)
Global Dim AddressesArrayB.l(0)
Global *RomFileMemImage
Global globalarraysz.l
Global Dim ReadMarkersArray.a(0)
Global Dim ReadMarkersArrayB.a(0)
Global Dim EndLinesMarkersArray.a(0)
Global Dim EndLinesMarkersArrayB.a(0)

Global outtextstring$

Structure mainoutstr
  string.s
  color.a
  address.d
  command.s
  param1.s
  param2.s
EndStructure
Global NewList MainOutputList.mainoutstr()
Global NewList BaseOutputList.mainoutstr()

Global BaseResaveFlag.a

Global pagelines = 8

Global MemImageSz = 6000000
*RomFileMemImage = AllocateMemory(MemImageSz)
CopyMemory(?startdata, *RomFileMemImage, ?enddata - ?startdata)

Procedure.u ReadBE16M(address.l)  
  Result.u
  Result = PeekA(address) << 8
  Result | PeekA(address + 1)
  ProcedureReturn (Result)  
EndProcedure

Procedure AddMemMarker(address.l)
  
  bingo.a
  sz.l
  i.l
  
  If ReadMarkersArray(address) = 0
    ; add only if code was not parsed before
  
    sz = ArraySize(AddressesArray())
    If sz = 0
      ; it just first case. add into array
      ReDim AddressesArray(1)
      AddressesArray(1) = address
    Else
      For i = 1 To sz
        If AddressesArray(i) = address
          ; already parsed

          bingo = 1
          Break
        EndIf
      Next
    
      If bingo = 0
        ; didnt see that add before. need add

        globalarraysz = sz + 1
        ReDim AddressesArray(globalarraysz)
        AddressesArray(globalarraysz) = address

      EndIf
    EndIf
  
  Else
    ;Debug "that address already parsed " + Hex(address)
  EndIf

EndProcedure

Procedure GetAsmCode(memstart.l)
  
  param.w ; can be negative to jump back, not only forward

  For i = memstart To MemImageSz - 1
    
    currentromlocation = i
    
    codeid = PeekA(?startdata + i) ; read command id
    
    ReadMarkersArray(i) = 1        ; mark - that code was parsed
    
    commandname$    = ""
    outtextstring1$ = "" 
    outtextstring2$ = ""
    
    Select codeid
      Case $01        
        i + 1 ; 1 byte param for 01
        ReadMarkersArray(i) = 1
        param = PeekA(?startdata + i)
        commandname$ = "comand1"
        outtextstring1$ = "$" + Hex(param)
        ;Debug "rom:" + RSet(Hex(i), 4, "0") + "  comm" + RSet(Hex(codeid), 2, "0") + " $" + Hex(param, #PB_Word)
      Case $03, $06
        i + 1
        ReadMarkersArray(i) = 1 
        param = ReadBE16M(?startdata + i)
        commandname$ = "comand2or3"
        outtextstring1$ = "$" + Hex(param, #PB_Word)
        ;Debug "rom:" + RSet(Hex(i), 4, "0") + "  comm" + RSet(Hex(codeid), 2, "0") + " $" + RSet(Hex(param, #PB_Word), 4, "0")
        i + 1 ; shift i above 2bytes param
        ReadMarkersArray(i) = 1 
      Case $09
        whereiamnow = i
        i + 1
        ReadMarkersArray(i) = 1 
        param = ReadBE16M(?startdata + i)
        ;Debug "rom:" + RSet(Hex(i), 4, "0") + "  jumpto" + " $" + RSet(Hex(param, #PB_Word), 4, "0")
        commandname$ = "jumpto"
        outtextstring1$ = "$" + Hex(param, #PB_Word)
        i + 1 ; shift i above 2bytes param
        ReadMarkersArray(i) = 1 
        If param > -1
          AddMemMarker(param) ; add new founded address to job list
        Else
          AddMemMarker(whereiamnow + param)
        EndIf
      Case $21
        ;Debug "rom:" + RSet(Hex(i), 4, "0") + "  end of function"
        ;Debug ""
        commandname$ = "endfunction"
        EndLinesMarkersArray(i) = 1
        breakflag = 1        
    EndSelect
    
    ; add parsed data to list
    If commandname$
      ;Debug commandname$
      AddElement(MainOutputList()) 
      MainOutputList()\string  = commandname$ + " " + outtextstring1$
      MainOutputList()\address = currentromlocation
    EndIf
    
    If breakflag
      Break
    EndIf
  Next
  
EndProcedure

Procedure ScrollUpdate()
  
  ystep.a       = 16
  
  scrollvalue   = GetGadgetState(#Scroll)
  scrollvalueto = scrollvalue + pagelines 
  
  If StartDrawing(CanvasOutput(#Canvas))
    
    DrawingMode(#PB_2DDrawing_Transparent)
    
    ; background
    Box(0, 0, 700, 500, RGB(70, 70, 70))
    
    y = 10
    x = 10
    SelectElement(MainOutputList(), scrollvalue)  
    For l = scrollvalue To scrollvalueto
      DrawText(x, y, MainOutputList()\string, RGB(200, 200, 200))
      
      NextElement(MainOutputList()) 
        
      y + ystep
    Next
    
    StopDrawing()
  EndIf
  
EndProcedure

Procedure Formatlist(sz.l)
  
  ; format output list and add raw data
  labelarraysize = ArraySize(AddressesArray())

  ; adresses as not right order, but 1, 3, 2 for example
  SortArray(AddressesArray(), #PB_Sort_Ascending)
  ; now they are 1, 2, 3
  lastAddressesArrayindex = 1

  ; copy parsed code, but not formated
  If BaseResaveFlag = 0
  
    BaseResaveFlag = 1
    CopyList(MainOutputList(),    BaseOutputList())
    
    CopyArray(AddressesArray(),   AddressesArrayB())
    CopyArray(ReadMarkersArray(), ReadMarkersArrayB())
    CopyArray(EndLinesMarkersArray(), EndLinesMarkersArrayB())
    
  EndIf

  ; run all data from begining
  For i = 0 To sz - 1
  
    If ReadMarkersArray(i) = 1 ; if that bytes was parsed as code
      
      ; paint raw data bytes, if they exist
      If outputstring$
        
        AddElement(MainOutputList())
        MainOutputList()\string  = outputstring$
        MainOutputList()\address = b_start_addr; - 0.3;i - 0.3

        outputstring$   = ""
        stringscounter  = 0
      EndIf
    
      ; add labels lines, if they exist
      If labelarraysize
        For l = lastAddressesArrayindex To labelarraysize
          If AddressesArray(l) = i
            
            AddElement(MainOutputList())
            ;MainOutputList()\string  = ""     ; just empty line
            MainOutputList()\address = i - 0.2 ; - 0.2 make empty line above
            
            AddElement(MainOutputList())
            MainOutputList()\string  = "label_" + Hex(i) + ":"
            MainOutputList()\address = i - 0.1 ; - 0.1 make line with lavel above code
            
            lastAddressesArrayindex = l

            Break
          EndIf
        Next
      EndIf    
    
      ; empty line after endofcode comand
      If EndLinesMarkersArray(i)      
        AddElement(MainOutputList())
        ;MainOutputList()\string  = ""     ; just empty line
        MainOutputList()\address = i + 0.1 ; set after endofcode command address + 0.1
      EndIf
    
    Else ; this bytes raw code
    
    ; add labels lines, if they exist
    If labelarraysize
      For l = lastAddressesArrayindex To labelarraysize
          If AddressesArray(l) = i
            
            AddElement(MainOutputList())
            ;MainOutputList()\string  = ""     ; just empty line
            MainOutputList()\address = i - 0.2 ; - 0.2 make empty line above
            
            AddElement(MainOutputList())
            MainOutputList()\string  = "label_" + Hex(i) + ":"
            MainOutputList()\address = i - 0.1 ; - 0.1 make line with lavel above code or raw data
            
            lastAddressesArrayindex = l

            Break
          EndIf
      Next
    EndIf 
    
    valuetoshow = PeekA(*RomFileMemImage + i)
                  
    If stringscounter = 0
      outputstring$  = "dc.b  $" + RSet(Hex(valuetoshow), 2, "0")
      stringscounter = 1
      b_start_addr   = i ; remember address of first raw byte
    Else
      ; add new bytes into line
      outputstring$ + ",$" + RSet(Hex(valuetoshow), 2, "0")
      stringscounter + 1
      If stringscounter = 16
        ; shift to next line
        
        AddElement(MainOutputList())
        MainOutputList()\string  = outputstring$
        MainOutputList()\address = b_start_addr
        
        outputstring$ = ""
        stringscounter = 0
      EndIf
    EndIf
    
  EndIf
  
Next

; add last raw bytes
If outputstring$
    AddElement(MainOutputList())
    MainOutputList()\string  = outputstring$
    MainOutputList()\address = b_start_addr
                
    outputstring$   = ""
    stringscounter  = 0
EndIf

; sort, to make correct order of lines by address value
SortStructuredList(MainOutputList(), 0, OffsetOf(mainoutstr\address), TypeOf(mainoutstr\address))

EndProcedure

Procedure ParseCode(sz.l)
  
  ; start parse and add new finded addresses in a process
  globalarraysz = ArraySize(AddressesArray())
  For i = 1 To globalarraysz
    
    currentaddr = AddressesArray(i)
    
    If ReadMarkersArray(currentaddr) = 0
      ; that address is sure not parsed before                
      GetAsmCode(AddressesArray(i))
    Else
      ;Debug Hex(currentaddr) + " already parsed"
    EndIf
  Next
  
  Formatlist(sz)
  
EndProcedure

AddMemMarker($00) ; add first known address marker to start search code

ReDim ReadMarkersArray(MemImageSz)     ; prepare array of flags parse or not
ReDim EndLinesMarkersArray(MemImageSz) ; endofblock lines array

ParseCode(MemImageSz)

;ResetList(MainOutputList())
;While NextElement(MainOutputList())
;  Debug MainOutputList()\string
;Wend

;- Window
If OpenWindow(#Window, 100, 100, 620, 200, "", #PB_Window_MinimizeGadget | #PB_Window_ScreenCentered)
  
  CanvasGadget(#Canvas, 10, 10, 580, 140)
  ScrollBarGadget(#Scroll, 590, 10, 20, 140, 0, 100, pagelines, #PB_ScrollBar_Vertical)
  BindGadgetEvent(#Scroll, @ScrollUpdate())
  
  ButtonGadget(#Button, 10, 160, 80, 30, "hotdisasm")
  
  sz = ListSize(MainOutputList())
  SetGadgetAttribute(#Scroll, #PB_ScrollBar_Maximum, sz-1)
  
  ScrollUpdate()
  
  Repeat
     Select WaitWindowEvent()

       Case #PB_Event_Gadget

         Select EventGadget()
           
           Case #Button
             If EventType() = #PB_EventType_LeftClick
               SetGadgetState(#Scroll, 22) ; scroll to area, where that code lays inside raw data
               ScrollUpdate()               
               
               ; restore data like it was before formated
               CopyList(BaseOutputList(), MainOutputList())
               CopyArray(AddressesArrayB(), AddressesArray())
               AddressesArrayCount = ArraySize(AddressesArray())
               CopyArray(ReadMarkersArrayB(), ReadMarkersArray())
               CopyArray(EndLinesMarkersArrayB(), EndLinesMarkersArray())
                      
               ; add new address
               AddMemMarker(85)
               
               ParseCode(MemImageSz)
               
               ; set new limit for scroll, because was added new lines in main list
               sz = ListSize(MainOutputList())
               SetGadgetAttribute(#Scroll, #PB_ScrollBar_Maximum, sz-1)
               
               ScrollUpdate()
               
             EndIf

         EndSelect

       Case #PB_Event_CloseWindow
         qiut = 1
   
     EndSelect
   Until qiut = 1

EndIf

End

by pressing button - will move scrollbar to that place, where lays raw data, and then will happen parse code and show below. and lines of raw data will be moved. so creation of that new lines will eat long time :(((
SMaag
Enthusiast
Enthusiast
Posts: 112
Joined: Sat Jan 14, 2023 6:55 pm

Re: output data organisation.

Post by SMaag »

I found some issues!

1. The code quality is not optimal to debug.
You do not use EnableExplicit, so PB defines the variables automatically.
This is a big problem when searching for bugs or problems

2. The complete Time is used by FormatList at my PC arround 2.5 to 3 seconds

3. You parse 6.000.000 operations
Global MemImageSz = 6000000
the following code is operated in 5.6Mio cycles

Code: Select all

   Else
      Protected cnt 
      ; add new bytes into line
      cnt + 1
      outputstring$ + ",$" + RSet(Hex(valuetoshow), 2, "0")
      stringscounter + 1
      If stringscounter = 16
        ; shift to next line
        
        AddElement(MainOutputList())
        MainOutputList()\string  = outputstring$
        MainOutputList()\address = b_start_addr
        
        outputstring$ = ""
        stringscounter = 0
      EndIf
    EndIf
and outputstring$ + ",$" + RSet(Hex(valuetoshow), 2, "0")
needs aprox 1sec


------------------------------------

You use PeekA() and ReadBE16M()

Parsing works much faster if you use Pointer.
But that is actually not your Problem.
Your Problem is the code quality and the concept you use. It's to complicated!
If you do this with Pointers it will be easier.

Here is a Function to convert Hex-Data from a Buffer to a HexString.

Code: Select all

  Structure pChar   ; virtual CHAR-ARRAY, used as Pointer to overlay on strings 
    a.a[0]          ; fixed ARRAY Of CHAR Length 0
    c.c[0]          
  EndStructure

  Procedure.s HexStringFromBuffer(*Buffer, Bytes)    
  ; ============================================================================
  ; NAME: HexStringFromBuffer
  ; DESC: Converts a Buffer to a Hex-String
  ; DESC: Convert each Byte to a 2-char-Hex-String
  ; VAR(*Buffer) : Pointer to the Buffer
  ; VAR(Bytes) : Number of Bytes to convert    
  ; RET.s: The String with the Bytes Hex-Values
  ; ============================================================================
   Protected I, *src.pChar, *dest.pChar
   Protected hiNibble.a, loNibble.a 
   Protected sRet.s
      
   If *Buffer
      sRet.s = Space(Bytes * 2) ; for each Byte we need to HEX digits 255=FFh
      *dest = @sRet
      *src = *Buffer  
       
      For I=0 To (Bytes-1)
        hiNibble =  (*src\a[I] >> 4)  + '0'  ; Add Ascii-Code of '0'
        If hiNibble > '9' : hiNibble  + 7 : EndIf ; If 'A..F', we must add 7 for the correct Ascii-Code
        
        loNibble =  (*src\a[I] & $0F) + '0'
        If loNibble > '9' : loNibble  + 7 : EndIf
        
        *dest\c[I]   = hiNibble         
        *dest\c[I+1] = loNibble
      Next
    
      ProcedureReturn sRet
    EndIf  
    ProcedureReturn #Null$
  EndProcedure

If I do such things, I use a univeral Pointer-Structure to interpret an address as byte, word, long, quad ...
This Trick is copied from the PureBasic IDE Source-Code. I recomend you to use this Trick too.
If you have more questions, I try to help you.

here ist the UniversalPointer Struct

Code: Select all

  Structure TUPtr  ; Universal Pointer (see PurePasic IDE Common.pb Structrue PTR)
    StructureUnion
      a.a[0]    ; ASCII   : 8 Bit unsigned  [0..255] 
      b.b[0]    ; BYTE    : 8 Bit signed    [-128..127]
      c.c[0]    ; CAHR    : 2 Byte unsigned [0..65535]
      w.w[0]    ; WORD    : 2 Byte signed   [-32768..32767]
      u.u[0]    ; UNICODE : 2 Byte unsigned [0..65535]
      l.l[0]    ; LONG    : 4 Byte signed   [-2147483648..2147483647]
      f.f[0]    ; FLOAT   : 4 Byte
      q.q[0]    ; QUAD    : 8 Byte signed   [-9223372036854775808..9223372036854775807]
      d.d[0]    ; DOUBLE  : 8 Byte float    
      i.i[0]    ; INTEGER : 4 or 8 Byte INT, depending on System
      *p.TUPtr[0] ; Pointer for TUPtr (it's possible and it's done in PB-IDE Source, but why???
    EndStructureUnion
  EndStructure

SeregaZ
Enthusiast
Enthusiast
Posts: 619
Joined: Fri Feb 20, 2009 9:24 am
Location: Almaty (Kazakhstan. not Borat, but Triple G)
Contact:

Re: output data organisation.

Post by SeregaZ »

maybe i set it at wrong place... but that HexStringFromBuffer takes 10873mlsec, but my PeekA = 6455.

Code: Select all

    ; 10873 for HexStringFromBuffer(*RomFileMemImage + i, 1)
    ;valuetoshow = PeekA(*RomFileMemImage + i)
    ; 6455 for PeekA and rset + hex
                  
    If stringscounter = 0
      outputstring$  = "dc.b  $" + HexStringFromBuffer(*RomFileMemImage + i, 1)
      ;outputstring$  = "dc.b  $" + RSet(Hex(valuetoshow), 2, "0")
      stringscounter = 1
      b_start_addr   = i ; remember address of first raw byte
    Else
      ; add new bytes into line
      outputstring$ + ",$" + HexStringFromBuffer(*RomFileMemImage + i, 1)
      ;outputstring$ + ",$" + RSet(Hex(valuetoshow), 2, "0")
about ReadBE16M - it read for 2 bytes big endian or how it correct names... where bytes revert. original code have 2bytes opcodes with this big endian. and "complicated" - it is actually light version :) original have more code... for some additional addresses checks, and select, and display... but main problem is format that raw data. parse speed of commands is fine. i think maybe i should make second list and collect into it that raw data - 16 per line. but then, when hot disassembly happen: trace what addresses for that case was used and detect what lines of raw data was touched by that hot disassembly and change only that lines, not reformat whole raw data lines. it means last line of raw code will be not full nice eyeview line with 16 bytes, but some cutted and will show only what is left after getting code. just need to think how to correct trace that changes... because that how disasm can have not only one new address, but it can recursive. in a process find new and new addresses, that needed to be parse too.
SeregaZ
Enthusiast
Enthusiast
Posts: 619
Joined: Fri Feb 20, 2009 9:24 am
Location: Almaty (Kazakhstan. not Borat, but Triple G)
Contact:

Re: output data organisation.

Post by SeregaZ »

or i no need to make format raw data as string, but left it as memory... and convert that memory into string only when canvas show exactly this place. it will need to format 20-30 lines (30x16 = 480, insted 6mln at once). and save result of format and show next time that data, not new formated... somekind of that. tomorow will try to make it.
SMaag
Enthusiast
Enthusiast
Posts: 112
Joined: Sat Jan 14, 2023 6:55 pm

Re: output data organisation.

Post by SMaag »

maybe i set it at wrong place... but that HexStringFromBuffer takes 10873mlsec, but my PeekA = 6455
You can't put the HexStringFromBuffer() dirctly in your code in this way.
It was only an example to show the difference of your Loop with
outputstring$ + ",$" + RSet(Hex(valuetoshow), 2, "0
what copies every time the String with a version which writes in a predefined String using Pointer.

I can't modify your code to be faster because it is to confusing. You have so much general Problems in the code!
Restart from 0. It's easier!

Here a basic Demo of how I would do the parising. I added a SizeInfo for the Datasection.

Code: Select all

EnableExplicit

Enumeration
  #Window
  #Canvas
  #Scroll
  #Button
EndEnumeration

Structure TUPtr  ; Universal Pointer (see PurePasic IDE Common.pb Structrue PTR)
    StructureUnion
      a.a[0]    ; ASCII   : 8 Bit unsigned  [0..255] 
      b.b[0]    ; BYTE    : 8 Bit signed    [-128..127]
      c.c[0]    ; CAHR    : 2 Byte unsigned [0..65535]
      w.w[0]    ; WORD    : 2 Byte signed   [-32768..32767]
      u.u[0]    ; UNICODE : 2 Byte unsigned [0..65535]
      l.l[0]    ; LONG    : 4 Byte signed   [-2147483648..2147483647]
      f.f[0]    ; FLOAT   : 4 Byte
      q.q[0]    ; QUAD    : 8 Byte signed   [-9223372036854775808..9223372036854775807]
      d.d[0]    ; DOUBLE  : 8 Byte float    
      i.i[0]    ; INTEGER : 4 or 8 Byte INT, depending on System
      *p.TUPtr[0] ; Pointer for TUPtr (it's possible and it's done in PB-IDE Source, but why???
    EndStructureUnion
EndStructure

DataSection
  startdata:
  Data.b $01, $02       ; command 01
  Data.b $03, $04, $05  ; command 02
  Data.b $06, $07, $08  ; command 03
  Data.b $09, $00, $23  ; command 04 url to other part of code. 09 code and $0023 address value
  Data.b $21            ; end of code
  
  Data.i 9    ; SizeOfDataSection
  Data.b $12, $13, $14, $15, $16, $17, $18, $19, $20 ; some raw data, not code
  
  ; start second code again
  Data.b $03, $04, $05  ; command 02       - jump should be get from 3 part
  Data.b $03, $04, $05  ; command 02
  Data.b $21            ; end of code
  
  Data.i 9    ; SizeOfDataSection
  Data.b $12, $13, $14, $15, $16, $17, $18, $19, $20 ; some raw data, not code
  
  ; start third code again
  Data.b $01, $02       ; command 01
  Data.b $03, $04, $05  ; command 02
  Data.b $06, $07, $08  ; command 03
  Data.b $09, $FF, $E5  ; command 04 url to other part of code. 09 code and -27 address value
  Data.b $21            ; end of code
  
  Data.i 36    ; SizeOfDataSection
  Data.b $12, $13, $14, $15, $16, $17, $18, $19, $20 ; some raw data, not code
  Data.b $12, $13, $14, $15, $16, $17, $18, $19, $20 ; some raw data, not code
  Data.b $12, $13, $14, $15, $16, $17, $18, $19, $20 ; some raw data, not code
  Data.b $12, $13, $14, $15, $16, $17, $18, $19, $20 ; some raw data, not code
  
  ; start fourth unknown code, that should be selectead manualy later
  Data.b $01, $02       ; command 01
  Data.b $03, $04, $05  ; command 02
  Data.b $06, $07, $08  ; command 03
  Data.b $21            ; end of code
  
  Data.i 40    ; SizeOfDataSection
  Data.b $12, $13, $14, $15, $16, $17, $18, $19, $20 ; some raw data, not code
  Data.b $12, $13, $14, $15, $16, $17, $18, $19, $20 ; some raw data, not code
  Data.b $12, $13, $14, $15, $16, $17, $18, $19, $20 ; some raw data, not code
  Data.b $12, $13, $14, $15, $16, $17, $18, $19, $20 ; some raw data, not code
  Data.b $12, $13, $14, $15, $16, $17, $18, $19, $22 ; some raw data, not code
  
  enddata:  
EndDataSection

Global *RomFileMemImage


; This allocates creates a 6MB Memeory Block  
Global MemImageSz = 6* 1024*1024  ; 6MB
*RomFileMemImage = AllocateMemory(MemImageSz)
CopyMemory(?startdata, *RomFileMemImage, ?enddata - ?startdata)

#Cmd1Byte =1
#Cmd2Bytes =2
#Cmd3Bytes = 3

Structure mainoutstr
  string.s
  color.l
  address.i
  command.s
  param1.s
  param2.s
EndStructure
Global NewList MainOutputList.mainoutstr()

  Structure pSwap ; Pointer Structure for swapping
    a.a[0]    ; unsigned Byte-Value
    u.u[0]    ; unsigned WORD-Value
  EndStructure

 Procedure.u BSWAP16(Value.u)
  ; ======================================================================
  ;  NAME: BSWAP16
  ;  DESC: LittleEndian<=>BigEndian conversion for 16Bit values
  ;  DESC: Swaps the Bytes of a 16 Bit value
  ;  VAR(Value.u): 16-Bit-Word Value
  ;  RET.u: Byte swapped 16-Bit value
  ; ====================================================================== 
            
   
   CompilerIf #PB_Compiler_Backend = #PB_Backend_Asm
     
        !xor eax, eax
        !mov ax, word [p.v_Value]
        !xchg al, ah  ; for 16 Bit ByteSwap it's the Exchange command 
        ProcedureReturn
              
   CompilerElse ; C-Backend
     
     CompilerIf #PB_Compiler_Processor = #PB_Processor_x86 Or #PB_Compiler_Processor = #PB_Processor_x64
        !return __builtin_bswap16(v_Value);
        ProcedureReturn
        
      CompilerElse
        Protected *Swap.pSwap
        *Swap = @Value
        Swap *Swap\a[0], *Swap\a[1]
        ProcedureReturn Value
      CompilerEndIf
      
    CompilerEndIf
       
  EndProcedure

Procedure ParseAsmCode(*memBuffer, Size, SegmentBaseAddress, List CodeList.mainoutstr())
  Protected *pCode.TUPtr    ; Pointer to Code
  Protected *pParam.TUPtr   ; Pointer to Parameter
  
  Protected *CodeEnd
  Protected CommandName$, Param1$, Param2$
  
  Protected I, Address, dsSize
  Protected xBreak, xDataSection  ; Flags
  
  *CodeEnd = *memBuffer + Size - 1
  
  *pCode = *memBuffer
  
  While (*memBuffer < *CodeEnd) And Not xBreak
    
    Select *pCode\a[0]
        
      Case $01
        *pParam = *pCode + 1
        CommandName$ = "comand1"
        Param1$ = "$" + Hex(*pParam\a[0], #PB_Byte)
        Address = *pCode
        *pCode  + #Cmd2Bytes
        
      Case $03, $06
        *pParam =  *pCode + 1
        CommandName$ = "comand2or3"
        Param1$ = "$" + Hex(BSWAP16(*pParam\u[0]), #PB_Word)
        Address = *pCode
        *pCode  + #Cmd3Bytes
        
      Case $09
       *pParam =  *pCode + 1
        CommandName$ = "jumpto"
        Param1$ = "$" + Hex(BSWAP16(*pParam\u[0]), #PB_Word)
        Address = *pCode
        *pCode  + #Cmd3Bytes
      
      Case $21
        CommandName$ = "endfunction"
        Param1$ =#Null$
        Address = *pCode
        *pCode + #Cmd1Byte
        xBreak = #True
    EndSelect
    
    If Address 
      AddElement(CodeList())
      With CodeList()
        \address = Address -  SegmentBaseAddress
        \param1 = Param1$
        \command = commandname$
      EndWith
      Address = 0
    EndIf
   
  Wend
  
  ;Check for Datasection after Code Segment
  dsSize = *pCode\i[0]  ; After the Code must be a 0 or the Size of Datasection
 
  If dsSize > 0
    Debug dsSize
    *pCode + SizeOf(Integer)
   
    Param1$ =""
    
    For I = 0 To dsSize -1 
      Param1$ + "$" + Hex(*pCode\b[I]) + ","  
    Next
    
    AddElement(CodeList())
    With CodeList()
      \address = *pCode  - SegmentBaseAddress
      \param1 = Param1$
      \command =  "DataSection"
    EndWith
    
    *pCode + dsSize
  EndIf
  
  ProcedureReturn *pCode  ; returns the actual Pointer for futher operations
  
EndProcedure

Define Line$
Define *ptr

OpenConsole("Output")


*ptr = *RomFileMemImage
*ptr = ParseAsmCode(*ptr, MemImageSz, *RomFileMemImage, MainOutputList())
*ptr = ParseAsmCode(*ptr, MemImageSz, *RomFileMemImage, MainOutputList())
*ptr = ParseAsmCode(*ptr, MemImageSz, *RomFileMemImage, MainOutputList())

ForEach MainOutputList()
  With MainOutputList()
    Line$ = Str(\address) + "  :  " + \command + "(" + \param1 + ")" 
    PrintN(Line$)
  EndWith

Next
PrintN("")
PrintN("Wait for Input")
Input()
SeregaZ
Enthusiast
Enthusiast
Posts: 619
Joined: Fri Feb 20, 2009 9:24 am
Location: Almaty (Kazakhstan. not Borat, but Triple G)
Contact:

Re: output data organisation.

Post by SeregaZ »

i made my way with memory addresses and size, instead making string for raw data lines. and it start work much faster, than it was before. thanks. will try to attach that variant to main code and see what happen :)

Code: Select all

Enumeration
  #Window
  #Canvas
  #Scroll
  #Button
EndEnumeration

DataSection
  startdata:
  Data.b $01, $02       ; command 01
  Data.b $03, $04, $05  ; command 02
  Data.b $06, $07, $08  ; command 03
  Data.b $09, $00, $23  ; command 04 url to other part of code. 09 code and $0023 address value
  Data.b $21            ; end of code
  
  Data.b $12, $13, $14, $15, $16, $17, $18, $19, $20 ; some raw data, not code
  
  ; start second code again
  Data.b $03, $04, $05  ; command 02       - jump should be get from 3 part
  Data.b $03, $04, $05  ; command 02
  Data.b $21            ; end of code
  
  Data.b $12, $13, $14, $15, $16, $17, $18, $19, $20 ; some raw data, not code
  
  ; start third code again
  Data.b $01, $02       ; command 01
  Data.b $03, $04, $05  ; command 02
  Data.b $06, $07, $08  ; command 03
  Data.b $09, $FF, $E5  ; command 04 url to other part of code. 09 code and -27 address value
  Data.b $21            ; end of code
  
  Data.b $12, $13, $14, $15, $16, $17, $18, $19, $20 ; some raw data, not code
  Data.b $12, $13, $14, $15, $16, $17, $18, $19, $20 ; some raw data, not code
  Data.b $12, $13, $14, $15, $16, $17, $18, $19, $20 ; some raw data, not code
  Data.b $12, $13, $14, $15, $16, $17, $18, $19, $20 ; some raw data, not code
  
  ; start fourth unknown code, that should be selectead manualy later
  Data.b $01, $02       ; command 01
  Data.b $03, $04, $05  ; command 02
  Data.b $06, $07, $08  ; command 03
  Data.b $21            ; end of code
  
  Data.b $12, $13, $14, $15, $16, $17, $18, $19, $20 ; some raw data, not code
  Data.b $12, $13, $14, $15, $16, $17, $18, $19, $20 ; some raw data, not code
  Data.b $12, $13, $14, $15, $16, $17, $18, $19, $20 ; some raw data, not code
  Data.b $12, $13, $14, $15, $16, $17, $18, $19, $20 ; some raw data, not code
  Data.b $12, $13, $14, $15, $16, $17, $18, $19, $22 ; some raw data, not code
  
  enddata:  
EndDataSection

Global Dim AddressesArray.l(0)
Global Dim AddressesArrayB.l(0)
Global *RomFileMemImage
Global globalarraysz.l
Global Dim ReadMarkersArray.a(0)
Global Dim ReadMarkersArrayB.a(0)
Global Dim EndLinesMarkersArray.a(0)
Global Dim EndLinesMarkersArrayB.a(0)

Global outtextstring$

Structure mainoutstr
  string.s
  color.a
  address.d
  command.s
  param1.s
  param2.s
  size.a
EndStructure
Global NewList MainOutputList.mainoutstr()
Global NewList BaseOutputList.mainoutstr()

Global BaseResaveFlag.a

Global pagelines = 8

Global MemImageSz = 6000000
*RomFileMemImage = AllocateMemory(MemImageSz)
CopyMemory(?startdata, *RomFileMemImage, ?enddata - ?startdata)

Procedure.u ReadBE16M(address.l)  
  Result.u
  Result = PeekA(address) << 8
  Result | PeekA(address + 1)
  ProcedureReturn (Result)  
EndProcedure

  Structure pChar   ; virtual CHAR-ARRAY, used as Pointer to overlay on strings 
    a.a[0]          ; fixed ARRAY Of CHAR Length 0
    c.c[0]          
  EndStructure

Procedure.s HexStringFromBuffer(*Buffer, Bytes)    
  ; ============================================================================
  ; NAME: HexStringFromBuffer
  ; DESC: Converts a Buffer to a Hex-String
  ; DESC: Convert each Byte to a 2-char-Hex-String
  ; VAR(*Buffer) : Pointer to the Buffer
  ; VAR(Bytes) : Number of Bytes to convert    
  ; RET.s: The String with the Bytes Hex-Values
  ; ============================================================================
   Protected I, *src.pChar, *dest.pChar
   Protected hiNibble.a, loNibble.a 
   Protected sRet.s
      
   If *Buffer
      sRet.s = Space(Bytes * 2) ; for each Byte we need to HEX digits 255=FFh
      *dest = @sRet
      *src = *Buffer  
       
      For I=0 To (Bytes-1)
        hiNibble =  (*src\a[I] >> 4)  + '0'  ; Add Ascii-Code of '0'
        If hiNibble > '9' : hiNibble  + 7 : EndIf ; If 'A..F', we must add 7 for the correct Ascii-Code
        
        loNibble =  (*src\a[I] & $0F) + '0'
        If loNibble > '9' : loNibble  + 7 : EndIf
        
        *dest\c[I]   = hiNibble         
        *dest\c[I+1] = loNibble
      Next
    
      ProcedureReturn sRet
    EndIf  
    ProcedureReturn #Null$
EndProcedure

Procedure AddMemMarker(address.l)
  
  bingo.a
  sz.l
  i.l
  
  If ReadMarkersArray(address) = 0
    ; add only if code was not parsed before
  
    sz = ArraySize(AddressesArray())
    If sz = 0
      ; it just first case. add into array
      ReDim AddressesArray(1)
      AddressesArray(1) = address
    Else
      For i = 1 To sz
        If AddressesArray(i) = address
          ; already parsed

          bingo = 1
          Break
        EndIf
      Next
    
      If bingo = 0
        ; didnt see that add before. need add

        globalarraysz = sz + 1
        ReDim AddressesArray(globalarraysz)
        AddressesArray(globalarraysz) = address

      EndIf
    EndIf
  
  Else
    ;Debug "that address already parsed " + Hex(address)
  EndIf

EndProcedure

Procedure GetAsmCode(memstart.l)
  
  param.w ; can be negative to jump back, not only forward

  For i = memstart To MemImageSz - 1
    
    currentromlocation = i
    
    codeid = PeekA(?startdata + i) ; read command id
    
    ReadMarkersArray(i) = 1        ; mark - that code was parsed
    
    commandname$    = ""
    outtextstring1$ = "" 
    outtextstring2$ = ""
    
    Select codeid
      Case $01        
        i + 1 ; 1 byte param for 01
        ReadMarkersArray(i) = 1
        param = PeekA(?startdata + i)
        commandname$ = "comand1"
        outtextstring1$ = "$" + Hex(param)
        ;Debug "rom:" + RSet(Hex(i), 4, "0") + "  comm" + RSet(Hex(codeid), 2, "0") + " $" + Hex(param, #PB_Word)
      Case $03, $06
        i + 1
        ReadMarkersArray(i) = 1 
        param = ReadBE16M(?startdata + i)
        commandname$ = "comand2or3"
        outtextstring1$ = "$" + Hex(param, #PB_Word)
        ;Debug "rom:" + RSet(Hex(i), 4, "0") + "  comm" + RSet(Hex(codeid), 2, "0") + " $" + RSet(Hex(param, #PB_Word), 4, "0")
        i + 1 ; shift i above 2bytes param
        ReadMarkersArray(i) = 1 
      Case $09
        whereiamnow = i
        i + 1
        ReadMarkersArray(i) = 1 
        param = ReadBE16M(?startdata + i)
        ;Debug "rom:" + RSet(Hex(i), 4, "0") + "  jumpto" + " $" + RSet(Hex(param, #PB_Word), 4, "0")
        commandname$ = "jumpto"
        outtextstring1$ = "$" + Hex(param, #PB_Word)
        i + 1 ; shift i above 2bytes param
        ReadMarkersArray(i) = 1 
        If param > -1
          AddMemMarker(param) ; add new founded address to job list
        Else
          AddMemMarker(whereiamnow + param)
        EndIf
      Case $21
        ;Debug "rom:" + RSet(Hex(i), 4, "0") + "  end of function"
        ;Debug ""
        commandname$ = "endfunction"
        EndLinesMarkersArray(i) = 1
        breakflag = 1        
    EndSelect
    
    ; add parsed data to list
    If commandname$
      ;Debug commandname$
      AddElement(MainOutputList()) 
      MainOutputList()\string  = commandname$ + " " + outtextstring1$
      MainOutputList()\address = currentromlocation
    EndIf
    
    If breakflag
      Break
    EndIf
  Next
  
EndProcedure

Procedure ScrollUpdate()
  
  valuetoshow.a
  
  ystep.a       = 16  
  
  scrollvalue   = GetGadgetState(#Scroll)
  scrollvalueto = scrollvalue + pagelines 
  
  If StartDrawing(CanvasOutput(#Canvas))
    
    DrawingMode(#PB_2DDrawing_Transparent)
    
    ; background
    Box(0, 0, 700, 500, RGB(70, 70, 70))
    
    y = 10
    x = 10
    SelectElement(MainOutputList(), scrollvalue)  
    For l = scrollvalue To scrollvalueto
      
      sz = MainOutputList()\size ; marker - this is raw code
      If sz
        ; means that line not prepare to show. need to convert into string
        valuetoshow = PeekA(*RomFileMemImage + MainOutputList()\address)
        outputstring$  = "dc.b  $" + RSet(Hex(valuetoshow), 2, "0")
        
        sz - 1
        If sz
          For b = 1 To sz
            valuetoshow = PeekA(*RomFileMemImage + MainOutputList()\address + b)
            outputstring$ + ", $" + RSet(Hex(valuetoshow), 2, "0")
          Next
        EndIf
        
        MainOutputList()\string = outputstring$
          
        MainOutputList()\size = 0
      EndIf
      
      DrawText(x, y, MainOutputList()\string, RGB(200, 200, 200))
      
      NextElement(MainOutputList()) 
        
      y + ystep
    Next
    
    StopDrawing()
  EndIf
  
EndProcedure

Procedure Formatlist(sz.l)
  
  ; format output list and add raw data
  labelarraysize = ArraySize(AddressesArray())

  ; adresses as not right order, but 1, 3, 2 for example
  SortArray(AddressesArray(), #PB_Sort_Ascending)
  ; now they are 1, 2, 3
  lastAddressesArrayindex = 1

  ; copy parsed code, but not formated
  If BaseResaveFlag = 0
  
    BaseResaveFlag = 1
    CopyList(MainOutputList(),        BaseOutputList())
    
    CopyArray(AddressesArray(),       AddressesArrayB())
    CopyArray(ReadMarkersArray(),     ReadMarkersArrayB())
    CopyArray(EndLinesMarkersArray(), EndLinesMarkersArrayB())
    
  EndIf

  ; run all data from begining
  For i = 0 To sz - 1
  
    If ReadMarkersArray(i) = 1 ; if that bytes was parsed as code
      
      ; paint raw data bytes, if they exist
      If outputstring$
        
        AddElement(MainOutputList())
        ;MainOutputList()\string  = outputstring$
        MainOutputList()\address = b_start_addr
        MainOutputList()\size    = stringscounter

        outputstring$   = ""
        stringscounter  = 0
      EndIf
    
      ;{ add labels lines, if they exist
      If labelarraysize
        For l = lastAddressesArrayindex To labelarraysize
          If AddressesArray(l) = i
            
            AddElement(MainOutputList())
            ;MainOutputList()\string  = ""     ; just empty line
            MainOutputList()\address = i - 0.2 ; - 0.2 make empty line above
            
            AddElement(MainOutputList())
            MainOutputList()\string  = "label_" + Hex(i) + ":"
            MainOutputList()\address = i - 0.1 ; - 0.1 make line with lavel above code
            
            lastAddressesArrayindex = l

            Break
          EndIf
        Next
      EndIf
      ;}
    
      ;{ empty line after endofcode comand
      If EndLinesMarkersArray(i)      
        AddElement(MainOutputList())
        ;MainOutputList()\string  = ""     ; just empty line
        MainOutputList()\address = i + 0.1 ; set after endofcode command address + 0.1
      EndIf
      ;}
    
    Else ; this bytes raw code
    
    ;{ add labels lines, if they exist
    If labelarraysize
      For l = lastAddressesArrayindex To labelarraysize
          If AddressesArray(l) = i
            
            AddElement(MainOutputList())
            ;MainOutputList()\string  = ""     ; just empty line
            MainOutputList()\address = i - 0.2 ; - 0.2 make empty line above
            
            AddElement(MainOutputList())
            MainOutputList()\string  = "label_" + Hex(i) + ":"
            MainOutputList()\address = i - 0.1 ; - 0.1 make line with lavel above code or raw data
            
            lastAddressesArrayindex = l

            Break
          EndIf
      Next
    EndIf
    ;}
    
    ; 10873 for HexStringFromBuffer(*RomFileMemImage + i, 1)
    ;valuetoshow = PeekA(*RomFileMemImage + i)
    ; 6455 for PeekA and rset + hex
                  
    If stringscounter = 0
      ;outputstring$  = "dc.b  $" + HexStringFromBuffer(*RomFileMemImage + i, 1)
      ;outputstring$  = "dc.b  $" + RSet(Hex(valuetoshow), 2, "0")
      stringscounter = 1
      b_start_addr   = i ; remember address of first raw byte
    Else
      ; add new bytes into line
      ;outputstring$ + ",$" + HexStringFromBuffer(*RomFileMemImage + i, 1)
      ;outputstring$ + ",$" + RSet(Hex(valuetoshow), 2, "0")
      stringscounter + 1
      If stringscounter = 16
        ; shift to next line
        
        AddElement(MainOutputList())
        ;MainOutputList()\string  = outputstring$
        MainOutputList()\address = b_start_addr
        MainOutputList()\size    = stringscounter
        
        outputstring$ = ""
        stringscounter = 0
      EndIf
    EndIf
    
  EndIf
  
Next

; add last raw bytes
If outputstring$
    AddElement(MainOutputList())
    ;MainOutputList()\string  = outputstring$
    MainOutputList()\address = b_start_addr
    MainOutputList()\size    = stringscounter
                
    outputstring$   = ""
    stringscounter  = 0
EndIf

; sort, to make correct order of lines by address value
SortStructuredList(MainOutputList(), 0, OffsetOf(mainoutstr\address), TypeOf(mainoutstr\address))

EndProcedure

Procedure ParseCode(sz.l)
  
  ; start parse and add new finded addresses in a process
  globalarraysz = ArraySize(AddressesArray())
  For i = 1 To globalarraysz
    
    currentaddr = AddressesArray(i)
    
    If ReadMarkersArray(currentaddr) = 0
      ; that address is sure not parsed before                
      GetAsmCode(AddressesArray(i))
    Else
      ;Debug Hex(currentaddr) + " already parsed"
    EndIf
  Next
  
  Formatlist(sz)
  
EndProcedure

StartTime.q = ElapsedMilliseconds()

AddMemMarker($00) ; add first known address marker to start search code

ReDim ReadMarkersArray(MemImageSz)     ; prepare array of flags parse or not
ReDim EndLinesMarkersArray(MemImageSz) ; endofblock lines array

ParseCode(MemImageSz)

Debug ElapsedMilliseconds() - StartTime
MessageRequester("dsdf", Str(ElapsedMilliseconds() - StartTime)) ; 3697, 3574

;ResetList(MainOutputList())
;While NextElement(MainOutputList())
;  Debug MainOutputList()\string
;Wend

;- Window
If OpenWindow(#Window, 100, 100, 620, 200, "", #PB_Window_MinimizeGadget | #PB_Window_ScreenCentered)
  
  CanvasGadget(#Canvas, 10, 10, 580, 140)
  ScrollBarGadget(#Scroll, 590, 10, 20, 140, 0, 100, pagelines, #PB_ScrollBar_Vertical)
  BindGadgetEvent(#Scroll, @ScrollUpdate())
  
  ButtonGadget(#Button, 10, 160, 80, 30, "hotdisasm")
  
  sz = ListSize(MainOutputList())
  SetGadgetAttribute(#Scroll, #PB_ScrollBar_Maximum, sz-1)
  
  ScrollUpdate()
  
  Repeat
     Select WaitWindowEvent()

       Case #PB_Event_Gadget

         Select EventGadget()
           
           Case #Button
             If EventType() = #PB_EventType_LeftClick
               SetGadgetState(#Scroll, 22) ; scroll to area, where that code lays inside raw data
               ScrollUpdate()               
               
               ; restore data like it was before formated
               CopyList(BaseOutputList(), MainOutputList())
               CopyArray(AddressesArrayB(), AddressesArray())
               AddressesArrayCount = ArraySize(AddressesArray())
               CopyArray(ReadMarkersArrayB(), ReadMarkersArray())
               CopyArray(EndLinesMarkersArrayB(), EndLinesMarkersArray())
                      
               ; add new address
               AddMemMarker(85)
               
               ParseCode(MemImageSz)
               
               ; set new limit for scroll, because was added new lines in main list
               sz = ListSize(MainOutputList())
               SetGadgetAttribute(#Scroll, #PB_ScrollBar_Maximum, sz-1)
               
               ScrollUpdate()
               
             EndIf

         EndSelect

       Case #PB_Event_CloseWindow
         qiut = 1
   
     EndSelect
   Until qiut = 1

EndIf

End


ops... size count wrong :)))))
SeregaZ
Enthusiast
Enthusiast
Posts: 619
Joined: Fri Feb 20, 2009 9:24 am
Location: Almaty (Kazakhstan. not Borat, but Triple G)
Contact:

Re: output data organisation.

Post by SeregaZ »

no, it still spend lot of money for process... it becomes better, but still not fine :) probably i will giveup.
SMaag
Enthusiast
Enthusiast
Posts: 112
Joined: Sat Jan 14, 2023 6:55 pm

Re: output data organisation.

Post by SMaag »

once again, your problem is the FormatList(sz)

Code: Select all

	Procedure Formatlist(sz.l)
		For i = 0 To sz - 1 ; For I =0 to 5.999.999 Why?
		Next
	EndProcedure
	
	Procedure ParseCode(sz.l)
	 	For i = 1 To globalarraysz ; For I = 0 To 6.000.001 Why?
		;...
		Next
		
		Formatlist(sz)
	EndProcedure

	ParseCode(MemImageSz)	; here you give as Parameter MemImageSz = 6.000.000
	; so your code works 6.000.000 cycles for ParceCode only parsing a few Bytes.
	; and you call 5.Mio times Formatlist() which has internaly alway 6.000.000 cycles.
	; so the For Next in Fromatlistsize works 6.000.000 * 6.000.000
	; Why?? Maybe your Break-Condition is not working correct!
	; I added for testing a counter what shows 5.620.000 Mio active calls for the String generation!
	; I don't why!!!
Don't give up!!!

I see in your code that you dont't have much expirience in programming but you have this intrinsic motivation, a programmer needs.

Parsing and disassembling code is definitely one of the advanced lessons in programming.

You still think to complicate, because of this you run into such problems!
Reduce your code to the very basic things!

All the following you don't need
Global Dim AddressesArray.l(0)
Global Dim AddressesArrayB.l(0)
Global Dim ReadMarkersArray.a(0)
Global Dim ReadMarkersArrayB.a(0)
Global Dim EndLinesMarkersArray.a(0)
Global Dim EndLinesMarkersArrayB.a(0)

At the end it does not matter you use Peek() Commands or Pointers.
If you use the Peek() you have to handel the Pointers as Integer Values.
If you use Pointer arithmetic you handle Pointers as *pointer. There is no
big difference!
SeregaZ
Enthusiast
Enthusiast
Posts: 619
Joined: Fri Feb 20, 2009 9:24 am
Location: Almaty (Kazakhstan. not Borat, but Triple G)
Contact:

Re: output data organisation.

Post by SeregaZ »

Code: Select all

	Procedure ParseCode(sz.l)
	 	For i = 1 To globalarraysz ; For I = 0 To 6.000.001 Why?
because that globalarraysz is variable. when it pass first time - it can get new addresses. that addresses will be add at the end of order and globalarraysz value is encrease until For is still running. and it is not 0 to 6mln - it is only for ArraySize of array, that collect addresses. usually it have not very much addresses is found. so that array is small.

Code: Select all

	Procedure Formatlist(sz.l)
		For i = 0 To sz - 1 ; For I =0 to 5.999.999 Why?
main list before that Format is mess. it have code blocks, as they was found in ParseCode(). so physical order, as they lay in a file is 1, 2, 3 - but ParseCode() can found it as 1, 3, 2 and add them in this order to list. so i should start that Format from 0 to 6mln to recheck all addresses and split - where is code, where is raw data, where need to add Label line, where need empty line (for visual split data for canvas output) and at the end of all process i should make sort by address value. so that many arrays have this data: this byte was parsed as code, or as raw byte? another array is labels addresses. another array - end of line. probably i should make just one complex array with all of that params with structure.

but thanks for idea for that Formatlist. i think i should make For i = 0 To sz - 1 step 2. because commands is 2 bytes opcode. not sure about labels... because they can be not only for command start, but can be in a middle of params. but i will test it. maybe it will fine.

main project is here:
https://www.emu-land.net/forum/index.ph ... ach=273451
but it need to some rom file of game for load (first left button on a window). biggest rom file size is hack for MK3Ultimate 6mb (that is why i set 6mln for test) can be download here:
https://www.emu-land.net/forum/index.ph ... ach=272232
so target is: parse file from first known address, where sure is code and make asm file, with 100% compatible with ASM68K.exe - compiler for Sega Genesis. it old and glitched and sometimes build a little wrong, but i love it. maybe somewhen i will my own assembler with fix all bugs, that have ASM68K.exe... but that is another story :)
SMaag
Enthusiast
Enthusiast
Posts: 112
Joined: Sat Jan 14, 2023 6:55 pm

Re: output data organisation.

Post by SMaag »

here are the counter results, I added in the For Loop of ParseCode and Formatlist

When you start the Programm this are the number of processed lops for:

ParseCode = 3
Formatlist = 6.000.000

When you click on the Button "hotdiasm"

ParseCode = 4
Formatlist = 6.000.000

It is as I said 6Mio times the Formatlist-Loop! That is what needs the time!

Code: Select all

Global cnt_Formatlist

Procedure Formatlist(sz.l)
  Debug "Firmatlist " + Str(sz)
  Protected labelarraysize, lastAddressesArrayindex, i
  Protected outputstring$
  Protected b_start_addr, stringscounter, l, valuetoshow
  
  ; format output list and add raw data
  labelarraysize = ArraySize(AddressesArray())

  ; adresses as not right order, but 1, 3, 2 for example
  SortArray(AddressesArray(), #PB_Sort_Ascending)
  ; now they are 1, 2, 3
  lastAddressesArrayindex = 1

  ; copy parsed code, but not formated
  If BaseResaveFlag = 0
  
    BaseResaveFlag = 1
    CopyList(MainOutputList(),    BaseOutputList())
    
    CopyArray(AddressesArray(),   AddressesArrayB())
    CopyArray(ReadMarkersArray(), ReadMarkersArrayB())
    CopyArray(EndLinesMarkersArray(), EndLinesMarkersArrayB())
    
  EndIf

  ; run all data from begining
  For i = 0 To sz - 1
    cnt_Formatlist + 1
    If ReadMarkersArray(i) = 1 ; if that bytes was parsed as code
      
      ; paint raw data bytes, if they exist
      If outputstring$
        
        AddElement(MainOutputList())
        MainOutputList()\string  = outputstring$
        MainOutputList()\address = b_start_addr; - 0.3;i - 0.3

        outputstring$   = ""
        stringscounter  = 0
      EndIf
    
      ; add labels lines, if they exist
      If labelarraysize
        For l = lastAddressesArrayindex To labelarraysize
          If AddressesArray(l) = i
            
            AddElement(MainOutputList())
            ;MainOutputList()\string  = ""     ; just empty line
            MainOutputList()\address = i - 0.2 ; - 0.2 make empty line above
            
            AddElement(MainOutputList())
            MainOutputList()\string  = "label_" + Hex(i) + ":"
            MainOutputList()\address = i - 0.1 ; - 0.1 make line with lavel above code
            
            lastAddressesArrayindex = l

            Break
          EndIf
        Next
      EndIf    
    
      ; empty line after endofcode comand
      If EndLinesMarkersArray(i)      
        AddElement(MainOutputList())
        ;MainOutputList()\string  = ""     ; just empty line
        MainOutputList()\address = i + 0.1 ; set after endofcode command address + 0.1
      EndIf
    
    Else ; this bytes raw code
    
    ; add labels lines, if they exist
      If labelarraysize
        Protected cnta
        
      For l = lastAddressesArrayindex To labelarraysize
          If AddressesArray(l) = i
            cnta + 1
            AddElement(MainOutputList())
            ;MainOutputList()\string  = ""     ; just empty line
            MainOutputList()\address = i - 0.2 ; - 0.2 make empty line above
            
            AddElement(MainOutputList())
            MainOutputList()\string  = "label_" + Hex(i) + ":"
            MainOutputList()\address = i - 0.1 ; - 0.1 make line with lavel above code or raw data
            
            lastAddressesArrayindex = l

            Break
          EndIf
      Next
    EndIf 
    
    valuetoshow = PeekA(*RomFileMemImage + i)
                  
    If stringscounter = 0
      outputstring$  = "dc.b  $" + RSet(Hex(valuetoshow), 2, "0")
      stringscounter = 1
      b_start_addr   = i ; remember address of first raw byte
    Else
      Protected cnt 
      ; add new bytes into line
      cnt + 1
      outputstring$ + ",$" + RSet(Hex(valuetoshow), 2, "0")
      stringscounter + 1
      If stringscounter = 16
        ; shift to next line
        
        AddElement(MainOutputList())
        MainOutputList()\string  = outputstring$
        MainOutputList()\address = b_start_addr
        
        outputstring$ = ""
        stringscounter = 0
      EndIf
    EndIf
    
  EndIf
  
Next

;MessageRequester("FormatList ", " cnt = " + Str(cnt))
; add last raw bytes
If outputstring$
    AddElement(MainOutputList())
    MainOutputList()\string  = outputstring$
    MainOutputList()\address = b_start_addr
                
    outputstring$   = ""
    stringscounter  = 0
EndIf

; sort, to make correct order of lines by address value
SortStructuredList(MainOutputList(), 0, OffsetOf(mainoutstr\address), TypeOf(mainoutstr\address))

EndProcedure

Global cnt_ParseCode

Procedure ParseCode(sz.l)
  Protected t1, i, currentaddr
   
  t1= ElapsedMilliseconds()
  ; start parse and add new finded addresses in a process
  globalarraysz = ArraySize(AddressesArray())
  For i = 1 To globalarraysz
    cnt_ParseCode + 1
    currentaddr = AddressesArray(i)
    
    If ReadMarkersArray(currentaddr) = 0
      ; that address is sure not parsed before                
      GetAsmCode(AddressesArray(i))
    Else
      ;Debug Hex(currentaddr) + " already parsed"
    EndIf
  Next
  ; MessageRequester("Pasecode Time 1", Str(ElapsedMilliseconds()-t1))
  
  Formatlist(sz)
  ; MessageRequester("Parsecode Time 2", Str(ElapsedMilliseconds()-t1))
  
  Define msg$ = "ParseCode = "+ Str(cnt_ParseCode) + #CRLF$ + "Formatlist = " + Str(cnt_Formatlist)
  SetClipboardText(msg$)
  MessageRequester("Counter", msg$)
  
  cnt_ParseCode = 0
  cnt_Formatlist = 0
EndProcedure
SMaag
Enthusiast
Enthusiast
Posts: 112
Joined: Sat Jan 14, 2023 6:55 pm

Re: output data organisation.

Post by SMaag »

SeregaZ
Enthusiast
Enthusiast
Posts: 619
Joined: Fri Feb 20, 2009 9:24 am
Location: Almaty (Kazakhstan. not Borat, but Triple G)
Contact:

Re: output data organisation.

Post by SeregaZ »

When you click on the Button "hotdiasm"
it emulate select some raw byte and count it as some address of code and try to parse from it. that is why i make restart format again from begining. it can found some new addresses inside like reqursive. it means all labels and end of line and strings of raw data will be changed and need to be recount again.

ouch... did you start that disasm? :) it going year :)))) still didnt finish. and looks like it parse absolutely everything, not as mine - get first know address (second long from begin of file) and find urls inside of it and start reqursive. but my cant find everything, because sometimes jumps hide inside Ax or Dx with some formulas. so many thing is missed. for this case need plug emulator with debug option - like IDA do, but i have no idea how it is work. with TCPView i see some connections between of them and even catch first packet from IDA to Emulator, but when i recreate this packed - nothing happen. i have source too of that two programs (not PB) and creator write some about "shared memory"... but no idea what happen thear :) theory is: when that emulator is start - need to play game for most places and location and doings and emulator store that data where code was read and it means here is code, here is gfx or sound, and etc things... now i have only first stage. finaly it done... but where it save file? it is not disassembler. it is piece of crap :)
SeregaZ
Enthusiast
Enthusiast
Posts: 619
Joined: Fri Feb 20, 2009 9:24 am
Location: Almaty (Kazakhstan. not Borat, but Triple G)
Contact:

Re: output data organisation.

Post by SeregaZ »

ParseCode = 3 and then 4
when it start parse inside first know address - it found new one. store into array. continue parse first until endoffunction opcode. start parse second item from array of knowing address (it was added from first parse). same happen - it found new address and add it into array. continue until get endoffunction code. so we have 3 addresses. that blocks of codes is connected with each other. but some code have some dificult formulas and cant to be found by this way. as i say: jmp can have some Ax or Dx or whatever param, where it is stored by cpu work, not by data from file. so for example user found it manualy, click on that address and precc C button - it add this new address to address array and start parse from it. so now we become to 4. it not parse full file - all 6mln. it parse only from starting address and until found endoffunction command. until that parse happen - it can found new addresses - new jumps command and it needed to be parse too.

if i will not do format from begining, but only add new last address, parse that address - how i will add new labels and endoflines? and how i can trace what lines with raw data was changed from raw to code? that is why i restart format each time (pressing button here, and select some raw address and press C for main project). i understand it will eat a lot of time, but it best what i can make for now.
Post Reply