How does PureBasic parse files?

Just starting out? Need help? Post your questions and find answers here.
mao
User
User
Posts: 20
Joined: Wed May 27, 2015 2:27 am

Re: How does PureBasic parse files?

Post by mao »

[quote="Marc56us"]Hi :P



id = Asc( Left(TmpLine$, 1) ) ; ASCII value of first char of line
i + 1 ; increment i (can be write as i = i + 1)

Select id

Case 34
; Double-quote
Debug "Line " + Str(i) + " is Part 1 : " + TmpLine$
; Trim (string, chr(34) while remove "" at beging and end of string
Data_One$(D1, 0) = Trim( StringField(TmpLine$, 1, ","), Chr(34) )
Data_One$(D1, 1) = Trim( StringField(TmpLine$, 2, ","), Chr(34) )
D1 + 1

Case 46
; Dot
Debug "Line " + Str(i) + " is a Separator : " + TmpLine$
; Nothing to do here

Case 48 To 57
; number 0 to 9
Debug "Line " + Str(i) + " is Part 2 : " + TmpLine$
RTrim(TmpLine$) ; remove spaces at end of line
k = 1
For j = 0 To CountString(TmpLine$, " ")
Data_Two$(D2, j) = StringField(TmpLine$, k, " ")
k + 1
Next
D2 + 1

EndSelect


Like the CASE part! thanks
Marc56us
Addict
Addict
Posts: 1600
Joined: Sat Feb 08, 2014 3:26 pm

Re: How does PureBasic parse files?

Post by Marc56us »

To help you more easily it would be useful to tell us what other languages you know (and how long)
Working in corelation is simpler when you start in a new language. 8)

Some PB Tips:

Avoids using 'FindString' if there is more than one identical delimiter. It is more efficient (quick and readable) to use 'Stringfield'.
'Stringfield' in PB work as same as 'split' in Perl or 'explode' in PHP (order differ)

Your first QuickBasic code work, but is not easy to read: Nesting loops and then break is the best way to get lost in the code.
Very old programer said: "A good software recognizes watching the listing to 10 feets. " (= a good program has a beautiful code)

Others interesting stuff in PB:

Code: Select all

'i = i + 1' can be written 'i + 1' (like in C  'i++' or 'i += 1')

'While Eof(1) = 0' can be written 'While Not Eof(1)' (like on some other languages 'while !eof(1)' ) 
Virtually all functions return a value, so it is very useful to use it to stop the program immediately if something goes wrong.

ie:

Code: Select all

OpenFile(1,"example.TXT")
would be better as

Code: Select all

If Not OpenFile(1,"example.TXT")
  Debug "File not found, or bad name or something else. Bye :-/"
  End
Else
  ; what to do if file is Ok
EndIf
As you can see, I prefer to put the condition exclusion first. As I'm sure not to forget.

Next step: Do not use number as filehandle, use Enumeration system

you will have something like

Code: Select all

Enumeration
  #File_IN
  #File_OUT
EndEnumeration
...
OpenFile(#File_IN,"example.TXT")
...
Try all samples in (the very good) help file (or F1), this is the best way.

Have a fun, PB is a fantastic language.

:wink:
User avatar
Zebuddi123
Enthusiast
Enthusiast
Posts: 796
Joined: Wed Feb 01, 2012 3:30 pm
Location: Nottinghamshire UK
Contact:

Re: How does PureBasic parse files?

Post by Zebuddi123 »

Hi mao yes you can do! It was just put together very quickly as an example for you to use. Best thing use the debugger (I prefer the standalone debugger -settings in menu File-Preferences-Debugger Option Choose Debugger Type) step through the code you change see whats happening. Again do the same with SortData() make it do what you want!

Zebuddi. :)
malleo, caput, bang. Ego, comprehendunt in tempore
sancho2
User
User
Posts: 44
Joined: Wed Apr 15, 2015 5:14 am

Re: How does PureBasic parse files?

Post by sancho2 »

Looking at the QB code I see some unusual things.

First the code seems to skip through the first part of the file looking for something or skipping something:

Code: Select all

DO
  INPUT #1, z$
LOOP UNTIL z$ = ""
But in my testing this loop reads the entire file and puts the file pointer at the end of the file.
So there must be a blank line at the beginning of the file that this qb loop is looking for.

Next it seems there must be a separator line in the file. A blank line. The following code loops until input# finds a blank. It won't stop at the numbers unless there is a blank line before them.

Code: Select all

n = 0
DO
  n = n + 1
   INPUT #1, array1$(n)
   IF array1$(n) = "" THEN EXIT DO
   INPUT #1, array2$(n)
LOOP
Also this loop leaves a blank entry in array1$(n) just before it skips out of the do loop.

There is a relationship between the number of strings loading into the two arrays (array1$, array2$) and the number of doubles being loaded into the matrix.
The next loop in the qb code is filling the matrix array using the number of strings it loaded in the previous loop (n).
In your sample code n = 12, and there are 12 columns of numbers to load.
This code line sets up the number of columns (notu), to be equal to the number of strings loaded (subtracting the last blank string). The forthcoming for next loop will (try to) read 11 rows (notu -1) of numbers.

Code: Select all

notu = n - 1
I think the next loop is in error. I think what you want is for dm() to have the same matrix that you see in the file. You want the first row (dm(1,x)) to be the first 12 numbers in the file:

Code: Select all

4.25 3.78 5.90 5.44 3.29 5.65 5.99 4.15 3.27 5.14 4.92 4.78
Then the next row (dm(2,x)) to be the next 11 numbers in the file:

Code: Select all

3.43 5.17 2.18 4.73 6.83 2.11 1.29 3.27 2.84 0.97 1.13
...and so on.

The qb loop however will input 12 items for each row. Therefore dm(2,12) = 3.01, the first number of the next row (3). By the time you get to row 8 you are loading nothing and all values are 0's.

Code: Select all

FOR i = 1 TO notu - 1
  FOR j = i + 1 TO notu
     INPUT #1, dm(i, j)
  NEXT
NEXT
CLOSE #1
###the 2-dimension array is not complete; some fields are empty at this stage.
Its my opinion that the array is not being built properly in the qb code.
sancho2
User
User
Posts: 44
Joined: Wed Apr 15, 2015 5:14 am

Re: How does PureBasic parse files?

Post by sancho2 »

This code results in the three arrays I see in the QB code. Hopefully properly loaded.

Code: Select all

; The following code loads the sample file into arrays. It uses three loops to read the file.
; The first loop skips over any blank lines at the beginning of the file..
; The second loop reads into a list, each line that begins with a quote char.
; The third loop reads into a list, each of the remaining lines.
; The code then takes the two lists and parses them.
; The list of strings is separated into two arrays.
; The list of numbers is parsed into a matrix using 
; Infratecs method posted earlier. (slightly modified)

EnableExplicit  ; this forces you to declare all your variables thus eliminating many errors

#FILENUM = 0  

Define s.s        ; defines a string variable used to store text read from the file

; these are lists - they are going to hold each line read in from the file
NewList stringlines.s()    
NewList numLines.s()

If ReadFile(#FILENUM, "c:\testfile.txt")
  
  ; this first loop dumps any blank lines at the beginning of the file
  While Not Eof(#FILENUM)
    
    s = ReadString(#FILENUM, #PB_Ascii)
    
    If Left(s,1) = Chr(34)    ; break out of loop if the first char of the line is a quote
      Break
    EndIf
    
  Wend
  
  ; Strings list
  While Not Eof(#FILENUM)
    
    AddElement(stringlines())   ; add an element to the list and put the file line in it
    stringlines() = s
    
    s = ReadString(#FILENUM, #PB_Ascii)
    
    If Left(s,1) <> Chr(34)   ; break out of loop when we don't have a quote character
      Break
    EndIf
        
  Wend
  
  ; Numbers list
  While Not Eof(#FILENUM)
    
    ; At this point the variable s will have the contents of the separator line, the blank line
    ; If we start reading from the file again, the blank line is not saved
    s = ReadString(#FILENUM, #PB_Ascii)   
    
    ; If there is no separation line than you must move the previous ReadString code line to after 
    ; the following lines.
    AddElement(numLines())
    numLines() = s
    
  Wend
  
  CloseFile(#FILENUM)
  
  ; we size the arrays using the count of strings in the list
  Define n.i = ListSize(stringlines())
  Define row.i, col.i, i.i
  Define part.s
  
  ; n = 12 using the sample file. Dim Dm(12, 12) means that 
  ; there are actually 13 x 13 spots in the Array; 0 to 12
  Dim Dm.d(n, n)    ; .d means an array of doubles
  Dim array1.s(n)   
  Dim array2.s(n)
  
  ; this section create the two string arrays
  row = 0
  
  ForEach stringlines()
    ; pure basic arrays start at index 0, we will just leave that element blank and start at index 1
    row = row + 1       
    array1(row) = RemoveString(StringField(stringlines(), 1, ", "), Chr(34))
    array2(row) = RemoveString(StringField(stringlines(), 2, ", "), Chr(34))
  Next
  
  ; This section uses Infratecs repeat-until method of parsing the string of text into numbers
  row = 0
  col = 0
  ForEach numLines()
    
    row = row + 1
    
    i = 0
    
    Repeat
      i = i + 1
      part = StringField(numLines(), i, " ")
      
      If Len(part) = 0 
        Break
      EndIf
      
      col = col + 1
      dm(row, col) = ValD(part)
      
    Until Len(part) = 0
    
    col = 0
    
  Next
  
  ; At this point the arrays have been loaded. 
  ; The 'CallDebugger' command halts execution of the program so you can examine the variable contents.. 
  ; You can view them in the Variable Viewer. It is under the debugger menu.
  CallDebugger

Else
  MessageRequester("Error", "Problem with the file")
  
EndIf



  
Marc56us
Addict
Addict
Posts: 1600
Joined: Sat Feb 08, 2014 3:26 pm

Re: How does PureBasic parse files?

Post by Marc56us »

A GUI version of my proposal :wink:

Code: Select all

; Read Log 2.0
; (Quick GUI version)

; Simple, quick and dirty version for learning 
; Explain how to use StringField to split text line
; and Trim to remove unwanted char
; Array are Zero base, change if you want

; Leave PB assign numbers for handles
Enumeration 
	#FileHandle
	#Window_0
	#Grid_1
	#Grid_2
EndEnumeration

; Classic array version (you can use small array too and use ReDim to expand)
; Array in PB use () instead of []

n = 76
Dim Array1$(n)
Dim Array2$(n)	
Dim DM$(n, n)


; ReadFile instead of OpenFile can open file (readonly) event if shared (windows only)
If ReadFile(#FileHandle, "example.txt")
	
	While Not Eof(#FileHandle)
		
		TmpLine$ = ReadString(#FileHandle)
		id		 = Asc( Left(TmpLine$, 1) )		; ASCII value of first char of line	
		i + 1									; increment i (can be write as i = i + 1)
		
		Select id
				
			Case 34		
				; First char is Double-quote, so fill array 1 and 2
				; Trim (string, chr(34) while remove "" at beging and end of string
				; (double Trim() on array2 because "" and space)
				Array1$(D1) = Trim( StringField(TmpLine$, 1, ","), Chr(34) )
				Array2$(D1) = Trim( Trim( StringField(TmpLine$, 2, ","), Chr(34) ))
				D1 + 1
				
			Case 46  	
				; First char is a Dot (in sample) 
				; Debug "Line " + Str(i) + " is a Separator : " + TmpLine$
				; Nothing to do here
				
			Case 48 To 57
				; First char is a Number (0 to 9)
				RTrim(TmpLine$)  ; remove spaces at end of line
				k = 1
				For j = 0 To CountString(TmpLine$, " ")
					DM$(D2, j) = StringField(TmpLine$, k, " ")
					k + 1
				Next
				D2 + 1
				
		EndSelect
		
	Wend
	CloseFile(#FileHandle)	
Else 
	Debug "Uh? I can't find datafile :-/"
	End
EndIf


; Let's go draw a window to show datas
; Not the best way, but a lite tutorial, to explain how to set a minimal window in PB ;-)

; Draw the main window
OpenWindow(#Window_0, 0, 0, 800, 600, "Results", 
           #PB_Window_SystemMenu | #PB_Window_ScreenCentered | #PB_Window_MinimizeGadget )

; Draw the grid 1
ListIconGadget(#Grid_1, 10, 10, 780, 300, "#", 20, #PB_ListIcon_GridLines )
AddGadgetColumn(#Grid_1, 1, "Array 1", 250)
AddGadgetColumn(#Grid_1, 2, "Array 2", 250)

; Draw the grid 2
ListIconGadget(#Grid_2, 10, 320, 780, 275, "#", 20, #PB_ListIcon_GridLines )
AddGadgetColumn(#Grid_2, 01, "DM 01", 50)
AddGadgetColumn(#Grid_2, 02, "DM 02", 50)
AddGadgetColumn(#Grid_2, 03, "DM 03", 50)
AddGadgetColumn(#Grid_2, 04, "DM 04", 50)
AddGadgetColumn(#Grid_2, 05, "DM 05", 50)
AddGadgetColumn(#Grid_2, 06, "DM 06", 50)
AddGadgetColumn(#Grid_2, 07, "DM 07", 50)
AddGadgetColumn(#Grid_2, 08, "DM 08", 50)
AddGadgetColumn(#Grid_2, 09, "DM 09", 50)
AddGadgetColumn(#Grid_2, 10, "DM 10", 50)
AddGadgetColumn(#Grid_2, 11, "DM 11", 50)
AddGadgetColumn(#Grid_2, 12, "DM 12", 50)

; Now, fill the grids
; each row will be feed by a string 
; PB change column when it see chr$(10) in string

; Fill Grid 1 with array 1 and 2
For i = 0 To 64
	AddGadgetItem(#Grid_1, i, Str(i) + Chr(10) + Array1$(i) + Chr(10) + Array2$(i))
Next

; Fill Grid 2 with matrix 
For i = 0 To 64
	AddGadgetItem(#Grid_2, i, Str(i) + Chr(10) + DM$(i, 0) + Chr(10) + DM$(i, 1) + Chr(10) + 
	                          DM$(i, 2) + Chr(10) + DM$(i, 3) + Chr(10) + DM$(i, 4) + Chr(10) + 
	                          DM$(i, 5) + Chr(10) + DM$(i, 6) + Chr(10) + DM$(i, 7) + Chr(10) + 
	                          DM$(i, 8) + Chr(10) + DM$(i, 9) + Chr(10) + DM$(i, 10) + Chr(10) + 
	                          DM$(i, 11) + Chr(10) + DM$(i, 12))
Next


; Main loop (wait for user to close it with closebox)
Repeat
	Event = WaitWindowEvent()
	If Event = #PB_Event_CloseWindow 
		End
	EndIf
ForEver

End
infratec
Always Here
Always Here
Posts: 7633
Joined: Sun Sep 07, 2008 12:45 pm
Location: Germany

Re: How does PureBasic parse files?

Post by infratec »

Since the programs are growing, here is my adapted version:

Code: Select all

#n = 76

Dim dm.f(#n, #n)
Dim Array1$(#n)
Dim Array2$(#n)

If ReadFile(0, "mao_01.txt")
  j = 0
  While Not Eof(0)
    Line$ = ReadString(0, #PB_Ascii)
    If Left(Line$, 1) = #DQUOTE$
      Array1$(n) = StringField(Line$, 1, ",")
      Array2$(n) = StringField(Line$, 2, ",")
      n + 1
    ElseIf Trim(Line$) <> ""
      i = 0
      Repeat
        Float$ = StringField(Line$, i + 1, " ")
        If Float$ <> ""
          dm(j, i) = ValF(Float$)
          i + 1
        EndIf
      Until Float$ = ""
      j + 1
    EndIf
  Wend
  CloseFile(0)
EndIf

; --------- only to check the result ------------

n = 0
While Array1$(n) <> ""
  Debug Array1$(n) + " - " + Array2$(n)
  n + 1
Wend

j = 0
While dm(j, 0)
  i = 0
  While dm(j, i)
    Debug StrF(dm(j, i), 2)
    i + 1
  Wend
  Debug ""
  j + 1
Wend
Bernd
sancho2
User
User
Posts: 44
Joined: Wed Apr 15, 2015 5:14 am

Re: How does PureBasic parse files?

Post by sancho2 »

infratec wrote:Since the programs are growing, here is my adapted version:

Code: Select all

...
    If Left(Line$, 1) = #DQUOTE$
...
    ElseIf Trim(Line$) <> ""
...
Bernd
It is possible that the beginning of the file is just a blank line or two. I think its even more likely that the first loop in the original QB code is skipping over a header of some sort.
I think you might have to search for the first blank line before loading data.
The QB input# command will load the strings without their surrounding quotes. I think your code includes the quotes (I'm not %100 sure).
Marc56us
Addict
Addict
Posts: 1600
Joined: Sat Feb 08, 2014 3:26 pm

Re: How does PureBasic parse files?

Post by Marc56us »

New version :P
Add a second array for numeric value (chose what you need)

This code is not optimized, it's a beginner's tutorial PB (for new users in PB)

Code: Select all

; Read Log 2.1
; (Quick GUI version)
;
; Changes:
; + Start array at 1
; + Add 'Case default' if line don't match expected value (line will be skipped)
; + Add a second array with fload values version (all decimal digits)
; + Add a log view (show all original lines)

; Leave PB assign numbers for handles
Enumeration
   #FileHandle
   #Window_0
   #Grid_1
   #Grid_2
   #Grid_3
   #All_Lines
EndEnumeration

n = 76
Dim AllLines$(n)	; Log file
Dim Array1$(n)		; First column  string data
Dim Array2$(n)   	; Second column string data
Dim DM$  (n, n)		; Second part: number stored as string (fixed decimal)
Dim DM.f (n, n)		; Second part: number as number (float)

DataFile$ = "example.txt"

; ReadFile instead of OpenFile can open file (readonly) event if shared (windows only)
If ReadFile(#FileHandle, DataFile$)
   
   While Not Eof(#FileHandle)
      
      TmpLine$ = ReadString(#FileHandle)
      id       = Asc( Left(TmpLine$, 1) )   ; ASCII value of first char of line   
      i + 1                           		; increment i (can be write as i = i + 1)
      AllLines$(i) = TmpLine$
      
      Select id
            
         Case 34      
            ; First char is Double-quote, so fill array 1 and 2
            ; Trim (string, chr(34) while remove "" at beging and end of string
            ; (double Trim() on array2 because "" and space)
         	D1 + 1
         	Array1$(D1) = Trim( 		StringField( TmpLine$, 1, "," ), Chr(34) )
            Array2$(D1) = Trim( Trim( 	StringField( TmpLine$, 2, "," ), Chr(34) ) )
            
         Case 48 To 57
            ; First char is a Number (0 to 9)
            RTrim(TmpLine$)  ; remove spaces at end of line
            k = 1
            D2 + 1
            For j = 0 To CountString(TmpLine$, " ")
            	DM$(D2, j) =      	StringField( TmpLine$, k, " " )
            	DM (D2, j) = ValF( 	StringField( TmpLine$, k, " " ) )
               k + 1
            Next
            
        Default 
        	; What else ?
        	Debug "Skip line #" + RSet(Str(i), 2, "0") + ": >" + TmpLine$ + "<"
        	
      EndSelect
      
   Wend
   CloseFile(#FileHandle)   
Else
   MessageRequester("Oh-oh", "Can't find data file " + DataFile$, 48)
   End
EndIf


; Let's go draw a window to show datas
; Not the best way, but a lite tutorial, to explain how to set a minimal window in PB ;-)

; Draw the main window
OpenWindow(#Window_0, 0, 0, 1000, 800, "Results",
           #PB_Window_SystemMenu | #PB_Window_ScreenCentered | #PB_Window_MinimizeGadget )

; Draw the grid 1
ColWidth = 250
ListIconGadget(#Grid_1, 10, 10, 780, 300, "#", 20, #PB_ListIcon_GridLines )
AddGadgetColumn(#Grid_1, 1, "Array 1", ColWidth)
AddGadgetColumn(#Grid_1, 2, "Array 2", ColWidth)

; Draw the grid 2 (string version)
ColWidth = 60
ListIconGadget(#Grid_2, 10, 320, 780, 190, "#", 20, #PB_ListIcon_GridLines )
AddGadgetColumn(#Grid_2, 01, "DM 01", ColWidth)
AddGadgetColumn(#Grid_2, 02, "DM 02", ColWidth)
AddGadgetColumn(#Grid_2, 03, "DM 03", ColWidth)
AddGadgetColumn(#Grid_2, 04, "DM 04", ColWidth)
AddGadgetColumn(#Grid_2, 05, "DM 05", ColWidth)
AddGadgetColumn(#Grid_2, 06, "DM 06", ColWidth)
AddGadgetColumn(#Grid_2, 07, "DM 07", ColWidth)
AddGadgetColumn(#Grid_2, 08, "DM 08", ColWidth)
AddGadgetColumn(#Grid_2, 09, "DM 09", ColWidth)
AddGadgetColumn(#Grid_2, 10, "DM 10", ColWidth)
AddGadgetColumn(#Grid_2, 11, "DM 11", ColWidth)
AddGadgetColumn(#Grid_2, 12, "DM 12", ColWidth)

; Draw the grid 3 (float version)
ListIconGadget(#Grid_3, 10, 520, 780, 270, "#", 20, #PB_ListIcon_GridLines )
AddGadgetColumn(#Grid_3, 01, "DM 01", ColWidth)
AddGadgetColumn(#Grid_3, 02, "DM 02", ColWidth)
AddGadgetColumn(#Grid_3, 03, "DM 03", ColWidth)
AddGadgetColumn(#Grid_3, 04, "DM 04", ColWidth)
AddGadgetColumn(#Grid_3, 05, "DM 05", ColWidth)
AddGadgetColumn(#Grid_3, 06, "DM 06", ColWidth)
AddGadgetColumn(#Grid_3, 07, "DM 07", ColWidth)
AddGadgetColumn(#Grid_3, 08, "DM 08", ColWidth)
AddGadgetColumn(#Grid_3, 09, "DM 09", ColWidth)
AddGadgetColumn(#Grid_3, 10, "DM 10", ColWidth)
AddGadgetColumn(#Grid_3, 11, "DM 11", ColWidth)
AddGadgetColumn(#Grid_3, 12, "DM 12", ColWidth)

; Draw "log" (all lines)
ListIconGadget(#All_Lines, 800, 10, 190, 780, "#", 20, #PB_ListIcon_GridLines)
AddGadgetColumn(#All_Lines, 01, "Original File", 165)
                
; Now, fill the grids
; each row will be feed by a string
; PB change column at each chr$(10) in string

; Fill Grid 1 with array 1 and 2
For i = 1 To n
   AddGadgetItem(#Grid_1, i, Str(i) + Chr(10) + Array1$(i) + Chr(10) + Array2$(i))
Next

; Fill Grid 2 with matrix
For i = 1 To n
   AddGadgetItem(#Grid_2, i, Str(i) + Chr(10) + DM$(i, 0) + Chr(10) + DM$(i, 1) + Chr(10) +
                             DM$(i, 2) + Chr(10) + DM$(i, 3) + Chr(10) + DM$(i, 4) + Chr(10) +
                             DM$(i, 5) + Chr(10) + DM$(i, 6) + Chr(10) + DM$(i, 7) + Chr(10) +
                             DM$(i, 8) + Chr(10) + DM$(i, 9) + Chr(10) + DM$(i, 10) + Chr(10) +
                             DM$(i, 11) + Chr(10) + DM$(i, 12))
Next

; Fill Grid 3 with matrix
For i = 1 To n
   AddGadgetItem(#Grid_3, i, Str(i) + Chr(10) + DM(i, 0) + Chr(10) + DM(i, 1) + Chr(10) +
                             DM(i, 2) + Chr(10) + DM(i, 3) + Chr(10) + DM(i, 4) + Chr(10) +
                             DM(i, 5) + Chr(10) + DM(i, 6) + Chr(10) + DM(i, 7) + Chr(10) +
                             DM(i, 8) + Chr(10) + DM(i, 9) + Chr(10) + DM(i, 10) + Chr(10) +
                             DM(i, 11) + Chr(10) + DM(i, 12))
Next

; Tips: in PB IDE, CTRL+D duplicate line (or selected bloc)

; Show all lines (for info)
For i = 1 To n		
	AddGadgetItem(#All_Lines, i, Str(i) + Chr(10) + AllLines$(i))
Next


; Main loop (wait for user to close it with closebox)
Repeat
   Event = WaitWindowEvent()
   If Event = #PB_Event_CloseWindow
      End
   EndIf
ForEver

End
Copy / Paste Code in IDE
Select all (CTRL+A)
Let PB re-indent the code (CTRL+I)
fromVB
User
User
Posts: 82
Joined: Sun Jul 29, 2012 2:27 am

Re: How does PureBasic parse files?

Post by fromVB »

I think that there are many ways to do it. Not very different from the old VB or QB ways.
User avatar
Zebuddi123
Enthusiast
Enthusiast
Posts: 796
Joined: Wed Feb 01, 2012 3:30 pm
Location: Nottinghamshire UK
Contact:

Re: How does PureBasic parse files?

Post by Zebuddi123 »

Hi mao. A Regular Expression example, no need for checking the separator.

Zebuddi. :)

Code: Select all

; ; Discription
; reads the file into allocated memory space and peeks the file into a variable string for the regular expression extraction


regex_strings=CreateRegularExpression(#PB_Any, #DOUBLEQUOTE$+".+?"+#DOUBLEQUOTE$) ; matches "bla bla"
regex_numbers=CreateRegularExpression(#PB_Any, "\d+\.\d+") ; matches 2334.123 1.234 any combination

If ReadFile(0,"w.txt")
	*mem=AllocateMemory(Lof(0))
	buffer=ReadData(0,*mem,Lof(0))
	If buffer
		file_content$=PeekS(*mem)
		If MatchRegularExpression(regex_strings,file_content$)
			Dim string$(0)
			; extracts the string based on "any text" 
			Numb_Strings=ExtractRegularExpression(regex_strings,file_content$,string$()) ; Numb_Strings  = number of elements in the string$() array
			For i=0 To Numb_Strings-1 : string$(i)=RemoveString(string$(i),Chr(34)) : Next  ; remove quotemarks from the strings in the array
		EndIf
		If MatchRegularExpression(regex_numbers,file_content$)
			Dim number$(0)
			; extracts the floats based on  [ any number digits + decimal point+ any number of digits ] ie: 2334.1234 45.789987
			Numb_Numbers=ExtractRegularExpression(regex_numbers,file_content$,number$()) ; Numb_Numbers  = number of elements in the number$() array
		EndIf
	EndIf	
	CloseFile(0)
EndIf

;show contents of arrays  
Debug "content of string$() array"
For i=0 To Numb_Strings-1
	Debug string$(i)
Next
Debug ""
Debug "content of number$() array"
For i=0 To Numb_Numbers-1
	Debug number$(i)
Next

;Garbage Collection -- memory allocation, regular exppression, arrays  to be cleaned up when not neeed any longer or end of program
FreeRegularExpression(regex_numbers)
FreeRegularExpression(regex_strings)
FreeArray(string$())
FreeArray(number$())
FreeMemory(*mem)
Last edited by Zebuddi123 on Tue Jun 02, 2015 12:52 pm, edited 3 times in total.
malleo, caput, bang. Ego, comprehendunt in tempore
IdeasVacuum
Always Here
Always Here
Posts: 6426
Joined: Fri Oct 23, 2009 2:33 am
Location: Wales, UK
Contact:

Re: How does PureBasic parse files?

Post by IdeasVacuum »

Hi Zebuddi

Short and sweet, but should show how to determine the size of the filled arrays and display the content.

Code: Select all

   For i = 0 To (ArraySize(s$()) - 1)
       Debug s$(i)
   Next i

   For i = 0 To (ArraySize(n$()) - 1)
       Debug n$(i)
   Next i
Also, these should be inside the If ReadFile statement.....

Code: Select all

FreeArray(s$())
FreeArray(n$())
FreeMemory(*mem)
..... otherwise, if the read fails, the code tries to free things that do not exist.
IdeasVacuum
If it sounds simple, you have not grasped the complexity.
sancho2
User
User
Posts: 44
Joined: Wed Apr 15, 2015 5:14 am

Re: How does PureBasic parse files?

Post by sancho2 »

Zebuddi123 wrote:Hi mao. A Regular Expression example, no need for checking the separator.
Great idea.
User avatar
Zebuddi123
Enthusiast
Enthusiast
Posts: 796
Joined: Wed Feb 01, 2012 3:30 pm
Location: Nottinghamshire UK
Contact:

Re: How does PureBasic parse files?

Post by Zebuddi123 »

Hi IdeasVacuum I had left it that way so that mao had the 2 arrays available to do as he wanted. All dependent on the way you use the code, the cleanup was to remind him to free stuff up.

My bad not giving explanation and notes :oops: :lol:

@Sancho2 thanks

Zebuddi. :)
malleo, caput, bang. Ego, comprehendunt in tempore
Marc56us
Addict
Addict
Posts: 1600
Joined: Sat Feb 08, 2014 3:26 pm

Re: How does PureBasic parse files?

Post by Marc56us »

My regular expression version
(for PB beginners: no use of pointer) :wink:

Code: Select all

; Read Log 2.2
; (Quick GUI version)
;
; Changes:
; + Use RegularExpression system

Enumeration
   #FileHandle
   #Window_0
   #Grid_1
   #Grid_2
   #Grid_3
   #All_Lines
   #RegEx_A1
   #RegEx_A2
   #RegEx_DM
EndEnumeration

n = 76
Dim AllLines$(n)	; Log file
Dim Array1$(n)		; First column  string data
Dim Array2$(n)   	; Second column string data
Dim DM$  (n, n)		; Second part: number stored as string (fixed decimal)
Dim DM.f (n, n)		; Second part: number as number (float)

DataFile$ = "example.txt"

If Not CreateRegularExpression(#RegEx_A1, "^" + Chr(34) + "(.+)" + Chr(34) + "\," + Chr(34) + "(?: )(.+)(?:" + Chr(34) + ")") 
	MessageRequester("Error", "RegEx A1 not OK")
	End
EndIf

If Not CreateRegularExpression(#RegEx_A2, "(\d+\.\d+)") 
	MessageRequester("Error", "RegEx A2 not OK")
	End
EndIf

If ReadFile(#FileHandle, DataFile$)
   
	While Not Eof(#FileHandle)
		
		TmpLine$ = ReadString(#FileHandle)
		i + 1
		AllLines$(i) = TmpLine$
		
		If ExamineRegularExpression(#RegEx_A1, TmpLine$)
			While NextRegularExpressionMatch(#RegEx_A1) 
				D1 + 1
				Array1$(D1) = RegularExpressionGroup(#RegEx_A1, 1)
				Array2$(D1) = RegularExpressionGroup(#RegEx_A1, 2)
			Wend
		EndIf
		
		If ExamineRegularExpression(#RegEx_A2, TmpLine$)	
			If MatchRegularExpression(#RegEx_A2, TmpLine$) 
				D2 + 1			
				While NextRegularExpressionMatch(#RegEx_A2) 
					DM$(D2, j) = RegularExpressionMatchString(#RegEx_A2)
					DM (D2, j) = ValF(RegularExpressionMatchString(#RegEx_A2))
					j + 1
				Wend
				j = 0
			EndIf
		EndIf
		      
   Wend
   CloseFile(#FileHandle)   
   FreeRegularExpression(#RegEx_A1)
   FreeRegularExpression(#RegEx_A2)
Else
   MessageRequester("Oh-oh", "Can't find data file " + DataFile$, 48)
   End
EndIf


; Draw the main window
OpenWindow(#Window_0, 0, 0, 1000, 800, "Results",
           #PB_Window_SystemMenu | #PB_Window_ScreenCentered | #PB_Window_MinimizeGadget )

; Draw the grid 1
ColWidth = 250
ListIconGadget(#Grid_1, 10, 10, 780, 300, "#", 20, #PB_ListIcon_GridLines )
AddGadgetColumn(#Grid_1, 1, "Array 1", ColWidth)
AddGadgetColumn(#Grid_1, 2, "Array 2", ColWidth)

; Draw the grid 2 (string version)
ColWidth = 60
ListIconGadget(#Grid_2, 10, 320, 780, 190, "#", 20, #PB_ListIcon_GridLines )
AddGadgetColumn(#Grid_2, 01, "DM 01", ColWidth)
AddGadgetColumn(#Grid_2, 02, "DM 02", ColWidth)
AddGadgetColumn(#Grid_2, 03, "DM 03", ColWidth)
AddGadgetColumn(#Grid_2, 04, "DM 04", ColWidth)
AddGadgetColumn(#Grid_2, 05, "DM 05", ColWidth)
AddGadgetColumn(#Grid_2, 06, "DM 06", ColWidth)
AddGadgetColumn(#Grid_2, 07, "DM 07", ColWidth)
AddGadgetColumn(#Grid_2, 08, "DM 08", ColWidth)
AddGadgetColumn(#Grid_2, 09, "DM 09", ColWidth)
AddGadgetColumn(#Grid_2, 10, "DM 10", ColWidth)
AddGadgetColumn(#Grid_2, 11, "DM 11", ColWidth)
AddGadgetColumn(#Grid_2, 12, "DM 12", ColWidth)

; Draw the grid 3 (float version)
ListIconGadget(#Grid_3, 10, 520, 780, 270, "#", 20, #PB_ListIcon_GridLines )
AddGadgetColumn(#Grid_3, 01, "DM 01", ColWidth)
AddGadgetColumn(#Grid_3, 02, "DM 02", ColWidth)
AddGadgetColumn(#Grid_3, 03, "DM 03", ColWidth)
AddGadgetColumn(#Grid_3, 04, "DM 04", ColWidth)
AddGadgetColumn(#Grid_3, 05, "DM 05", ColWidth)
AddGadgetColumn(#Grid_3, 06, "DM 06", ColWidth)
AddGadgetColumn(#Grid_3, 07, "DM 07", ColWidth)
AddGadgetColumn(#Grid_3, 08, "DM 08", ColWidth)
AddGadgetColumn(#Grid_3, 09, "DM 09", ColWidth)
AddGadgetColumn(#Grid_3, 10, "DM 10", ColWidth)
AddGadgetColumn(#Grid_3, 11, "DM 11", ColWidth)
AddGadgetColumn(#Grid_3, 12, "DM 12", ColWidth)

; Draw "log" (all lines)
ListIconGadget(#All_Lines, 800, 10, 190, 780, "#", 20, #PB_ListIcon_GridLines)
AddGadgetColumn(#All_Lines, 01, "Original File", 165)
                
; Now, fill the grids
; each row will be feed by a string
; PB change column when at each chr$(10) in string

; Fill Grid 1 with array 1 and 2
For i = 1 To n
   AddGadgetItem(#Grid_1, i, Str(i) + Chr(10) + Array1$(i) + Chr(10) + Array2$(i))
Next

; Fill Grid 2 with matrix
For i = 1 To n
   AddGadgetItem(#Grid_2, i, Str(i) + Chr(10) + DM$(i, 0) + Chr(10) + DM$(i, 1) + Chr(10) +
                             DM$(i, 2) + Chr(10) + DM$(i, 3) + Chr(10) + DM$(i, 4) + Chr(10) +
                             DM$(i, 5) + Chr(10) + DM$(i, 6) + Chr(10) + DM$(i, 7) + Chr(10) +
                             DM$(i, 8) + Chr(10) + DM$(i, 9) + Chr(10) + DM$(i, 10) + Chr(10) +
                             DM$(i, 11) + Chr(10) + DM$(i, 12))
Next

; Fill Grid 3 with matrix
For i = 1 To n
   AddGadgetItem(#Grid_3, i, Str(i) + Chr(10) + DM(i, 0) + Chr(10) + DM(i, 1) + Chr(10) +
                             DM(i, 2) + Chr(10) + DM(i, 3) + Chr(10) + DM(i, 4) + Chr(10) +
                             DM(i, 5) + Chr(10) + DM(i, 6) + Chr(10) + DM(i, 7) + Chr(10) +
                             DM(i, 8) + Chr(10) + DM(i, 9) + Chr(10) + DM(i, 10) + Chr(10) +
                             DM(i, 11) + Chr(10) + DM(i, 12))
Next

; Tips: in PB IDE, CTRL+D copy actual line (or selected bloc)

; Show all lines (for info)
For i = 1 To n		
	AddGadgetItem(#All_Lines, i, Str(i) + Chr(10) + AllLines$(i))
Next


; Main loop (wait for user to close it with closebox)
Repeat
   Event = WaitWindowEvent()
   If Event = #PB_Event_CloseWindow
      End
   EndIf
ForEver

End
Post Reply