How does PureBasic parse files?
How does PureBasic parse files?
Given a file like this:
"abcd","abcdsssss"
"aaaaaaaabbs","ffffffffffffffff"
.......
1.098 2.754 3.777 5.777
1.999 3.777 4.567
2.888 4.675
1.897
This file is divided into two main parts:
The first part includes two sections separated by comma, these two sections can be any length within double quotes. I want to put the first section into an array and put the second into another array. I figured out how to do that but using pretty long codes. Could anyone possibly give some guidelines about how to get the output in several shorter lines?
The second part of the file is filled with numeric values with specific length and separated by space and every following line is shorter than by one value. I need to put these values into a matrix. I have no idea how PureBasic parses files as other languages do, like how to recognize delimiter and so on.
Could someone help me with this or recommend some links or sources about how PB parses files?
Thanks.
"abcd","abcdsssss"
"aaaaaaaabbs","ffffffffffffffff"
.......
1.098 2.754 3.777 5.777
1.999 3.777 4.567
2.888 4.675
1.897
This file is divided into two main parts:
The first part includes two sections separated by comma, these two sections can be any length within double quotes. I want to put the first section into an array and put the second into another array. I figured out how to do that but using pretty long codes. Could anyone possibly give some guidelines about how to get the output in several shorter lines?
The second part of the file is filled with numeric values with specific length and separated by space and every following line is shorter than by one value. I need to put these values into a matrix. I have no idea how PureBasic parses files as other languages do, like how to recognize delimiter and so on.
Could someone help me with this or recommend some links or sources about how PB parses files?
Thanks.
Re: How does PureBasic parse files?
Parse string: StringField
http://www.purebasic.com/documentation/ ... field.html
Remove "" : Trim
http://www.purebasic.com/documentation/string/trim.html
Matrix:
See chapter Arrays, Lists & Structure in
http://www.purebasic.com/documentation/index.html
and more fun: Regular Expression
http://www.purebasic.com/documentation/ ... index.html

http://www.purebasic.com/documentation/ ... field.html
Remove "" : Trim
http://www.purebasic.com/documentation/string/trim.html
Matrix:
See chapter Arrays, Lists & Structure in
http://www.purebasic.com/documentation/index.html
and more fun: Regular Expression
http://www.purebasic.com/documentation/ ... index.html

-
- Always Here
- Posts: 6426
- Joined: Fri Oct 23, 2009 2:33 am
- Location: Wales, UK
- Contact:
Re: How does PureBasic parse files?
Well, it's how you parse files with the myriad of functions that PB provides.
Personally, I would use a Linked List for each data section, rather than an array for each, since with Lists you do not need to know how many data elements there are.
Read each file line as a string.
When reading, test the last char of the line - if it is not Chr(34) (speech), start adding data to your second List.
By the way, with a structured List, you do not need to have two Lists, you can read any data type into the list.
So, how you want to parse the file is dependant on how you will use the data later in the application.
Personally, I would use a Linked List for each data section, rather than an array for each, since with Lists you do not need to know how many data elements there are.
Read each file line as a string.
When reading, test the last char of the line - if it is not Chr(34) (speech), start adding data to your second List.
By the way, with a structured List, you do not need to have two Lists, you can read any data type into the list.
So, how you want to parse the file is dependant on how you will use the data later in the application.
IdeasVacuum
If it sounds simple, you have not grasped the complexity.
If it sounds simple, you have not grasped the complexity.
Re: How does PureBasic parse files?
Regular expression is what I am looking for!Marc56us wrote:Parse string: StringField
http://www.purebasic.com/documentation/ ... field.html
Remove "" : Trim
http://www.purebasic.com/documentation/string/trim.html
Matrix:
See chapter Arrays, Lists & Structure in
http://www.purebasic.com/documentation/index.html
and more fun: Regular Expression
http://www.purebasic.com/documentation/ ... index.html
Thanks.

Re: How does PureBasic parse files?
You are right. It is going to take me a while to choose the appropriate commands.IdeasVacuum wrote:Well, it's how you parse files with the myriad of functions that PB provides.
Personally, I would use a Linked List for each data section, rather than an array for each, since with Lists you do not need to know how many data elements there are.
Read each file line as a string.
When reading, test the last char of the line - if it is not Chr(34) (speech), start adding data to your second List.
By the way, with a structured List, you do not need to have two Lists, you can read any data type into the list.
So, how you want to parse the file is dependant on how you will use the data later in the application.
Wish I could handle these problems well as you do some day!
Thanks!
- Zebuddi123
- Enthusiast
- Posts: 796
- Joined: Wed Feb 01, 2012 3:30 pm
- Location: Nottinghamshire UK
- Contact:
Re: How does PureBasic parse files?
Hi mao This parses the file as you have described and prints the results to the debug window. Its based on parsing the file till the separator is reached, whereby the sep variable is set to 1 and now all subsequent reads are placed in the numbers linkedlist of which is of type string so can be passed to the procedure SortData(). Just an example for you to look at
Zebuddi.
Zebuddi.

Code: Select all
; save this data in in file w.txt in same directory as the code, without the ;"
; "abcd","abcdsssss"
; "aaaaaaaabbs","ffffffffffffffff"
; .......
; 1.098 2.754 3.777 5.777
; 1.999 3.777 4.567
; 2.888 4.675
; 1.897
Global NewList string.s()
Global NewList numbers.s()
Procedure SortData(List l.s(), delim$) ; takes a linkedlist of type .s and delimiter for the stringfield() serperator
commas=CountString(l.s(),delim$) ; counts the number of comma`s in the string
If commas
For i=1 To commas+1 ; +1 parses to the end of the string
Debug StringField(l.s(), i, delim$)
Next
Else
Debug l.s() ; only one element in the string
EndIf
EndProcedure
If ReadFile(0,"w.txt")
sep=0 ; to show separator not yet reached
While Not Eof(0)
a$=ReadString(0,ReadStringFormat(0)) ; read in the current line in the correct format ascii/utf8 etc
If FindString(a$, Chr(34)) ; looking for chr(44) which is quotemark "
AddElement(string()) : string()=a$ ; add an element to the linkedlist and stores the string (strings identified by quotemarks" ")
Else
If sep = 0 ; we have encountered the seperator
a$=ReadString(0,ReadStringFormat(0)) ; ditch the seperator line and read the next line
sep+1 ; set var that we have pasted the seperator line
EndIf
AddElement(numbers()) : numbers()=a$ ; add an element to the linkedlist and stores the string (numbers)
EndIf
Wend
CloseFile(0)
ForEach string()
SortData(string(), ",") ; loop through linkedlist and process the data
Next
ForEach numbers()
SortData(numbers(), " ") ; loop through linkedlist and process the data
Next
EndIf
malleo, caput, bang. Ego, comprehendunt in tempore
Re: How does PureBasic parse files?
Hi mao. Without some fixed structure, it can be tricky to handle files with mixed data types. Although the best approach would be to use custom-defined structures and read/write the file through the ReadData() and WriteData() functions, it wouldn't work with dynamic strings. As such, the next best approach would be to read and write the strings and numbers separately, albeit in a structured manner.mao wrote:Given a file like this:
"abcd","abcdsssss"
"aaaaaaaabbs","ffffffffffffffff"
.......
1.098 2.754 3.777 5.777
1.999 3.777 4.567
2.888 4.675
1.897
This example does just that, displaying the read and written values in list views:
Code: Select all
Global.s fileName, text, doubles.d
Procedure readSampleFile()
If OpenFile(1, fileName)
ClearGadgetItems(2)
For r = 1 To 3
AddGadgetItem(2, -1, "Record #" + Str(r))
AddGadgetItem(2, -1, "=========")
For i = 1 To 4
AddGadgetItem(2, -1, ReadString(1))
Next i
For i = 1 To 10
AddGadgetItem(2, -1, StrD(ReadDouble(1)))
Next i
AddGadgetItem(2, -1, "")
Next r
CloseFile(1)
DisableGadget(4, 1)
Else
MessageRequester("File System", "Unable to read the created file.")
EndIf
EndProcedure
Procedure writeSampleFile()
fileName = OpenFileRequester("Select file name and location:", "", "", 0)
If fileName And OpenFile(1, fileName)
ClearGadgetItems(1)
Restore sampleData:
For r = 1 To 3
AddGadgetItem(1, -1, "Record #" + Str(r))
AddGadgetItem(1, -1, "=========")
For i = 0 To 3
Read.s text
WriteStringN(1, text)
AddGadgetItem(1, -1, text)
Next i
For i = 0 To 9
Read.d doubles
WriteDouble(1, doubles)
AddGadgetItem(1, -1, StrD(doubles))
Next i
AddGadgetItem(1, -1, "")
Next r
CloseFile(1)
DisableGadget(3, 1)
DisableGadget(4, 0)
Else
MessageRequester("File System", "Unable to create the file.")
EndIf
EndProcedure
wFlags = #PB_Window_SystemMenu | #PB_Window_ScreenCentered
OpenWindow(0, #PB_Any, #PB_Any, 430, 420, "Structured File Example", wFlags)
ListViewGadget(1, 10, 60, 200, 350)
ListViewGadget(2, 220, 60, 200, 350)
ButtonGadget(3, 10, 10, 200, 40, "WRITE FILE")
ButtonGadget(4, 220, 10, 200, 40, "READ FILE")
DisableGadget(4, 1)
DeleteFile("testFile.mao") ;the sample file is deleted with every run
Repeat
Select WaitWindowEvent()
Case #PB_Event_CloseWindow
appQuit = 1
Case #PB_Event_Gadget
Select EventGadget()
Case 3
writeSampleFile()
Case 4
readSampleFile()
EndSelect
EndSelect
Until appQuit = 1
DataSection
sampleData:
Data.s "first abcd", "first abcdefgh", "first pqrstuvwxyz", "first lmnopqrstuvwxyz"
Data.d 1.111111, 1.22222, 1.3333, 1.444444, 1.55555, 1.6666, 1.777777, 1.88888, 1.9999, 1.101
Data.s "second abcd", "second abcdefgh", "second pqrstuvwxyz", "second lmnopqrstuvwxyz"
Data.d 2.111111, 2.22222, 2.3333, 2.444444, 2.55555, 2.6666, 2.777777, 2.88888, 2.9999, 2.101
Data.s "third abcd", "third abcdefgh", "third pqrstuvwxyz", "third lmnopqrstuvwxyz"
Data.d 3.111111, 3.22222, 3.3333, 3.444444, 3.55555, 3.6666, 3.777777, 3.88888, 3.9999, 3.101
EndDataSection
Hope it helps.

Texas Instruments TI-99/4A Home Computer: the first home computer with a 16bit processor, crammed into an 8bit architecture. Great hardware - Poor design - Wonderful BASIC engine. And it could talk too! Please visit my YouTube Channel 

-
- Always Here
- Posts: 6426
- Joined: Fri Oct 23, 2009 2:33 am
- Location: Wales, UK
- Contact:
Re: How does PureBasic parse files?
Another approach: browse to the file, parse it, display results on Window:
[/size]
Edit: Once you have pasted the code into PB, it is much easier to follow because of the syntax highlighting.
Code: Select all
EnableExplicit
Enumeration
#FileIO
#WinMain
#EdStrings
#EdMatrix
#TxtStrings
#TxtMatix
#BtnGetVals
EndEnumeration
Global NewList StringVals.s() ;strings
Global NewList MatrixVals.d() ;doubles
Procedure Msg(sMsg.s)
;--------------------
MessageRequester("Problem", sMsg)
EndProcedure
Procedure ImportVals()
;---------------------
Protected sPat.s = "Text File|*.txt;*.txt|All Files (*.*)|*.*;"
Protected iStringFormat.i = 0
Protected sVal.s = "", sChar.s = "", sStringValDelim.s = Chr(44), sMatValDelim.s = Chr(32)
Protected iTotalVals.i = 0, iIndex.i = 0
;Ask the User to browse to the data file
Protected sDataFile.s = OpenFileRequester("Browse and select Data File", "C:\", sPat, 0)
ClearList(StringVals())
ClearList(MatrixVals())
If(sDataFile)
If ReadFile(#FileIO, sDataFile)
iStringFormat = ReadStringFormat(#FileIO) ;e.g. file could be ASCII, Unicode etc
While(Eof(#FileIO) = 0) ;loop until end of file reached
;On read, remove any Prefix/Postfix space chars
sVal = Trim(ReadString(#FileIO, iStringFormat))
sChar = Right(sVal, 1) ;Is sVal a string?
If sChar = Chr(34)
;Count the commas which delimit the Matrix values
iTotalVals = CountString(sVal, sStringValDelim)
For iIndex = 1 To (iTotalVals + 1)
AddElement(StringVals())
StringVals() = Trim(StringField(sVal, iIndex, sStringValDelim), Chr(34))
Next
Else
;sVal contains Matrix Values (floats or doubles)
;Count the spaces which delimit the Matrix values
iTotalVals = CountString(sVal, sMatValDelim)
For iIndex = 1 To (iTotalVals + 1)
AddElement(MatrixVals())
MatrixVals() = ValD(StringField(sVal, iIndex, sMatValDelim))
Next
EndIf
Wend
Else
Msg("Could not read data File: " + sDataFile)
EndIf
Else
Msg("Data file was not selected")
EndIf
ClearGadgetItems(#EdStrings)
ForEach StringVals()
AddGadgetItem(#EdStrings, -1, StringVals())
Next
ClearGadgetItems(#EdMatrix)
ForEach MatrixVals()
AddGadgetItem(#EdMatrix, -1, StrD(MatrixVals(), 4))
Next
EndProcedure
Procedure WinMain()
;------------------
Protected iflags.i = #PB_Window_SystemMenu|#PB_Window_ScreenCentered
If OpenWindow(#WinMain, 0, 0, 520, 252, "Parse Data File", iflags)
TextGadget(#TxtStrings, 10, 2, 500, 18, "String Vals")
EditorGadget(#EdStrings, 10, 20, 500, 84)
TextGadget(#TxtMatix, 10, 112, 500, 18, "Matrix Vals")
EditorGadget(#EdMatrix, 10, 130, 500, 84)
ButtonGadget(#BtnGetVals, 10, 222, 500, 26, "Browse to Data File")
EndIf
EndProcedure
Procedure WaitForUser()
;----------------------
Protected iExit.i = #False
Repeat
Select WaitWindowEvent(1)
Case #PB_Event_CloseWindow
If EventWindow() = #WinMain: iExit = #True : EndIf
Case #PB_Event_Gadget
Select EventGadget()
Case #BtnGetVals: ImportVals()
EndSelect
EndSelect
Until iExit = #True
EndProcedure
;Startup the App
WinMain()
WaitForUser()
End
Edit: Once you have pasted the code into PB, it is much easier to follow because of the syntax highlighting.
Last edited by IdeasVacuum on Wed May 27, 2015 8:27 pm, edited 5 times in total.
IdeasVacuum
If it sounds simple, you have not grasped the complexity.
If it sounds simple, you have not grasped the complexity.
Re: How does PureBasic parse files?
Thanks all!
I am just curious why it only takes 19 short-line code(two DO-LOOP and one FOR-NEXT loop) in QuickBasic to parse this file, but in PureBasic we have to write a much longer one. PureBasic is suposed to be more powerful than QB,right?
I am just curious why it only takes 19 short-line code(two DO-LOOP and one FOR-NEXT loop) in QuickBasic to parse this file, but in PureBasic we have to write a much longer one. PureBasic is suposed to be more powerful than QB,right?

-
- Always Here
- Posts: 6426
- Joined: Fri Oct 23, 2009 2:33 am
- Location: Wales, UK
- Contact:
Re: How does PureBasic parse files?
Hello Mao
If you study everybody's examples, the parsing is done with very little code. All the extra lines of code are there to:
1) Make the code safe/durable for use with different files;
2) Define a Window to display the Results;
3) Allow any amount of files to be opened and the results displayed;
4) Show you the value of using Procedures in terms of making any amount of code more robust, easier to follow, easier to debug, easier to re-use in other projects.
If you study everybody's examples, the parsing is done with very little code. All the extra lines of code are there to:
1) Make the code safe/durable for use with different files;
2) Define a Window to display the Results;
3) Allow any amount of files to be opened and the results displayed;
4) Show you the value of using Procedures in terms of making any amount of code more robust, easier to follow, easier to debug, easier to re-use in other projects.
IdeasVacuum
If it sounds simple, you have not grasped the complexity.
If it sounds simple, you have not grasped the complexity.
-
- Always Here
- Posts: 6426
- Joined: Fri Oct 23, 2009 2:33 am
- Location: Wales, UK
- Contact:
Re: How does PureBasic parse files?
...you can code really dirty and squeeze brilliant code into very few lines: Xmas Punch Contest
IdeasVacuum
If it sounds simple, you have not grasped the complexity.
If it sounds simple, you have not grasped the complexity.
Re: How does PureBasic parse files?
IdeasVacuum wrote:Hello Mao
If you study everybody's examples, the parsing is done with very little code. All the extra lines of code are there to:
1) Make the code safe/durable for use with different files;
2) Define a Window to display the Results;
3) Allow any amount of files to be opened and the results displayed;
4) Show you the value of using Procedures in terms of making any amount of code more robust, easier to follow, easier to debug, easier to re-use in other projects.
Thanks! I will spend more time and slow down.
Re: How does PureBasic parse files?
I like that your reading in the file and doing the parsing afterwards.Zebuddi123 wrote:Its based on parsing the file till the separator is reached, whereby the sep variable is set to 1Code: Select all
If ReadFile(0,"w.txt") sep=0 ; to show separator not yet reached While Not Eof(0) a$=ReadString(0,ReadStringFormat(0)) ; read in the current line in the correct format ascii/utf8 etc If FindString(a$, Chr(34)) ; looking for chr(44) which is quotemark " AddElement(string()) : string()=a$ ; add an element to the linkedlist and stores the string (strings identified by quotemarks" ") Else If sep = 0 ; we have encountered the seperator a$=ReadString(0,ReadStringFormat(0)) ; ditch the seperator line and read the next line sep+1 ; set var that we have pasted the seperator line EndIf AddElement(numbers()) : numbers()=a$ ; add an element to the linkedlist and stores the string (numbers) EndIf Wend CloseFile(0)
I know its for example purposes only...
I don't know if its any faster (or even needs to be) but it looks like testing only the first character of each line for a quote, could give you the satisfactory results:
Code: Select all
;if FindString(a$, Chr(34)) ; looking for chr(44) which is quotemark "
if left(a$, 1) = chr(34)
There was no mention of a separator line in the original post.
Re: How does PureBasic parse files?
19 Lines (without tricks):
And it runs on windows, linux and mac osx.
So PB is more powerfull than QB
Bernd
Code: Select all
If ReadFile(0, "test.txt")
While Not Eof(0)
Line$ = ReadString(0, #PB_Ascii)
i = 1
If Left(Line$, 1) = #DQUOTE$
Delimiter$ = ","
Else
Delimiter$ = " "
EndIf
Repeat
Part$ = StringField(Line$, i, Delimiter$)
If Len(Part$)
Debug Part$
EndIf
i + 1
Until Len(Part$) = 0
Wend
CloseFile(0)
EndIf
So PB is more powerfull than QB

Bernd