Page 2 of 2
Re: Can a structure be defined dynamically by csv headers
Posted: Thu Mar 07, 2024 11:44 am
by craig7
Thanks all for the help, it is great to get insights when you are just starting on a language. My mind is a bit blown by how much code has to be written in PureBasic to do this, because in R or Python it is a one liner. For example in Python Panda:
df2 = pd.read_csv(filepath,encoding = "latin1")
In R reading in a CSV to a dataframe is in the base language, so even quicker than Python.
Re: Can a structure be defined dynamically by csv headers
Posted: Thu Mar 07, 2024 2:58 pm
by tored
craig7 wrote: Thu Mar 07, 2024 11:44 am
Thanks all for the help, it is great to get insights when you are just starting on a language. My mind is a bit blown by how much code has to be written in PureBasic to do this, because in R or Python it is a one liner. For example in Python Panda:
df2 = pd.read_csv(filepath,encoding = "latin1")
In R reading in a CSV to a dataframe is in the base language, so even quicker than Python.
That is because those languages have CSV support in their respective standard libraries where PureBasic has not. And also Python and R are both interpreted languages, thus have virtual machine and a garbage collector, where PureBasic is a compiled language without a garbage collector, which naturally leads to more code.
Another option is to use an existing C library and use that together with the C backend.
However there is an argument to be made that CSV support ought be included in the PureBasic standard library, but the maintainers are pretty much occupied with more important tasks.
Re: Can a structure be defined dynamically by csv headers
Posted: Thu Mar 07, 2024 3:17 pm
by tored
infratec wrote: Wed Mar 06, 2024 9:50 pm
A quick implementation with an user function:
Nice. You can simplify it even more if you have the CSV file on disk by letting sqlite import it after you created the table.
Code: Select all
.mode csv
.import myfile.csv mytable
Re: Can a structure be defined dynamically by csv headers
Posted: Thu Mar 07, 2024 4:32 pm
by Tenaja
My mind is a bit blown by how much code has to be written in PureBasic to do this
That is because those languages have CSV support in their respective standard libraries where PureBasic has not.
I agree... CSV support would be nice.
Keep in mind that there are many libraries native to pb that do not exist in python or r natively. You just happened to stumble across one that is absent here. But should you try to convert every pb file into another language, you would find similar deficiencies that pb has supported for many many years.
One nice thing about pb is the breadth of user-written libraries, as well as the helpful & friendly community! Sure, those exist elsewhere, but the large languages also come with a lot of chaff... More, in my experience, unfortunately.
Re: Can a structure be defined dynamically by csv headers
Posted: Thu Mar 07, 2024 5:06 pm
by infratec
tored wrote: Thu Mar 07, 2024 3:17 pm
Nice. You can simplify it even more if you have the CSV file on disk by letting sqlite import it after you created the table.
Code: Select all
.mode csv
.import myfile.csv mytable
This is only possible with the command line tool.
So you have to use RunProgram()
Re: Can a structure be defined dynamically by csv headers
Posted: Thu Mar 07, 2024 5:46 pm
by craig7
Tenaja wrote: Thu Mar 07, 2024 4:32 pm
My mind is a bit blown by how much code has to be written in PureBasic to do this
That is because those languages have CSV support in their respective standard libraries where PureBasic has not.
I agree... CSV support would be nice.
Keep in mind that there are many libraries native to pb that do not exist in python or r natively. You just happened to stumble across one that is absent here. But should you try to convert every pb file into another language, you would find similar deficiencies that pb has supported for many many years.
One nice thing about pb is the breadth of user-written libraries, as well as the helpful & friendly community! Sure, those exist elsewhere, but the large languages also come with a lot of chaff... More, in my experience, unfortunately.
Yes Python library distribution sites like pypi are a bit like the wild west. Only widely used packages like Pandas are well maintained and documented. R is different, in that Cran curates the libraries / packages to a high standard. However, unless you use virtual environments, OR stick to base language which is very stable, your code will break every few months due to changes in the packages when they update... especially if has anything to do with the tidyverse, where the lead dev. once stated: "I prefer to have a more utopian approach where I can be quite aggressive about making backward-incompatible changes while trying to figure out a better API."
It is nice the PB has so many built in libraries, and that everything is in one exe, as long as the libraries respect backwards compatibility.
Re: Can a structure be defined dynamically by csv headers
Posted: Fri Mar 08, 2024 9:59 am
by mk-soft
Nice function ...
Added error text if csv-file not found
Code: Select all
EnableExplicit
Structure IndexArray
*Index[0]
EndStructure
PrototypeC.l Prototype_CallBack(*pCtx, argc.l, *argv.IndexArray, *colnames.IndexArray)
ImportC "sqlite3.lib"
sqlite3_create_function16(DatabaseID, zFunctionName.s, nArg.l, eTextRep.l, *pApp, *xFunc, *xStep, *xFinal)
sqlite3_value_text16(*sqlite3_value)
sqlite3_context_db_handle.i(*context)
sqlite3_exec(*db, sql.p-utf8, callback.Prototype_CallBack, *firstArg, *errmsg)
sqlite3_result_error16(*context, msg.s, bytes)
EndImport
#SQLITE_UTF16 = 4
ProcedureC createTableFromCSV(*context, argc.i, *argv.IndexArray)
Protected Filename$, File.i, Line$, FieldCount.i, Table$, SQL$, Field$, *DB, i.i
Filename$ = PeekS(sqlite3_value_text16(*argv\Index[0]))
File = ReadFile(#PB_Any, Filename$)
If File
Table$ = GetFilePart(Filename$, #PB_FileSystem_NoExtension)
Line$ = ReadString(File)
;Debug Line$
SQL$ = "CREATE TABLE " + Table$ + "("
FieldCount = 1
Repeat
Field$ = StringField(Line$, FieldCount, ",")
If Field$ <> ""
SQL$ + Trim(Field$) + " TEXT,"
EndIf
FieldCount + 1
Until Field$ = ""
SQL$ = RTrim(SQL$, ",") + ")"
;Debug SQL$
*DB = sqlite3_context_db_handle(*context)
sqlite3_exec(*DB, SQL$, #Null, #Null, #Null)
While Not Eof(File)
Line$ = ReadString(File)
SQL$ = "INSERT INTO " + Table$ + " VALUES ("
For i = 1 To FieldCount
Field$ = StringField(Line$, i, ",")
SQL$ + Trim(Field$) + ","
Next i
SQL$ = RTrim(SQL$, ",") + ")"
;Debug SQL$
sqlite3_exec(*DB, SQL$, #Null, #Null, #Null)
Wend
CloseFile(File)
Else
sqlite3_result_error16(*context, "Error: CSV-File Not Found!", -1)
EndIf
EndProcedure
UseSQLiteDatabase()
If OpenDatabase(0, ":memory:", "", "")
sqlite3_create_function16(DatabaseID(0), "createTableFromCSV", 1, #SQLITE_UTF16, #Null, @createTableFromCSV(), #Null, #Null)
If Not DatabaseUpdate(0, "SELECT createTableFromCSV('c:\tmp\PureBasic\csvtest\Test.csv')")
Debug DatabaseError()
Else
If DatabaseQuery(0, "SELECT * FROM Test")
While NextDatabaseRow(0)
Debug GetDatabaseString(0, 0) + " " + GetDatabaseString(0, 1) + " " + GetDatabaseString(0, 2)
Wend
FinishDatabaseQuery(0)
Else
Debug DatabaseError()
EndIf
EndIf
CloseDatabase(0)
EndIf
[code-pb]
Re: Can a structure be defined dynamically by csv headers
Posted: Fri Mar 08, 2024 4:19 pm
by ChrisR
Nicely done with sqlite and probably the most appropriate way for csv
Another possible way without using sqllite:
Code: Select all
CompilerIf #PB_Compiler_IsMainFile
EnableExplicit
CompilerEndIf
Structure MyFieldTable
Array Field$(0)
EndStructure
Global NewList MyTable.MyFieldTable()
; mk-soft SplitString to List or Array with option double-quotes (CSV) (Array here): https://www.purebasic.fr/english/viewtopic.php?t=69557
XIncludeFile "SplitString.pbi"
CompilerIf #PB_Compiler_IsMainFile
; GetArtistNameFromID is only for the Test example below
Procedure.s GetArtistNameFromID(ID)
Protected ArtistName$
PushListPosition(MyTable())
ForEach MyTable()
If MyTable()\Field$(0) = Str(ID)
ArtistName$ = MyTable()\Field$(1)
Break
EndIf
Next
PopListPosition(MyTable())
ProcedureReturn ArtistName$
EndProcedure
; GetBioFromID is only for the Test example below
Procedure.s GetBioFromID(ID)
Protected Bio$
PushListPosition(MyTable())
ForEach MyTable()
If MyTable()\Field$(0) = Str(ID)
Bio$ = MyTable()\Field$(2)
Break
EndIf
Next
PopListPosition(MyTable())
ProcedureReturn Bio$
EndProcedure
CompilerEndIf
Procedure.s GetColValueFromID(Col, ID)
Protected Value$
If Col < 0 : ProcedureReturn : EndIf
PushListPosition(MyTable())
FirstElement(MyTable())
If Col <= ArraySize(MyTable()\Field$())
ForEach MyTable()
If MyTable()\Field$(0) = Str(ID)
Value$ = MyTable()\Field$(Col)
Break
EndIf
Next
EndIf
PopListPosition(MyTable())
ProcedureReturn Value$
EndProcedure
Procedure.s GetValueFromID(Field$, ID)
Protected Value$, ID$, CountField, Col = #PB_Default, I
PushListPosition(MyTable())
FirstElement(MyTable())
CountField = ArraySize(MyTable()\Field$())
; Check case-insensitive header fields to make life easier
Field$ = UCase(Field$)
For I = 0 To CountField
If UCase(MyTable()\Field$(I)) = Field$
Col = I
Break
EndIf
Next
If Col <> #PB_Default
ID$ = Str(ID)
ForEach MyTable()
If MyTable()\Field$(0) = ID$
Value$ = MyTable()\Field$(Col)
Break
EndIf
Next
EndIf
PopListPosition(MyTable())
ProcedureReturn Value$
EndProcedure
Procedure LoadCSV(Filename$)
Protected File, Line$, NbLine, CountHeaderField, CountField, ReturnVal = #True, I
Protected Dim StringArray.s(0)
File = ReadFile(#PB_Any, Filename$)
If File
NbLine + 1
Line$ = ReadString(File)
CountHeaderField = SplitStringArray(Line$, ",", StringArray(), #True) - 1
AddElement(MyTable())
ReDim MyTable()\Field$(CountHeaderField)
For I = 0 To CountHeaderField
MyTable()\Field$(I) = StringArray(I)
Next
While Not Eof(File)
NbLine + 1
Line$ = ReadString(File)
CountField = SplitStringArray(Line$, ",", StringArray(), #True) - 1
If CountField = CountHeaderField
AddElement(MyTable())
ReDim MyTable()\Field$(CountField)
For I = 0 To CountField
MyTable()\Field$(I) = StringArray(I)
Next
Else
If MessageRequester("Warning", "Line " + Str(NbLine) +", Number of fields = " + Str(CountField+1) + ", does Not match header fields count = " + Str(CountHeaderField+1) +
#CRLF$+#CRLF$+ "Continue loading the csv table, ignoring this line ?", #PB_MessageRequester_YesNo | #PB_MessageRequester_Warning) = #PB_MessageRequester_No
ClearList(MyTable())
ReturnVal = #False
Break
EndIf
EndIf
Wend
CloseFile(File)
ProcedureReturn ReturnVal
EndIf
EndProcedure
CompilerIf #PB_Compiler_IsMainFile
Define Filename$ = "E:\Temp\Test.csv"
Define OutPut$, CountField, I
If LoadCSV(Filename$)
ForEach MyTable()
If CountField = 0 : CountField = ArraySize(MyTable()\Field$()) : EndIf
OutPut$ = ""
For I = 0 To CountField
If I = 0
OutPut$ + MyTable()\Field$(I)
Else
OutPut$ + " | " + MyTable()\Field$(I)
EndIf
Next
Debug OutPut$
Next
Debug "-----"
Debug "ArtistName From ID = 8 : " + GetArtistNameFromID(8)
Debug "Bio From ID = 8 : " + GetBioFromID(8)
Debug "Value ArtistName From ID = 8 : " + GetValueFromID("ArtistName", 8)
Debug "Column 2 Value From ID = 8 : " + GetColValueFromID(2, 8)
EndIf
; Test.csv
; ArtistId,ArtistName,Bio
; 1,"Joe Satriani",""
; 2,"Steve Vai",""
; 3,"The Tea Party",""
; 4,"Noiseworks",""
; 5,"Wayne Jury",""
; 6,"Mr Percival",""
; 7,"Iron Maiden",""
; 8,"Atmasphere","Australian jazz band centred around polyrhythms."
; 9,"Ian Moss",""
; 10,"Magnum",""
; 13,"Primus",""
; 14,"Pat Metheny",""
; 15,"Frank Gambale",""
; 16,"Mothers of Invention","Like its name"
CompilerEndIf
Re: Can a structure be defined dynamically by csv headers
Posted: Fri Mar 08, 2024 5:41 pm
by mk-soft
With doublequotes this is a little more complicated. There can be separators in the field. Hence the doublequotes in the field.
Link:
SplitString to list and array with option double-quotes
Re: Can a structure be defined dynamically by csv headers
Posted: Fri Mar 08, 2024 6:34 pm
by ChrisR
Yes, it's much better with your SplitString CSV to Array.
I've updated my previous post.
Thanks