How to speed up multiple inserts in SQLite

Share your advanced PureBasic knowledge/code with the community.
Mistrel
Addict
Addict
Posts: 3415
Joined: Sat Jun 30, 2007 8:04 pm

How to speed up multiple inserts in SQLite

Post by Mistrel »

This is basically a copy from tanguay.info since there was hardly anything to convert to PB.
http://www.tanguay.info/web/codeExample.php?id=728

Credit to pdwyer for bringing this to my attention in one of this posts.

Code: Select all

sql.s="Insert into MyTable(Column) values('Value')"
sqlite_exec(hDB,"Begin")
for i=1 to 500
	sqlite_exec(hDB,sql.s);
next i
sqlite_exec(hDB,"End");
User avatar
Fangbeast
PureBasic Protozoa
PureBasic Protozoa
Posts: 4790
Joined: Fri Apr 25, 2003 3:08 pm
Location: Not Sydney!!! (Bad water, no goats)

Post by Fangbeast »

This sounds interesting, a way of caching the transactions for speed reasons. Sounds like a good way to have an sqlite database attached to a machine that is the database and connection server where you have a lot of people hitting the machine, given that sqlite is not client/server.

Brings possibilities..
Mistrel
Addict
Addict
Posts: 3415
Joined: Sat Jun 30, 2007 8:04 pm

Post by Mistrel »

I'm still trying to figure out how END works but if you use COMMIT instead (END may be the same) the changes made to the database will be Atomic.

You can perform the same request when selecting data from a database too.
User avatar
Fangbeast
PureBasic Protozoa
PureBasic Protozoa
Posts: 4790
Joined: Fri Apr 25, 2003 3:08 pm
Location: Not Sydney!!! (Bad water, no goats)

Post by Fangbeast »

The way I read it, "End" tells sqlite that you have finished sending queries (which, as I understand it, are cached to disk) and then a "Commit" is performed to add the cached queries to database permanently.

I think "Commit" normally does the transactions immediately without caching the statements cased by the "begin" and "end" markers but I could be wrong.

P.s. I've never understood what "Atomic" means in relation to a database. Only for physics:):)
Mistrel
Addict
Addict
Posts: 3415
Joined: Sat Jun 30, 2007 8:04 pm

Post by Mistrel »

pdwyer explains the END/COMMIT confusion!
http://www.purebasic.fr/english/viewtopic.php?t=29414

Atomic means that either all of the changes occur or none at all. For example, if you're half-way through updating a database and your power goes out you never send a COMMIT so the changes are automatically rolled back to the previous state.

I believe that's how Windows Vista (TM) handles files.
Wikipedia wrote:Transactional NTFS

Transactional NTFS (abbreviated TxF) brings the concept of atomic transactions to the NTFS file system, allowing Windows application developers to write file output routines that are guaranteed to either completely succeed or completely fail. Transactional NTFS allows for files and directories to be created, renamed, and deleted atomically. Using transaction ensures correctness of operation; in a series of file operations (done as a transaction), the operation will be committed if all the operations succeed. In case of any failure, the entire operation will rollback and fail.

Transactional NTFS is implemented on top of the Kernel Transaction Manager (KTM), which is a Windows kernel component first introduced in Windows Vista that provides transactioning of objects in the kernel. The NTFS file system already supports journaling of low-level operations, such as writing a block of data. Transactional NTFS expands on this capability to include:

* Atomic operations on a single file:

A common example of this is saving a file from an application; if the application or machine were to crash while writing the file, then only part of the file could be written, possibly resulting in a corrupted file. This would be a very significant problem if a previous version of the file was being over-written, as data would likely be lost.

* Atomic operations spanning multiple files:

If an application needs to update several files at once with a set of changes, all the necessary file operations can be performed as a single transaction, preventing inconsistent updates in the event of a failure.

* Atomic operations spanning multiple computers:

Performing the same operation on multiple computers is a fairly common administrative task in a corporate network; Transactional NTFS integrates with the Distributed Transaction Coordinator to ensure that the change is successfully applied to all machines.
User avatar
Fangbeast
PureBasic Protozoa
PureBasic Protozoa
Posts: 4790
Joined: Fri Apr 25, 2003 3:08 pm
Location: Not Sydney!!! (Bad water, no goats)

Post by Fangbeast »

Thanks for that, there are huge gaps in my knowledge regarding many database features that I don't use.
Dare
Addict
Addict
Posts: 1965
Joined: Mon May 29, 2006 1:01 am
Location: Outback

Post by Dare »

Thanks for the explanations! Sometimes a few words make a huge difference to understanding.
Dare2 cut down to size
Heathen
Enthusiast
Enthusiast
Posts: 498
Joined: Tue Sep 27, 2005 6:54 pm
Location: At my pc coding..

Post by Heathen »

Also it's a good idea to group them into 1 call, ie:

Code: Select all

sql.s
for i=1 to 500
   sql + "Insert into MyTable(Column) values('Value');"
next i
sqlite_exec(hDB,"Begin;" + sql + "End;");
I suggest also checking out the PRAGMA settings. there's many optimizations that can be made.
User avatar
graves
Enthusiast
Enthusiast
Posts: 160
Joined: Wed Oct 03, 2007 2:38 pm
Location: To the deal with a pepper

Post by graves »

I do'nt know SQLite, but in MySQL it's very very faster to group in a sentence, like this:

Code: Select all

INSERT INTO nametable [(field, field...] VALUES (data1, data1....), (data2, data2...), (data3, data3...)...(and so until 4MB of text)
In PureBasic: (asume 3 data field values stored in a LinkedList)

Code: Select all

sqlcmd.s = "INSERT INTO nametable VALUES "
Foreach LnkLstData()
  sqlcmd + "("+ LnkLstData()\field1+","+ LnkLstData()\field2 +","+ LnkLstData()\field3 +"),"
next
sqlcmd = left(sqlcmd, len(sqlcmd)-1)  ;deletes last comma character
mysql_real_queryl(*mysql,sqlcmd,len(sqlcmd))
User avatar
pdwyer
Addict
Addict
Posts: 2813
Joined: Tue May 08, 2007 1:27 pm
Location: Chiba, Japan

Post by pdwyer »

From the SQLite page
When all the INSERTs are put in a transaction, SQLite no longer has to close and reopen the database or invalidate its cache between each statement... When unshackled in this way, SQLite is much faster than either PostgreSQL and MySQL.
More details about perf considerations are here http://www.sqlite.org/cvstrac/wiki?p=Pe ... iderations


If you want more speed and only need the data temporarily (perform complex queries on data and want to use SQL) you can use "In Memory" DBs too that don't write to disk at all. When you close the DB the memory is freed.

http://www.purebasic.fr/english/viewtopic.php?t=27149
Paul Dwyer

“In nature, it’s not the strongest nor the most intelligent who survives. It’s the most adaptable to change” - Charles Darwin
“If you can't explain it to a six-year old you really don't understand it yourself.” - Albert Einstein
mskuma
Enthusiast
Enthusiast
Posts: 573
Joined: Sat Dec 03, 2005 1:31 am
Location: Australia

Post by mskuma »

Yes I agree that using 'begin transaction' & a bunch of inserts & 'commit transaction' make a huge difference. To speed up db operations, lately I've been playing with ramdrives & this is better than using disk-based dbs, but I've never tried 'in-memory'. It is easy enough to use SQL statements to copy an existing disk-based database into memory for use, then write the updates back to the disk-based one to preserve any changes?
User avatar
pdwyer
Addict
Addict
Posts: 2813
Joined: Tue May 08, 2007 1:27 pm
Location: Chiba, Japan

Post by pdwyer »

I think that in memory dbs lend themselves to a different kind of application. Rather than being a database in the traditional sense it gives you complex memory structures and querying features beyond what just a programming language will do for you.

You might use a linked list rather than array for its features even though it's not as fast at raw access. This is (IMO) just another step further in that direction. Rather than load a list, load an "in memory" DB then you can execute complex joins and grouping, sorting and filtering with just a couple of lines of code and select out the result. All that doesn't need to go to disk at all.

One thing I used this for was to load two tab delimited text files into two tables and perform joins on them for comparisons then output the result into the same format of tabbed text. It would have taken quite a bit of code to do that with arrays and would have been slower by far to use a "real" (disk) database due to the loading of very large text (log) files.

Just some thoughts...
Paul Dwyer

“In nature, it’s not the strongest nor the most intelligent who survives. It’s the most adaptable to change” - Charles Darwin
“If you can't explain it to a six-year old you really don't understand it yourself.” - Albert Einstein
Dare
Addict
Addict
Posts: 1965
Joined: Mon May 29, 2006 1:01 am
Location: Outback

Post by Dare »

pdwyer wrote:Just some thoughts...
Good ones, and food for more.

* Lateral thinking procedures envoked ... *
Dare2 cut down to size
User avatar
pdwyer
Addict
Addict
Posts: 2813
Joined: Tue May 08, 2007 1:27 pm
Location: Chiba, Japan

Post by pdwyer »

Code: Select all

InitRightBrain()
:)
Paul Dwyer

“In nature, it’s not the strongest nor the most intelligent who survives. It’s the most adaptable to change” - Charles Darwin
“If you can't explain it to a six-year old you really don't understand it yourself.” - Albert Einstein
Dare
Addict
Addict
Posts: 1965
Joined: Mon May 29, 2006 1:01 am
Location: Outback

Post by Dare »

:D

.. invoked
Dare2 cut down to size
Post Reply