Millisecond time resolution with databases?

Just starting out? Need help? Post your questions and find answers here.
User avatar
Skipper
User
User
Posts: 59
Joined: Thu Dec 19, 2024 1:26 pm
Location: Europe

Millisecond time resolution with databases?

Post by Skipper »

Hi Group,

after many years, I decided to check on PureBasic once more. I liked what I saw and decided to return to using this compiler.

For now, I have a question about the databases to which PureBasic provides access:

I need to work with rather big time-series. These series are sorted by date/time, and as these series potentially contain multiple events within any single second, these time-series use time-date fields with a resolution up to a millisecond. The current logs are 50+GB, and growing.

Is it feasible to store and use this time-resolution with, say, SQLite? I do know that I have to write my own PB-code to handle milliseconds, but what about the database side of things?

Cheers
Skipper
User avatar
NicTheQuick
Addict
Addict
Posts: 1527
Joined: Sun Jun 22, 2003 7:43 pm
Location: Germany, Saarbrücken
Contact:

Re: Millisecond time resolution with databases?

Post by NicTheQuick »

For time series data SQLite or any other relational database is never a good idea.
I'm afraid you need a proper time series database like InfluxDB, TimescaleDB, OpenTSDB, Graphite or even Prometheus. However, there is no ready to use connection in Purebasic yet. You will have to write the communication yourself. But this should be quite easy with the HTTP library in most of the cases.
The english grammar is freeware, you can use it freely - But it's not Open Source, i.e. you can not change it or publish it in altered way.
User avatar
Skipper
User
User
Posts: 59
Joined: Thu Dec 19, 2024 1:26 pm
Location: Europe

Re: Millisecond time resolution with databases?

Post by Skipper »

Hi Nic,

thank you for pointing me to databases that are specifically meant for time-series data.

Cheers
Skipper

EDIT: I should have added that my app will not require real-time analysis, so I wonder if in that scenario it would still be best to not use an SQL database?
Last edited by Skipper on Mon Jan 13, 2025 4:34 pm, edited 1 time in total.
User avatar
mk-soft
Always Here
Always Here
Posts: 6318
Joined: Fri May 12, 2006 6:51 pm
Location: Germany

Re: Millisecond time resolution with databases?

Post by mk-soft »

SQLite support timestamp with milliseconds as TEXT as ISO8601 strings ("YYYY-MM-DD HH:MM:SS.SSS") and REAL (8 bytes) as Julian day numbers

Link:
https://www.sqlite.org/datatype3.html
https://www.sqlite.org/lang_datefunc.html
My Projects ThreadToGUI / OOP-BaseClass / EventDesigner V3
PB v3.30 / v5.75 - OS Mac Mini OSX 10.xx - VM Window Pro / Linux Ubuntu
Downloads on my Webspace / OneDrive
User avatar
NicTheQuick
Addict
Addict
Posts: 1527
Joined: Sun Jun 22, 2003 7:43 pm
Location: Germany, Saarbrücken
Contact:

Re: Millisecond time resolution with databases?

Post by NicTheQuick »

Skipper wrote: Mon Jan 13, 2025 1:48 pm EDIT: I should have added that my app will not require real-time analysis, so I wonder if in that scenario it would still be best to not use an SQL database?
I don't know if SQLite is the best idea with 50+GB of data. In our company we use InfluxDB to store and read realtime metrics of 40.000+ services and although we acquire these data over several year the size of the database is still not in the 50GB range. And we need a lot of memory (48GB+ RAM) to handle that data properly.
So... Of course SQlite can store that amount of data but I don't think you will be happy with it.
The english grammar is freeware, you can use it freely - But it's not Open Source, i.e. you can not change it or publish it in altered way.
User avatar
idle
Always Here
Always Here
Posts: 6026
Joined: Fri Sep 21, 2007 5:52 am
Location: New Zealand

Re: Millisecond time resolution with databases?

Post by idle »

Out of the the 50 gb how many entries does the db store?
I only ask as I'm writing a db engine for PB but it currently stores the primary index in memory which limits it somewhat to the available ram but it can easily deal with 200m entries on a desktop though that mightn't be enough.
However in the case of time series it would be easy enough to range index and then create dynamic multiple views by scanning through the ranges and indexing them. So an index of ranges which then indexes the views.
The core does around 1m reads per thread, 1m Enums per thread and 200k writes p/s concurrently, though that would probably translate to 20k writes to disk p/s with 16 byte keys.
In terms of performance it's faster than Facebooks RocksDB, which is an LSM tree, it's memtable does
4.5M - 7M read QPS for point lookups with sustained writes
4M - 6M QPS prefix range scans with sustained writes.

The writes are limited to 20k p/s and that's running on 32 threads with 131m entries and 144gb ram
my core with 8 readers 2 enums and 2 writers is doing around 8m reads, 2m prefix, and 200k writes on 12 threads concurrently on 16gb ram. though that's with 16 byte keys vs 19 byte keys. its O(k) vs O(log n) so it scales well but there's still a long way to go before it's production ready.
User avatar
Skipper
User
User
Posts: 59
Joined: Thu Dec 19, 2024 1:26 pm
Location: Europe

Re: Millisecond time resolution with databases?

Post by Skipper »

@mk-soft, @nic:
Thanks a lot for your input. Unfortunately, the date/time-stamp is not quite conforming to ISO, so some conversion needs to be done on import. That's OK though, see below:

@idle:
Thanks for the insights you provide about your project. It might become interesting down the line, I'll follow your progress.
The application I intend to create is a 'back-testing engine' for currency and future trading. The import files have very tiny records, four fields only, per underlying some 2-6 mio/month of them. Following 16 currency pairs and a handful of index futures, and you can imagine where that leads up to.
For this reason, I thought of range indexing as well, so on import I intend to store derived information as well (like week number, etc.). Also, the date/time-field I may well split into its constituents, which would offer another way to ease range indexing. Besides, I'm likely to create a single table per underlying, which also aids in segmentation.

Cheers
Skipper
<< Win-11 (x64: XEON + i5) / Mint linux (x64: i3) / MacOS Monterey (Intel x64) >>
User avatar
idle
Always Here
Always Here
Posts: 6026
Joined: Fri Sep 21, 2007 5:52 am
Location: New Zealand

Re: Millisecond time resolution with databases?

Post by idle »

Skipper wrote: Tue Jan 14, 2025 11:54 am I intend to create is a 'back-testing engine' for currency and future trading. The import files have very tiny records, four fields only, per underlying some 2-6 mio/month
you probably won't need a dB for that if it's all time stamped in increasing order just use a sliding windows and seek through the file loading into an array at what ever sample rate you want and that way you can do fft's to see any periodic cycles and correlate the lag or feed into a Neural network or do PCA principal component analysis
User avatar
Skipper
User
User
Posts: 59
Joined: Thu Dec 19, 2024 1:26 pm
Location: Europe

Re: Millisecond time resolution with databases?

Post by Skipper »

idle wrote: Tue Jan 14, 2025 9:15 pm you probably won't need a dB for that if it's all time stamped in increasing order just use a sliding windows and seek through the file loading into an array at what ever sample rate you want and that way you can do fft's to see any periodic cycles and correlate the lag or feed into a Neural network or do PCA principal component analysis
Idle,

you're not the first to suggest that, for this application, a full-grown DBMS might be overkill :wink: I have to test my individual alternatives before deciding on a strategy. A sliding window over the ASCII files would be possible, but I'd like to reduce disk I/O as much as possible. Time for some experiments...

Btw: those 50Gb I mentioned before are zipped ASCII files, so the unpacked size is probably way over sevenfold that size...

Cheers
Skipper
<< Win-11 (x64: XEON + i5) / Mint linux (x64: i3) / MacOS Monterey (Intel x64) >>
User avatar
idle
Always Here
Always Here
Posts: 6026
Joined: Fri Sep 21, 2007 5:52 am
Location: New Zealand

Re: Millisecond time resolution with databases?

Post by idle »

Don't think that would be a problem file reads are buffered anyhow but tou could page into a ringbuffer
Post Reply