BS and RM lists

Everything else that doesn't fall into one of the other PB categories.
Dare
Addict
Addict
Posts: 1965
Joined: Mon May 29, 2006 1:01 am
Location: Outback

BS and RM lists

Post by Dare »

This code is not really an announcement nor a tricks/tips so I am putting it here.

The zip file contains two linked list codes. Both are quick put-togethers to ensure I could handle things before I tackled something a bit more complex.

The one list is called BS - I leave you to decide what it stands for. It is a linked list at it's absolute simplest and can have management code and additional functionality built on top.

The second is called RM for Rail management and is the precursor to a bigger project (which will also use a slightly enhanced version of BS).

Like the midi files at http://www.purebasic.fr/english/viewtop ... e0d13a510f these are:

(a) Developed with 4.2 (but these should work with 4.1)
(b) Windows - but these may be cross platform
(c) In the public domain, so you can do what you like with them.
(d) Require that you remove an easily found (if you read the initial comments) "End" statement or they won't run.

By removing the "End" you modified the code and this takes the onus of responsibility off of me. :D


Zip file: http://www.gemcutters.net/pbOddments/RM.zip


Hope it is useful to someone, and hope for some suggestions on the RM.
Dare2 cut down to size
User avatar
pdwyer
Addict
Addict
Posts: 2813
Joined: Tue May 08, 2007 1:27 pm
Location: Chiba, Japan

Post by pdwyer »

Is the RM like a hash table with linked lists as chaining?
Paul Dwyer

“In nature, it’s not the strongest nor the most intelligent who survives. It’s the most adaptable to change” - Charles Darwin
“If you can't explain it to a six-year old you really don't understand it yourself.” - Albert Einstein
Dare
Addict
Addict
Posts: 1965
Joined: Mon May 29, 2006 1:01 am
Location: Outback

Post by Dare »

pdwyer wrote:Is the RM like a hash table with linked lists as chaining?
Hi mate.

Not being really conversant with hash tables I'll say no because I don't think it is.

With most linked lists, each element points forward and back to neighbouring elements or siblings.

With RM there is a single block of memory, an "index":
  • [addr1][addr2][addr3][etc]
with each entry pointing to an element.

There is a control block (a list of lists) and each list within that can be used for whatever.

Offhand I would say it is closer to a B+tree (with which I am familiar) than a hash table, but it is not really a B+tree either.

I intend to do something similar to this for an RMS. This will include B+tree for indexing by records, record management within disk files, dictionary and etc. The RM was just to see if (a) I could get my head around the approach and (b) if so, then the basic concept will be used to handle all aspects of the RMS.

The biggest downside of the RM re lists is the "rail" or list of elements because this has a contiguous chunk of memory allocated and it can grow. However if the size of the rail is fixed or within limits then there are some upsides, especially with sorting, binary searches (not having to step through a next/previous list) and etc.

Hope that made sense. :)
Dare2 cut down to size
User avatar
pdwyer
Addict
Addict
Posts: 2813
Joined: Tue May 08, 2007 1:27 pm
Location: Chiba, Japan

Post by pdwyer »

Sounds pretty cool, I often pull out my old VB Algorithms book and go through these things and port them to PB http://www.amazon.com/Ready-Run-Visual- ... 0471242683.

Recently though, as much as I want to implement some of these things I just find that using sqlite in-memory DB gives me a power b+tree with lots of features built in so I haven't looked at this kind of stuff for a while.

Why are you going about this in this way? What I mean is, what strengths of this are you planning on taking advantage of for what you're doing? The searching?

There's an "interpolation search" http://en.wikipedia.org/wiki/Interpolation_search that should be faster than a binary search (depending on the data) if you want to play with another algorithm to squeeze some more speed. Instead of halving the distance to the target of the search it tries to estimate it and jump right to it. It's more CPU cycles to guess a jump but less jumps.

It might not be better every time but fun to play with.

I'm interested in what you are wanting to do with this though, it sounds very specific to a task
Paul Dwyer

“In nature, it’s not the strongest nor the most intelligent who survives. It’s the most adaptable to change” - Charles Darwin
“If you can't explain it to a six-year old you really don't understand it yourself.” - Albert Einstein
Dare
Addict
Addict
Posts: 1965
Joined: Mon May 29, 2006 1:01 am
Location: Outback

Post by Dare »

Hi mate.

Mostly I am doing this because it grabbed my interest. A weakness I have, methinks. Last thing that really grabbed my interest with PureBasic was an attempt to make a generic interface for activeX, etc. I dug a hole so deep little red fellows with horns and tails started to help me deepen it. :)

What I want to do is create an RMS that successfully (and speedily, but that is later) handles variable length records, variable length fields, dictionaries, indexing on said variable length fields, multi-field keys, etc etc.

It will never challenge anything like SQLite. :) It might prove useful for the odd app or so. But mostly - if it happens I succeed - I will feel good.

One aspect of this is to have significant keys ordered and handled in such a way that it is quick to find/insert/remove them, etc. I think that the RM will provide such a way, especially with variable length keys and with the fact the keys can be disconnected from the rail. The rails (per node if a BTree+) can be pretty large in memory and in a file, with the keys being disjointed (in memory) by "hanging" below the rails. Lower level rail nodes would have higher level (leafy) nodes hanging from them.

The in-memory side is going to be fast enough, I think. The on-file side is something that needs to be rethought a bit.

Another aspect of interest is the free space management of the file, which I am thinking is going to be an index ordered by the space available per deleted record or chunk, so records can re-use space that is equal or only slightly larger than their needs. Just so it doesn't keep expanding the file and need a dedicated repacking process.

It is a huge job (for me) covering everything from dictionary through indexing through file storage management. So I anticipate it will take a while. RM was a concept check for one part and it was triggered by some of the LL posts recently. They broke a mindset and opened up possibilities. RM was mostly to see if my brain could cope with what was going on conceptually (brain to code).

If anything comes of this I will certainly send the results to you. If it gets that far this side of the next millenium then your part of the deal is that, on receiving the code, you will keep a straight face. :D

That link, BTW, was interesting. Will follow up on that. Thanks.
Dare2 cut down to size
User avatar
DeanH
Enthusiast
Enthusiast
Posts: 292
Joined: Wed May 07, 2008 4:57 am
Location: Adelaide, South Australia
Contact:

Re: BS and RM lists

Post by DeanH »

Hi,

I know it has been awhile but I thought I'd inquire to see how your effort has gone.

"What I want to do is create an RMS that successfully (and speedily, but that is later) handles variable length records, variable length fields, dictionaries, indexing on said variable length fields, multi-field keys, etc etc. "

I am very interested in this. I don't use SQL or related and am not that interested in doing so. The software I write is designed to allow concurrent users on a LAN to access the data files, so in-memory indexing is not an option, it has to be file-based. I have not found a suitable B tree or other type of file-based index for PureBasic. I did try PowerBasic's system but never got it to work in Pure.

As an aside, I do have a DLL developed by a friend several years ago that does most of the things you've mentioned. I've had some problems with it in PB but that could be on my side. The DLL is written in C+ and is based on QDBM's Villa, which uses a B-Tree index. Records and keys are variable-length. Keys are text strings sorted alphabetically. There is no provision for fields, you have to manage those yourself with the records. File locking and queing users is automatic for multiple access. Give me a ping if you want to have a look at it.

D
User avatar
netmaestro
PureBasic Bullfrog
PureBasic Bullfrog
Posts: 8453
Joined: Wed Jul 06, 2005 5:42 am
Location: Fort Nelson, BC, Canada

Re: BS and RM lists

Post by netmaestro »

Dare/Dare2 hasn't posted in over two years, I sent him a PM over two months ago to see if he is still alive and it's still sitting in my outbox, meaning he hasn't read it. A couple of bad signs unfortunately. He is/was valued on these boards.
BERESHEIT
srod
PureBasic Expert
PureBasic Expert
Posts: 10589
Joined: Wed Oct 29, 2003 4:35 pm
Location: Beyond the pale...

Re: BS and RM lists

Post by srod »

Yes, I do hope that he is okay, but I have to acknowledge that the signs do not look good on that score.

A nice fella all round.
I may look like a mule, but I'm not a complete ass.
Baldrick
Addict
Addict
Posts: 860
Joined: Fri Jul 02, 2004 6:49 pm
Location: Australia

Re: BS and RM lists

Post by Baldrick »

srod wrote:Yes, I do hope that he is okay, but I have to acknowledge that the signs do not look good on that score.

A nice fella all round.
+1 srod, but I do fear the worst for him now 2 years on. 1 of the last few pm conversations I had with him back then, he talked of going to hospital for an operation on his stomach. He came back online briefly for a while after this, but disappeared again a short time later & I have not seen or heard anything from him since.
Post Reply