Database implementation/organization
when you do a search for vocal band, you do the same process and split the text, and then in the IN () you just have IN ("vocal", "band")
This would return then "Male Vocal Band" and "Vocal Politician Versus Band Of Rioters".
This is the price of searching on a many to many list.
As a post process you can then "rank" the results, so words that are next to each other in the correct order have a higher ranking.
Of course if you are focussing on windows, get SQL Server Developer Edition (the enterprise edition without restrictions) and then look up "Full Text Search" which does what I'm talking about within it's own database
This would return then "Male Vocal Band" and "Vocal Politician Versus Band Of Rioters".
This is the price of searching on a many to many list.

As a post process you can then "rank" the results, so words that are next to each other in the correct order have a higher ranking.
Of course if you are focussing on windows, get SQL Server Developer Edition (the enterprise edition without restrictions) and then look up "Full Text Search" which does what I'm talking about within it's own database

Funny. For a while the crowd was fooling around with optimization, and lately it's all databases 

( PB6.00 LTS Win11 x64 Asrock AB350 Pro4 Ryzen 5 3600 32GB GTX1060 6GB)
( The path to enlightenment and the PureBasic Survival Guide right here... )
( The path to enlightenment and the PureBasic Survival Guide right here... )
I'm not talking about a search for vocal band, I'm talking about searching for "vocal band" (in quotes). It should not return Vocal Politician Versus Band Of Rioters".Foz wrote:when you do a search for vocal band, you do the same process and split the text, and then in the IN () you just have IN ("vocal", "band")
This would return then "Male Vocal Band" and "Vocal Politician Versus Band Of Rioters".
- Rook Zimbabwe
- Addict
- Posts: 4322
- Joined: Tue Jan 02, 2007 8:16 pm
- Location: Cypress TX
- Contact:
- Rook Zimbabwe
- Addict
- Posts: 4322
- Joined: Tue Jan 02, 2007 8:16 pm
- Location: Cypress TX
- Contact:
- Rook Zimbabwe
- Addict
- Posts: 4322
- Joined: Tue Jan 02, 2007 8:16 pm
- Location: Cypress TX
- Contact:
nope but I do know that before I left there were over 114 pieces of data for each child or varying lengths...
And yes... Data Migration was handled on a MASSIVE scale several times a year.
Took less time than you would think!
But you will probably use unicode because of the fonts (you know, special letter characters)... I have no idea what XML does in unicode.
As well... There are many things I DON'T like about XML... it is sloppy to my eyes and always has been, but it IS easy to deal with in a webpage!
And yes... Data Migration was handled on a MASSIVE scale several times a year.
Took less time than you would think!
But you will probably use unicode because of the fonts (you know, special letter characters)... I have no idea what XML does in unicode.
As well... There are many things I DON'T like about XML... it is sloppy to my eyes and always has been, but it IS easy to deal with in a webpage!

Well, to take things to the next level, a string could be set up as tags in multiple passes, i.e.Trond wrote:I'm not talking about a search for vocal band, I'm talking about searching for "vocal band" (in quotes). It should not return Vocal Politician Versus Band Of Rioters".Foz wrote:when you do a search for vocal band, you do the same process and split the text, and then in the IN () you just have IN ("vocal", "band")
This would return then "Male Vocal Band" and "Vocal Politician Versus Band Of Rioters".
"boy vocal band" is 6 tags:
boy
vocal
band
boy vocal
vocal band
boy vocal band
and when you do a search for "vocal band" you can add a processor to the search builder that looks for quoted strings as is, rather than splitting it down.
so a search for "vocal band" would exclude "Vocal Politician Versus Band Of Rioters", but doing a search for "band" or "vocal" would include it.
-
- User
- Posts: 42
- Joined: Sat Mar 04, 2006 5:19 pm
- Location: Rome, NY
Trond,
You may want to try MySQL. It has a free unlimited version. Use the MyISAM table type and use full text index to index the column.
It should be fast enough for you.
The whole key to leveraging database technology is design and index usage.
If you have overlapping inserts/deletes, you will need to use an InnoDB table type and use transactions.
If it's at all possible to build in some delay, you could store all the adds/deletes and run them periodically so indexes can be disabled or removed/added around them.
Jeff
You may want to try MySQL. It has a free unlimited version. Use the MyISAM table type and use full text index to index the column.
It should be fast enough for you.
The whole key to leveraging database technology is design and index usage.
If you have overlapping inserts/deletes, you will need to use an InnoDB table type and use transactions.
If it's at all possible to build in some delay, you could store all the adds/deletes and run them periodically so indexes can be disabled or removed/added around them.
Jeff
Where are the masses?
- the.weavster
- Addict
- Posts: 1576
- Joined: Thu Jul 03, 2003 6:53 pm
- Location: England
Shouldn't the combination of the two fields be the LINK tables primary key?srod wrote:Thomas this is a standard trick when using many-many relationships. The important thing is that the LINK table does not have a key-field, or at the very least the two link fields shown are not defined as being key fields or even unique fields etc.
A Firebird VARCHAR field can be up to 32k in size and can be queried using '=', '>', BETWEEN, IN (), case sensitive LIKE and STARTING and case insensitive CONTAINING. They are also RLE compressed (at maximum 128 bytes compresses to 2 bytes).Trond wrote:Basically I want a searchable index of this, and since like 90% of all songs are about love I figured I'd save some space by not saving the "love" tag so many times. Not to mention artist names and album names will be saved tons of times.
I want to search both the filename as raw text and extra attributes, also as raw text. I thought I could get around the raw text requirement by splitting both the extra attributes and the filename into words and treating them all like short attributes, but then phrase search would break (didn't think of that).
But will it be fast for thousands of entries?the.weavster wrote:A Firebird VARCHAR field can be up to 32k in size and can be queried using '=', '>', BETWEEN, IN (), case sensitive LIKE and STARTING and case insensitive CONTAINING. They are also RLE compressed (at maximum 128 bytes compresses to 2 bytes).Trond wrote:Basically I want a searchable index of this, and since like 90% of all songs are about love I figured I'd save some space by not saving the "love" tag so many times. Not to mention artist names and album names will be saved tons of times.
I want to search both the filename as raw text and extra attributes, also as raw text. I thought I could get around the raw text requirement by splitting both the extra attributes and the filename into words and treating them all like short attributes, but then phrase search would break (didn't think of that).
- the.weavster
- Addict
- Posts: 1576
- Joined: Thu Jul 03, 2003 6:53 pm
- Location: England
In my experience queries that search, sort or join on unindexed fields or queries that return tens of thousands of records in a recordset are what slow databases down rather than a query that selects a few records from a very large table.
It's one of those 'how long is a piece of string' type questions, it really depends what you're going to ask the server to fetch for you.
It's one of those 'how long is a piece of string' type questions, it really depends what you're going to ask the server to fetch for you.