Fangbeast wrote:I'm thinking that the proper design for my recipe database is 3NF using intermediate transaction tables
I doubt it, I really do. Most of the time people can manage quite happily with 2NF (maybe 2 and a bit). The only time that I've seen 3NF used in anger is in enterprise resource planning applications such as CICS or MRP. In these situations you need to keep track of a lot of data which is actually redundant for a significant amount of time but you can't avoid keeping track of it all somewhere.
As an example a factory's parts database will include finished products, subassemblies used to make them and discrete components used to make the subassembies. Assemblies and subassemblies will have bills of material (BOM), drawing lists (DL) and other documentation such as assembly instructions or test procedures etc, all of which will need version control.
However most of the time you won't want much of this data:-
1) There will be no BOM, drawing lists or test procedures for any discrete components - you'll buy them from someone else complete.
So storing BOMs for every part in the db is wasteful - you only want BOMs for assemblies and sub assemblies - and there will be far less of these than the components which are requried to build them.
2) Most of the time you will only be interested in the current BOM and DL, the only time you want old versions is to answer the questions like 'what did we change on that board five years ago?', 'how do we update the Y the way we did the X last year?' So when referencing an item you just need to know that there are multiple BOMs and DLs around - not their full contents.
3) Each discrete component will actually have two (or more) part numbers - yours and the catalogue number used by your supplier(s), however the only people who'll be interested in these latter numbers is the purchasing office (maybe stores too), nobody else will care. There will be far more of these other part numbers than your own if you have alternative suppliers for each component.
and so on... This IS a 3NF scenario.
I can't see that you are dealing with anything that truly falls into this area. A recipe with no ingredients is no use, nor is one with no method, a method with no quantities isn't going to help much... This is all met by 2NF though, I'm not seeing anything really warrants the extra design work of 3NF so far.
At each stage you need to ask yourself 'what advantages does doing it this way offer?':-
Fangbeast wrote:
Recipes ; Main recipes table
Title ; [Text] Recipe title
Recipeid ; [INTEGER PRIMARY KEY AUTOINCREMENT] Recipe record id
How does separating out the Title from the rest of the recipe help you? It's not big and it just means you have to 'join' it back in again in a query every time you want it, making the query that bit more complex. Now if you were talking about a picture of the finished product or multiple pictures illustrating the method...
Fangbeast wrote:
Category ; Category table
Itemcount ; [INTEGER] Items belonging to category
Categoryid ; [INTEGER PRIMARY KEY AUTOINCREMENT] Category record id
I can see that a lot of the time you won't want the 'Itemcount' but I can also see that a "SELECT COUNT(Categoryid) FROM Category;"
is not going to be arduous when Categoryid is an IPK. I'm inclined to think an index somewhere else might be more helpful though - probably on CategoryID. I think that separating the CategoryID from the name of the category will just be a nuisance.
Fangbeast wrote:
tr_Category ; Category transaction table
Recipeid ; FOREIGN KEY(Recipeid) REFERENCES Recipes(Recipeid) ON DELETE CASCADE)
Categoryid ; FOREIGN KEY(Categoryid) REFERENCES Categoryid) ON DELETE CASCADE)
Recordid ; [INTEGER PRIMARY KEY AUTOINCREMENT] tr_Category record id
Yes I'd agree with this one offering advantages:
1) this way a single recipe can belong to multiple categories by having mutiple records for a single Recipeid.
2) it will support searching or retrieving by categories.
I'd still want indexes though on both Recipeid and Categoryid. You don't need the two prior normalisations to implement this one.
ON UPDATE CASCADE will be far more important that ON DELETE. If the IPK gets changed for either of the foreign tables and it doesn't get updated here this will foul up the whole record rendering it useless, or worse pointing to something else entirely, but keep the ON DELETE too because otherwise that would become an administrative chore. (It's still 2NF not 3NF though!)
Fangbeast wrote:IU have to do it as the database has grown to a sluggish 2GB and I have more recipes to feed it. It's unmanageable now.
2Gb is not big for a database. There are lots of databases kicking around with single
tables bigger than 2Gb. SQLite's quoted max is 140 terabytes. The biggest problem with database performance at the design stage is failing to set up the right indices to support the implemented queries. The easiest improvement for database performance at the operational stage is to add a useful index for the problem queries.