I've never used MySQL before, and I'm wondering if I could use it for the following task:
I am making a website where people can go to it and basically submit a shopping list. They will give the list a Name such as "Sue's List" and then they can enter 1-15 items, and for each item they enter 4 standard bits of data which can be applied to any item IE:
1)Name of item
4)Description of Item
From there I want to have another webpage which is capable of displaying all the lists that have been created, and when clicked, it will then show the items in that list.
Since MySQL is not made to create a hierarchal set of objects, is it possible to do this.
Three possible work arounds that I was thinking about would be to either
A)Create a new table for each shopping list that is created, and then a master table for these subtables. Is there a limit to the number of tables that can be created though? If 100 different people create a list, then it may get pretty cluttered with tables. Also, is there even a way to store a reference to a table as an element in another table
B)Create a table of all the items from all the shopping lists and give each a reference number, then create a master table of the shopping list names IE "Sue's List" and a list of the associated reference numbers
C)Create one table where each element has 61 Fields, where the fields are:
1)Name of Listand if there are only, say, 2 items in the shopping list, then the rest of the fields are left null--It seems really inefficient to me though.
2)Item 1 Name
3)Item 1 Cost
4)Item 1 Quantity
5)Item 1 Description
6)Item 2 Name
61)Item 15 Description
Would it be possible to do this in MySQL. If so, which method would be best, or is there a better method altogether? Is there something better to do this in than MySQL?
In the database you have a set of tables, the tables and represent objects in your system. You can have relationships between tables to help define a link between the different sets of data in the tables between the relationships. It’s these relationships which allow you to represent a child parent relationship with the data.
A relationship is formed in the table by declaring a field/column in a table as a primary key and the corresponding field/column in the relational table is declared as a foreign key.
To understand how to structure you tables you should do some research on normalization. The number of tables you can have is limited by the hardware and disk space to support the amount of tables.
In the scenario above you can have a table with your items with your 4 columns. Each item is represented as a row in that table not a column as you suggested in c). You have another table which will hold your list ID and the list name. You need a 3rd table to represent the relationship between the items and the lists bas a list has many items and an item can belong to more than one list therefore it is called a many-to-many relationship. Your 3rd table holds the primary key from your items table and the primary key of the list table but in this table they are called foreign key. This table will also have a primary key.
If the list of items is not a set list, then you only need two tables, not three as suggested. Three tables would be used if the list of 15 was selected from a longer list of pre-set items.
In your case, your structure would be:
ListID | ListName
ListItemID | ListID | ItemName | ItemCost | Qty | Desc
That' s it. You may want to add a MemberID type field to the Lists table or whatever schema you are using to remember the users.
If you noticed the 3rd table was for the many-to-many relationship between items and the itemlist. With a 2 table structure like you suggested then items would be repeated in your ListItems table which is bad normalization.
If the list of items is not a set list, then you only need two tables
If user #1 creates a list of 15 unique items, user #2 creates a list of 15 unique items, etc, etc.. where will you ever get a list item that repeats?
I do not believe the OP mentioned displaying a list to choose from. All items are being created as a uniquely named item.
Using a third many-to-many table for this scenario is unnecessarily redundant as you will always have the same number of records in each table.
Difference in opinions here I guess. Or maybe different in opinion on what the OP is trying to achieve.
From what you have with your 2 tables then data would be something like this:
Imo that’s not normalized correctly.
ListID | Name
1 | list 1
2 | list 2
3 | list 3
ListItemID | ListID | ItemName |ItemCost | Qty | Desc
1 | 1 | Item 1 | 2.00 | 2 | description of item 1
2 | 1 | Item 2 | 3.00 | 2 | description of item 2
3 | 1 | Item 3 | 1.00 | 5 | description of item 3
4 | 2 | Item 1 | 2.00 | 2 | description of item 1
5 | 2 | Item 2 | 3.00 | 2 | description of item 2
6 | 2 | Item 3 | 1.00 | 5 | description of item 3
7 | 3 | Item 1 | 2.00 | 2 | description of item 1
8 | 3 | Item 2 | 3.00 | 2 | description of item 2
9 | 3 | Item 3 | 1.00 | 5 | description of item 3
I would be doing the following:
ListID | ListName
ItemID | ItemName | ItemCost | Qty | Desc
ListItemID | ItemID | ListID
Will it work? Yes, but still not necessary.. because it is NOT a many-to-many table.
You are creating a duplicate ONE-to-many-table. If a many-to-many table would work, than it would certainly be suggested. Your third table will NEVER have duplicate data with a foreign key.. that makes it unnecessary.
I disagree with you that the relationship between a List of Items and the Items is NOT a many-to-many situation. You have 1 or more lists, each list can have 1 or more items belonging to it. Many lists to many items, many items to many lists, clear as day thats a many-to-many relationship.
Normalizing the many-to-many relationship requires a third table to serve as a link between the 2 tables in the relationship. Once a thrid table is added the relationships between the relational table and the tables making the relationship become one-to-many relationships.
Your tables can work without a third but then it would not be normalized correctly, Description and itemname would be repeated over and over.
Anyway this is getting of topic and I only posted to help Alex get a better understanding of a relational database and I was trying to avoid a lengthy discussion on database structure.
It's not off-topic if we are teaching the proper use of normalization. It's healthy discussion and may help someone on the future if not the OP.
To better understand your concept of many-to-many, can we agree your ID field ListItemID is not needed? If it is, please tell us why.
If ListItemID is removed from the third table, we are left with ItemID and ListID
ListID will be duplicated, but ItemID will NEVER, EVER be duplicated. That, sir, is a one-to-many table. There is no need for any other identifier which would make this a many-to-many table.
I wish Bill Wilkinson would chime in here.
ListITemID in my 3rd table is required to prevent having a composite key which is 4th or 5th NF, if your asking me is it required then I would say to bring it to 5th NF then yes it is.
The sturcture that is left after normalizing will not have many-to-many relationships hence why its called normalized. Your sturcture is not normalized and still has a many-to-many relationship. In my structure there is 3 tables with one-to-many relationships and no many-to-many relationships.
I see both sides but it is highly dependent on the usage and data being stored. If one or more items could be identical then you would have a many-to-many relationship with only 2 tables and a little bit of data redundancy. If all items will be unique from each other then moving to a 3 table design is overkill since you would effectively have 2 almost identical tables (1 including a portion of a row from the other) and unnecessary obfuscation of the data across tables.
OOoo ... a little discussion about schema design. Delicious:
So, if these lists are going to be composed of similar items, particularly wherein the user might benefit from seeing "popular" items or autocomplete functionality, you'll probably want a structure similar to this:
There's a little redundancy -- multiple lists may have the same item listed at the same quantity. Similarly, the ItemCost for any given list of item X may equal the AverageItemCost, but the user is then free to change that cost on their personal list without destroying the relationship to the base item.
ListID | ListName
ItemID | ItemName | AverageItemCost | Desc | Popularity
ListItemID | ListID | ItemID | Qty | ItemCost
So, why have the base item? Well, you certainly don't need it; it is primarily beneficial in implement aggregate functions more efficiently. It's more efficient to pull suggested items from the system if you can pull a SINGLE row that matches certain criteria, rather than pulling MANY rows matching certain criteria and then GROUP()ing, SUM()ing, or COUNT()ing. You need perform your aggregate functions only when a row is added (or modified). The assumption is that the general items list will be SELECTed from more often than it is written to.
Also, depending on the average ItemName and Desc lengths, you could save on space by using that 3rd table (Items). That along may be sufficient justification.
But, if space isn't a concern, which it probably isn't in this case, and you're NEVER going to have pan-user autocomplete, autosuggest, or "popular items" features, then you can probably stand to forgo the more normalized form without any noticeable consequence -- aside from the burning fire in your soul that comes from knowing a more normalized form that performs equally well is just within reach. But, if you're not using any fancy features that benefit from the normalization, and you can cope with that burning fire, use the 2-table form suggested by yamaharuss, as it will result in slightly less complex looking queries (but probably no difference in efficiency):
ListID | ListName
ListItemID | ListID | ItemName | ItemCost | Qty | Desc
ListItemId is actually a pretty important field according to some pretty experiences database folks, provided that it's an auto-incrementing INT (or BIGINT). MyISAM stores the PK anyway, so you may as well expose it. And InnoDB yields greater efficiency a majority of the time for having it on oft-written to tables, since it stores records in order of the PK. An auto-incrementing (big)INT PK instructs InnoDB to write rows in-order as they're INSERTed, avoiding the overhead of reorganizing the base row. From a programmatic perspective it's faster, simpler, and typical/normal/expected to reference rows by PK than by FK1 + FK2 + ... + FKN. And of course, from a purely philosophical perspective, every row effectively has an ID anyway -- so we may as well be explicit about it and provide a concrete, knowable ID (aside form the sum of the information present in the row).
In terms of the expectation that these rows will NEVER be duplicated -- I disagree. We don't have enough information to make that assumption. In the very least, we can probably consolidate ItemName and/or Desc into their own table(s) to approach a more flexible and normalized design. Fields such as Qty and ItemCost may be good bonus fields (aggregate function caches) to cut back on repeating more expensive queries. But in short, the only significant benefit I see in this case for avoiding the more normalized form is to make the queries slightly more readable.
This is contingent, again, on the assumption that ItemName will at least be duplicated to some extent. And since the same ItemName might show up on multiple lists, it's also possible to save on space by also providing a suggested description when a user types in "boots" or something ... We're then left only with the unanswerable philosophical question of whether items with the same name are all the same or items with the same name and description ... The unanswerable question is often best "answered" by introducing 2 more tables, actually:
By doing this, we potentially save on space and allow ourselves more sophisticated auto-complete, auto-suggest, and "popular items" functionality to the users without sacrificing much at all on efficiency.
DescriptionId | DescriptionText | Popularity
ItemDescriptionId | ItemId | DescriptionId | Popularity
Just thought I'd take a hack at this discussion:
First a few things:
makes me think mysql SET type. Instead say 4 pieces of information- or 4 fields; less confusing that way.
4 standard bits of data
Of your A/B/C suggestions B is the best, but still not acceptable.
Now your question about "a better solution". You can do this:
serialized_list would be a blob- serialize the data at the application level- make a wrapper class for it. The list could be linked, doubly-linked, an array, who cares. The ID's (PK/FK) mean nothing to the application layer, and likewise the serialized_list means nothing to the database.
user_ID | userName | password | realName
list_ID | user_ID | listName | serializedList
My schema wont support this (Ribeyed's schema is better for it) but:
Since MySQL is not made to create a hierarchical set of objects, is it possible to do this.
You can SELECT XML using a query; get comfortable with sql + normalization first. I don't want to explain it- it depends on how normalized your tables are and what you want the XML to look like. You should be able to select the entire XML document in 1 sweep, otherwise you are not a dolphin tamer. I would rather write a very complex query and get exactly what I want. Otherwise you're just moving the complexity around, I find that making simpler parts means making the code harder to maintain. I Don't ask mySQL for the simplest of data- ask it for exactly what you want.
I like svidgen's schema the least- he is making assumptions about what the lists are going to be used for. You should not be storing meta-data in your database unless there is an exceptional reason- you should not be making reasons up. To be fair, he makes a good point about myISAM tables- you should always make the PK explicit since it's inherent.
you should be using a BTREE.
So, if these lists are going to be composed of similar items
Between the suggestions proposed by Ribeyed and yamaharuss: I would go with Ribeyed's implementation, but stress that either could potentially be better. There is no 1 best way
Last edited by eval(BadCode); 03-12-2011 at 07:34 PM.
... point is, it depends entirely on how the data is going to be used. And we don't have enough information here to produce the "right" answer.
Users Browsing this Thread
There are currently 1 users browsing this thread. (0 members and 1 guests)
Tags for this Thread