www.webdeveloper.com
Results 1 to 14 of 14

Thread: Data Structure

Hybrid View

  1. #1
    Join Date
    Feb 2011
    Posts
    24

    Data Structure

    I've never used MySQL before, and I'm wondering if I could use it for the following task:

    I am making a website where people can go to it and basically submit a shopping list. They will give the list a Name such as "Sue's List" and then they can enter 1-15 items, and for each item they enter 4 standard bits of data which can be applied to any item IE:
    1)Name of item
    2)Cost
    3)Quantity needed
    4)Description of Item

    From there I want to have another webpage which is capable of displaying all the lists that have been created, and when clicked, it will then show the items in that list.

    Since MySQL is not made to create a hierarchal set of objects, is it possible to do this.

    Three possible work arounds that I was thinking about would be to either
    A)Create a new table for each shopping list that is created, and then a master table for these subtables. Is there a limit to the number of tables that can be created though? If 100 different people create a list, then it may get pretty cluttered with tables. Also, is there even a way to store a reference to a table as an element in another table

    B)Create a table of all the items from all the shopping lists and give each a reference number, then create a master table of the shopping list names IE "Sue's List" and a list of the associated reference numbers

    C)Create one table where each element has 61 Fields, where the fields are:
    1)Name of List
    2)Item 1 Name
    3)Item 1 Cost
    4)Item 1 Quantity
    5)Item 1 Description
    6)Item 2 Name
    .......
    61)Item 15 Description
    and if there are only, say, 2 items in the shopping list, then the rest of the fields are left null--It seems really inefficient to me though.

    Would it be possible to do this in MySQL. If so, which method would be best, or is there a better method altogether? Is there something better to do this in than MySQL?

    Thanks,
    Alex

  2. #2
    Join Date
    Jan 2003
    Location
    Dundee, Scotland
    Posts
    1,367
    Hi Alex,

    In the database you have a set of tables, the tables and represent objects in your system. You can have relationships between tables to help define a link between the different sets of data in the tables between the relationships. It’s these relationships which allow you to represent a child parent relationship with the data.

    A relationship is formed in the table by declaring a field/column in a table as a primary key and the corresponding field/column in the relational table is declared as a foreign key.
    To understand how to structure you tables you should do some research on normalization. The number of tables you can have is limited by the hardware and disk space to support the amount of tables.

    In the scenario above you can have a table with your items with your 4 columns. Each item is represented as a row in that table not a column as you suggested in c). You have another table which will hold your list ID and the list name. You need a 3rd table to represent the relationship between the items and the lists bas a list has many items and an item can belong to more than one list therefore it is called a many-to-many relationship. Your 3rd table holds the primary key from your items table and the primary key of the list table but in this table they are called foreign key. This table will also have a primary key.
    Regards

    Ribs

  3. #3
    Join Date
    Jan 2008
    Location
    Florida
    Posts
    1,227
    If the list of items is not a set list, then you only need two tables, not three as suggested. Three tables would be used if the list of 15 was selected from a longer list of pre-set items.

    In your case, your structure would be:

    Lists
    ListID | ListName

    ListItems
    ListItemID | ListID | ItemName | ItemCost | Qty | Desc

    That' s it. You may want to add a MemberID type field to the Lists table or whatever schema you are using to remember the users.

  4. #4
    Join Date
    Jan 2003
    Location
    Dundee, Scotland
    Posts
    1,367
    If the list of items is not a set list, then you only need two tables
    If you noticed the 3rd table was for the many-to-many relationship between items and the itemlist. With a 2 table structure like you suggested then items would be repeated in your ListItems table which is bad normalization.

    regards


    Ribs

  5. #5
    Join Date
    Jan 2008
    Location
    Florida
    Posts
    1,227
    If user #1 creates a list of 15 unique items, user #2 creates a list of 15 unique items, etc, etc.. where will you ever get a list item that repeats?

    I do not believe the OP mentioned displaying a list to choose from. All items are being created as a uniquely named item.

    Using a third many-to-many table for this scenario is unnecessarily redundant as you will always have the same number of records in each table.

  6. #6
    Join Date
    Jan 2003
    Location
    Dundee, Scotland
    Posts
    1,367
    Hi yamaharuss,


    Difference in opinions here I guess. Or maybe different in opinion on what the OP is trying to achieve.

    From what you have with your 2 tables then data would be something like this:
    Code:
    Lists
    ListID | Name
    1 | list 1
    2 | list 2
    3 | list 3
    
    ListItems
    ListItemID | ListID | ItemName |ItemCost | Qty | Desc
    1 | 1 | Item 1 | 2.00 | 2 | description of item 1
    2 | 1 | Item 2 | 3.00 | 2 | description of item 2
    3 | 1 | Item 3 | 1.00 | 5 | description of item 3
    4 | 2 | Item 1 | 2.00 | 2 | description of item 1
    5 | 2 | Item 2 | 3.00 | 2 | description of item 2
    6 | 2 | Item 3 | 1.00 | 5 | description of item 3
    7 | 3 | Item 1 | 2.00 | 2 | description of item 1
    8 | 3 | Item 2 | 3.00 | 2 | description of item 2
    9 | 3 | Item 3 | 1.00 | 5 | description of item 3
    Imo that’s not normalized correctly.

    I would be doing the following:
    Code:
    List
    ListID | ListName
    
    Items
    ItemID | ItemName | ItemCost | Qty | Desc
    
    ListItems
    ListItemID | ItemID | ListID
    Regards

    Ribs

  7. #7
    Join Date
    Jan 2008
    Location
    Florida
    Posts
    1,227
    Will it work? Yes, but still not necessary.. because it is NOT a many-to-many table.

    You are creating a duplicate ONE-to-many-table. If a many-to-many table would work, than it would certainly be suggested. Your third table will NEVER have duplicate data with a foreign key.. that makes it unnecessary.

  8. #8
    Join Date
    Jan 2003
    Location
    Dundee, Scotland
    Posts
    1,367
    Hi yamaharuss,

    I disagree with you that the relationship between a List of Items and the Items is NOT a many-to-many situation. You have 1 or more lists, each list can have 1 or more items belonging to it. Many lists to many items, many items to many lists, clear as day thats a many-to-many relationship.

    Normalizing the many-to-many relationship requires a third table to serve as a link between the 2 tables in the relationship. Once a thrid table is added the relationships between the relational table and the tables making the relationship become one-to-many relationships.

    Your tables can work without a third but then it would not be normalized correctly, Description and itemname would be repeated over and over.

    Anyway this is getting of topic and I only posted to help Alex get a better understanding of a relational database and I was trying to avoid a lengthy discussion on database structure.

    regards


    Ribs

  9. #9
    Join Date
    Jan 2008
    Location
    Florida
    Posts
    1,227
    It's not off-topic if we are teaching the proper use of normalization. It's healthy discussion and may help someone on the future if not the OP.

    To better understand your concept of many-to-many, can we agree your ID field ListItemID is not needed? If it is, please tell us why.

    If ListItemID is removed from the third table, we are left with ItemID and ListID

    ListID will be duplicated, but ItemID will NEVER, EVER be duplicated. That, sir, is a one-to-many table. There is no need for any other identifier which would make this a many-to-many table.

    I wish Bill Wilkinson would chime in here.

  10. #10
    Join Date
    Jan 2003
    Location
    Dundee, Scotland
    Posts
    1,367
    Hi Yamaharuss,


    ListITemID in my 3rd table is required to prevent having a composite key which is 4th or 5th NF, if your asking me is it required then I would say to bring it to 5th NF then yes it is.

    The sturcture that is left after normalizing will not have many-to-many relationships hence why its called normalized. Your sturcture is not normalized and still has a many-to-many relationship. In my structure there is 3 tables with one-to-many relationships and no many-to-many relationships.

    regards


    Ribs

  11. #11
    Join Date
    Jan 2009
    Posts
    3,346
    I see both sides but it is highly dependent on the usage and data being stored. If one or more items could be identical then you would have a many-to-many relationship with only 2 tables and a little bit of data redundancy. If all items will be unique from each other then moving to a 3 table design is overkill since you would effectively have 2 almost identical tables (1 including a portion of a row from the other) and unnecessary obfuscation of the data across tables.

  12. #12
    Join Date
    Jan 2007
    Location
    Wisconsin
    Posts
    2,120
    OOoo ... a little discussion about schema design. Delicious:

    So, if these lists are going to be composed of similar items, particularly wherein the user might benefit from seeing "popular" items or autocomplete functionality, you'll probably want a structure similar to this:

    List
    ListID | ListName

    Items
    ItemID | ItemName | AverageItemCost | Desc | Popularity

    ListItems
    ListItemID | ListID | ItemID | Qty | ItemCost
    There's a little redundancy -- multiple lists may have the same item listed at the same quantity. Similarly, the ItemCost for any given list of item X may equal the AverageItemCost, but the user is then free to change that cost on their personal list without destroying the relationship to the base item.

    So, why have the base item? Well, you certainly don't need it; it is primarily beneficial in implement aggregate functions more efficiently. It's more efficient to pull suggested items from the system if you can pull a SINGLE row that matches certain criteria, rather than pulling MANY rows matching certain criteria and then GROUP()ing, SUM()ing, or COUNT()ing. You need perform your aggregate functions only when a row is added (or modified). The assumption is that the general items list will be SELECTed from more often than it is written to.

    Also, depending on the average ItemName and Desc lengths, you could save on space by using that 3rd table (Items). That along may be sufficient justification.

    But, if space isn't a concern, which it probably isn't in this case, and you're NEVER going to have pan-user autocomplete, autosuggest, or "popular items" features, then you can probably stand to forgo the more normalized form without any noticeable consequence -- aside from the burning fire in your soul that comes from knowing a more normalized form that performs equally well is just within reach. But, if you're not using any fancy features that benefit from the normalization, and you can cope with that burning fire, use the 2-table form suggested by yamaharuss, as it will result in slightly less complex looking queries (but probably no difference in efficiency):

    Lists
    ListID | ListName

    ListItems
    ListItemID | ListID | ItemName | ItemCost | Qty | Desc

    @yamaharuss:

    ListItemId is actually a pretty important field according to some pretty experiences database folks, provided that it's an auto-incrementing INT (or BIGINT). MyISAM stores the PK anyway, so you may as well expose it. And InnoDB yields greater efficiency a majority of the time for having it on oft-written to tables, since it stores records in order of the PK. An auto-incrementing (big)INT PK instructs InnoDB to write rows in-order as they're INSERTed, avoiding the overhead of reorganizing the base row. From a programmatic perspective it's faster, simpler, and typical/normal/expected to reference rows by PK than by FK1 + FK2 + ... + FKN. And of course, from a purely philosophical perspective, every row effectively has an ID anyway -- so we may as well be explicit about it and provide a concrete, knowable ID (aside form the sum of the information present in the row).

    In terms of the expectation that these rows will NEVER be duplicated -- I disagree. We don't have enough information to make that assumption. In the very least, we can probably consolidate ItemName and/or Desc into their own table(s) to approach a more flexible and normalized design. Fields such as Qty and ItemCost may be good bonus fields (aggregate function caches) to cut back on repeating more expensive queries. But in short, the only significant benefit I see in this case for avoiding the more normalized form is to make the queries slightly more readable.

    This is contingent, again, on the assumption that ItemName will at least be duplicated to some extent. And since the same ItemName might show up on multiple lists, it's also possible to save on space by also providing a suggested description when a user types in "boots" or something ... We're then left only with the unanswerable philosophical question of whether items with the same name are all the same or items with the same name and description ... The unanswerable question is often best "answered" by introducing 2 more tables, actually:

    Descriptions
    DescriptionId | DescriptionText | Popularity

    ItemDescriptions
    ItemDescriptionId | ItemId | DescriptionId | Popularity
    By doing this, we potentially save on space and allow ourselves more sophisticated auto-complete, auto-suggest, and "popular items" functionality to the users without sacrificing much at all on efficiency.

  13. #13
    Join Date
    Jul 2010
    Location
    /ramdisk/
    Posts
    865
    Just thought I'd take a hack at this discussion:

    First a few things:
    Code:
    4 standard bits of data
    makes me think mysql SET type. Instead say 4 pieces of information- or 4 fields; less confusing that way.

    Of your A/B/C suggestions B is the best, but still not acceptable.

    Now your question about "a better solution". You can do this:

    Code:
    table User:
    user_ID | userName | password | realName
    
    table List:
    list_ID | user_ID | listName | serializedList
    serialized_list would be a blob- serialize the data at the application level- make a wrapper class for it. The list could be linked, doubly-linked, an array, who cares. The ID's (PK/FK) mean nothing to the application layer, and likewise the serialized_list means nothing to the database.

    Since MySQL is not made to create a hierarchical set of objects, is it possible to do this.
    My schema wont support this (Ribeyed's schema is better for it) but:

    You can SELECT XML using a query; get comfortable with sql + normalization first. I don't want to explain it- it depends on how normalized your tables are and what you want the XML to look like. You should be able to select the entire XML document in 1 sweep, otherwise you are not a dolphin tamer. I would rather write a very complex query and get exactly what I want. Otherwise you're just moving the complexity around, I find that making simpler parts means making the code harder to maintain. I Don't ask mySQL for the simplest of data- ask it for exactly what you want.

    Edit:
    I like svidgen's schema the least- he is making assumptions about what the lists are going to be used for. You should not be storing meta-data in your database unless there is an exceptional reason- you should not be making reasons up. To be fair, he makes a good point about myISAM tables- you should always make the PK explicit since it's inherent.

    So, if these lists are going to be composed of similar items
    you should be using a BTREE.

    Between the suggestions proposed by Ribeyed and yamaharuss: I would go with Ribeyed's implementation, but stress that either could potentially be better. There is no 1 best way
    Last edited by eval(BadCode); 03-12-2011 at 08:34 PM.

  14. #14
    Join Date
    Jan 2007
    Location
    Wisconsin
    Posts
    2,120
    ... point is, it depends entirely on how the data is going to be used. And we don't have enough information here to produce the "right" answer.

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
HTML5 Development Center



Recent Articles