Click to See Complete Forum and Search --> : Storing and querying user information efficiently


aditya.internet
05-01-2008, 11:09 AM
Hi,

I have implemented user information storage in following manner:


profiles
---------
id
email
password

profile_attributes
------------------
id
name
field_type

profile_options
----------------
id
profile_attribute_id
name

profile_details
---------------
profile_id // This record is of which profile
profile_attribute_id // This record indicates which attribute (gender etc)
profile_option_id // This value is set if the user selected from drop down
value // This value is filled when a user enters a text value


So this means my profile_attributes table has all the attributes for a profile.
When storing the data for a particular attribute, references are stored in "profile_details" table.

For example:

profiles
--------
id: 1
email: boy@website.com

profile_attributes
----------------------
id: 1
name: gender
field_type: select

profile_options
----------------
id: 1
profile_attribute_id: 1
name: Male

id: 2
profile_attribute_id: 1
name: Female

profile_details
---------------
id: 1
profile_id: 1
profile_attribute_id: 1
profile_options_id : 1
value: null



I would like to know if there is any down side of having this implementation. I have gone ahead with this approach keeping in mind scalability factors in mind and few other important things. I dint wanted to have a table with columns of each attribute coming out to be 30 columns plus.

And the most important thing I need help with is how should I query profiles with certain attributes.

Like all the profiles with gender male and country Germany.

Now I am not sure how to construct a query for that.
I am looking forward to doing this in a single query but have multiple where clauses is creating a problem in fetching results.


select from profiles
join profile_details on(profiles.id = profile_details.profile_id)
where (gender = 'Male') and (country = 'Germany')

This fails as it tries to get profiles with joining profile_details and looks for records which matches all the where clauses that is not possible a profile_detail to match both gender and country condition.

And if I change the clauses to "or" comparison then it gets me both:
gender: male and country: any
gender: any and country: Germany

There are several other ways to go around doing this like sub queries etc but I am looking for a better approach to get this working.

I will be glad to add clarifications if required.


Thanks in advance..

chazzy
05-01-2008, 12:49 PM
I think you want to use an INTERSECT for this one.


select * from profiles where profile_id in (
select profile_id from profile_Details where profile_attribute_id = 1 and profile_options_id=1
INTERSECT
select profile_id from profile_Details where profile_attribute_id = 2 and profile_options_id=3)

That's assuming that attribute 2 is country and option 3 is Germany.

As for if this approach is good or not, most people would say no. From a normalization standpoint it's great (except that you don't give option for localization). From a performance standpoint it's slow (because you'll have to join many tables to get a single "Object" back) and it makes the data very hard to read.

aditya.internet
05-02-2008, 06:19 AM
Hi Chazzy,

Thanks for your reply and your feedback.
As you were saying that it will be low on performance then can you please suggest some another implementation that is better for this requirement. But scalability and performance are both of very high importance for me while designing database.

As for localization we already have another system in place which can handle multiple languages.

I am really not sure if what you suggested is the best option for getting results as it has sub queries in a way. So it might be a good idea to have more than one query to get the results.

I am also thinking of using indexing solution like ferret and may be that will solve the problem here.

Once again thanks a lot for taking out time and replying back to my message.

chazzy
05-02-2008, 09:24 AM
Even though you might structure your database in this way, it doesn't meant that you'll never have to change your application. There's really nothing about this structure that makes it more scalable than a single profile table w/ all of the data there as columns (30 columns isn't a lot, and I take scalability as defined here (http://en.wikipedia.org/wiki/Scalability)). Your design has more to do with long term maintainability of the system - make it easier to add new features.

Even with that said, your database has a number of flaws. You need to realize these facts, as in your current structure you can't handle them properly:

- Foreign Key constraints between tables will always fail. (let's say you wanted to add a favorite product attribute to the profile - you can't use it directly as a FK)
- You can't require any fields from the database level.
- Duplicates require you to have a multi-column foreign key.
- Updates are harder to do when updating multiple rows in this table, vs updating multiple columns.

It is true that adding new attributes to the profile table requires no altering of the database, but it lends itself to incredibly large SQL statements to get the same data that could be gotten like this:

select * from profiles where profile_id = 1;


As for the comment about the query, it's not efficient, no, but it's the only solution in your given structure.