Quote Originally Posted by svidgen View Post
For instance, I would argue that storing colors as a comma-delimited string lowers the complexity of accessing a produce from O((M/N)*Log(M)*Log(N)) to (Log(M)) where N is the total number of color assignments used (relationships between a color and a product) and M is the number of products in the products table.

Log(N) is the cost of the product lookup, assuming a BTREE index.
Log(M) is the cost of the color lookup, assuming a BTREE index.
M/N is the average cost of joining the product to its colors.

So, by denormalizing the data, you're looking at saving a M/N Log(N) lookups, each of which will be a higher cost than the single Log(M) lookup needed for the denormalized form.
However this is not the end of the story because you now have to post process all your data to get it in a useable form. You also have potential issues with large indexes from TEXT fields used to store the CSV (Postgre in particular has issues), your tables are locked whilst maintaining these larger indexes, and you also have to use more storage for the TEXT field when you could have used an INT. There are more factors involved here.

The cost is the inability to efficiently search, update, or remove single colors globally across all product rows.
That's certainly one of the other costs, though there are more.