I recently developed an algorithm that can be used for average rating calculations in content rating systems. It allows an average rating to be re-calculated upon a new vote only with simple mathematical operations, without the previous ratings having to be retrieved one-by-one from a database, summed up and then divided by their total number like in most traditional solutions based on the standard arithmetic mean equation.
Yes, I did a quick benchmark test on a small data set (12 rows). What I did was, insert a new grade to the database for a corresponding ID, then re-calculate the new average rating for that corresponding ID and output it to the screen.
Using MySQL AVG(): Finished in 0.001843 seconds
Using my algorithm: Finished in 0.001113 seconds
0.0007 seconds of difference and we're dealing with only 12 rows, plus the database server and the web server are located on the same machine.
That's without retrieving the total number of ratings forming the average rating for the given ID. If I wanted to do that, I would have to execute a statement like this:
AVG(rating) AS avg_raging, COUNT(rating) as votes_count
WHERE some_id = <some_value>
GROUP BY some_id
In which case the benchmark results on the same data set are as follows:
Using MySQL AVG(): Finished in 0.001865 seconds
Using my algorithm: Finished in 0.001128 seconds
0.0007 seconds may seem like a tiny difference on small data sets, but imagine you're dealing with a popular website, with a few hundred millions of unique visitors per month..
0.0007 seconds may seem like a tiny difference on small data sets
It may appear just the variable mistake of experiment (delay because of some hardware interrupt, for example), so you need to test on large amount of data to obtain any meaningful comparison. It is of little use to compare times less than quantum of system time allowed for each process (which is usually 30-50ms) so you'd better try to generate data sets which took about 1 second to process query on them...