Click to See Complete Forum and Search --> : MySQL performance on big amount of data


tim_kinder
07-05-2006, 03:09 AM
Hello,

I'm trying to run statistics on visiting logs of relatively busy site (few thousands unique ip's per day, about 10 registered clicks whithin the site from each of them - after 1 year it'll make ~5 millions records in one table and ~50 millions in another).

I have a dedicated computer (P4 3 Ghz, 500Mb RAM) running Windows XP SP2, IIS (just to provide a web interface to MySQL), MySQL 5.0.22 (set up as default). Currently there are ~ 350,000 records in ip's table and ~3,500,000 in click's table (both MyISAM with indexes on most of the fields) - and it runs queries very slow, much slower than I'd expect (5-10 sec. and more for simple queries). And I'm affraid what will happen in one year time.

Now the questions. Is MySQL suitable to operate with such amounts of data?
How can I increase the perfomance and make sure there are no hidden rocks?

Will appreciate any help.

Regards,

Tim Kinder

NogDog
07-05-2006, 10:39 AM
You might want to use EXPLAIN (http://dev.mysql.com/doc/refman/4.1/en/explain.html) on your queries to find out if any of those indexes are not being used, in which case getting rid of those unused indexes may help a bit. There is lots of other information to review in the Optimization section (http://dev.mysql.com/doc/refman/4.1/en/optimization.html) of the manual - it's a non-trivial subject (which is why a good DBA can demand a good-sized salary. :) ).

chazzy
07-06-2006, 08:20 PM
I think the problem lies here...

with indexes on most of the fields

I had a discussion once w/ a programmer who was telling me to individually index each column. He was telling me that his query could use anywhere from 1-3 different columns in the where clause and that since we can't predict the combination to use (in reality there are 7 different situations) it would be more efficient to just have 3 different indexes - colA, colB, and colC indexes. I argued with him to say that it would improve average seek time to have the 7 different indexes - since our requirements made clear that the 2 column and 3 column options were much more likely than the 1 column options. We decided to run a load test on the database - same entries on both, 1 night it had just the 3 single column indexes on it and the other night it had all 7 combinations on.

Wouldn't you believe that the 7 indexes, even though they were mostly trivial, out performed in the load test in return times for the queries (same queries issued). With the single columns on, all queries with just 1 column in the where returned very fast - but anything with more than 1 column was very slow.

So, to cut a long story short, while it might make sense to have "indexes on each column", if you have a where clause with multiple columns, then those multiple columns need to be indexed as a single key.

mattyblah
07-07-2006, 11:54 AM
If you could monitor what queries were being run it would be much more efficient. The problem with 7 different indexes is that though search results would be fast, inserts would be terribly slow.