Click to See Complete Forum and Search --> : spliting database into 2
drewex
03-31-2004, 01:36 PM
Hi all
I have an enormous sized database that i need to separate into 2 pieces ineed the system to look in 1 database first and if it cant find data take a look in the second one. Is this possible is it possible to split it and show it as one.
PeOfEo
03-31-2004, 05:16 PM
why? How big are we talking about. You can move data around and populate tables, but as for making a new data base... not very likely. It depends on what you are using, if you are using access forget it, it might be possible but I have never seen it done or ever heard of anything like tha tbeing possible. If is xml you can append it like a text file, no problem there, if its sql server you might be able to create a whole new data base, but I am not sure, I know you can create new tables though. But if you are using mssql server or msde it would be pointless to break up the data, they run much faster then access or xml, it would be more efficient to keep it as one, thats probably why I have never seen it. I know vb applications can create new data bases, but I do not know if the aspnet system account is able to do this.
drewex
03-31-2004, 05:22 PM
I got sql server 7 running and the database is at least 25 gb. I have data till 2000 and I want to separate them by year. Move old data in one place. and use it if its neeeded.
there are 5 10 tables that have to be changed. and most of them have triggers. and the previous programmer used Sp to do everything.
Im not really sure for using to databases maybe two tables but that could cause biiig amount of coding. I though there could be a easy way to this in spilting process or something.
Can you help?
PeOfEo
03-31-2004, 09:50 PM
oh wow... 25gb. Ok, yea I got cha. I have to say I have never done anything like this before so I have no idea how to do it myself, but I do know it can be done, I have seen it done, and I googled it. This might be of some use
http://www.aspfree.com/c/a/ASP/Creating-And-Altering-Tables-In-Microsoft-SQL-SERVER-2000/2/
I think you are going to use an sql statement to select all of the old stuff, maybe in a loop playing off of your identity feild and then put the old data in your new table and delete it as you go. This could take a bit of time to run with that much data. But I think trying to move it all at one time, well I can't think of how you would do it #1, but if you can that is going to take a bit of time too. No matter what script is going to take a noticable amount of time for the server to run... infact could be a good while moving 10+ gb around. So how often are you going to fire this?
drewex
04-01-2004, 11:19 AM
The creating and the other codes are easy the hard part is if it cant find it in one place it has to look in the old. See come think about it it looks impossible. but have to find away to do it. I have all the time i want. My job is to get this database to run faster. So my plan is to create an archive database. I it cant find it in the live database the system must have to look in the old and work from there. The problem is there is 2 different system thats using this database. One access code with vb script. and Asp.net. i wroth the asp.net part so i can change it but the access part is under someone elses hand and I think he cant do the changes i can do.
Still i haveto find a way to create split and move, realate to each other and get them workin togher. that sounds IMPOSSIBLE. Dont know what do to. Thanks anyaway.
CardboardHammer
04-01-2004, 02:28 PM
-Break the large tables into multiple tables having the same structure (and make that the table names are all different from the originals).
-Define partitioned views (having the name of the original tables) that logically combine the individual tables back into the original table.
-NOTE: Some complex queries can't use views under certain circumstances, so watch your step.
--See "Using Views with Partitioned Data" in SQL Server Books Online for more info.
In terms of getting better performance in general:
-upgrade and/or better utilize the I/O subsystems
-use indexes, wisely
-review code for inefficiencies
-review how the data is stored: Are there redundancies that can be eliminated, overly large field sizes, numeric data being stored as text, etc.?
-See SQL Server Books Online for more info
drewex
04-01-2004, 03:51 PM
that sound good. after researching for a while on the web i found
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/acdata/ac_8_qd_10_2z4x.asp
andfragmenting sound good but still i haave problems. I got around 900 sp that does every thing. when i had to add a parameter to all sp i had to work stopless 10 days on it. Now im looking dor an easy way. At least a example what to do. This looks really biiig work but dont know what to do. and suggestions.
PeOfEo
04-01-2004, 05:11 PM
good to see you are back w/ us cardboard :D
CardboardHammer
04-05-2004, 10:14 AM
Originally posted by PeOfEo
good to see you are back w/ us cardboard :D
Been busy/stressed/etc. (and have been doing more DB work than web, though that might change a bit) Will wander through from time to time, but unlikely to resume my prior level of participation.
CardboardHammer
04-05-2004, 10:36 AM
Originally posted by drewex
that sound good. after researching for a while on the web i found
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/acdata/ac_8_qd_10_2z4x.asp
andfragmenting sound good but still i haave problems. I got around 900 sp that does every thing. when i had to add a parameter to all sp i had to work stopless 10 days on it. Now im looking dor an easy way. At least a example what to do. This looks really biiig work but dont know what to do. and suggestions.
What you're working on is big to the point where an "easy fix" may not exist...
Tweaking the indexes MAY be your best bet at the moment for getting the best boost for the least effort...
Without more info, it's hard to say what's the best way to go on this...
Is that 510 tables, or 5 to 10 tables? 900 stored procedures... are there any redundancies (can you fold many into one, do several of them all do similar tasks which could be pulled out and made into a stored procedure which could then be called by the stored procedures that require those task done)? What IS the data? How is it used? How is it being added to / updated? Are there times when the system can be taken offline (nights or weekends, for example)? What is the hardware it is being run on? Is there any money available for hardware upgrades? Do you have test system(s) available? Are there any constraints you can put on the system to get better performance (I've got a system that puts reports online (among other things it does) and I was able to specify that the reports only update daily, which enables me to prepare the data nightly in a DTS package and transfer it to a second DB server (that the reports are then pulled from), which makes everything work nicer and MUCH faster.)? Which actions are the slowest, most frequent, most critical, etc.? And so on...
On a slightly different note, you should be glad the last guy used stored procedures... at least you don't have to look all over the place to find all the queries.
drewex
04-05-2004, 11:06 AM
Fisrt, thank you for your help.
Ill answer your questions one by one.
Is that 510 tables, or 5 to 10 tables?
There are around 10 tables. That hold main data
900 stored procedures... are there any redundancies?
Yes there is alot redundancies. The old programmer made lots of useless SP's. Sometimes it even goes in to a recursive turns.
What IS the data? How is it used?
The data ticket information. My company is processing parking tickets.
How is it being added to / updated?
Meaning?
Are there times when the system can be taken offline (nights or weekends, for example)?
There is always some processes running on it. But mostly the work is done on work hours (weekdays 8 - 5).
What is the hardware it is being run on? Is there any money available for hardware upgrades?
They just upgraded the system 9 months ago. To
Dual intel 2.8 (if im right of the speed),
2GB Memory
Enough hard drive
Do you have test system(s) available?
Yes I do. Have a place that i can play as much I want.
Are there any constraints you can put on the system to get better performance (I've got a system that puts reports online (among other things it does) and I was able to specify that the reports only update daily, which enables me to prepare the data nightly in a DTS package and transfer it to a second DB server (that the reports are then pulled from), which makes everything work nicer and MUCH faster.)?
Which actions are the slowest, most frequent, most critical, etc.?
The indexing isnt very well i think there are dozen of indexes in every table for different elements.
Im thinking this getting the database into 2 peices is a good idea. But im not sure if i have to change all the sp's. this is hard to get.
Thanks for your hhelp.
CardboardHammer
04-05-2004, 02:18 PM
OK...
10 tables with 75GB of data makes around 7.5GB per table, on average... That seems a bit much... Are images of the tickets being stored in the db? Or are you in the neighborhood of 100 million records?
"How is it being added to / updated?" Presumably, records are being updated from time to time (which is clear now, given what the system is). You're inserting/updating data one record at a time or in daily batch jobs, or ...?
More memory could help quite a bit. The less disk IO required, the better.
"Enough hard drive" There may be enough storage, but the actual hardware and how it is set up can have a major impact. If all the disk space is pooled as a RAID 5 array, that's a bad thing, performancewise, for a db. Ideally, the db files would be divided up among several independant drives (or pairs of drives, if using RAID 1 for redundancy (which is likely critical)). This enables multiple concurrent I/O operations (as opposed to multiple concurrent outstanding I/O requests, which is not the same thing), which will improve performance significantly, the degree being dependant on the choice of how the files are distributed.
Overuse and poor choice of columns to index can lead to degradation of performance. (More indexes can favor SELECT performance, at the penalty of adversely impacting INSERT, UPDATE and DELETE performance.) See "Designing an Index" an SQL Server Books Online for details. Tweaking the indexes you have at the moment could make things better or it could make them worse. It's a bit of a roll of the dice and the question to ask is "how likely is it that the last guy did it well?" You can dabble with this on your test system to get an idea what the consequences of changes will be, but completely cloning the live experience is likely impractical, so you may encounter more/less performance impact when going from test to live (so do changes to the live system on a Thurs. night and if performance sucks the next day, you can flip back to some/all of the old indexes on Fri. night, having only impacted what is likely the least productive day of the work week, if I guess correctly).
You can try cleaning up the stored procedure situation, but you'll need to be careful not to break anything along the way... Personnally, I'd start with the most frequently used procedures and find and fix significant inefficiencies. I wouldn't delete any stored procedure until I were certain it was completely without use (an idea to test to see if a procedure is unused would be to add another table (called "SP_usage", or something like that) and add an INSERT statement to every procedure that you think is no longer in use that will put a record in that table indicating the procedure called (and perhaps when, as well, in order to help tracking down where it's being called from). If after X amount of time passes and the procedures haven't been called, then toss them (X being an amount of time in which all branches of code should have been reached at least once).).
The recursive areas may be solid candidates for reworking, but just because there is recursion doesn't mean it is poorly done. Some problems just by nature have recursive solutions, and stored procedures are probably too high level to make managing the stack oneself a worthwhile performace boost (in C/C++, the cost of a function call can be worth avoiding recursive calls (depending on how often they'd happen and how deep the recursion goes), but one usually needs to maintain ones own stack to keep everything straight (except for the case of tail recursion, which no programmer worth a damn should be doing in the first place)).
With 900 stored procedures, it may be difficult (or wishful thinking), but it may prove helpful to map out which stored procedure uses what tables how (INSERT/UPDATE/etc., and which columns are used) and calls what other stored procedures (and perhaps also add which application uses each procedure and where, trigger/index info, etc.). You could pack all this info into its own db and, if kept up to date, be able to see where things connect so that the impact of a change can be better predicted before it is made (plus, you'd be able to tell when something no longer has anything dependant on it and is safe to drop). I'm somewhat tempted to do similar things with my own system, but making my system easier to handle would make me more expendable... politics are perverse, but why should I let my boss be the only one to play the game?