harlock
01-23-2005, 05:49 PM
This is more of a MySQL question than a PHP question, but the two go hand in hand in this case.
I am writing a web based app with PHP and MySQL that accepts resumes online. The software asks some basic questions to help categorize the submission, and then stashes the user information along with their resume (uploaded via a form) where it can be retrieved by an HR dept. The resumes can be .doc or .pdf files.
The question is - Should I store the files as real files or as binary data in the MySQL database?
The problems with storing the files in MySQL are that the database would become huge in a very short time. There are about 3 or 4 resumes submitted per day. That's probably about a thousand per year. They want to keep most of the resumes around for 2 years. Each file may be 100k to several megabytes in size, due to .doc bloat. Although the machine's only task will be to hold that databse, would it slow down to a crawl searching with that much data? What if the files were stored in a seperate database on the same MySQL server? Would it make a difference since it doesn't need to open the giant DB with the binary data until someone actually needs to retrieve a file?
The advantage of storing them in the database is that everything is handled through MySQL, from storage to retrieval.
On the other side, the advantage of storing them as files is that they can be accessed directly and can be searched by other programs such as desktop search appliances.
If I stored them as real files, I would probably have PHP automatically create a directory structure based on date and job applied for. It would create directories based on month and split up the files by month. So it would look like:
master resume directory/category/2005-01/2005-01-04_joe-blow.doc
The full submission date would be prepended onto the filename, along with the submitters first and last name, acquired from the form input. The directory and file location would be stored with the user's entry in the MySQL database. When viewing that entry in the database, you would click on a generated link that points to the local filesystem as mounted on the client PC in HR.
If anyone has any thoughts or musings or technical advice, feel free to chime in.
I am writing a web based app with PHP and MySQL that accepts resumes online. The software asks some basic questions to help categorize the submission, and then stashes the user information along with their resume (uploaded via a form) where it can be retrieved by an HR dept. The resumes can be .doc or .pdf files.
The question is - Should I store the files as real files or as binary data in the MySQL database?
The problems with storing the files in MySQL are that the database would become huge in a very short time. There are about 3 or 4 resumes submitted per day. That's probably about a thousand per year. They want to keep most of the resumes around for 2 years. Each file may be 100k to several megabytes in size, due to .doc bloat. Although the machine's only task will be to hold that databse, would it slow down to a crawl searching with that much data? What if the files were stored in a seperate database on the same MySQL server? Would it make a difference since it doesn't need to open the giant DB with the binary data until someone actually needs to retrieve a file?
The advantage of storing them in the database is that everything is handled through MySQL, from storage to retrieval.
On the other side, the advantage of storing them as files is that they can be accessed directly and can be searched by other programs such as desktop search appliances.
If I stored them as real files, I would probably have PHP automatically create a directory structure based on date and job applied for. It would create directories based on month and split up the files by month. So it would look like:
master resume directory/category/2005-01/2005-01-04_joe-blow.doc
The full submission date would be prepended onto the filename, along with the submitters first and last name, acquired from the form input. The directory and file location would be stored with the user's entry in the MySQL database. When viewing that entry in the database, you would click on a generated link that points to the local filesystem as mounted on the client PC in HR.
If anyone has any thoughts or musings or technical advice, feel free to chime in.