Normally, I'd get an email that says:
Error 400 from http://www.domain.com/page-with-broken-link.php
But I've been getting many emails just just say "Error 400 from" with no url. Does anyone know why this is?
I'm thinking that it could be search engine bots/spiders, but I'm not sure.
Is this a stupid way to try to track down broken links? I was hoping that I'd find broken external links. I'm sure that there's better ways to do this. Do you know of any. I do use Analytics, if that can help.
The address of the page (if any) which referred the user agent to the current page. This is set by the user agent. Not all user agents will set this, and some provide the ability to modify HTTP_REFERER as a feature. In short, it cannot really be trusted.
A few alternative solutions to consider:
Products like Webalizer and AWStats which get the 404 failed url from your web server log directly can email you reports. For example, Webalizer an extension called "Xtended" that generates additional stats for error 404's in a more detailed report format that can be friendly emailed/printed. This is a consolidated report which you might schedule periodically for delivery at non-peak times to reduce server load (and maintain your sanity which you might lose as a result of processing all those emails using the method you use now.)
Google Analytics also has a means to track 404's so they show up in your reports, which is more real time and also just as friendly looking.