I seem to be having my share of Google problems lately! I wrote a lengthy post about my problems with Adsense. There have been a few responses, but no one is really able to answer my questions.
I’ve also been having a problem on my family website. Almost every month the Googlebot goes crazy and crawls my site non-stop. I won’t go into the detail I did with my Adsense problem, but I did email them originally, and it took 10 days to get a response! In the meantime, they kept sucking up my bandwidth which I have to pay for. They asked for some information from me which I sent. They replied 5 days later asking for more information. Things seemed to settle down (I assumed they’d fixed the problem), so I waited a few weeks. But then it started again, so I sent along what they had asked for. 11 days later after not hearing from anyone yet, I emailed them again (I was obviously angry):
Do something about this!!!!!!!!!!!
Why is no one replying??????You’ve already used up 12.5 GB this month, and I only have 15 for the entire month. Stop this. I want to be crawled, but not like this……………
Why is the customer service so bad? Is there someone I can email who will actually read this? This should have been corrected when I first commented on it 2 months ago. I provided all the necessary information, but was still told they needed more info (even though I had already given it to them).
Now here I am with the same problem.Fix this please
Unbelievable, it took another 15 days after that angry email before I got a reply. Wow, their customer service is awful!
The end result was their Googlebot doesn’t handle sessions well:
From the log snippet you provided, we can see that your site is using session IDs. As you’ve observed, session IDs can cause problems for our robots. Please disable session IDs for Googlebot so that our robots may crawl your site more efficiently.
They sent me a link to some information on how to modify a PHP script so that it doesn’t create a session for the Googlebot. Unfortunately, I don’t use the script that link talks about. So I have to figure it out for myself. It irks me that I have to make changes on my site when they are the ones that have the problem. I have 2 PHP scripts on the site: PhpGedView (for genealogy/family history information), and 4images (for family photos). Now I have to go figure out which one is the problem…
I realize I’m using open source scripts that are free and there is a risk in that. I realize that this is not a business site that has quality control over the scripts. I realize that the site isn’t really meant for anyone other than my family. But Google should be able to reply in a more timely manner, and provide better answers.
As of November 27, my site had 224.55 MB worth of regular traffic, and 21.55 GB worth of “not viewed traffic” (according to Awstats), with the Googlebot using 21.40 GB of that traffic!
I came across your description of your problem researching a similar problem I was having. phpGedView is probably the culprit.
If you have access, edit your .htaccess file per the instructions at this page:
http://baheyeldin.com/drupal/how-to-get-rid-of-phpsessid-in-drupal-and-other-php-applications.html
You can test whether it stuck by disabling cookies in your browser, and see if PHPSESSID shows up in the URLs when you browse around in phpGedView.
Thanks for the suggestion Ron. I had already assumed the culprit was phpGedView as I have 4images installed on other sites and don’t have the problem. Google suggested I try http://www.top25web.com/blog/2004/02/disabling-session-ids-in-phpbb-forums.html and I just haven’t bothered to try it yet. I’ve temporarily stopped all robots from crawling my site and I’ll look into this further soon.