Hi,
I was reading a notice on the Google website, about the behaviour of the googlebot, apparently, according to thier Information for webmasters, Googlebot doesn't crawl any page which has a get argument of &id=. So things like News articles, Calendar items, Guestbook entries, well just about any content generating module won't get spidered.
Anyone know a way around this?
Cheers,
Tom
Google crawling of module content.
Re: Google crawling of module content.
I thought that I would share my solution for getting rid of ids in all other modules too.
I rewrote the following line in /lib/class.module.php:
to
Then added the following rewrite rule to the .htaccess file:
which has worked pretty well for all the modules I have encountered so far. Does anyone know of any modules that over-ride the CreateLink() method?
EDIT: It has however buggered the administration. Please ignore this until I figure out some way to get around it.
I rewrote the following line in /lib/class.module.php:
Code: Select all
$text .= '/'.$goto.'?module='.$this->GetName().'&id='.$id.'&'.$id.'action='.$action;
Code: Select all
if($this->cms->config['assume_mod_rewrite'])
$text .= '/'.$this->GetName().'.module.'.$id.'.php?'.$id.'action='.$action;
else
$text .= '/'.$goto.'?module='.$this->GetName().'&id='.$id.'&'.$id.'action='.$action;
Code: Select all
RewriteRule ^([A-Za-z]+)\.module\.(.+)\.php$ moduleinterface.php?module=$1&id=$2 [QSA]
EDIT: It has however buggered the administration. Please ignore this until I figure out some way to get around it.
Last edited by ceilingfish on Fri Nov 18, 2005 11:07 am, edited 1 time in total.
Re: Google crawling of module content.
Preliminarily, the following seemed to work. The code for /lib/class.module.inc.php was rewritten to:
I shall test this more.
Tom
Code: Select all
if($this->cms->config['assume_mod_rewrite'] && $returnid != '')
$text .= '/'.$this->GetName().'.module.'.$id.'.php?'.$id.'action='.$action;
else
$text .= '/'.$goto.'?module='.$this->GetName().'&id='.$id.'&'.$id.'action='.$action;
Tom