News Pages not in Google
News Pages not in Google
Hi there. I've given each of my news item pages (from the news module) a unique page title (roughly as described in http://forum.cmsmadesimple.org/index.ph ... 475.0.html) and yet these news pages are not showing up in Google organic search. Anybody got any idea why this might be?
Many thanks in advance and all the best.
Many thanks in advance and all the best.
Last edited by Schaboo on Tue Apr 27, 2010 12:54 pm, edited 1 time in total.
-
- Power Poster
- Posts: 424
- Joined: Sat Feb 02, 2008 12:42 am
Re: News Pages not in Google
Implement the CGFeedMaker module and submit the rss feed to Google as a sitemap.
Take a penny, leave a penny.
Re: News Pages not in Google
I use Sitemapmadesimple and then add in logic to grab articles from the news page. My full template is:
Note: Change 60 to your detail page ID number - you can find that int he URL when viewing a news article
I can't remember where I found this - either in the forums or someone's personal site. But, you can apply this approach to any module that generates Detail pages.
Code: Select all
{* modified sitemap template *}
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
{foreach from=$output item='page'}
<url>
<loc>{$page->url}</loc>
<lastmod>{$page->date|date_format:"%Y-%m-%d"}</lastmod>
<priority>{$page->priority}</priority>
<changefreq>{$page->frequency}</changefreq>
</url>
{/foreach}
{capture assign='junk'}{news number='1000'}{/capture}
{foreach from=$items item=entry}
{assign var=utmpNEWS value=$entry->moreurl|replace:'//':'/60/'|replace:'http:/60':'http:/'}
<url>
<loc>{$utmpNEWS}</loc>
{if $entry->postdate}
<lastmod>{$entry->postdate|date_format:"%Y-%m-%d"}</lastmod>
{/if}
<priority>{$page->priority}</priority>
<changefreq>{$page->frequency}</changefreq>
</url>
{/foreach}
</urlset>
I can't remember where I found this - either in the forums or someone's personal site. But, you can apply this approach to any module that generates Detail pages.
"The art of life lies in a constant readjustment to our surroundings." -Okakura Kakuzo
--
LinkedIn profile
--
I only speak/write in English so I may not translate well on International posts.
--
--
LinkedIn profile
--
I only speak/write in English so I may not translate well on International posts.
--
Re: News Pages not in Google
Hi there. Thanks to you both. I used MWW's template to create a sitemap, as I couldn't get CG Feedmaker to work. I'll keep an eye on Google results and let you know if this has done the trick.
Just out of interest why do you think the Googlebot didn't find these pages? - they were all well linked to from other pages (and other sites) that were indexing nicely. I just thought it would be a case of letting the crawlers do their work.
Thanks again.
Just out of interest why do you think the Googlebot didn't find these pages? - they were all well linked to from other pages (and other sites) that were indexing nicely. I just thought it would be a case of letting the crawlers do their work.
Thanks again.
Re: News Pages not in Google
Generally they should be able to find them. Can we see a Link to your site?
"The art of life lies in a constant readjustment to our surroundings." -Okakura Kakuzo
--
LinkedIn profile
--
I only speak/write in English so I may not translate well on International posts.
--
--
LinkedIn profile
--
I only speak/write in English so I may not translate well on International posts.
--
Re: News Pages not in Google
Sure. Its http://roxan.co.uk.
News stories are listed on the following pages:
http://www.roxan.co.uk/index.php?page=news-poultry
http://www.roxan.co.uk/index.php?page=news-sports
http://www.roxan.co.uk/index.php?page=news-industrial
http://www.roxan.co.uk/index.php?page=news-livestock
And here is a sample news detail page:http://www.roxan.co.uk/index.php/news/2 ... tland.html
Thanks for having a look.
News stories are listed on the following pages:
http://www.roxan.co.uk/index.php?page=news-poultry
http://www.roxan.co.uk/index.php?page=news-sports
http://www.roxan.co.uk/index.php?page=news-industrial
http://www.roxan.co.uk/index.php?page=news-livestock
And here is a sample news detail page:http://www.roxan.co.uk/index.php/news/2 ... tland.html
Thanks for having a look.
Re: News Pages not in Google
Right now the News URLs in the sitemap file are wrong.
Change the sitemap template where the number 60 is to the ID of your news pages. For example, your news article URLs are http://www.roxan.co.uk/index.php/news/3 ... Rings.html so you would change the 60 to 58 for the above example.
BUT, since different news categories have different detail pages for the full article, you'll need to set up the sitemap templlate for news by section. For example, you will need to do this for each category:
Find the news ID for each category by looking at an article detail URL from each category. And change '60' to that ID number. Here is an example template to target specific News categories - you will need this for each category:
Change 'Your-Category-Name-Here' in both places to your category - use this block of code for each category and be sure to change the number 60 to the number in your news URL
P.S. you might need to edit a regular content page to get the sitemap to regenerate the URLs after changing the template.
Change the sitemap template where the number 60 is to the ID of your news pages. For example, your news article URLs are http://www.roxan.co.uk/index.php/news/3 ... Rings.html so you would change the 60 to 58 for the above example.
BUT, since different news categories have different detail pages for the full article, you'll need to set up the sitemap templlate for news by section. For example, you will need to do this for each category:
Find the news ID for each category by looking at an article detail URL from each category. And change '60' to that ID number. Here is an example template to target specific News categories - you will need this for each category:
Code: Select all
{capture assign='junk'}{news number='1000' category='Your-Category-Name-Here'}{/capture}
{foreach from=$items item=entry}
{if $entry->category == 'Your-Category-Name-Here'}
{assign var=utmpNEWS value=$entry->moreurl|replace:'//':'/60/'|replace:'http:/60':'http:/'}
<url>
<loc>{$utmpNEWS}</loc>
{if $entry->postdate}
<lastmod>{$entry->postdate|date_format:"%Y-%m-%d"}</lastmod>
{/if}
<priority>{$page->priority}</priority>
<changefreq>{$page->frequency}</changefreq>
</url>
{/if}
{/foreach}
P.S. you might need to edit a regular content page to get the sitemap to regenerate the URLs after changing the template.
"The art of life lies in a constant readjustment to our surroundings." -Okakura Kakuzo
--
LinkedIn profile
--
I only speak/write in English so I may not translate well on International posts.
--
--
LinkedIn profile
--
I only speak/write in English so I may not translate well on International posts.
--
Re: News Pages not in Google
I think the next step is to look at your Google Webmaster account and see if there are any crawl errors.
My only other though is to use SEO 'pretty urls' for all the links on your site using mod_rewrite in case google is not crawling urls like http://www.roxan.co.uk/index.php?page=news-poultry properly. For example, the pretty URL for this is http://www.roxan.co.uk/news-poultry.html
My only other though is to use SEO 'pretty urls' for all the links on your site using mod_rewrite in case google is not crawling urls like http://www.roxan.co.uk/index.php?page=news-poultry properly. For example, the pretty URL for this is http://www.roxan.co.uk/news-poultry.html
"The art of life lies in a constant readjustment to our surroundings." -Okakura Kakuzo
--
LinkedIn profile
--
I only speak/write in English so I may not translate well on International posts.
--
--
LinkedIn profile
--
I only speak/write in English so I may not translate well on International posts.
--
Re: News Pages not in Google
Hi mmw. Thanks again. I'll have a crack at sorting the sitemap - I was being lazy!
I just noticed "Disallow: /index.php?mact" in the robots.txt file. Could this have been stopping the new pages indexing (before we applied pretty URLs)?
I just noticed "Disallow: /index.php?mact" in the robots.txt file. Could this have been stopping the new pages indexing (before we applied pretty URLs)?
Re: News Pages not in Google
yes, that is preventing those links. If you try, "site: roxan.co.uk" in the Google search box, you will see no pages indexed that use index.php? in the search results. Google Webmasters account allows you to test your robots.txt file to see how Google will crawl your site.Disallow: /index.php?mact"
"The art of life lies in a constant readjustment to our surroundings." -Okakura Kakuzo
--
LinkedIn profile
--
I only speak/write in English so I may not translate well on International posts.
--
--
LinkedIn profile
--
I only speak/write in English so I may not translate well on International posts.
--
Re: News Pages not in Google
Righto. I've changed the robots.txt file now and removed that line. We did not add that line, so I'm thinking its a cmsms default, in which case its something for others to look out for. Thanks for all your help