Page 1 of 1

Content duplication issue and mod_rewrite...

Posted: Tue Jun 02, 2009 12:25 pm
by sgabello
Hi all,
I've successfuly installed CMSMS MLE and I'm very happy with it... (thanks Alberto, you did a great job).

Unfortunately I spotted a problem that could lead to serious problem with google.

Have a look here:
http://www.2italianholidays.com/
by default it should display the page in english...
Now go to, let's say:
http://www.2italianholidays.com/en_GB/d ... o-tuscany/
And now try to remove the "en_GB" part:
http://www.2italianholidays.com/destina ... o-tuscany/

Basically it displays the same page!!! Google can look at this like a content duplication and fire a ban!

But things get worse if you start to play with languages:
let's go back to the homepage http://www.2italianholidays.com/
switch to italian (top right corner) and go here:
http://www.2italianholidays.com/it_IT/d ... o-tuscany/
http://www.2italianholidays.com/destina ... o-tuscany/
Again different urls with same content!

And the worst ever: http://www.2italianholidays.com/destina ... o-tuscany/
can display both italian or english content depending on what page you come from.

Is there a way to fix this?

Thanks,
Andrea

Re: Content duplication issue and mod_rewrite...

Posted: Tue Jun 02, 2009 9:22 pm
by reneh
Google follow the menu and the links from the frontpage.
So just be sure to not MAKE own links that is diferent in the content or template!


In other words:
Google don't manualy remove parts form the links!  You CAN - but you are a human - google use spiders.

Re: Content duplication issue and mod_rewrite...

Posted: Wed Jun 03, 2009 2:45 am
by mel
Hi,
I'm not sure what I should do? I didn't "set" links in my template, I use {lang text="true" class="lang" spacer=" - "}. Is it what you refeer to?
I don't really understand how spiders like google works and I'm not sure if it's related to this thread. But on webmaster tools, under "internal links" tab, all my french pages have a lot of internal links (probably because of menu links). But on english part, I get only 2 links (from home page and the french homolog page). Why is this? Menus have english links (but have same alias then in french). Is it detected as duplicates?
Thanks if you could help me to understanding!
Mel

Re: Content duplication issue and mod_rewrite...

Posted: Thu Jun 04, 2009 8:00 pm
by alby
mel wrote: I'm not sure what I should do? I didn't "set" links in my template, I use {lang text="true" class="lang" spacer=" - "}. Is it what you refeer to?
No, Reneh say that google follow your site links (static and/or dynamic, ex from MenuManager).
The (false) problem is that ID/alias is unique and go in same content (same with lang param in url):
index.php?page=ID
index.php?page=ALIAS
index.php/ALIAS   (with internal_pretty_url without hierarchy [DEPRECATED])
index.php/hierarchy/ALIAS   (with internal_pretty_url with hierarchy)
/ALIAS   (with mod_rewrite without hierarchy [DEPRECATED])
/hierarchy/ALIAS   (with mod_rewritel with hierarchy)


In this way you can starting with the method that you want.
IMPORTANT IS THAT YOU DON'T CHANGE because google (or other) look in old link and in new link (however there is a method for drop old links!)

mel wrote: I don't really understand how spiders like google works and I'm not sure if it's related to this thread. But on webmaster tools, under "internal links" tab, all my french pages have a lot of internal links (probably because of menu links). But on english part, I get only 2 links (from home page and the french homolog page). Why is this?
Google spider is a browser, you see that pages in your browser?
Look in source of that pages and in your server logs, there are errors?
Have you a robots.txt that deny that pages?
There are many tools that provide an sitemap for your site, try to submit to google a sitemap for your site

Alby

Re: Content duplication issue and mod_rewrite...

Posted: Fri Jun 05, 2009 11:25 am
by sgabello
Alby,
please... can you have a look to my problem too?

I think the best solution should be to force cmsms to generate always urls with the language specified... but I don't have any clue how to do that...

Thanks,
Andrea

Re: Content duplication issue and mod_rewrite...

Posted: Fri Jun 05, 2009 1:20 pm
by alby
sgabello wrote: please... can you have a look to my problem too?
Google cannot look in anywhere (is different if you say it where look....)
Google follow links and not try to change  :)

Important is that you starting with a method and not change it (ex change language labels)

Alby

Re: Content duplication issue and mod_rewrite...

Posted: Sun Jun 07, 2009 2:19 pm
by sgabello
Of course it can't... and I haven't changed anything... this is the way MLE worked for me from the beginning...
links are generated by the tinyMCE editor... with the self-link tool... I can't ask my client to learn html for this reason... otherwise where's the point to use a CMS? This problem simply shouldn't happen...
Or at least this is my point of view.... :D
I think that there's also an easy solution: force the urls to be "rewritten" with the language when is missing...
Don't you think so?