We're getting whipped on SEO-friendly dynamic content URL's

General project discussion. NOT for help questions.
nhaack

Re: We're getting whipped on SEO-friendly dynamic content URL's

Post by nhaack »

@brippon,

my experience is that when you use pretty URLs or the standard URLs consistently throughout your site, you shouldn't have a problem. E.g. if pretty URLs, than make sure that you do not any "unpretty" URLs somewhere behind a link. If you ensure consistency in your template design (also check module, rss or whatever templates used), everything should be fine.

Ok, if people link to you with the wrong URL scheme, you still have the wrong links, in this case, 301 redirect URL, contact other webmaster and you should be fine. But then again, I just copy paste URLs... I don't think someone would purposely alter the scheme to dilute your results (mean black hat method now as I think of it).

If you already face the duplicated content problem such as having two identical pages rank for the same keyword not identified by Google as duplicated content, correct the according links and 301 redirect the wrong URL to the correct one. With time your results should become clean. But you can keep the potential link juice. I admit, with 1000 pages, some automatism would be nice. Probably it can be solved with a generic rule redirecting the one pattern to the other. You could also use robot.txt to disallow the wrong urls....

If content is actually different but indicated as duplicated content, try to use unique titles. If keywords and description meta data is used, make sure they are unique too. Also ensure that the automatically generated stuff like menu, teasers, etc is not overwhelmingly more text than the actual content.

Concerning the sitemap: it is a valuable tool for SEO activities... however, as suggested, it compliments the regular crawling and can help you to make sure that your important pages get crawled. Make sure that sitemap and content pages show the same link structure.

Concerning the whole pretty URLs discussion, I do not think that it is such a huge issue. First of all, search engines mostly perfectly understand http://example.com/index.php?page=home. Yes, it's not pretty and URLs are a stronger ranking factor, but your keyword is there. If one wants the last tiny bit, one has to create sexy rewrite rules. That's something the admin should be able to handle for your (or check the dozens of tutorials out there in the web). Regular expressions are hard to learn (I think) but definitely worth it.

Additionally, I think that well written and well marked-up content (e.g. using micro formats, headlines, paragraph tags etc) is much more important - if you have to decide where to invest your energy ;) Keep in mind that e.g. Google uses more than an estimated 200 ranking factors. See the following article for some more information:

http://www.seomoz.org/article/search-ranking-factors

A great tool for checking the link structure in your sandbox already is XENU's Link Sleuth http://home.snafu.de/tilman/xenulink.html it will crawl links in your site, the results will quickly tell you if there's a problem.

Just some thoughts

Best
Nils

-------
edit: @jeremy... hehe you where just posting when I was still writing ;)
JeremyBASS

Re: We're getting whipped on SEO-friendly dynamic content URL's

Post by JeremyBASS »

quick like ninja  ;D
viebig

Re: We're getting whipped on SEO-friendly dynamic content URL's

Post by viebig »

When you have pretty urls your pages can be accessed by the normals urls too...

A good internal links structure dont solves the problem, bacause anyone can broke that structure.

For example if you have http://domain.com/test.html, anyone knowing cmsms can use a malicious behavior to create a link on another site to http://domain.com/index.php?page=test

This non-pretty url page will get indexed, and marked as duplicated content(google hates that), but what google hates more is duplicated content on the same site.

So, a quick fix would be create a rewrite rule to move (301 - permanetly) those ordinary urls to the prettier.

A page that doesn´s exist, or a module that get´s a invalid id to display details should always return a 404 error.

Also, the pretty urls for modules is fragile, since /news/1/2/News-Title is the same as /news/1/2/News-Title-Wherever or /news/509/2/News-Title

All that pages exists and will return duplicated content. This behavior make cmsms lost a lot of potential SEO entusiasts.

I think that any page or module can have an option like url, so we can make a extact match url to point to the page, so the urls structure would be independent(sure, this is optional, if not set, use default url scheme). I think magento has a feature like that.
nhaack

Re: We're getting whipped on SEO-friendly dynamic content URL's

Post by nhaack »

Mhh... I see your point. But if my site exists in an environment where competitors place "bad links" so I loose rankings, one could change the query parameters from "page" to "something_else". If this parameter is not uncovered, I don't see such a big problem here.

I understand the issue with news. Or with the pretty URLs of some other modules. I can imagine that it could even happen without bad intention, a number gets lost - whereas a missing word might catch the users attention (e.g. when copy-writing an URL in a forum post or something like that).

I totally agree with the 404 thing. Something that isn't there shouldn't return a different status code.

What would be the best solution? Someone mentioned an alias engine at the beginning. I think the best option would be something like a central alias module that other modules can use to reserve sort of a space in the content-hierarchy of CMSMS.

It would be able to represent a tree hierarchy with n levels. Now a module could say: "I am 3.1 and below. Call me $alias.". For each item of a module, like with pages, you create an alias. This alias gets registered in the hierarchy tree (so you can have the same alias appearing twice at different locations; if desired).

When a url is called, the required data to build the query string gets looked up first in sort of a translation  table and after this is passed to the regular index.php functionality. This central hierarchy tree can be used to do a lot of fancy things: sitemaps which include every modules item if desired (sitemap.xml), easier menu manager handling etc pp. you could do a lot of fancy things with such a data table. Think of global last-updated lists or the option to easily link to every object from the tinyMCE. Or you could extend the functionality a bit to maintain redirects and status codes as well.

A db table could probably look like this:
-hierarchy_position (position in the tree)
-alias (well... the alias)
-module_name (which module)
-mapping (which module parameter)
-status_code

And an option to create and maintain status code pages would be nice (with placeholder ;D). This also allows modules/plug-ins to return a precise error code and message.

But as mentioned, this would require a lot of rework for the core, the DB layout, modules and probably 100s or 1000s of small  scripts that query the database. And for large sites, I could imagine that it can cost some performance to match thousands of aliases.

If it's easy to use - it'll be a blast. I think this is much better than letting the modules maintain alias functionality in their own tables. Mhhh... now as I think of it, couldn't a module attached to events do the trick? With the correct rewrite rule, every request could be piped through it (something like that).

What do you think?

Best
Nils
calguy1000
Support Guru
Support Guru
Posts: 8169
Joined: Tue Oct 19, 2004 6:44 pm
Location: Fernie British Columbia, Canada

Re: We're getting whipped on SEO-friendly dynamic content URL's

Post by calguy1000 »

this topic is closed..... just getting out of hand.
Follow me on twitter
Please post system information from "Extensions >> System Information" (there is a bbcode option) on all posts asking for assistance.
--------------------
If you can't bother explaining your problem well, you shouldn't expect much in the way of assistance.
Locked

Return to “General Discussion”