Page 1 of 4
Clean Hierachical URL's
Posted: Thu Apr 06, 2006 9:16 am
by Russ
Now that we have the new Menu Manager and hierarchy, would it be possible to generate clean hierarchical url's?
e.g.
www.domain.com/section/subsection/page.html
I've had a look at the menu manager code - but it is way beyond me

Any ideas? I think this would be a really nice feature, especially if we could get it into a Google stiemap.
Russ
Re: Clean Hierachical URL's
Posted: Thu Apr 06, 2006 11:34 am
by tsw
I wrote quick and dirty for it some time ago
in lib/misc.functions.php ad this code at the bottom
Code: Select all
function clean_url($my_ID) {
global $gCms;
global $config;
$thispage = $page;
$thispage = $my_ID;
$trail = "";
$delimiter = "/";
$allcontent = array();
$onecontent = ContentManager::LoadContentFromId($thispage, false);
if ($onecontent !== FALSE)
{
array_push($allcontent, $onecontent);
while ($onecontent->ParentId() > 0)
{
$onecontent = ContentManager::LoadContentFromId($onecontent->ParentId(), false);
if (isset($params['root']))
{
if (strtolower($onecontent->Alias()) != strtolower($params['root']))
{
array_push($allcontent, $onecontent);
}
}
else
{
array_push($allcontent, $onecontent);
}
}
$first=1;
$skip=0;
while ($onecontent = array_pop($allcontent))
{
if ($onecontent->HasProperty('target') && $onecontent->GetPropertyValue('target') != '') {
$trail = $onecontent->getURL();
$skip=1;
continue;
}
if($first) {
$trail = $onecontent->getURL();
$strip = count_chars($config["page_extension"]);
$trail = substr($trail, 0, -5);
$trail .= $delimiter;
$first=0;
} else {
$trail .= $onecontent->MenuText() . $delimiter;
}
}
if ($skip==0) {
$trail = substr($trail, 0, -1);
$trail .= $config["page_extension"];
}
return $trail;
}
}
in modules/MenuManager/MenuManager.module.php in function
Code: Select all
function & FillNode(&$content, &$node, &$nodelist, &$gCms, &$count, &$prevdepth, $origdepth)
add this line after "$onenode->id = $content->Id();" (should be around line 186)
Code: Select all
$onenode->CleanUrl = clean_url($content->Id());
after this you can use "{$node->CleanUrl}" in menumanager code..
and here's the rewrite rule
Code: Select all
RewriteCond %{REQUEST_FILENAME} !-f [NC]
RewriteRule ^(.*/)?([a-zA-Z0-9\.]+)\.html$ index.php?page=$2 [QSA]
NOTE: This code wont work right in some situations and Ill try to have something similar into 1.0
Re: Clean Hierachical URL's
Posted: Thu Apr 06, 2006 1:23 pm
by Russ
Thank you tws, it worked well - with some real big problems
Your version seems to be using Menu Title or Menu Text, I found a way to do this that works with Page Alias, and obviously you have to set the page alias in Admin...this seemed to work better with submenus.
In function clean_url($my_ID) {
Find:
Code: Select all
$trail .= $onecontent->MenuText() . $delimiter;
and replace with
Code: Select all
$trail .= $onecontent->Alias() . $delimiter;
Whilst it works and is great, it does cause lots of problems
+ Internal links using FCKeditor
+ function.cms_selflink.php
+ Problems with PiSearch
+ SiteMap
+ function.ImageGallery.php
etc...
And I guess anything else that uses menus or returns a link? I guess we will have to wait until all modules follow a standard way of using the menu heirarchy, unless anyone has a better idea?
Bit of a shame really...
Russ
Re: Clean Hierachical URL's
Posted: Fri Apr 07, 2006 3:51 pm
by sloop
I've developed something as well, which uses the content_alias field in the cms_content table. I'll get it in shape for sharing and post it at some point. What's nice about hierarchical URLs is that they lend themselves well to caching in the filesystem, and to using Apache's mod_rewrite module to serve them as static content when availble or calling CMS when they're not.
One issue I've noticed, though- if your image references are relative (e.g. - no leading slash), that'll become a 404/broken image when the page is accessed via a hierarchical URLs. The browser see's that it's relative and expects to find the "images" folder within the current "folder" of the current page.
That problem can be solved by adding a leading slash, *however* I like sites to be relocatable, to be able to run from a subdirectory instead of the root if necessary, so: rather than putting '/' in front of a local URL, I've been prefixing image and link SRC and HREF attributes with this kind of thing: {html_blob name="image_prefix"} or {html_blob name="site_prefix"}.
Then, if/when the site gets moved or copied (say to a portfolio subfolder), you adjust the blobs for the images and URLs and you're ready to go. Performance for dynamically rendered pages can be a bit slower with all those blobs, if not cached, so make sure you have CMS caching enabled.
Sometime next week I'll post my code for hierarchical URLs and some mod_rewrite recipes as well.
-sloop
Re: Clean Hierachical URL's
Posted: Sat Apr 08, 2006 6:33 am
by Russ
I look forward to it sloop, but I guess we would have to re-write a fair few modules unless we can do it all with mod-rewrite?
Russ
Re: Clean Hierachical URL's
Posted: Sat Apr 08, 2006 3:05 pm
by sloop
Russ,
The changes were made to ./index.php, plus the addition of an include file. It handles only in-bound requests that are handed to it by a mod_rewrite rule. It works by splitting the URL on the slashes, and then searching the content table by the content_alias field, until it reaches the end of the URL. So, each piece of the URL corresponds to the unique content_alias name on the Content -> Pages list.
The rest of CMS behaves like it normally does. The way content is stored doesn't change. You can even put .html on the end to make it look like static content, and that's what I've done for a couple of sites.
The one thing I'd change is to build a lookup cache of the content_aliases, so it doesn't have to do queries each time. The sites I've built are being cached by an Akamai-like network (like squid), so most page hits are not going to CMS, unless it needs to be dynamic (like a form response.)
Regards,
-sloop
Re: Clean Hierachical URL's
Posted: Wed Apr 12, 2006 6:32 pm
by sloop
I may have misunderstood your question, Russ. You're correct, if you want existing modules to produce hierarchical URLs, they'll need a code change to do that. Currently I've been using something like this:
Code: Select all
function makeHierURL( $content, $ext='html' ) {
$path='/'.$content->Alias();
$content = ContentManager::LoadContentFromId( $content->ParentId() );
while($content) {
$path='/'.$content->Alias().$path;
$content = ContentManager::LoadContentFromId( $content->ParentId() );
};
return("${path}.{$ext}");
}
In the context of, say, the sitemap plugin, you'd drop the above function into the file function.sitemap.php, then change this line in the body...
Code: Select all
$menu .= "<li><a href=\"".$onecontent->GetURL()."\"";
...to read...
Code: Select all
$menu .= "<li><a href=\"".getFolderizedSitemapURL($onecontent)."\"";
Same for other plugins that use $content->GetURL(). Ultimately, a variation on the makeHierURL() function could be added as a method to the class ContentBase, and save some CPU time.
Re: Clean Hierachical URL's
Posted: Thu Apr 13, 2006 5:25 am
by iNSiPiD
mod_rewrite does the job nicely but you do have to write a new rule for each module or URL form.
For a sampling take a look at the rewrite for standard page URLs here:
http://www.ashm.org.au
And then the rewrite for a jobs module here:
http://www.ashm.org.au/employment
I'd be happy to share the rules here.
Another trick for deeper hierarchies propounded by someone some time ago is to name your page aliases accordingly in conjunction with a base rewrite.
e.g. jobs/vacancies, resources/member-publications
Re: Clean Hierachical URL's
Posted: Thu Apr 13, 2006 8:42 am
by Russ
Hi iNSiPiD and sloop
iNSiPiD: I for one would like to see you mod_rewrite's and how you do different modules?
sloop: I cans ee what you meam, although I don't quite understand the 'getFolderizedSitemapURL' bit , I was expecting 'makeHierURL' ?
Also could we not put this function into the lib/misc.functions.php ?
Thanks both of your for your help, i feel with a bit more effort we can probably crack this
Russ
Re: Clean Hierachical URL's
Posted: Thu Apr 13, 2006 2:46 pm
by sloop
Russ,
That's right, it should've been "makeHierURL(..)"...
Yes, that function could be put into lib/misc.functions.php. I'm unclear on how code gets added to the core project code, so I've been keeping it separate in my own plugin file versions for now.
Here are the basic rewrite rules I use in conjunction with a modified index.php that supports looking up content from hierarchical URLs.
Code: Select all
RewriteEngine On
# The URI exists as a file, so let Apache serve it from the filesysterm
RewriteCond /var/www/mysite%{REQUEST_URI} -f
RewriteRule ^(.+)$ $1 [QSA,L]
# The URI doesn't exist as a file, and it ends in .html, so send it to the path resolver
RewriteCond /var/www/mysite%{REQUEST_URI} !-f
RewriteRule ^(.+)(\.html)?$ /index.php?the_uri=$1 [QSA,L]
The first rule checks to see if the uri already exists as a file under the indicated root folder, and just serves it. This passes through images and pages that are statically-written to the filesystem. The second rule is triggered if the file doesn't exist, which would be the case for content in the CMS that hasn't been written out to the filesystem. my index.php has been modified to look up content by following the chain of parent-to-child content_aliases.
i'll release that code when i have time to clean it up. it works pretty well - it makes the whole site looks like its static files.
Re: Clean Hierachical URL's
Posted: Fri Apr 14, 2006 7:20 am
by Russ
Thanks sloop, I'm going to have a look at this over the bank holiday weekend if I get chance - perhaps completely wreck out test site

but hey that is what they are for!
Actually ... i may mouse out and do it locally
I'll let you know how I get on.
Russ
Re: Clean Hierachical URL's
Posted: Sat Apr 15, 2006 2:44 am
by Ted
Ok, so I had a revelation today...
Why not use the base tag? Then all images can still be inserted from fck relatively and having deep urls won't matter.
So, I tested it today. It's output from {metadata} and work like a champ.
My first test was to try non-htaccess pretty urls (
http://localhost/cms/index.php/pagealias). Works perfectly with an image inserted with fck.
Next will be to go hierarchical. In fact, that's why I'm reading this post. I'm going to toss the code into GetURL() in the contentbase class and I'll let you all know how it goes.
Looking like it could be a next release thing, though...
Oh, and BTW, I think I'm done my "break". Let's get to work.

Re: Clean Hierachical URL's
Posted: Sat Apr 15, 2006 3:22 am
by Ted
Success!
After modifying sloop's makeHierURL method to use hierarchy manager and a few other tweaks, I've had some great results.
Stuff like
http://localhost/cms/index.php/navigati ... vleft_1col works great, even with images inserted with fck. Now I just need some definitives on mod_rewrite scripts to support this, and I think we're set.
Re: Clean Hierachical URL's
Posted: Sat Apr 15, 2006 10:51 am
by Russ
Well - revalation indeed. I'll abandon my experiments and look forward to seeing your results Ted. I think this is probably the last real thing on my list of core improvements for CMS - apart a decent search plugin and from those already in the pipeline and planning.
A better Easter present I could not have

well perhpas there are one or two others I can think of that might be a smidgen better...
Congrats on another great leap forward.
Russ
Re: Clean Hierachical URL's
Posted: Sat Apr 15, 2006 1:39 pm
by Ted
I committed the changes to svn last night before I went to bed. All of the internal url mechanism stuff is there, as well as a nice change to config.php. Everything from last night on will be considered 0.13, since I need the upgrade script ran for the new config.php.
Only thing left is to get hierarchical urls working with mod_rewrite.