Clean Hierachical URL's

Talk about writing modules and plugins for CMS Made Simple, or about specific core functionality. This board is for PHP programmers that are contributing to CMSMS not for site developers
Russ
Power Poster
Power Poster
Posts: 813
Joined: Fri Nov 25, 2005 5:02 pm
Location: North West England

Clean Hierachical URL's

Post by Russ »

Now that we have the new Menu Manager and hierarchy, would it be possible to generate clean hierarchical url's?
e.g. www.domain.com/section/subsection/page.html

I've had a look at the menu manager code - but it is way beyond me ;-)
Any ideas? I think this would be a really nice feature, especially if we could get it into a Google stiemap.

Russ
tsw
Power Poster
Power Poster
Posts: 1408
Joined: Tue Dec 13, 2005 10:50 pm
Location: Finland

Re: Clean Hierachical URL's

Post by tsw »

I wrote quick and dirty for it some time ago

in lib/misc.functions.php ad this code at the bottom

Code: Select all

function clean_url($my_ID) {
  global $gCms;
  global $config;

  $thispage = $page;
  $thispage = $my_ID;
  $trail = "";
  $delimiter = "/";

  $allcontent = array();

  $onecontent = ContentManager::LoadContentFromId($thispage, false);

  if ($onecontent !== FALSE)
    {
      array_push($allcontent, $onecontent);

      while ($onecontent->ParentId() > 0)
        {
          $onecontent = ContentManager::LoadContentFromId($onecontent->ParentId(), false);
          if (isset($params['root']))
            {
              if (strtolower($onecontent->Alias()) != strtolower($params['root']))
                {
                  array_push($allcontent, $onecontent);
                }
            }
          else
            {
              array_push($allcontent, $onecontent);
            }
        }

      $first=1;
      $skip=0;
      while ($onecontent = array_pop($allcontent))
        {
 if ($onecontent->HasProperty('target') && $onecontent->GetPropertyValue('target') != '') {
   $trail = $onecontent->getURL();
   $skip=1;
   continue;
 }
          if($first) {
            $trail = $onecontent->getURL();
            $strip = count_chars($config["page_extension"]);
            $trail = substr($trail, 0, -5);
            $trail .= $delimiter;
            $first=0;
          } else {
            $trail .= $onecontent->MenuText() . $delimiter;
          }

        }
      if ($skip==0) {
      $trail = substr($trail, 0, -1);
      $trail .= $config["page_extension"];
      }
      return $trail;
    }
}
in modules/MenuManager/MenuManager.module.php in function

Code: Select all

function & FillNode(&$content, &$node, &$nodelist, &$gCms, &$count, &$prevdepth, $origdepth)
add this line after "$onenode->id = $content->Id();"  (should be around line 186)

Code: Select all

$onenode->CleanUrl = clean_url($content->Id());
after this you can use "{$node->CleanUrl}" in menumanager code..

and here's the rewrite rule

Code: Select all

RewriteCond %{REQUEST_FILENAME} !-f [NC]
RewriteRule ^(.*/)?([a-zA-Z0-9\.]+)\.html$ index.php?page=$2 [QSA]

NOTE: This code wont work right in some situations and Ill try to have something similar into 1.0
Russ
Power Poster
Power Poster
Posts: 813
Joined: Fri Nov 25, 2005 5:02 pm
Location: North West England

Re: Clean Hierachical URL's

Post by Russ »

Thank you tws, it worked well - with some real big problems
Your version seems to be using Menu Title or Menu Text, I found a way to do this that works with Page Alias, and obviously you have to set the page alias in Admin...this seemed to work better with submenus.

In function clean_url($my_ID) {
Find:

Code: Select all

$trail .= $onecontent->MenuText() . $delimiter;
and replace with

Code: Select all

$trail .= $onecontent->Alias() . $delimiter;
Whilst it works and is great, it does cause lots of problems
+ Internal links using FCKeditor
+ function.cms_selflink.php
+ Problems with PiSearch
+ SiteMap
+ function.ImageGallery.php
etc...
And I guess anything else that uses menus or returns a link? I guess we will have to wait until all modules follow a standard way of using the menu heirarchy, unless anyone has a better idea?

Bit of a shame really...
Russ
Last edited by Russ on Thu Apr 06, 2006 2:25 pm, edited 1 time in total.
sloop

Re: Clean Hierachical URL's

Post by sloop »

I've developed something as well, which uses the content_alias field in the cms_content table.  I'll get it in shape for sharing and post it at some point.  What's nice about hierarchical URLs is that they lend themselves well to caching in the filesystem, and to using Apache's mod_rewrite module to serve them as static content when availble or calling CMS when they're not.

One issue I've noticed, though- if your image references are relative (e.g. - no leading slash), that'll become a 404/broken image when the page is accessed via a hierarchical URLs.  The browser see's that it's relative and expects to find the "images" folder within the current "folder" of the current page.

That problem can be solved by adding a leading slash, *however* I like sites to be relocatable, to be able to run from a subdirectory instead of the root if necessary, so: rather than putting '/' in front of a local URL, I've been prefixing image and link SRC and HREF attributes with this kind of thing: {html_blob name="image_prefix"} or {html_blob name="site_prefix"}.

Then, if/when the site gets moved or copied (say to a portfolio subfolder), you adjust the blobs for the images and URLs and you're ready to go.  Performance for dynamically rendered pages can be a bit slower with all those blobs, if not cached, so make sure you have CMS caching enabled.

Sometime next week I'll post my code for hierarchical URLs and some mod_rewrite recipes as well.

-sloop
Russ
Power Poster
Power Poster
Posts: 813
Joined: Fri Nov 25, 2005 5:02 pm
Location: North West England

Re: Clean Hierachical URL's

Post by Russ »

I look forward to it sloop, but I guess we would have to re-write a fair few modules unless we can do it all with mod-rewrite?

Russ
sloop

Re: Clean Hierachical URL's

Post by sloop »

Russ,

The changes were made to ./index.php, plus the addition of an include file.  It handles only in-bound requests that are handed to it by a mod_rewrite rule.  It works by splitting the URL on the slashes, and then searching the content table by the content_alias field, until it reaches the end of the URL.  So, each piece of the URL corresponds to the unique content_alias name on the Content -> Pages list.

The rest of CMS behaves like it normally does.  The way content is stored doesn't change.  You can even put .html on the end to make it look like static content, and that's what I've done for a couple of sites.

The one thing I'd change is to build a lookup cache of the content_aliases, so it doesn't have to do queries each time.  The sites I've built are being cached by an Akamai-like network (like squid), so most page hits are not going to CMS, unless it needs to be dynamic (like a form response.)

Regards,
-sloop
sloop

Re: Clean Hierachical URL's

Post by sloop »

I may have misunderstood your question, Russ.  You're correct, if you want existing modules to produce hierarchical URLs, they'll need a code change to do that.  Currently I've been using something like this:

Code: Select all

function makeHierURL( $content, $ext='html' ) {
  $path='/'.$content->Alias();
  $content = ContentManager::LoadContentFromId( $content->ParentId() );
  while($content) {
    $path='/'.$content->Alias().$path;
    $content = ContentManager::LoadContentFromId( $content->ParentId() );
  };
  return("${path}.{$ext}");
}
In the context of, say, the sitemap plugin, you'd drop the above function into the file function.sitemap.php, then change this line in the body...

Code: Select all

         $menu .= "<li><a href=\"".$onecontent->GetURL()."\"";
...to read...

Code: Select all

         $menu .= "<li><a href=\"".getFolderizedSitemapURL($onecontent)."\"";
Same for other plugins that use $content->GetURL().  Ultimately, a variation on the makeHierURL() function could be added as a method to the class ContentBase, and save some CPU time.
iNSiPiD

Re: Clean Hierachical URL's

Post by iNSiPiD »

mod_rewrite does the job nicely but you do have to write a new rule for each module or URL form.

For a sampling take a look at the rewrite for standard page URLs here:

http://www.ashm.org.au

And then the rewrite for a jobs module here:

http://www.ashm.org.au/employment

I'd be happy to share the rules here.

Another trick for deeper hierarchies propounded by someone some time ago is to name your page aliases accordingly in conjunction with a base rewrite.

e.g. jobs/vacancies, resources/member-publications
Last edited by iNSiPiD on Thu Apr 13, 2006 5:27 am, edited 1 time in total.
Russ
Power Poster
Power Poster
Posts: 813
Joined: Fri Nov 25, 2005 5:02 pm
Location: North West England

Re: Clean Hierachical URL's

Post by Russ »

Hi iNSiPiD and sloop :)


iNSiPiD: I for one would like to see you mod_rewrite's and how you do different modules?


sloop: I cans ee what you meam, although I don't quite understand the 'getFolderizedSitemapURL' bit , I was expecting 'makeHierURL' ?

Also could we not put this function into the lib/misc.functions.php ?


Thanks both of your for your help, i feel with a bit more effort we can probably crack this :)

Russ
sloop

Re: Clean Hierachical URL's

Post by sloop »

Russ,

That's right, it should've been "makeHierURL(..)"...

Yes, that function could be put into lib/misc.functions.php.  I'm unclear on how code gets added to the core project code, so I've been keeping it separate in my own plugin file versions for now.

Here are the basic rewrite rules I use in conjunction with a modified index.php that supports looking up content from hierarchical URLs.

Code: Select all

RewriteEngine On

# The URI exists as a file, so let Apache serve it from the filesysterm
RewriteCond /var/www/mysite%{REQUEST_URI} -f
RewriteRule ^(.+)$ $1 [QSA,L]

# The URI doesn't exist as a file, and it ends in .html, so send it to the path resolver
RewriteCond /var/www/mysite%{REQUEST_URI} !-f
RewriteRule ^(.+)(\.html)?$ /index.php?the_uri=$1 [QSA,L]
The first rule checks to see if the uri already exists as a file under the indicated root folder, and just serves it.  This passes through images and pages that are statically-written to the filesystem.  The second rule is triggered if the file doesn't exist, which would be the case for content in the CMS that hasn't been written out to the filesystem.  my index.php has been modified to look up content by following the chain of parent-to-child content_aliases.

i'll release that code when i have time to clean it up.  it works pretty well - it makes the whole site looks like its static files.
Russ
Power Poster
Power Poster
Posts: 813
Joined: Fri Nov 25, 2005 5:02 pm
Location: North West England

Re: Clean Hierachical URL's

Post by Russ »

Thanks sloop, I'm going to have a look at this over the bank holiday weekend if I get chance - perhaps completely wreck out test site ;-) but hey that is what they are for!

Actually ... i may mouse out and do it locally :)

I'll let you know how I get on.

Russ
Ted
Power Poster
Power Poster
Posts: 3329
Joined: Fri Jun 11, 2004 6:58 pm
Location: Fairless Hills, Pa USA

Re: Clean Hierachical URL's

Post by Ted »

Ok, so I had a revelation today...

Why not use the base tag?  Then all images can still be inserted from fck relatively and having deep urls won't matter.

So, I tested it today.  It's output from {metadata} and work like a champ.

My first test was to try non-htaccess pretty urls (http://localhost/cms/index.php/pagealias).  Works perfectly with an image inserted with fck.

Next will be to go hierarchical.  In fact, that's why I'm reading this post.  I'm going to toss the code into GetURL() in the contentbase class and I'll let you all know how it goes.

Looking like it could be a next release thing, though...



Oh, and BTW, I think I'm done my "break".  Let's get to work.  ;)
Ted
Power Poster
Power Poster
Posts: 3329
Joined: Fri Jun 11, 2004 6:58 pm
Location: Fairless Hills, Pa USA

Re: Clean Hierachical URL's

Post by Ted »

Success!

After modifying sloop's makeHierURL method to use hierarchy manager and a few other tweaks, I've had some great results.

Stuff like http://localhost/cms/index.php/navigati ... vleft_1col works great, even with images inserted with fck.  Now I just need some definitives on mod_rewrite scripts to support this, and I think we're set.
Russ
Power Poster
Power Poster
Posts: 813
Joined: Fri Nov 25, 2005 5:02 pm
Location: North West England

Re: Clean Hierachical URL's

Post by Russ »

Well - revalation indeed. I'll abandon my experiments and look forward to seeing your results Ted. I think this is probably the last real thing on my list of core improvements for CMS - apart a decent search plugin and from those already in the pipeline and planning.

A better Easter present I could not have ;-) well perhpas there are one or two others I can think of that might be a smidgen better...

Congrats on another great leap forward.

Russ
Ted
Power Poster
Power Poster
Posts: 3329
Joined: Fri Jun 11, 2004 6:58 pm
Location: Fairless Hills, Pa USA

Re: Clean Hierachical URL's

Post by Ted »

I committed the changes to svn last night before I went to bed.  All of the internal url mechanism stuff is there, as well as a nice change to config.php.  Everything from last night on will be considered 0.13, since I need the upgrade script ran for the new config.php.

Only thing left is to get hierarchical urls working with mod_rewrite.
Post Reply

Return to “Developers Discussion”