Alias and some international characters replacement

Have a translation or something? It goes here. This can mean modules, documentation, or CMSMS itself!
Post Reply
netfast

Re: Alias and some international characters replacement

Post by netfast »

I noticed that I am running the latest version of fiji and the page alias isn't allowing '-' it is over written by '_'.

For me I prefer a '-' not a '_' for seo purposes. Se's apparently (according to some debate over at an seo forum where it was tested) prefer the '-' instead. Now the gain will be a negligble but a couple of sites I have set up have the '-' instead of '_' as a page alias and I would be hesitant in converting those sites over to future cmsms until I find a fix.

So reading what you have said here I am not sure if it is a minor bug or a decision to just use '_' as the only replacement character now.

In the meantime I will look into it and see if I can find a fix.
netfast

Re: Alias and some international characters replacement

Post by netfast »

OK. In the meantime I have found a rough fix just to get anyone started:

In lib/classes/class.content.inc.php about line 547 in current fiji version.

Just replace this code with below as a temporary fix. I would suggest maybe have an option for a user to set their own replacement character (obviously only underscore or dash) or to accept either or in future relaese. If I get a chance I will setup correctly for control from config. Just after this fix for now.


//aj 13/03/2006 to use dash instead of underscore as space default replacement           
//        $alias = preg_replace("/[_-\W]+/", "_", $alias);
//        $alias = trim($alias, '_');

        $alias = preg_replace("/[_-\W]+/", "-", $alias);
        $alias = trim($alias, '-');
radoado
Forum Members
Forum Members
Posts: 31
Joined: Fri Sep 09, 2005 12:33 pm
Location: Ostrava, Czech republic

Re: Alias and some international characters replacement

Post by radoado »

Patricia wrote: (this post is in "Translations" sub-forum, as I cannot post attachements to the General discussion)

Hi there!

PAST:
As you may have seen, the content alias could until now be configured as auto_alias_content (which was the default), the alias was then built from the Menu Text you had enter in the "add a new content" page.

Alias is what is used in the url to the page (alias or id), giving url like (example page is Home): /index.php?page=Home,
or with mod_rewrite with html extension: /Home.html
url must be without accentuated characters, it's the page address.

This has lead to an issue for content with "international characters" in the Menu Text, filling the alias with an hyphen (or underscore), example Menu Text is "Växjö", then the alias was transformed into "V-xj-" (or "V_xj_"), then required the user to go edit the page in order to set it manually to "Vaxjo" or setting in config.php auto_alias_content to false, to always have the alias field to be filled manually.

FUTURE:
for the next version 0.12 to be released soon, the alias field is always present in the add/edit content (under the "Options" tab), and will behave like that:

- If you fill it (or let it empty, build from Menu Text) with accepted characters (basically a to z and 1 to 9 and - and _) it will be accepted as is (only make them lowercase and replace space by _)

- If you fill it (or let it empty, build from Menu Text) with accentuated characters (like éàèü etc), it will look in a list of about 300 unicode characters, and replace to a given letter (see attachement).

- If a character either in Alias field (or in Menu Text, when Alias field is empty) is NOT found in the 300 characters list, then it will give you a warning when you submit, informing you that you must fill an alias with accepted characters. You can go add manually the desired Alias.

I NEED YOU FOR:

Here below is the list of the replaced characters in alias. In order to be sure they display correctly, I made an image (please right-scroll the image to see it completely), as the encoding of this forum is iso-8859-1, and wouldn't display all those characters correctly or same to everybody.

In the first column, you'll find the replacement letter, and beside it, the characters to be replaced by this letter. I do not know all languages ;) and then I'm not exactly sure I chose the correct replacement letter. so please have a look and tell me if it's correct for the letters you know about. I have some doubt about islandic and slavic characters

Note: Unfortunately, languages like Chinese, Korean or Japanese, Hebrew, Arab, Russian, Greek, etc, will still have to fill manually an "english-letters" text in alias, for the url. However, the Menu Text and title can be in those languages as the letters do exist in unicode.

Note 2: we chose to replace ö by o, not by oe (and similar diphtongs). Remember this is only for the url, not for content or menu. You wouldn't want "Vaexjoe" for "Växjö" (Westis said :) ), but rather "vaxjo". And also you can always edit it manually, if you prefer "muenchen" over "munchen", or "aalborg" over "alborg". A standard had to be chosen.

Note 3: you can also have a look at the replacement file here:
http://svn.cmsmadesimple.org/svn/cmsmad ... cement.php
BUT THEN be sure you force your browser to Unicode. In Firefox, it's under menu View->Character encoding->Unicode(UTF-8) and in Internet Explorer it's under View->Encoding->Unicode(UTF-8).

Ok! I hope my post is clear enough, it was a bit tricky :)
Thanks in advance for your comments.
Cheers
Patricia
I add fonetic replacement characters cyrilica->latin. 
Sample Советский Союз -> sovetskij sojuz



[attachment deleted by admin]
radoado
Forum Members
Forum Members
Posts: 31
Joined: Fri Sep 09, 2005 12:33 pm
Location: Ostrava, Czech republic

Re: Alias and some international characters replacement

Post by radoado »

netfast wrote: OK. In the meantime I have found a rough fix just to get anyone started:

In lib/classes/class.content.inc.php about line 547 in current fiji version.

Just replace this code with below as a temporary fix. I would suggest maybe have an option for a user to set their own replacement character (obviously only underscore or dash) or to accept either or in future relaese. If I get a chance I will setup correctly for control from config. Just after this fix for now.


//aj 13/03/2006 to use dash instead of underscore as space default replacement           
//        $alias = preg_replace("/[_-\W]+/", "_", $alias);
//        $alias = trim($alias, '_');

        $alias = preg_replace("/[_-\W]+/", "-", $alias);
        $alias = trim($alias, '-');
... and this code

Code: Select all

$alias = preg_replace('/(-)+/','-',$alias);
replace multiple dash
Post Reply

Return to “Translations”