Multilanguage support in CMS: UTF8 Problem

Talk about writing modules and plugins for CMS Made Simple, or about specific core functionality. This board is for PHP programmers that are contributing to CMSMS not for site developers
Ted
Power Poster
Power Poster
Posts: 3329
Joined: Fri Jun 11, 2004 6:58 pm
Location: Fairless Hills, Pa USA

Re: Multilanguage support in CMS: UTF8 Problem

Post by Ted »

I would love to use UTF-8 across the board, and while it wouldn't cause much of an issue with the Western European languages, it's going to cause problems with others.  Until operating systems start defaulting to using UTF-8 in their input systems, it's going to cause issues.

If there is an all-around better approach to handling this, I'd love to know.  I could be missing something.
mbvdk
Forum Members
Forum Members
Posts: 43
Joined: Wed Jun 08, 2005 3:30 pm

Re: Multilanguage support in CMS: UTF8 Problem

Post by mbvdk »

I agree that using UTF-8 across the board is the best aproach, however I don't think it should be hardcoded into cmsms. Rather it should be the default value set in config.php and $nls['encoding']['...']= should be removed from the various language files. Finally the encoding dropdown should be removed from the templates page.
Ted
Power Poster
Power Poster
Posts: 3329
Joined: Fri Jun 11, 2004 6:58 pm
Location: Fairless Hills, Pa USA

Re: Multilanguage support in CMS: UTF8 Problem

Post by Ted »

But the problem is that the translations themselves are not all in UTF-8.  Most are in the local encoding...  So people are going to switch languages in the admin and all of the sudden it's not going to work.  We'd have to ensure that all translations are in UTF-8 only, and I'm not sure how that's going to happen without making the poor translators really have to mess with their systems...
mbvdk
Forum Members
Forum Members
Posts: 43
Joined: Wed Jun 08, 2005 3:30 pm

Re: Multilanguage support in CMS: UTF8 Problem

Post by mbvdk »

You sure have a point there. On the other hand, if someone wants to use more than one language-file for some module they have a problem unless thsy make sure to use different templates for each pages using different languages. The safest aproach would be to use htm-enteties for non-English characters in language files.

Ok this does not solve the problem with mixing western and non-western character sets, but that will probably create a host of other problems with cmsms as well. If I remember correctly most other CMS-systems I have tried uses the same character encoding for the whole site. But then none of them offered the same ease of use as cmsms :D
Post Reply

Return to “Developers Discussion”