what's up with non-english characters / Umlauts?

General project discussion. NOT for help questions.
Post Reply
vioos

what's up with non-english characters / Umlauts?

Post by vioos »

hello,

i am new to cmsms and in gereral very happy with it :D

but i and a lot of people from other countries than England, USA, ... do have big problems with non-english characters like ä, ö, ü, ß and so on.
could you please implement a method which encodes this characters in the "Menu Text", "Title", "Content" and "Head Tags"-Form fields or better in all form fields into their named entities or unicode-values automatically
For The "Page Alias" should be an own conversion table: like this one:


Ä
AE


ä
ae


ö
oe


ß
ss


Ö
OE


Ü
UE



euro
Ted
Power Poster
Power Poster
Posts: 3329
Joined: Fri Jun 11, 2004 6:58 pm

Re: what's up with non-english characters / Umlauts?

Post by Ted »

Well, you can set the encoding of the template by editing the template and changing the encoding to "ISO-8859-15" at the bottom and it will work German characters natively.

As far as doing the conversion on page alias (which HAS to be a valid URLable text), I guess we could do something like that, but there is no easy conversion.  We'd just have to start putting a huge list together for all of the different languages and encodings out there.
vioos

Re: what's up with non-english characters / Umlauts?

Post by vioos »

as you know this changing of the encoding is not the best solution.

i wrote a very primitive function which replaces all non-english characters with their named entities:

Code: Select all

$text = "<a href=\"test\">I'm an exmaple: ÄÖÜß</a>";
$arr = get_html_translation_table(HTML_ENTITIES);
//html-tags will work:
unset($arr['<']);
unset($arr['>']);
unset($arr['"']);
unset($arr[' ']);
echo strtr($text, $arr);
i think on the basis of this array the characters would be relatively fast translated into url-able characters.

btw. is there any reason why the text is encoded in "latin1_swedish_ci" and not in sth. international? i dont know if it affects anything but i was wondered

//edit: i forgot unset($arr[' ']);
Last edited by vioos on Sat Jun 04, 2005 8:56 pm, edited 1 time in total.
nils73
Power Poster
Power Poster
Posts: 520
Joined: Wed Sep 08, 2004 3:32 pm

Re: what's up with non-english characters / Umlauts?

Post by nils73 »

Setting the character encoding only won't do the trick. It would be better to have a conversion table to convert all special chars to their respective HTML entities. Coming back to Textpattern (and my question for a Textile module / option) there is a function called

function cmap()

in classTextile.php, where the conversion is done. Won't do any harm, but in countries where you need special chars and XHTML you are better off with this solution.

Regards,
Nils
Post Reply

Return to “General Discussion”