Question about the transform sql
Question about the transform sql
Hi Alby,
one additional question:
Why did you set the charset for the new columns explicit to 'utf8'?
At the moment I'm using the above example sql to transform cmsms to mle.
(Of cource, it would be nice/a good idea, to really change/convert the db to utf8)
Regards,
Carsten
one additional question:
Why did you set the charset for the new columns explicit to 'utf8'?
At the moment I'm using the above example sql to transform cmsms to mle.
(Of cource, it would be nice/a good idea, to really change/convert the db to utf8)
Regards,
Carsten
- Attachments
-
[The extension txt has been deactivated and can no longer be displayed.]
-
[The extension txt has been deactivated and can no longer be displayed.]
Re: Question about the transform sql
Generally people who use multilingual must use UTF-8, in this way sets the db to use this instead latin-sweden (mysql default) but is not important if not appear strange charactersWiedmann wrote: Why did you set the charset for the new columns explicit to 'utf8'?
Alby
Re: Question about the transform sql
Right, and CMSMS is allready using UTF-8.Generally people who use multilingual must use UTF-8
But:
CMSMS itself have a really strange misbehaviour in using UTF-8 in db's. Because it's connecting with latin1 to the db. All utf-8 char's are stored as a sequence of 1,2 or 3 latin1 chars in the db. Regardsless if the the db is setup to latin1 or utf-8.
If you want verify this:
DB setup with latin1:
Just make a content page with e.g. the German word "König". The word looks fine in the browser. Now look at the db with phpMyAdmin. What can you see?
In a second test change the db to utf-8:
Now the same test as above. What can you see: Right... the same.
In a third test:
Change the content of the UTF-8 db with phpMyAdmin and put some valid utf-8 chars in it (e.g. change König to Müller). What can you see in your Browser if you acces your website?
At the moment it makes no sense to set only these (MLE) columns to utf-8.
Of course, the way CMSMS is using charsets in the db should be corrected:
- you can't really change utf-8 content with a 3rd.party program like phpMyAdmin.
- if the default charset of the server is utf-8 before you install CMSMS, and you install CMSMS, not all indexes will be generated, but you have no error message about this situation.
- And of course, no natural language sorting of the db will work at the moment.
(I think, most of the cmsms devs are us-(ascii) people. Ans thus they don't have this problem )
Last edited by Wiedmann on Mon Apr 21, 2008 8:55 am, edited 1 time in total.
Re: Question about the transform sql
Yes (most of the cmsms devs are us-(ascii)) and no (you can really change utf-8 content with a 3rd.party program)Wiedmann wrote: Of course, the way CMSMS is using charsets in the db should be corrected:
- you can't really change utf-8 content with a 3rd.party program like phpMyAdmin.
- if the default charset of the server is utf-8 before you install CMSMS, and you install CMSMS, not all indexes will be generated, but you have no error message about this situation.
- And of course, no natural language sorting of the db will work at the moment.
(I think, most of the cmsms devs are us-(ascii) people. Ans thus they don't have this problem )
but there are many problem related .... the most important is intrinsic with php: default is ALL ISO-8859-1 and not UTF8
In a my script I must use: htmlspecialchars($query, ENT_QUOTES, 'UTF-8') for insert in/view from DB
If you then use a editor that translate in html entity .....
UTF-8 is not the panacea for everything and should be evaluated case by case, if a person does not want utf8 because, for example, use languages ISO-8859-15?
Alby
Re: Question about the transform sql
PHP is not the problem. But CMSMS is not using "SET NAMES utf8" in the db connection, although they use utf8 for out/input.and no (you can really change utf-8 content with a 3rd.party program) .... the most important is intrinsic with php: default is ALL ISO-8859-1 and not UTF8
That's not a problem. With a correct "set names" the db make sure the client have the correct chars. Thus you can output iso-8859-15 to the client, but store the data in utf-8 tables. Or output utf-8, but store the text in columns which have different charsets for each language.UTF-8 is not the panacea for everything and should be evaluated case by case, if a person does not want utf8 because, for example, use languages ISO-8859-15?
Well, the easiest thing is really using utf-8 in output and db. And this works with cmsms after a small test... Of course, just setting the charset in db to utf-8 is not enought, if you transform a working cmsms. In this case you must first transform the wrong latin1(utf-8) chars to real utf-8 chars. But with a little script this is a easy thing
Is this not also normal? IMHO that's a basic thing if you output text form a db.... (BTW: afaik there is a cmsms internal function for this) (Of course, no one should really store entities in the db...)In a my script I must use: htmlspecialchars($query, ENT_QUOTES, 'UTF-8')
(But in the summary: CMSMS MLE works also, if the columns are latin1 or whatever, because cmsms don't use charsets in the db.)
Re: Question about the transform sql
I wanted to say that PHP is a problem for general installations. PHP 5 is ok but PHP 4 has problem (mysql_fetch bug in few version of 4.4)Wiedmann wrote:PHP is not the problem. But CMSMS is not using "SET NAMES utf8" in the db connection, although they use utf8 for out/input.and no (you can really change utf-8 content with a 3rd.party program) .... the most important is intrinsic with php: default is ALL ISO-8859-1 and not UTF8
Again, for general installations "set names" (if you want uncomment query in include.php) implies that you must have mysql 4.1 to work well, but until a month ago I had two sites MLE with mysql 3.23.58!!Wiedmann wrote:That's not a problem. With a correct "set names" the db make sure the client have the correct chars. Thus you can output iso-8859-15 to the client, but store the data in utf-8 tables. Or output utf-8, but store the text in columns which have different charsets for each language.UTF-8 is not the panacea for everything and should be evaluated case by case, if a person does not want utf8 because, for example, use languages ISO-8859-15?
Well, the easiest thing is really using utf-8 in output and db. And this works with cmsms after a small test... Of course, just setting the charset in db to utf-8 is not enought, if you transform a working cmsms. In this case you must first transform the wrong latin1(utf-8) chars to real utf-8 chars. But with a little script this is a easy thing
My bad example but it was to say that you will still problems with the WYSIWYG editor.Wiedmann wrote:Is this not also normal? IMHO that's a basic thing if you output text form a db.... (BTW: afaik there is a cmsms internal function for this) (Of course, no one should really store entities in the db...)In a my script I must use: htmlspecialchars($query, ENT_QUOTES, 'UTF-8')
(But in the summary: CMSMS MLE works also, if the columns are latin1 or whatever, because cmsms don't use charsets in the db.)
My personal experience is:
- If I have access to mysql/apache resources:
- my.cnf:
[mysqld]
collation_server=utf8_unicode_ci
character_set_server=utf8
[client]
default-character-set=utf8
- htaccess or httpd.conf:
AddDefaultCharset UTF-8
- No access to mysql/apache resources: php query:
SET NAMES utf8 or SET CHARACTER_SET utf8
- database/table/columns text: UTF8
- header('Content-Type: text/html; charset=utf-8' );
-
Alby
Re: Question about the transform sql
Can you explain this more detailed?(mysql_fetch bug in few version of 4.4)
--> I've written some of the charset code for the mysql(I) drivers for mdb2, and I don't think we have special PHP4 bugreports regarding charsets.
Of course, if you store data with an application like phpMyAdmin (which handle charsets correct) in the db, and then retrieve the data with a wrong client encoding in PHP, the result is not correct.
--> same happens with cmsms at the moment.
That's also no problem, because mysql in your (x)html output.Again, for general installations "set names" (if you want uncomment query in include.php) implies that you must have mysql 4.1 to work well, but until a month ago I had two sites MLE with mysql 3.23.58!!
What does you mean with "no access"?- No access to mysql/apache resources: php query:SET NAMES utf8 or SET CHARACTER_SET utf8
BTW:
"SET CHARACTER SET" is in most times not what you want. This sets the connection charset to the default server charset. "SET NAMES" (or better mysql(i)_set_charset) is the correct way.
That's what I also have. But as I've written above:- database/table/columns text: UTF8
- header('Content-Type: text/html; charset=utf-8' );
-
For the DB this makes only sense, if you use "SET NAMES 'utf8'" and also correct the wrong chars in the db and adjust one index.
If you don't use "SET NAMES", better use "latin1" in the db (and save space).
That works without problems for the base cmsms. If you install additional modules, you can have a problem if only your tables have a default charset utf8, but the db not.
Last edited by Wiedmann on Tue Apr 22, 2008 5:01 am, edited 1 time in total.
Re: Question about the transform sql
WOWWiedmann wrote:Can you explain this more detailed?(mysql_fetch bug in few version of 4.4)
--> I've written some of the charset code for the mysql(I) drivers for mdb2, and I don't think we have special PHP4 bugreports regarding charsets.
I remember a problem but I could not find it now, I only found this
No, requirement for < 2.0 is MySQL 3.23Wiedmann wrote: (BTW: The prerequist for cmsms is MySQL 4.1)
I agreeWiedmann wrote: And about enabling "set names" in include.php. Just enabling this is also wrong. First you must correct all wrong chars in the db.
True, is because I use mysql programs for backup/checkWiedmann wrote:You remember: PHP doesn't use this.Code: Select all
[client] default-character-set=utf8
No access to my.cnf and httpd.conf/Override NoneWiedmann wrote:What does you mean with "no access"?- No access to mysql/apache resources: php query:SET NAMES utf8 or SET CHARACTER_SET utf8
Why you don't you take a look at the 2.0 code?
Alby
Re: Question about the transform sql
The old problem.. A MySQL utf-8 table with correct utf-8 chars (created with phpMyAdmin or MySQL QueryBrowser or...), but the PHP script connects with latin1. They should read, how MySQL handle charsetsbut I could not find it now, I only found this
Ups, my fault. I thought I had read this somewhere.No, requirement for < 2.0 is MySQL 3.23
If someone (or you) is interesting. I've attached a script to convert all tables correct to utf-8.I agreeFirst you must correct all wrong chars in the db.
- just put this script in a subdir of your cmsms root and execute it (browser or shell)
- after that apply the patch to adodb.functions.php to enable utf-8 db connections.
(you must not change anything in "my.ini")
(Backup your db and only use this script with MySQL! only testet with a default installation of cmsms with sample data and some chinese chars... )
And I have always latin1 as client encoding, because all my shells are setup to iso-8859-1 and I want work with the command line clients. For backup/restore I change this with the client options parameter.True, is because I use mysql programs for backup/check
Is this code (formationg/docblocks) as bad as that of version 1.2?Why you don't you take a look at the 2.0 code?
- Attachments
-
[The extension txt has been deactivated and can no longer be displayed.]
-
[The extension txt has been deactivated and can no longer be displayed.]