[SOLVED] Spurious characters in content after upgrade
Posted: Wed May 21, 2014 7:23 pm
I've recently upgraded from 1.6.10 with PostgreSQL all the way to 1.11.10 (with every major release step upgraded in between) on MySQL 5.1.73. PHP is version 5.3.3, all running on CentOS 6.x.
After the switch to MySQL (at the 1.10.3 upgrade step), spurious high-byte characters showed up on some of the content pages. The characters in question are nowhere in the dump / load scripts generated during the database transition, and if I perform direct queries on cms_content_props for the affected pages, the characters aren't present in the database either.
They do show up in the admin editors, either in GUI or non-GUI mode. Suspecting there might be a strange character encoding issue, I made sure the MySQL server uses utf8 as the default--the databases were created with utf8 encoding as per the install guidelines.
If I remove the spurious characters in the editor and apply the changes, they are accepted and the characters disappear from the presented pages as well, so if all else fails I can do a manual audit of every page and hope we don't miss anything.
It'd be awfully nice to find a cause, though. I've got three different sites worth of content all exhibiting the same spurious characters, so a general fix would be most welcome.
Any ideas?
After the switch to MySQL (at the 1.10.3 upgrade step), spurious high-byte characters showed up on some of the content pages. The characters in question are nowhere in the dump / load scripts generated during the database transition, and if I perform direct queries on cms_content_props for the affected pages, the characters aren't present in the database either.
They do show up in the admin editors, either in GUI or non-GUI mode. Suspecting there might be a strange character encoding issue, I made sure the MySQL server uses utf8 as the default--the databases were created with utf8 encoding as per the install guidelines.
If I remove the spurious characters in the editor and apply the changes, they are accepted and the characters disappear from the presented pages as well, so if all else fails I can do a manual audit of every page and hope we don't miss anything.
It'd be awfully nice to find a cause, though. I've got three different sites worth of content all exhibiting the same spurious characters, so a general fix would be most welcome.
Any ideas?