Search Module - Greek Search
Search Module - Greek Search
The search function doesn't seem to work with international characters. I give specific terms in the search textfield (i know these terms exist in my content).
Any ideas? Any solutions?
Any ideas? Any solutions?
Re: Search Module - Greek Search
What version are you using? I though I solved this problem in 1.0.2.
Re: Search Module - Greek Search
I am using 1.0.2 . Any related topics to read on? It just doesn't return any results... 

Re: Search Module - Greek Search
I tested search module and int characters today. It seems to me that search works if characters are encoded to HTML entities (γε etc..) when stored to database. If those characters are stored to database as they are (not as html entities) searching doesn't find anything.
Do you use WYSYWYG-editor when adding content??
It seems to me that admin console stores Title and Menu Text fields to database without encoding international characters to HTML entities. I think that this prevents search module to find strings from Title and Menu Text containing int characters.
Do you use WYSYWYG-editor when adding content??
It seems to me that admin console stores Title and Menu Text fields to database without encoding international characters to HTML entities. I think that this prevents search module to find strings from Title and Menu Text containing int characters.
Re: Search Module - Greek Search
Exactly. This is the second day straight playing with this. I used fckeditor and characters are stored in database as "&Gamma&Epsilon" etc.. In that case search works just fine... If i take fckeditor out data is stored Like that in database "Σκατά". Both are searchable but the point is that how am i suppose to store the data correctly in the database, i mean with the correct characters, like "Ενα Δυο Τρια" and still be searchable?
Re: Search Module - Greek Search
Well, as you seem to have same problem as me, I'll file bug report to tracker. I'm not 100% sure is this CMSMS core problem or search module problem?
There are quite a few variables (in php.ini, my.cnf, config.php.....) which can mess up international characters. I've tried every combination I can imagine without success.
With this I tried to make sure that connection from CMSMS to MySql database was done using UTF-8 encoding so that international characters would be stored to database as they are. I made changes to my.cnf because CMSMS is only software using Mysql on my server.
There are quite a few variables (in php.ini, my.cnf, config.php.....) which can mess up international characters. I've tried every combination I can imagine without success.
I made these changes to my.cnf:If i take fckeditor out data is stored Like that in database "Σκατά".
Code: Select all
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
# Default to using old password format for compatibility with mysql 3.x
# clients (those using the mysqlclient10 compatibility package).
old_passwords=1
init_connect='SET collation_connection = utf8_general_ci'
init_connect='SET NAMES utf8'
default-character-set = utf8
character-set-server = utf8
collation-server = utf8_general_ci
[mysql.server]
user=mysql
basedir=/var/lib
[mysqld_safe]
err-log=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
[client]
default-character-set=utf8
Re: Search Module - Greek Search
I am just evaluating CMSMS and i can confirm the issues with UTF-8.
My installation:
- CMSMS 1.0.2 on WAMP (Win2k, Apache 2, MySQL 4.1, PHP 4.4).
- CMSMS DB on MySQL 4.1 is set to collation utf8_general_ci
Issues:
Slovak characters stored with FCKEditor are encoded as htmlentities or completly damaged.
Search does not work.
Solution:
1. disable HTML entities in FCKEditor, in fckconfig.js set:
FCKConfig.ProcessHTMLEntities = false ;
2. always connect to MySQL with utf8. Patch the include.php file; add this line:
$db->Execute('set names utf8');
after line
$db =& $gCms->GetDB();
I have success
Now i need to build a simple multilingual site. I have found several threads here in forum. Which solution is the best one?
Thanks
Jozef
My installation:
- CMSMS 1.0.2 on WAMP (Win2k, Apache 2, MySQL 4.1, PHP 4.4).
- CMSMS DB on MySQL 4.1 is set to collation utf8_general_ci
Issues:
Slovak characters stored with FCKEditor are encoded as htmlentities or completly damaged.
Search does not work.
Solution:
1. disable HTML entities in FCKEditor, in fckconfig.js set:
FCKConfig.ProcessHTMLEntities = false ;
2. always connect to MySQL with utf8. Patch the include.php file; add this line:
$db->Execute('set names utf8');
after line
$db =& $gCms->GetDB();
I have success

Now i need to build a simple multilingual site. I have found several threads here in forum. Which solution is the best one?
Thanks
Jozef
Re: Search Module - Greek Search
Nope. This didn't work for Greek characters. everything on my DB (mysql 5.0) is set to utf-8.
When i use fckeditor characters are stored as "Γ&Theta.. etc" and they are searchable.
I deactivate fckeditor, contenet goes as it should be in db but it's not searchable.
When i use fckeditor characters are stored as "Γ&Theta.. etc" and they are searchable.
I deactivate fckeditor, contenet goes as it should be in db but it's not searchable.

Re: Search Module - Greek Search
tsiger: I wonder. Do you use AdoDb with MySQL 5.0+utf8 in other PHP software without need to use 'set names utf8' query?
Re: Search Module - Greek Search
Josef i added the set names lines and i did everything u described 
This is the first time i am using mysql 5.0 and that's because the site built with cmsmadesimple is going to be hosted on mysql 5.0 server.
Here's a step.
I edited fckconfig.js like that and now the characters are stored in db just fine using fckeditor.
i noticed that e.g for the word "Αποτέλεσμα" which is stored in db properly, when i hit the search button the url contains
"cntnt01searchinput=%CE%91%CF%80%CE%BF%CF%84%CE%AD%CE%BB%CE%B5%CF%83%CE%BC%CE%B1".
Would that help?
I noticed that in Google as well... so not sure at all though.. but it's propably something between the script and the db... the characters in db are stored properly so... any thoughts?

This is the first time i am using mysql 5.0 and that's because the site built with cmsmadesimple is going to be hosted on mysql 5.0 server.
Here's a step.
I edited fckconfig.js like that and now the characters are stored in db just fine using fckeditor.
Code: Select all
FCKConfig.ProcessHTMLEntities = false ;
FCKConfig.IncludeLatinEntities = true;
FCKConfig.IncludeGreekEntities = false;
"cntnt01searchinput=%CE%91%CF%80%CE%BF%CF%84%CE%AD%CE%BB%CE%B5%CF%83%CE%BC%CE%B1".
Would that help?
I noticed that in Google as well... so not sure at all though.. but it's propably something between the script and the db... the characters in db are stored properly so... any thoughts?
Re: Search Module - Greek Search
ok let's see coz it works but i am not sure if this is the right way...
after doing everything jozef said i did the following:
in the file action.dosearch.php i replaced this line :
with this:
In other words i removed the htmlentities thing... and it works just fine... Data stored in db properly and everything is searchable.
By removing the htmlentities would that affect any other functionality?
after doing everything jozef said i did the following:
in the file action.dosearch.php i replaced this line :
Code: Select all
$ary[] = "word = " . $db->qstr(htmlentities($word, ENT_COMPAT, 'UTF-8'));
Code: Select all
$ary[] = "word = " . $db->qstr($word, ENT_COMPAT, 'UTF-8');
By removing the htmlentities would that affect any other functionality?
Re: Search Module - Greek Search
Now i am confused. I thought that after my patcing there will be no HTML entities in DB. But they are there! and search works?!
tsiger: do you have same words in tables content_props and module_search_index?
core developers: why are you using the html entities?! It is a road to hell! If you use utf8 you should read about PHP and UTF8
http://www.phpwact.org/php/i18n/charsets, http://sourceforge.net/projects/phputf8
tsiger: do you have same words in tables content_props and module_search_index?
core developers: why are you using the html entities?! It is a road to hell! If you use utf8 you should read about PHP and UTF8
http://www.phpwact.org/php/i18n/charsets, http://sourceforge.net/projects/phputf8
Re: Search Module - Greek Search
There are no HTML entities in DB anymore jozef. These characters i mentioned before appear on the URL NOT the database.Try it by urself. Copy paste the word "Αποτέλεσμα" from here and check the url. U ll see what i mean. I mentioned that coz i thought it has something to do with it but apparently it's not. Forget about it.jozef wrote: Now i am confused. I thought that after my patcing there will be no HTML entities in DB. But they are there!

Both tables have same words in them. Both stored properly. Everything searchable.
Everything works now

Re: Search Module - Greek Search
I am still confused. I have only my 2 patches, without your one. I have inserted your greek word into one content page using FCKEditor.
In both previously mentioned tables the word is stored in HTML entities:
"Αποτέλεσμα" = αποτέλεσμα
And the search finds your word without your patch?!
In both previously mentioned tables the word is stored in HTML entities:
"Αποτέλεσμα" = αποτέλεσμα
And the search finds your word without your patch?!
Re: Search Module - Greek Search
aply those 2 changes in fckconfig.js
re enter the greek word in one content page and search again. Without my patch it should not find anything
Code: Select all
FCKConfig.IncludeLatinEntities = true;
FCKConfig.IncludeGreekEntities = false;