Hello,
I am running CMSMS 1.2.3 in german language and noted that when I search for a word containing an HTML entity I get no results.
Example: When I have a page containing text with german umlauts e.g. - äöü - marked up correctly with html entities, I get no search results. When I switch off wysiwyg editor and enter text by hand, without masking the umlauts, the page can be found.
It seems that search does not mask non-ascii characters as entities.
Did I overlook something? Any solutions?
Cheers,
Alex
Search module: Problems with HTML entities
Search module: Problems with HTML entities
Last edited by faglork on Mon Feb 18, 2008 8:42 am, edited 1 time in total.
Re: Search module: Problems with HTML entities
Hi Alex,
I use CMSms 1.2.3 with TinyMCE as its content-editor and the search module find words with German Umlauts. But you are partly right. The search of words with umlauts is case-sensitive!!
But I found out one other curiosities or bug (maybe features
):
You have to type in the whole word to find it. If you like to search for only a part of a word the search run fails. E.g. You type in "Dokument" and you will find all pages with "Dokument" but CMSms-Search do not show pages which contain the plural form "Dokumente" with a trailing "e".
Both characteristics make the search module mess useful for surfers.
Regards,
Hani
I use CMSms 1.2.3 with TinyMCE as its content-editor and the search module find words with German Umlauts. But you are partly right. The search of words with umlauts is case-sensitive!!
But I found out one other curiosities or bug (maybe features

You have to type in the whole word to find it. If you like to search for only a part of a word the search run fails. E.g. You type in "Dokument" and you will find all pages with "Dokument" but CMSms-Search do not show pages which contain the plural form "Dokumente" with a trailing "e".
Both characteristics make the search module mess useful for surfers.
Regards,
Hani
Hanis Sammelsurium - How To's, Erfahrungs- und Meinungsberichte
Re: Search module: Problems with HTML entities
Hi,
well, don't know how & why, but after re-inxexing everything now works perfectly.
I do not find a case-sensitivty in Umlauts ...
As to the other problem: Is there something like an advanced search, wher you can use + and - for in-/exclusion and the like?
This would be a nice feature ...
CHeers,
Alex
well, don't know how & why, but after re-inxexing everything now works perfectly.
I do not find a case-sensitivty in Umlauts ...
As to the other problem: Is there something like an advanced search, wher you can use + and - for in-/exclusion and the like?
This would be a nice feature ...
CHeers,
Alex
Last edited by faglork on Thu Feb 14, 2008 1:29 pm, edited 1 time in total.
Re: [solved] Search module: Problems with HTML entities
Ok, I cleaned cache and re-indexed the site but I still have the case-sensitivity on umlauts.
But when I display the html-code of the page the umlauts are not marked-up with html-entities. I see the ÄäÖö etc. In my template I use the default utf:
Database uses UTF and in the config.php the encoding is:
$config['default_encoding'] = '';
$config['admin_encoding'] = 'utf-8';
as the default. The TinyMCE setting for entities was raw, so I switched to named and I let TinyMCE create one page's content. The result was correct html-entities in the content block of this page. But the results of Search were the same - case sensitive for umlauts even for this page.
What are your settings?
Regards, Hani
PS: Concerning Advanced Search: "+" works already. Enter two words separated with a blank and you only get the page(s) which contain both words. Very nice.
But when I display the html-code of the page the umlauts are not marked-up with html-entities. I see the ÄäÖö etc. In my template I use the default utf:
Database uses UTF and in the config.php the encoding is:
$config['default_encoding'] = '';
$config['admin_encoding'] = 'utf-8';
as the default. The TinyMCE setting for entities was raw, so I switched to named and I let TinyMCE create one page's content. The result was correct html-entities in the content block of this page. But the results of Search were the same - case sensitive for umlauts even for this page.
What are your settings?
Regards, Hani
PS: Concerning Advanced Search: "+" works already. Enter two words separated with a blank and you only get the page(s) which contain both words. Very nice.
Hanis Sammelsurium - How To's, Erfahrungs- und Meinungsberichte
Re: [solved] Search module: Problems with HTML entities
Däng!! That did ithibr wrote: Ok, I cleaned cache and re-indexed the site but I still have the case-sensitivity on umlauts.
But when I display the html-code of the page the umlauts are not marked-up with html-entities. I see the ÄäÖö etc.

I noticed that I entered the text of the page which I used as a test case in "source code mode" so the Umlauts of this page didn't get changed to entities. I corrected that. Now the coding is perfect, but
search does not work on catitalized umlauts, just as in your case - but with one difference: I have correct entities on the page.
Uh, oh - I just noticed: I it even worse, after re-indexing the search with umlauts does not work at all.
So I am right where I started.
I have to cross-check with several systems, so it will take a day or two to get a complete picture.
I will post as soon as possible.
Alex
Last edited by faglork on Mon Feb 18, 2008 8:52 am, edited 1 time in total.