Page 3 of 4

Re: Language detection?

Posted: Fri Dec 12, 2008 11:36 pm
by alby
bongobongo wrote: Before the script gets to the usort($browser_langs, ...)  then it contains:
no
en
which is the priority of the languages as set in my browser.

After the usort it contains
en
no
Which now does not honor my priority given to no as set in my browser.

Since I have set priority to "no" (norwegian) in my browser then I would like the site to show up in Norwegian pages.
You not report the quality factor of these languages.
Accept-languages has language AND quality factor, from RFC2616:
Each media-range MAY be followed by one or more accept-params, beginning with the "q" parameter for indicating a relative quality factor. The first "q" parameter (if any) separates the media-range parameter(s) from the accept-params. Quality factors allow the user or user agent to indicate the relative degree of preference for that media-range, using the qvalue scale from 0 to 1 (section 3.9). The default value is q=1
if missing then q has default value of 1

bongobongo wrote: What is the meaning of the usort line.... what is it ment to do? I belive it should never have been there.
Browser can display languages in ANY order BUT must following the right quality factor
usort do that .... the right order
However my results:

FF 3.0.4:

Before:
it-it-1
it-0.8
en-us-0.5
en-0.3

After
it-it-1
it-0.8
en-us-0.5
en-0.3

IE6 (with no, en-us and it):
Before:
no-1
es-us-0.7
it-0.3

After:
no-1
es-us-0.7
it-0.3

bongobongo wrote: That said, the usort line might not result in a bug for everyone, since it depends on what languages are defined
in the browser and the natural sort order of those languages (as is in the browser_langs array after usage of the usort)  compared to the priority you have given to each language.
Repeat, for me it's not a bug (as also evidenced from the site www.php.net)
If I will have further report users then I investigate further but now I stop here


EDIT:
I read this that IE7 Accept-Language header is calculated based ALSO on the Windows default locale


Best regards

Alby

Re: Language detection?

Posted: Fri Dec 12, 2008 11:51 pm
by bongobongo
Could you show me the code you used to show ther order before and after?

Then I will test that code here tomorrow, now it is a little to late

Regards

Re: Language detection?

Posted: Fri Dec 12, 2008 11:54 pm
by alby
bongobongo wrote: Could you show me the code you used to show ther order before and after?
It's in my Reply #28

Alby

Re: Language detection?

Posted: Sat Dec 13, 2008 8:18 am
by bongobongo
Okay....

*********************************************
Using your code from post #28 I get this output in IE7:
*********************************************

Before:
no-1
en--1

After:
en--1
no-1

Which gives wrong result and wrong language is displayed.

*******************************
Now lets try IE6
*******************************

Before:
no-1
en--1

After:
en--1
no-1

Which gives wrong result and wrong language is displayed.

*******************************
Now lets try Chrome
*******************************

Before:
no-1
en--1

After:
en--1
no-1

Which gives wrong result and wrong language is displayed.


*******************************
Now lets see what happens in FF:
*******************************

Before:
no-1
en-us-0.5

After:
no-1
en-us-0.5

Which gives correct language....

*******************************
Now lets see what happens in OP:
*******************************

Before:
no-1
en-0.9

After:
no-1
en-0.9

Which gives correct language....


***********
Conclusion
***********

It fails in: IE6, IE7 and Chrome.
It works in: Firefox and Opera.

At least using the languages I have set up.
I do belive this will fail for other's as well, but it depends on browser used as well as languages defined, and which order.

So IMHO, it is buggy with the usort line.

And... I do belive most people when they alter their Language Preferences do not think much about the q factor.
They enter a few languages in the preference list, and order them as they want.
Why not honor the order of the languages as the client browser (user) has it set?
Why sort this based on the q factor? Especially when it, for some, gives wrong result?

If you look at the original posters question in this thread, it looks to me he never got it fixed. I would not be surprised
if he encountered the same problems as me.

If you do not want to completely remove the usort line then why not eventually not use it by default, and optionally let
the admin activate it if he want. As it is now it is counterintuitive.

Best regards

Re: Language detection?

Posted: Sat Dec 13, 2008 11:20 am
by alby
bongobongo wrote: If you look at the original posters question in this thread, it looks to me he never got it fixed. I would not be surprised
if he encountered the same problems as me.
You cannot say this ....., I talked to him and it's close for me even if topic is not marked SOLVED .....

bongobongo wrote: If you do not want to completely remove the usort line then why not eventually not use it by default, and optionally let
the admin activate it if he want. As it is now it is counterintuitive.
Platform: Windows XP SP2
Language order is no, en (or en-us) but in Before there is $_SERVER['HTTP_ACCEPT_LANGUAGE']


Browser FF 3.04:

Before: 'no,en-us;q=0.8,it;q=0.6,it-it;q=0.4,en;q=0.2'

After:
no/1
en-us/0.8
it/0.6
it-it/0.4
en/0.2

OK!


Browser IE6 (and IE7 because share same value in registry):

Before: 'no,es-us;q=0.7,it;q=0.3'

After:
no-1
es-us-0.7
it-0.3

OK!


Browser Chrome 1.0.154.36:

Before: 'no,en,it,en-US,it-IT'

After:
it-IT/1
en-US/1
it/1
en/1
no/1

NO!
This version of Chrome has not q factor, unique browser in my test and VS RFC


Browser Opera 9.61 (browser has +item in last position: User-defined, it-IT):

Before: 'no,en;q=0.9,it;q=0.8,it-IT;q=0.7'

After:
no/1
en/0.9
it/0.8
it-IT/0.7

OK!



Conclusion:
1. I has not problem in IE then (sorry #1) I can not change just because it does not work on your PC
2. From statistic browser, 93,5% OK!, 3,1% NO! and 3,4% UNKNOW!, sorry but I don't install Safari for this
Chrome (3,1% NO!) not following RFC then (sorry #2) I can not change just because you use a browser that not following RFC with this percent. If you think is a true bug, first fill a bug report in google chrome and in IE (but I have not in IE) and ask me a workaround for this bug waiting to be solved
3. (sorry #3) I think that this topic is too long (Reply #34) for this type of argument and Repeat (from my Reply #28 and #30) I have not other report users. If I have not any report this topic is CLOSED for me
4. But I must say even thank you because reading better the RFC, Accept values are case insensitive and in function I see that is lower case only, I change in next release

Best regards

Alby

Re: Language detection?

Posted: Sat Dec 13, 2008 2:30 pm
by bongobongo
Hi.

Having this line:

Code: Select all

usort($browser_langs, "language_accept_order");
in this file:
mle/function.mle.php    and in this function: language_user_setting

will SOMETIMES cause cmsmadesimple to select a wrong language.

So I have also exemplified in my previous post.

This bug will not be VISIBLE for everyone as stated by alby.
And for the bug to be VISIBLE, will probably be a combination of operating system and version, browser type and version as well as which languages the site is available in.

That this does not cause the bug I'm talking about for alby (and many others) on their computer DOES NOT MEAN that other VISITORS of the site does not see the result of the bug.

It is a particular nasty bug, because it does not allways produce a visible result for everyone.
Therefore, many siteadmins may belive it is working.... , but for visitors to that same site it may not work as it was intended to.

Alby... In your examples run on your computer, you say it work.
Then I may ask.... what is really working.
The usort line does nothing in your examples, except for your Chrome test where it causes the bug to be visible for the
visitor of the site (but then you blame the browser).

If that usort line was deleted, then nobody would experience these problems, and your Chrome browser
would also get it right.

And in your conclusion you say among other things:
4. But I must say even thank you because reading better the RFC, Accept values are case insensitive and in function I see that is lower case only, I change in next release
Now.... what good will that change do?

The main issue here is that the order that the languages defined in Language preferences for a given browser should be followed, and in the priority it is listed for each client browser (in the browser).
(This is the order the langs are listed in the Before examples of mine and alby.)

If you do not agree to that, then  not much to do for you then... Might cause a lot of headace for some siteadmins... as well as some end-users not getting where they should on first visit.

If you do agree to that then please just delete that usort line. There is no need for it
.

That said, now why on earth would I use hours to test and comment on this thing:

I installed cmsmadesimple a couple of days ago, liked what I saw, but sat up allmost all of the first night after installing it.....
getting more and more frustrated because of inconsistent results the BUG caused.

So I guess I wanted to get a sane answer to why that usort line was in the code (has not been answered yet though).
And I also wanted to inform other current and new users of this otherwise exellent CMS software that there is a tricky bug
here, and that it is easy fixable by siteadmins as well.
I can only hope that this will be fixed in future versions.

Best regards

Re: Language detection?

Posted: Sat Dec 13, 2008 3:27 pm
by alby
bongobongo wrote: So I guess I wanted to get a sane answer to why that usort line was in the code (has not been answered yet though).
hummm, maybe there is a barrier language between us (very simple because I am not english native) but:

1. if you have read well I answer in Reply #28. The function is not mine but from www.php.net (from at least 5 years when I search for a language function and still there and I suppose that php guys know the thing also because have many developers in many countries and with many browsers ......). Open an alert in php.net may be that in php.net there is anyone interested or who has encountered this problem.

2. order is define in Accetp-Language header and formalized in RFC:
Through the Internet Society, engineers and computer scientists may publish discourse in the form of an RFC, either for peer review or simply to convey new concepts, information, or (occasionally) engineering humor. The IETF adopts some of the proposals published as RFCs as Internet standards.
and therefore should be followed by all browsers, I remember the history of IE.
If there is a browser that not following this, REPEAT: open a bug request in that browser, do you want the web address for submit?
I met a few times a problem of conflict with RFC and I reported to the author of the program that changed immediately (after checking the RFC).
I don't understand because I must changed something just because a browser is not following RFC. I hope that Chrome grow but until then (it's not sense with that percent) I honestly do not change a comma
If everyone has to make workarounds because someone is not following the standards ........ we have not understood anything from the lesson of Microsoft

bongobongo wrote: And I also wanted to inform other current and new users of this otherwise exellent CMS software that there is a tricky bug
here, and that it is easy fixable by siteadmins as well.
I can only hope that this will be fixed in future versions.
This is good, many times I made some workarounds (even here in cmsms core/modules/plugins) because that system (OS / web server / php) was so special that did not work properly but that is a workaround for that system without claiming to change because worked for me and others

Best regards
Alby

Re: Language detection?

Posted: Sat Dec 13, 2008 6:17 pm
by bongobongo
It is interesting how you manage to say so much without saying anything about why
the usort line is there at all.

The line causes bugs sometime. That is a fact. Not because of an error in the function or whatever.
I know you have not created that function..... (bahhh).

But some developer has decided to use it.

There is no need for it. And by putting that usort line in there some users may experience odd behaviour
from the site... that it returns a language it should not have done.


The above is the core issue here.... not who created that function, not the q factor ....

Simply put: There is no need to sort that array using usort.

So why on earth do you guys sort it then? Should not be that hard to answer.
And have you still not understood that that line of code can make troubles for some, even if it did not so for you on your computer?

My usage of "sane" had to come, since I do not get an answer related to my findings and questions.

Time for me to test the better features of this puppy (cmsmadesimple)  ;)

Have a nice day!

Best regards

Re: Language detection?

Posted: Sat Dec 13, 2008 10:07 pm
by alby
bongobongo wrote: It is interesting how you manage to say so much without saying anything about why
the usort line is there at all.
This is my last Reply in this topic.

usort is necessary because in RFC is NOT MANDATORY that Accept-Language is in exact language order (q factor is in for THIS).
If tomorrow FF go out with 4.0 with Accept-Language in reverse order, this IS CORRECT FOR RFC and THEN WHAT I DO?

However the real issue is:
How important are the RFC?
For you less, for me more

Have you good day

Close
Alby

Re: Language detection?

Posted: Sun Dec 14, 2008 7:14 am
by bongobongo

Code: Select all

usort is necessary because in RFC is NOT MANDATORY that Accept-Language is in exact language order (q factor is in for THIS).
If tomorrow FF go out with 4.0 with Accept-Language in reverse order, this IS CORRECT FOR RFC and THEN WHAT I DO?
Thanks for that feedback, I did not know that the order of the languages as given in the accept-language header could have a different order than what was set in the client browser.

I have a proposal that will make the browser lang detect part of cmsmadesimple better (more reliable, both for site-admins and for endusers) and still follow the RFC:

Based on what we know, some browsers does not set the q value, so sometimes this q value will be set to 1 for all languages missing the q part.

If you introduce a check if all the q values are different for each language in accept-language header then you can use
the following rules:

1.
Test if all q values are different, if true then use the usort line (thereby following the RFC).

2.
If some q values are equal then DO NOT USE the usort line, since it (the language detection) will then be unpredictable at the best and buggy at the worst (based on what we now know from previus tests in this thread).
In this case, use the order of languages as given in accept-language header.

Maybe a good thing would be to put a notice about this behaviour in you excellent thread here:
http://forum.cmsmadesimple.org/index.ph ... 318.0.html

Thanks for the replies (and for staying put, even if the discussion got a little hot (from my side))

Best regards

Re: Language detection?

Posted: Sun Dec 14, 2008 3:38 pm
by alby
bongobongo wrote: If you introduce a check if all the q values are different for each language in accept-language header then you can use
the following rules:

1.
Test if all q values are different, if true then use the usort line (thereby following the RFC).

2.
If some q values are equal then DO NOT USE the usort line, since it (the language detection) will then be unpredictable at the best and buggy at the worst (based on what we now know from previus tests in this thread).
In this case, use the order of languages as given in accept-language header.

nahhh,
is better this:
// Go through all language preference specs
for ($i = 0; $i < count($browser_accept); $i++) {
// The language part is either a code or a code with a quality
// We cannot do anything with a * code, so it is skipped
// If the quality is missing, it is assumed to be 1 according to the RFC
if (preg_match("!([a-zA-Z-]+)(;q=([0-9\\.]+))?!", trim($browser_accept[$i]), $found)) {
$quality = (isset($found[3]) ? (float) $found[3] : 1.0);
$browser_langs[] = array($found[1], $quality);
}
unset($found);
}
}

// Reverse order: mantain user order for same quality
$browser_langs = array_reverse($browser_langs);

// Order the codes by quality
usort($browser_langs, "language_accept_order");

Alby

Re: Language detection?

Posted: Sun Dec 14, 2008 5:28 pm
by bongobongo
Cool  :)

Problem solved

Re: Language detection?

Posted: Sun Dec 14, 2008 10:44 pm
by alby
bongobongo wrote: Problem solved
sorry but unfortunately no.
I have verified that the output does not always go well but have not yet figured why....

Alby

Re: Language detection?

Posted: Sun Dec 14, 2008 10:51 pm
by bongobongo
Is it q value related?

Re: Language detection?

Posted: Sun Dec 14, 2008 10:57 pm
by alby
bongobongo wrote: Is it q value related?
no, same q value are mixed and not in same order but the function for same index=1 return 0
for second and more values first third
????

Alby