Michal Čihař - ColorHug with non English locales

ColorHug with non English locales

Since infamous erasing of factory calibration in my ColorHug device and restoring calibration matrix, I noticed it did screen calibration wrong. However I did not find time to properly investigate the issue. Yesterdays mail from Richard was actually trigger for me so I've opened up this topic.

In the end it turned out to be caused by Little CMS wrongly parsing CCMX in case you are using locales which use something else than . as decimal point.

After lot of googling, I've realized there is probably no good way of parsing floats independent on current locales, so I used one of hacks I found and I think it's less intrusive - get current decimal point by printing float string using printf and then convert the string to it. I know it looks ugly, but including own implementation of strtod is also not nice and playing with locales is definitely something not thread safe to do within widely used library.

Anyway I've asked upstream to merge my patches, so let's see what they think of it.


foo wrote on Feb. 9, 2012, 12:44 p.m.

Your patch is wrong, you should set the locale for LC_NUMERIC to "C" before parsing and reset it afterwards.

wrote on Feb. 9, 2012, 12:56 p.m.

Wrong approach. The parse routine MUST be changed to be not locale-dependent, because otherwise it parses “something” but not the specified format. This is kinda like PHP’s JSON bug…

@foo: Your approach does not work in a pthreaded environment, as Michal pointed out.

wrote on Feb. 9, 2012, 1:17 p.m.

@mirabilos: Yes it would be probably cleaner, but what I did not want to do is bringing in another code for parsing numbers. Unfortunately Little CMS does not use any library which would already provide locale agnostic number parsing.

wrote on Feb. 9, 2012, 2:55 p.m.

Unfortunately, you must bring in that code. I know that you’d like to avoid it and understand why, but it’s necessary.

wrote on Feb. 9, 2012, 3:19 p.m.

Well with my changes it is pretty easy to replace my hack with real parsing, in case upstream prefers that. I personally would rather switch to using existing parser, for example by using glib.

wrote on Feb. 9, 2012, 4:17 p.m.

These days POSIX specifies some features that make locales usable in multi-threaded programs.

First, call uselocale to retrieve the current thread's locale. Then call newlocale to create a new "C" locale, set it as the thread's current locale with uselocale, perform your formatting/parsing and finally restore the original locale with a final call to uselocale.

wrote on Feb. 9, 2012, 6:03 p.m.

@Sam: Well as I don't have man pages for neither of them, I assume it is somehow missing on Linux...

wrote on Feb. 9, 2012, 6:47 p.m.

They have been available on Linux for a long time, but naturally neither man-pages nor the GLIBC documentation have been updated. They were a part of POSIX-1.2008 though so at least they are documented at <http://pubs.opengroup.org/onlinepubs/9699919799/functions/newlocale.html>.

Here's a quick example of the functions in use:

#include <stdio.h>
#include <locale.h>

int main () {
setlocale (LC_ALL, "de_DE.utf8");
printf ("%f\n", 3.14);

locale_t old = uselocale (0);
locale_t l = newlocale (LC_NUMERIC_MASK, "C", 0);
if (!l) {
perror ("newlocale");
} else {
uselocale (l);
printf ("%f\n", 3.14);
uselocale (old);
freelocale (l);

printf ("%f\n", 3.14);

There are also some functions that take a locale_t directly, thereby avoiding the call to uselocale. These are non-standard though, and uselocale is fast, so there's no real need to use them.