The IM, Contacts & Social hackfest is over and people are variously heading back to their respective corners of the world. A lot of good discussion was had, and I think everyone's got a clearer vision of where we need to be with IMs and contacts and what needs to be done to get there.
I'll be heading along to the IM, Contacts & Social hackfest next week at Collabora's offices. There, a plan for world domination by libfolks will be forged, along with plotting around GNOME's new SSO overlord system and work on the much-awaited GNOME Contacts.
Should be fun!
Relatedly, I've just released libgdata 0.9.0, which has sprouted support for OAuth 1.0 — so hopefully some GNOME Online Accounts goodness will soon make it into Evolution's Google Contacts and Calendar backends.
Now that exams are finally over, I can spend more time on GNOMEy things. One problem which has been sitting on my to-do list for a while is that of translatable Unicode strings in Python. It appears that my patch in bug #591496 to get Hamster to use Unicode em-dashes inadvertently broke translation of the strings. Whoops.
It turns out that in order for gettext to properly match and translate a C-locale string which contains Unicode characters, the encoding of the Python file must be specified using a
coding: line at the top of the file, and the string in question must be a Unicode object. For example:
# -*- coding: utf-8 -*- … import gettext gettext.textdomain('myapp') … my_translated_string = gettext.gettext(u'My Unicode string…') …
I don't think this is too common a problem, and I've checked that it doesn't affect any of the other Python modules I've fiddled with, but hopefully this will be useful to someone. As far as I understand it, all translatable strings in Python modules should be
u'Unicode objects rather than normal strings' anyway, ideally, but don't take my word on it because my Python-fu is weak.
This is something I’ve been meaning to write about for a while and, I must admit, something I should have written about before I started pushing through changes in GNOME applications. I’m talking about the use of Unicode in GNOME: the use of the proper ellipsis character (“…”), proper en- and em-dashes (“–” and “—”, respectively) and fancy quotation marks.
This is something which has been brought up before, so I’ll try not to reignite the old arguments, and instead concentrate on the unresolved issues. Here are the main points:
- Proper Unicode characters look nicer than the ASCII versions which substitute for them. The ellipsis is correctly spaced (if one were to use full stops instead, they should technically have non-breaking spaces between them), and the quotation marks are pleasantly curved. This looks nicer, to my eye at least. The difference between en- and em-dashes and the ASCII hyphens used to simulate them is considerable.
- They’re harder to type on a conventional keyboard, though are easily accessible through the use of the compose key.
- There are questions about the level of font support for such characters. On my Fedora 11 system, all the fonts except one (“PakTypeTehreer”) have the expected characters (ellipsis, dashes and quotation marks) at the right codepoints, although many of the glyphs are ugly and unloved (e.g. in Hershey and Khmer). DejaVu and Bitstream have excellent support for these characters. There is a suggestion that Pango should be extended to support decomposing the Unicode characters into their ASCII equivalents if a font doesn't support them.
- There was confusion over what exactly was allowed in source code, and whether UTF-8 characters were allowed in C-locale strings (regardless of their representation in source code). It was decided that they were, but that the most portable way to represent them in C was to use octal slash escaping (e.g. “\342\200\246” instead of “…”). We’ve had Unicode characters in source code since GNOME 2.22, and (apparently) there have been no bug reports on the matter, but there was no conclusive answer about how embedded C compilers (and other, less well-known compilers) cope with such things.
Obviously, I’m thoroughly in the pro-Unicode camp. I believe it would make our desktop look more professional, and improve legibility of the interface in places. I’ve spoken to Calum Benson of HIG fame and he has no particular objections to mandating use of the appropriate Unicode characters by the HIG.
In the meantime, I’ve been filing bugs against applications to convert them to using proper Unicode characters; this probably wasn’t the best way to go about things, but at least it is a move in the right direction (in my view anyway). Unfortunately, this has come at the cost of inconsistency in the desktop. Most of the changes have been applied after branching for gnome-2-28, however, so if we can work out some guidelines about use of Unicode characters early in the 2.30 cycle (i.e. now), consistency could be maintained in the desktop for the 2.30 release. We might even be able to brag about nice typography for (dare I say it?) GNOME 3.0!
So, should we be expending effort on dealing with fonts which don’t support various Unicode characters, extending Pango to support the appropriate decompositions? Are there any problems with embedded C compilers and Unicode string literals? If we decide to go with a uniform usage of certain Unicode characters, what guidelines shall we go with, and how can we educate translators in how to type them?
Here I am, somehow successfully arrived in Gran Canaria, despite Iberia's best efforts. My first plane was delayed not quite long enough to give me hope that I'd catch my connection, but just too long for me to do so comfortably. Thankfully, the second plane was also delayed, so my running down the entire length of Madrid's Terminal 4 was somewhat unnecessary.
How am I here? I'm only here because of the nice people at the GNOME Foundation, who decided to sponsor me. Thank you, nice people!
In the past few days I've been looking at using en_GB translations to detect mistakes in the original C-locale strings. I've hacked a version of en_GB.pl to automatically translate the C-locale strings using its database of Americanisms (or should that be "Americanizms"?) and then compare them to the module's current en_GB strings, which have likely been caressed into shape manually.
The results are quite useful. About half of the strings are currently flagged up due to missing translations in en_GB.pl itself, which I've noted down in bug #524049. The other half are a combination of bad en_GB translations, en_GB translations which rectify mistakes in the original string and en_GB translations which improve the original string grammatically or punctuationally.
Using this hacked en_GB.pl and a few shell scripts which aid in iterating through a directory full of all of GNOME 2.22's en_GB PO files it hasn't been hard to come up with logs of all the problems in each module, so I'll be spending some time going through and fixing all the broken en_GB translations and then bugs will get filed about the string problems in each module.
I wonder if this – or something like it – could get put into use on a l10n tinderbox machine? I've yet to talk to someone about merging my changes with the proper en_GB.pl, which is the first step, but if it's possible to use something like this for l10n quality control, I'd happily put time in to get it working.
After a couple of days' work to push it up the last 10%, the British English translation of GNOME 2.22 is complete, barring any further string changes, and assuming the changes I committed to GTK+-2.12 actually turn up in the translation statistics (which, worryingly, they haven't done so far). It's verging on being late, I know, but it's better than nothing.
Hopefully this is OK for my first attempt at translating GNOME.
Hello Planet GNOME!
I'm Philip, 16-year-old programmer and GNOME user from England. Although I've been using GNOME for a while, I think it's only been a year I've been helping out. I've mostly been dealing with Totem bugs, and doing various other things to Totem; probably the most notable things I've done is adding Python and Vala support to Totem's plugin system, and convert the whole of Totem to use GtkBuilder, both of which are changes debuting in 2.20.
Some of you may have seen me at GUADEC; I was the lost-looking one without a laptop. I was there for the full duration of the event, and enjoyed it a lot, even if I didn't manage to take full opportunity to make more friends.
As far as personal life goes, I do a lot of web development (probably evident if you look at my previous posts. I've just finished the compulsory part of my education, gaining these grades, and I've now got two more years in full-time education before I hopefully go off to university, probably to study computer science. Away from school and computers I'm into hockey and model making.
Finally, if anybody manages to break anything on this site, please tell me, as I've only just upgraded the site, so there are probably still a few kinks to work out (Jeff noticed my feeds were slightly broken when he added me to planet, for example).