Improving language strings

In the past few days I've been looking at using en_GB translations to detect mistakes in the original C-locale strings. I've hacked a version of to automatically translate the C-locale strings using its database of Americanisms (or should that be "Americanizms"?) and then compare them to the module's current en_GB strings, which have likely been caressed into shape manually.

The results are quite useful. About half of the strings are currently flagged up due to missing translations in itself, which I've noted down in bug #524049. The other half are a combination of bad en_GB translations, en_GB translations which rectify mistakes in the original string and en_GB translations which improve the original string grammatically or punctuationally.

Using this hacked and a few shell scripts which aid in iterating through a directory full of all of GNOME 2.22's en_GB PO files it hasn't been hard to come up with logs of all the problems in each module, so I'll be spending some time going through and fixing all the broken en_GB translations and then bugs will get filed about the string problems in each module.

I wonder if this – or something like it – could get put into use on a l10n tinderbox machine? I've yet to talk to someone about merging my changes with the proper, which is the first step, but if it's possible to use something like this for l10n quality control, I'd happily put time in to get it working.