The Free and Open Source Developers’ European Meeting took place at Brussels this week-end at the Université Libre, hosting various conferences on a range of fascinating topics.
Was it a geek-only event? Far from it.
Language lovers could attend several presentations on the semantic desktop and the semantic web by programmers with a various degree of experience — including Kristof Van Tomme from Drupal, who has made the slides of his presentation on Drupal and the semantic web available here.
Axel Hecht from Mozilla gave a presentation about so-called l20n, or localization 2.0, the company’s project for integrating localization in website design (using XHTML5) and generally making everybody’s life easier, from programmers to linguists.
His talk focused mainly on the technical aspects of l20n, or to quote his own words, ‘I just work in localization, I don’t know shit about languages’.
Here are the main points of his contribution:
– current programming in the English-speaking world only takes the peculiarities of English into account, making life more complicated for linguists and programmers who need to take this into account when their forte is not languages;
– l20n aims at making life easier for programmers and linguists alike, by making it possible to take into account a variety of languages; code does not need to be changed at every translation to accommodate for linguistic variations
– these will be accounted for in libraries or galleries documented with a bundle of grammar rules. In the example given at the talk, there was a translation from English to Polish. 3 sentences were used as an illustration : “Axel gave 1 presentation.” “Delphine gave 5 presentations.” “Seth did not give any presentations.” In Polish, the subject’s gender affects verb conjugation, a specificity that does not exist in English. This grammatical rule can be documented by linguists in the library. All future linguists can benefit from this library (a sort of overarching set of rules)
– l20n can only work for a finite number of languages.
– For localization, or l10n, the basic unit is a string. For l20n, the basic unit is an object. An object can be submitted to variations according to the specific language it is being translated into (ie an object can be a complement, a verb…). The semantic properties of a sentence are more rigorously taken into account.
What could future developments look like?
– First, if the basic unit of l20n is an object, then it points towards the same direction as semantic web. In semantic web, content is readable by humans and machines alike. Then, one can imagine that localizers could be replaced in some instances by machines. It would be easier for machines to decipher a sentence properly, thus to translate it more accurately. Of course, localizers would still be needed to proofread the content. Hopefully it would be proofreading, not ‘fool-proofing’ (“let’s erase all this meaningless mumbo-jumbo and start from scratch, it’ll be easier.”)
– Then, companies may use the library/gallery in order to unify all of their copywriting guidelines. Brand consistency matters, even more so in a global context. Companies’ libraries, which would then be private, could include stylistic guidelines on top of grammatical guidelines, and share them more easily with their linguists.
L20n is still in development stage. It offers the possibility of taking the old dream of the semantic web one step further and to create more effective communication not only between persons speaking different languages, but mostly between humans and machines.