30% of all features requests for ERPNext are about making it multi-lingual. So, as we started building the new version, we decided to give it a try. Conceptually it is very simple,
For every text message that is printed on the screen, there must be a translation dictionary that will be used to replace the word with the appropriate word in the user's language.
But in implementation there are many issues.
Source of messages is varies: In ERPNext, messages can come from:
- Database (DocTypes, labels, descriptions etc)
- Code (Error messages etc)
- Client-side Code
- Framework Code
Translations must be partitioned: There are more than 2000 messages. So, to club them all is a bad idea because, at a given time, you only need to work on a subset of it.
Loading Translations: There must a system to load the parts of dictionaries on demand, on both, the server-side and the client-side.
Gettext
gettext
is the unix standard way of doing multi-lingual
apps. I checked out implementations in Django (a Python web framework)
and OpenERP (a Python ERP). Gettext seemed to have solved the problem
of extracting messages from code by declaring a standard way of writing
messages using the gettext
method.
The problem with gettext was that it was a tedious system and its translation files were very verbose (high noise to signal ratio). Plus it did not solve our problem of extracting messages from the database and a way to port into the browser as and when needed.
Options
Once I understood how the gettext
system worked, I decide to make my
own translation interface. (I know what you are thinking). The webnotes
system will have declarations similar to gettext, but it will have its own
extraction and loading system. Since we have our own framework, we can
automagically (thanks Ryan for this term) attach parts of the translations
when DocTypes are loaded. (example: Sales Order)
Partitioning was already solved because we had a folder for each DocType and Page and that would be the right place to put the transaction dictionary too.
We can also smartly attach messages for framework as and when needed (like when a request starts) etc.
So the idea was to use gettext
as a starting point and then build a similar
system. Also, few months back when I was working on a new framework prototype,
I had implemented something like this. So parts of the code were also ready!
Heavy Lifting
Now that we found a way to extract, partition and load messages, there must be a way to automatically make translation files for any language. Who is going to translate 2000 messages in 2012? Google!. So we wired in Google Translate API to do the heavy lifting.
Result
So we wired up this system and it worked!
We ran a Hindi translation and here are the results. Click on the image to see full-screen.