Internationalisation
Internationalisation
fisharebest
Posts: 17Questions: 1Answers: 0
The I18N features work great for languages written in the latin script. But for languages written in other scripts, there are some difficulties.
For example, arabic has its own symbols for the digits (٠١٢٣٤٥٦٧٨٩), and even if I translate the "showing X to Y of Z", I still get latin digits:
عرض 1 إلى 20 من 39
For proper I18N, this should be displayed as:
١ إلى ٢٠ من ٣٩
One option would be an extra parameter "sDigits", default "0123456789", that is used as a one-to-one mapping. For arabic, I would just pass in "٠١٢٣٤٥٦٧٨٩"
Another issue is the actual formatting of numbers. For large numbers, it is helpful to use thousands separators, and these vary by language. For example:
Showing 1 to 20 of 1,234
Affichage de 1 à 20 parmi 1 234
Zeige 1 bis 20 von 1.234
Again, an extra parameter "sThousandsSeparator", default "," could be used.
( As a lazy application developer (!), it would be great to simply pass in an IETF language tag (en, fr, de, ar, etc.), and have datatables do all of this for me automatically! )
Also, the default sorting uses a simple ASCII comparison. This works with English, but fails badly with languages that use diacritic marks or non-latin scripts. Would it make more sense for the default "text" sort function to use localeCompare(x,y). I know I can add this as a custom sort function, but IMHO, it would make a better default than the current function.
For example, arabic has its own symbols for the digits (٠١٢٣٤٥٦٧٨٩), and even if I translate the "showing X to Y of Z", I still get latin digits:
عرض 1 إلى 20 من 39
For proper I18N, this should be displayed as:
١ إلى ٢٠ من ٣٩
One option would be an extra parameter "sDigits", default "0123456789", that is used as a one-to-one mapping. For arabic, I would just pass in "٠١٢٣٤٥٦٧٨٩"
Another issue is the actual formatting of numbers. For large numbers, it is helpful to use thousands separators, and these vary by language. For example:
Showing 1 to 20 of 1,234
Affichage de 1 à 20 parmi 1 234
Zeige 1 bis 20 von 1.234
Again, an extra parameter "sThousandsSeparator", default "," could be used.
( As a lazy application developer (!), it would be great to simply pass in an IETF language tag (en, fr, de, ar, etc.), and have datatables do all of this for me automatically! )
Also, the default sorting uses a simple ASCII comparison. This works with English, but fails badly with languages that use diacritic marks or non-latin scripts. Would it make more sense for the default "text" sort function to use localeCompare(x,y). I know I can add this as a custom sort function, but IMHO, it would make a better default than the current function.
This discussion has been closed.
Replies
> localeCompare
The reason I haven't used this by default is compatibility issues. By understanding is that it isn't supported by all browsers. It would be great if we could you this, but until it is widely supported and enough people have upgraded to compatible browsers, then we are stuck with the plug-in method for now (presumably in countries where this would be of most use, people will use using decent browsers...).
> sDigits
Very interesting! I must confess I hadn't considered this before. Are all languages base 10? I would guess probably not... I suspect that a callback is going to be needed for this to make it properly compatible, and in fact there is a suitable callback already available: http://datatables.net/ref#fnInfoCallback . This is executed whenever the information display is updated, and you can modify the output based on the input parameters as required.
Your thoughts on these options are very welcome indeed!
Regards,
Allan
To all intents and purposes, yes. Look at the Unicode Common Locale Data Repository (cldr.unicode.org), and extract the file numberingSystems.xml
Numbering systems are either based on a substitution of the digits 0-9, or an algorithm. For those languages that use algorithms (typically spelling the number out in words or using a roman-numeral style approach), the use of "western" digits is widely adopted and widely understood.
<>
We've started using it extensively on our project (webtrees.net). The function is included in the 3rd version of the ECMA script specification (Dec 1999), and so should always be present. However, there is no reference implementation and browsers are free to be as "locale aware" as they please. I would imagine that it always performs better than strcasecmp().
Should it also be applied to the numbers in the pagination buttons - or is there a separate callback for that?
Greg
Thanks for the extra information - I will look at using localeCompare and other internationalisation option as a priority once I finish my current work on DataTables (reorganising the core a bit just now).
Regarding the pagination buttons and fnFormatNumber - I must admit I hadn't actually thought of this - what would happen with page numbers > 1000. Yes it almost certainly should do that rendering. Equally the length change menu should also have this taken into account. I'll also add these options in as soon as I can.
Regards,
Allan
The length change menu already works fine - as the contents are specified by the client application.
FYI, these two pages show I18N being applied to as many of the components as possible.
http://fisharebest.webtrees.net/indilist.php?surname=ROACH&lang=en_GB
http://fisharebest.webtrees.net/indilist.php?surname=ROACH&lang=ar
As a reminder to anyone developing an I18N site, it is best to assume that there is no such thing as "numeric" data. Once it is has been localised, it will look and behave like text. Hence all the "numeric" columns in this example have a corresponding hidden column for sorting.
Greg
1. fnFormatNumber is now called when outputting the numbers for the full numbers pagination type
2. localeCompare is used for string comparisons. The downside is that there is a bit of a performance hit relative to operator checking - something to be aware of in future, so if this is a bottleneck for someone, and they are dealing with just ASCII, then they can override the default methods.
Thanks very much for the input!
Regards,
Allan
Allan