Tuesday Nov 14, 2006

Locale Demo on your own machine

"Seeing is believing", so I made the locale demo webstartable.

First, please check that your machine has Java 6 installed, if not, you can install it from here. Then just hit the following button:

The demo started? At this moment, it only displays locales that are provided by the Java runtime only. If you go to the "Locale Names" tab, it shows the total number of locales as "150".

Now the fun part begins. Download the CLDR Adapter into the extension directory of your Java runtime. For example on Windows, the path to the extension directory is typically "C:\\Program Files\\Java\\jre1.6.0\\lib\\ext". Then hit the above "Launch" button again. It would take a bit longer time to start, because this time the CLDR adapter retrieves the locale data from the Unicode Consortium's web site on the fly. Take a look at the "Locale Names" tab again and you will see the locale count jumps to "365"!

This demo also showcases an interesting aspect, i.e., "locale on demand". Locales can be installed into the runtime only when it is really needed. One of the features that we are planning to do in the next release, um, that means after Java 6 :-), is how we can deploy locale data to the clients, and this demonstrates an interesting point.

Wednesday Nov 08, 2006


来週ワシントンD.C.で開催されるInternationalization and Unicode Conferenceで近々公開予定のJava 6の国際化機能についてプレゼンテーションをしに行ってきます。そこで使うデモを現在準備中なんですが、その中の一つに前のエントリーで紹介したLocale Sensitive Services SPI(長い名前だ)という機能の紹介をするデモがあり、ここでちょっと紹介したいと思います。


もう一つのコンポーネントは目には見えないですが、このデモの核心部分となる物です。Unicode Consortiumが公開しているCLDRというXMLで記述されたロケールデータがあるのですが、このコンポーネントはCLDRのデータをLocale Sensitive Services SPIを使ってJavaランタイムに組み込む働きをします。現時点ではJavaはISO 639-2で定められた3文字の言語コードをサポートしていないので、それらの言語コードを使っているロケールはプラグインすることはできませんが、それでも300以上のロケールがこのCLDRアダプターによってJavaのテキストプロセッシングクラスで利用可能になるんです!






このデモではタミール語のようにWindows XPのフォントを持ってくることができない(Windows XPでも未サポートのため)ので、アムハラ語のオープンソースのTrueTypeフォントである"Jiret"というフォント(ここでダウンロード可能)を使いました。

もし来週開催のUnicode Conferenceに来られる方がいらっしゃいましたら是非声をかけてください。いろんな言語でのデモをお見せできると思います。

Monday Oct 30, 2006

More than 300 locales in Java!

I am going to give a presentation at the upcoming Internationalization and Unicode Conference held in Washington D.C. next month. The main topic of the presentation is the new i18n features in the upcoming Java 6, and now I am preparing a demo to showcase the potential of the Locale Sensitive Services SPI feature (I know. Kinda long name), which I wrote in the previous entry (wow, it was 8 months ago :-)

The demo consists of two components. One is just a simple Swing application which displays a list of available locales in the platform (precisely, a list returned from Locale.getAvailableLocales()), and displays formatted dates, times, numbers, currencies, or language/country/timezone names in a particular locale which is selected in the locale list.

The other component is invisible, but more interesting one. That is, an adapter which reads the CLDR's locale data in XML format, then it transforms the locale data and plugs them into the Java runtime through the Locale Sensitive Services SPI. Although Java currently do not support three letter language codes defined in ISO 639-2 so some locales may not be plugged in, but still we have more than 300 locales available in the Java text processing classes with this CLDR adapter!

Here is a preview of the demo. This demonstrates the date and time formatting in the Tamil (India) locale, which the Java runtime does not yet support out of the box:

Click to enlarge

Note that since Java does not support Tamil, it does not contain the suitable font either. However, the OS I used for running this demo was Windows XP, which contains fonts that are capable for displaying Tamil such as "Latha" font. And with the "font fallback" mechanism we introduced in Java 5.0, the demo worked like a charm with the formatted date and time, which are derived from the CLDR data.

Here is another screen shot of the demo, which displays date and time in Amharic (Ethiopia) locale.

Click to enlarge

For this demo, since Windows XP does not support Amharic, I used an open source TrueType font for Amharic called "Jiret", which is available here.

If you are interested, and planning to be at the next Unicode conference, please come to our presentation. I am sure I can show you more interesting bits!

Wednesday Feb 15, 2006

About the Locale Sensitive Services SPI in Mustang Beta

Now that the Mustang (a.k.a. Java SE 6) beta is out (yay!), I think it is about time to describe one of the new i18n features so that you can ride the Mustang beta with pleasure. The feature I want to mention here has the name "Locale Sensitive Services SPI". Sounds complicated? No, it's not! OK, let me introduce it to you.

With this Mustang Beta release, we support more than 100 locales in java.text and java.util packages. Although this covers lots of areas in this Globe, there are areas that are yet to be supported by the Java runtime, and some customers do want them supported. Believe it or not, supporting a locale and its data requires a lot of investigation, such as the most popular date format or translation of a country name in that language. It's getting more difficult for smaller countries/regions. Sometimes even political ramifications come in. To resolve such a difficulty, there is an interesting project going on, which is called "Common Locale Data Repository (CLDR)" at the Unicode Consortium and standardizes the commonly used locale data. However, we cannot include all of the defined data into the Java runtime as it would bloat the Java runtime size (We introduced a couple of new locales in Mustang from the CLDR, though). So, the idea we came up with was to open up the interface with which the developers can plug in their own locale data and related services. I guess that the question in your mind now is, how do I use them, right?

To provide locale data/services, you first need to decide which functionality you want to provide for your own locale. With Mustang, you can provide the following functionalities by your locale data implementation.

  • java.text.BreakIterator
  • java.text.Collator
  • java.text.DateFormat
  • java.text.DateFormatSymbols
  • java.text.DecimalFormatSymbols
  • java.text.NumberFormat
  • java.util.Currency
  • java.util.Locale
  • java.util.TimeZone

Once you decide which functionality you want to provide with your locale, then you will need to implement the corresponding SPI (stands for the "Service Provider Interface"), which resides either in java.text.spi or java.util.spi packages. Let's say you want to provide a DateFormat object for a new locale, then you need to implement java.text.spi.DateFormatProvider. Since java.text.spi.DateFormatProvider is an abstract class, you will need to extend it and implement the following 4 methods.

  • getAvailableLocales()
  • getDateInstance()
  • getDateTimeInstance()
  • getTimeInstance()

You'll notice that "getAvailableLocales()" is actually derived from the parent class, i.e., LocaleServiceProvider, so all the SPI providers need to implement it so that the providers can "declare" which locales they claim "supported". And you will also notice that the other three methods are mirrored factory methods from the corresponding API class, i.e., java.text.DateFormat class. This means that the object which your implementation returns is passed down to the application as it is.

After implementing those required methods, then you need to package your implementations, so that you can deploy it with the Java runtime. Since the Locale Sensitive Services SPIs are based on the standard Java Extension Mechanism, you can package them as a JAR file (with a few tricks in its MANIFEST file, which can be found in here) and place it in the extension directory.

That's it folks. If an application requests a DateFormat object for your locale, your providers object is now created and used. Happy locale-adding!

Wednesday Jan 05, 2005

I only bought a pastry!

Happy new year! After a hiatus, I try to resume this blog. Let's see :)

Some of you may have heard that Turkish government took place a denomination of their currency. On January 1st, 2005, they have introduced the New Turkish Lira (YTL), where one New Turkish Lira is equivalent to 1,000,000 (old) Turkish Lira (TL). The transition from the old currency to the new one seemingly went well, but it looks like some are confused. After the new currency inauguration, a Turkish man bought a pastry at some pastry store. It turned out that the credit card of the man was charged 1,400,000 New Turkish Lira (about $10,000) for the pastry! Both the man and the pastry shop were not sure what had happened until the credit card bank gave the man a call confirming the transaction. (cf. Man mistakenly buys Turkey's most costly pie)

Although the Java runtime cannot convert the amount in the old currency to the new one, this New Turkish Lira currency support has already been incorporated in the latest J2SE 5.0 Update 1 release. Let's take a look at the following piece of code:

Locale turkish = new Locale("tr", "TR");
Currency turkishCurrency = Currency.getInstance(turkish);

This code prints "YTL" on or after 1/1/2005, otherwise prints "TL". Please note that getSymbol() returns the ISO 4217 currency code if the display language is not Turkish, such as getSymbol(Locale.ENGLISH). In such a case, the above code prints "TRY" and "TRL" for the New Turkish Lira and the (old) Turkish Lira respectively.

Monday Sep 06, 2004

New Internationalization Features of the Java™ Platform

That's the title of the talk Craig and I are going to give at the 26th Unicode Conference, starting from tomorrow. I am still swamped with lots of demos, machine setup, etc, spending Labor Day long weekend with those, but I believe that we are in good shape now.

Friday Jul 16, 2004

Unicode Conference

Looks like I have to fill in for John at the next Unicode Conference, with Craig at Oracle. I knew that the tutorial sessions are longer than the technical sessions, but I did not know that the longest slot was assigned to our Java i18n tutorial session. I did technical sessions a couple of times, which are 40 minutes long, but I am not sure how we end up filling 2 hour and 40 minutes.

Well, let's forget it for now. Because I am off to Las Vegas!

Wednesday Jun 30, 2004

Swing vs. SWT

I am blogging at Moscone, waiting for our i18n BOF session to start...

One of the interesting sessions I saw today was the Swing vs SWT panel session (well, title may be different, but something like that). Both creators of Swing and SWT were there, and gave their toolkit explanation in 10 minutes, and then the users for those toolkits (NetBeans for Swing, and another IDE for SWT) expressed how they think about those toolkits. After that, they answered the attendee's questions and they went into an endless battle, well not really.

I expected more like a heated battle, but it was kind of a cooperative discussion. One phrase they said that I liked most was something like, "The concepts of those toolkits are different, Swing focuses on the cross platform compatibility, on the other hand, SWT focuses on exploitation of the native platform functionality. They are complementary to each other." I think this explains all and let the developers decide which toolkit they think it is suitable for their applications.

Monday Jun 28, 2004

It's tough to keep a secret...

I don't have the exact number, but I think this year's JavaOne is well packed, compared to the last year's one. In today's keynote, Graham announced that the upcoming J2SE's version number now changed to 5.0! I knew this change a couple of months ago, but of course I have not been able to mention that. In fact, we had to use the phrase "version 1.5" when we updated the Tiger related documents (such as my article about input methods), despite that this would be a false version number soon... BTW, I like the acronym "JDK" alot more than "SDK". This is a good change.

Friday Jun 25, 2004

Off to JavaOne

I am off to JavaOne next week, so probably will not update this blog often. One panel session I do not want to miss is Sun General Session, where they will discuss the open source possibility of Java. Rod Smith of IBM and Rob Gingell of Sun, who are the sender and the recipient of the controversial open letter about the Java open source. I'd also very much like to hear what Lawrence Lessig will say about Java open sourcing.

I wish I could moblog from JavaOne..., which is impossible because a) I don't have a mobile phone with a camera integrated, and b) this blog system is not capable of moblogging :(

Wednesday Jun 23, 2004

JavaOne 2004 is just around the corner

At JavaOne 2004, We will hold the annual BOF from the Java internationalization team. Of course I will be there. We will be presenting the new features that are introduced in the upcoming J2SE version 1.5, such as, Unicode 4.0 support including JSR 204 - supplementary character support, multilingual text rendering, Vietnamese locale support, and more!

This is a good opportunity for the people who are interested in the internationalization field, to hand us tons of homework in person ;) Your opinion matters. So come and join us!

Monday Jun 14, 2004

Picture taken for a demo

Sherman just stopped by my office and took a picture of me, which will be used in a demo for the upcoming JavaOne. Hope it looks decent at least :)

Thursday Jun 10, 2004

Lost in Technical Translation?

Recently, I had a chance to review a Japanese translation of a technical document written by my colleague (yes, I'm a Japanese). It may be because I have never done a translation to make my living, but every time I review those technical translations, I realize it very difficult to select the most proper Japanese word for a new technical terminology.

For example, we have introduced the Unicode 4.0 in J2SE 1.5 (still in beta 2), there are a lot of new terminologies such as "supplementary character" or "surrogate", and simple translations of those words are not very familiar to most of the Japanese people. So when we translate those words, we have to carefully choose (or even weave) Japanese words that represent the new words most properly.

On the other hand, we Japanese have a handy work around to import those new words, Katakana. Katakana is one of Japanese scripts, which mainly is used to loan foreign words. It's a phonogramic script, so that it can just import foreign word just by how it sounds. For example, "keyboard" is spelled (in Katakana) as "ki-bo-do" (where '-' denotes prolonged sound symbol). We do have an equivalent pure Japanese word for keyboard, but that is rarely used. We could use this work aound for those new technical terminologies, but those do not convey any meanings. That's why we should avoid this work around where possible. Since this is a technical translation, it may not be very useful unless the translated word has a meaning.

Yes, it's a tough job. We don't want to be blamed that we are the one who introduced those weird translations in the first place :)



« April 2014

No bookmarks in folder