Translation segmentation

I've been hacking furiously on doc.java.sun.com, the community translation site for java documentation (even if you're not interested in translation, it's still an interesting way to read javadoc). There's been a pile of bug fixing. The biggest change that people using the site will notice is that when you volunteer to do translations, it breaks large chunks of text down into more manageable segments. The isn't the full-up segmentation that professional translators need, but it's a step in the right direction. I also got a start on a statistics module. And I poured all of the JavaEE 5 javax.\* sources into it. With all of JavaSE and JavaEE in it, it's clear that I need to work on the navigation facilities.

Some folks have looked at the sources and asked "where's the database?". I'm a big believer in RAM. If I used a real database, some pages would require close to 1000 database lookups to construct. Even with all kinds of cacheing, this turns into far too much time wasted waiting for a disk to spin. Instead, I just use in-memory data structures. Every user action (a translation or a vote) adds to the data structure and to a log file. When I reboot the server, I just reprocess the logs. Given the statistics and semantics of this application, this technique works well. It's moderately straightforward to make this work well on clusters. And it's fast.

Comments:

Hi James. Nice to see JavaEE there. I think it might be useful to have a top navigation per platform, perhaps listing entries for all the specifications in that platform. That way one can browse top-down. - eduard/o

Posted by eduardo pelegri-llopart on October 20, 2006 at 01:28 PM PDT #

Hi James Gosling, Regarding your documentaion project(http://blogs.sun.com/jag/entry/translation_segmentation),what you really want to use for is Prevayler! Unfortunately, they are currently fixing the web site, but what Prevayler does is to keep the entire application in RAM, while maintaining logs and fully recoverable transactional semantics. Neat and mature tool!

Posted by Hristo on October 21, 2006 at 06:09 AM PDT #

Good work, would it be possible to add "Turkish" to the language list? Thanks.

Posted by ahmetaa on October 21, 2006 at 01:22 PM PDT #

To Hristo: there are better object or xml databases (like db4o , or sleepycat) then prevayler. prevayler is a lost project with a lot of noise.

Posted by ahmetaa on October 21, 2006 at 01:27 PM PDT #

Translation is not enough. It is better to implement a feature that we can see in PHP Manual where an user can add his/her comment for a certain page to cover what experience he/she has with as well as provide some proof-of-concept examples. If it happens, Java docs should be very great. However, huge moderation is much needed

Posted by pcdinh on October 21, 2006 at 02:31 PM PDT #

pcdinh, to be honest, most of the time those PHP manual like comments creates great confusion. some says it works some says it does not, old versions mixes with new ones etc. i never liked it to be honest. i think a clean document translation is the way to go.

Posted by afsina on October 21, 2006 at 04:29 PM PDT #

Hi - great. Why not support more languages (Afrikaans being the one I want). Or better yet, allow others to add languages to the list.

Posted by Grant on October 22, 2006 at 10:29 PM PDT #

Any chance of hooking onto a greater translation project, such as the Rosetta Project? This would expand you translation listing to approximately 2,376 languages. :)

Posted by Joel Buckley on October 23, 2006 at 01:18 AM PDT #

Awesome that you got Sun's legal to let you pull it off. Keep up the good work, James.

Posted by Dalibor Topic on October 23, 2006 at 07:30 AM PDT #

Is it down temporarily? I can't get to it.

Posted by Geir Magnusson Jr on October 24, 2006 at 09:13 PM PDT #

The site is down. James is currently on medical leave and things will have to wait for him to return. Sorry about that.

Posted by Ray Gans on October 25, 2006 at 04:10 AM PDT #

Post a Comment:
Comments are closed for this entry.
About

jag

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today