Geertjan's Blog

  • August 17, 2007

Identifying Words in HTML Documents

Geertjan Wielenga
Product Manager
If you study Tim's Wicket tag parser, which underlies the navigator mentioned in yesterday's blog entry, then you'll find it's not very hard to adapt it to your own purposes. For example, here's the same parser in action, but slightly modified. Here, instead of looking for Wicket tags, it looks for all words in the document and then prints them to the navigator:

Why might this be useful? Well, it's now a small step to a spell checker. The user would specify a file containing words, the words would be compared to the words in the document, and all the words that are not found in the file would be printed to the navigator. And those words, because they don't match the words in the file, would be the ones that are incorrect in one way or another. And that's all a spell checker should tell you, i.e., which words are incorrect. I've made a spell checker before, using annotations in the editor, but I haven't been able to find the code. Plus, I prefer this navigator approach to adding still more annotations to the editor. So, watch this space for HTML spell checker developments.

Join the discussion

Comments ( 1 )
  • Jake Tuesday, August 21, 2007

    Spellchecker? I can remember one for the JavaDoc in NB5, but unfortunately I can't remember who provided it. What about figuring that out and using its features?



Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.Captcha