Geertjan's Blog

  • November 20, 2010

HTML Parsing for NetBeans Platform Applications

Geertjan Wielenga
Product Manager
I was creating a hyperlink for the JCite support I've been working on and discovered how simple HTML parsing can be when you use the NetBeans Platform's "org.netbeans.api.html" (HTML Lexer module) and "org.netbeans.api.lexer" (Lexer module) packages:
public class JCiteHyperlinkProvider implements HyperlinkProvider {
private static String JCITE_IDENTIFIER = "[jc:";
private int startOffset;
private int endOffset;
public boolean isHyperlinkPoint(Document doc, int offset) {
TokenHierarchy hi = TokenHierarchy.get(doc);
TokenSequence<HTMLTokenId> ts = hi.tokenSequence(HTMLTokenId.language());
if (ts != null) {
Token<HTMLTokenId> tok = ts.token();
int tokOffset = ts.offset();
if (tok.text().toString().startsWith(JCITE_IDENTIFIER)) {
//Add 4 for the prefix "[jc:":
startOffset = tokOffset+4;
//Remove 17 for the suffix ":---fragment]":
endOffset = startOffset + tok.length()-17;
return true;
return false;

The above is for identifying bits like this:

Just posting it here so I can find it again when I need the above snippet later.

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.