HTML Parsing for NetBeans Platform Applications

I was creating a hyperlink for the JCite support I've been working on and discovered how simple HTML parsing can be when you use the NetBeans Platform's "org.netbeans.api.html" (HTML Lexer module) and "org.netbeans.api.lexer" (Lexer module) packages:
public class JCiteHyperlinkProvider implements HyperlinkProvider {

    private static String JCITE_IDENTIFIER = "[jc:";
    private int startOffset;
    private int endOffset;

    @Override
    public boolean isHyperlinkPoint(Document doc, int offset) {
        TokenHierarchy hi = TokenHierarchy.get(doc);
        TokenSequence<HTMLTokenId> ts = hi.tokenSequence(HTMLTokenId.language());
        if (ts != null) {
            ts.move(offset);
            ts.moveNext();
            Token<HTMLTokenId> tok = ts.token();
            int tokOffset = ts.offset();
            if (tok.text().toString().startsWith(JCITE_IDENTIFIER)) {
                //Add 4 for the prefix "[jc:":
                startOffset = tokOffset+4;
                //Remove 17 for the suffix ":---fragment]":
                endOffset = startOffset + tok.length()-17;
                return true;
            }
        }
        return false;
    }
    
    ...
    ...
    ...

The above is for identifying bits like this:

Just posting it here so I can find it again when I need the above snippet later.

Comments:

Post a Comment:
  • HTML Syntax: NOT allowed
About

Geertjan Wielenga (@geertjanw) is a Principal Product Manager in the Oracle Developer Tools group living & working in Amsterdam. He is a Java technology enthusiast, evangelist, trainer, speaker, and writer. He blogs here daily.

The focus of this blog is mostly on NetBeans (a development tool primarily for Java programmers), with an occasional reference to NetBeans, and sometimes diverging to topics relating to NetBeans. And then there are days when NetBeans is mentioned, just for a change.

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
12
13
14
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today