X

An Oracle blog about BI Publisher

  • RTF
    November 20, 2007

Counting Distinctly

Scouring the forum this morning I almost had an 'instant message' session with Darice, we were pinging the forum back and forth so quickly - a great way to communicate and really shows the value of the forum. There were problems getting counting working when re-grouping. We ended up exchanging emails and I think things are going to be working now.
Another thread on counting came up today too and is worthy of mention here. It came from Vadim (nagornyi):








Here is sample XML:

<ROWSET>
<ROW>
<DEPT>A1</DEPT>
<DATA>2112</DATA>
</ROW>
<ROW>
<DEPT>A1</DEPT>
<DATA>7985</DATA>
</ROW>
<ROW>
<DEPT>B1</DEPT>
<DATA>8452</DATA>
</ROW>
<ROW>
<DEPT>C2</DEPT>
<DATA>2159</DATA>
</ROW>
</ROWSET>



Using Table Wizard, I am getting this report:

A1 2112
7985
B1 8452
C2 2159


Everything OK, but at the report header I need to count the number of DEPT, i.e. 3. How to get it? Thanks.






I have to admit I rushed at this one and missed the crux of the issue - Im making amends here and sharing a hidden feature of the RTF templates. I missed the fact that Vadim wanted the number of unique DEPTs not the total number ie 3 not 4.



There are ways to do this programmatically in XSL but its nasty, very nasty. Thats why we have a function that makes life tres simple - 'distinct_values'. Its a Publisher extension that does all the heavy lifting for you in the background, it takes the format:



xdoxslt:distinct_values(Node name)




and returns a numeric value.



In Vadim's case



<?count(xdoxslt:distinct_values(/ROWSET/ROW/DEPT))?>
will do the trick and return '3' rather than '4'.



A word of caution, its expensive, just like DISTINCT in SQL its going to require a lot of processing to read the node tree in and process it eliminating duplicates. There are other slightly less expensive ways to achieve the same using XPATH but for now, just use it wisely.





An update - Nilanshu came up with the XPATH solution I was aluding to above ... remember the data must be ordered by the data you want to count and remove the duplicates from.



If your data is ordered, then you may simply count all ROWS that have a different DEPT than the previous one, keeping into account the first row of course. This might give you a better performance for large XML files.

<?count(ROW[position()=1 or ./preceding-sibling::ROW[1]/DEPT!=DEPT])?>

XPATH 2.0 does provide something similar, but AFAIK XPATH 2.0 does not work in Oracle XMLP yet.

<?count(distinct-values('DEPT')?>

Join the discussion

Comments ( 5 )
  • Klaus Fabian Monday, February 1, 2010
    I don't expect that the XSL expression delivers good performance for large data sets:

    The algorithm is O(square(n)) - all good sorting algorithms are O(n*log(n)) - this is at most a sorting problem. Having a hash table or bins - the problem could be solved in nearly O(n).
    It may avoid running out of memory if the other expressions rely on the data set fitting into memory.
  • Ashwini Tuesday, April 20, 2010
    Hi,
    Distinct count syntax does not work for my RTF template. I'm using the syntax as follows:
    But when I deploy in the application or try to view from local machine the Excel/PDF/HTML outputs. The errors thrown is as follows:
    ************************************
    at java.lang.reflect.Method.invoke(Unknown Source)
    at oracle.apps.xdo.common.xml.XSLT10gR1.invokeNewXSLStylesheet(Unknown Source)
    at oracle.apps.xdo.common.xml.XSLT10gR1.transform(Unknown Source)
    at oracle.apps.xdo.common.xml.XSLTWrapper.transform(Unknown Source)
    at oracle.apps.xdo.template.fo.util.FOUtility.generateFO(Unknown Source)
    at oracle.apps.xdo.template.fo.util.FOUtility.generateFO(Unknown Source)
    at oracle.apps.xdo.template.FOProcessor.createFO(Unknown Source)
    at oracle.apps.xdo.template.FOProcessor.generate(Unknown Source)
    at RTF2PDF.runRTFto(RTF2PDF.java:629)
    at RTF2PDF.runXDO(RTF2PDF.java:439)
    at RTF2PDF.main(RTF2PDF.java:289)
    Caused by: oracle.xdo.parser.v2.XPathException: Function 'distinct-values' not found.
    at oracle.xdo.parser.v2.XSLProcessor.reportException(XSLProcessor.java:806)
    at oracle.xdo.parser.v2.XSLProcessor.newXSLStylesheet(XSLProcessor.java:571)
    ... 14 more
    ********************************************************
    Please let me know what needs to be done.
    Thanks
    Ashwini
  • Ashwini Tuesday, April 20, 2010
    I'm using the syntax as posted in the website count(distinct-values('Name') (enclosed in angle brackets and a question mark).
    But does not work!.
  • Ashwini Wednesday, April 21, 2010
    Thanks for the article. I could get Distinct values syntax right. I have used the following in my RTF file:
    if@inlines: count(xdoxslt:distinct_values(ProjectNumber))> ‘1’
    Enclosed in the angular brackets and a question mark.
  • guest Thursday, August 14, 2014

    Thanks a lot Tim Dexter.....u are Awesome.....


Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.