select min(lo_quantity), max(lo_ordtotalprice) from lineorder;The LINEORDER table in this example has 11,997,996 rows and has been populated into the IM column store using the default in-memory compression, which is "memcompress for query low"
From the statistic output we see that the query accessed 22 IMCUs (IM scan CUs memcompress for query low) and a total of 11,997,996 rows in those 22 IMCUs (IM scan rows), but only "returned" 22 rows (IM scan rows projected) from the IM column store. So how does Oracle return the correct aggregations when it only returns 22 rows from the query? The optimization that is happening is that Database In-Memory is returning the aggregations at the IMCU level back to the query layer. Most of the aggregation work occurs during the actual scan of the IMCUs and this is where considerable time is saved, especially when scanning very large objects. The query layer has much less work to do to complete the aggregation (i.e. over 11 million rows versus just 22).
We still looked at 22 IMCUs (IM scan CUs memcompress for query low) and we know that we scanned 11,997,996 rows (IM scan rows), but now we see that the "IM scan rows projected" statistic also reflects the total number of rows that are returned.
Let's look at one more query, this one finds "What is the most expensive air-mail order we have received to date?" It has both an aggregation and a filter predicate.
Now we see three additional statistics, specifically "IM scan CUs predicates applied", "IM scan CUs predicates received" and "IM scan segments minmax eligible". These statistics tell us two things. The first is that we applied our filter predicate (lo_shipmode = 'AIR') to the 22 IMCUs (IM scan CUs predicates applied/received) that comprise the LINEORDER table, and that we accessed the In-Memory Storage Index on the lo_shipmode column and compared min/max values for the 22 IMCUs (IM scan segments minmax eligible). And again we returned just 22 rows back to the query layer as a result of the aggregation performed at the IMCU level (IM scan rows projected). So we can conclude that not only was the aggregate "pushed" into the scan as in our first example, but the where clause predicate lo_shipmode = 'AIR' was also "pushed" into the scan and evaluated for each IMCU accessed. All of these optimizations result in a much faster scan with fewer rows returned than a traditional row based scan.
I hope that this posting was helpful in showing another one of the optimizations that have been made to make scans of the IM column store as fast as possible with Database In-Memory.