X

Everything about Table Scans including offload to Exadata, Big Data SQL, and External Tables

Create Quarantine

First if you want don't know what an Exadata Quarantine is read this. Someone asked whether you can create your own Exadata Cell quarantine and, if you can, why you might ever want to do it?  The first step when you don't know how to do something is try HELP in cellcli CellCLI> HELP ... ALTER QUARANTINE ... CREATE QUARANTINE ... DROP QUARANTINE ... LIST QUARANTINE So we see we can create a quarantine, so we use HELP again: CellCLI> HELP CREATE QUARANTINE  Usage: CREATE QUARANTINE...

Thursday, August 16, 2018 | Read More

Introduction to In-Memory External Tables (IMXT )

This article has been moved to the Oracle Database In-Memory blog

Friday, March 23, 2018 | Read More

Create External Table as Select

I was looking through a test script and saw something I didn't know you could do in Oracle. I mentioned it to an Oracle ACE and he didn't know it either. I then said to the External Table engineers "Oh I see you've added this cool new feature" and he replied dryly - "Yes, we added it in Oracle 10.1". Ouch! So just in case you also didn't know, you can create an External Table using a CTAS and the ORACLE_DATAPUMP driver. This feature only work with the ORACLE_DATAPUMP access...

Friday, March 9, 2018 | Read More

Understanding External Table URowids

I finally found time to get back to External Tables and have a list of blog posts I need to write on this topic. Here's a brief one. DBMS_ROWID will nicely break down a heap table's rowid for you into file number, block number, and row number but it doesn't handle the rowids coming from External Tables. So let's look at how to make sense of them. They fall under the datatype UROWID which is a nominally opaque rowid defined by the data source. The first byte of a UROWID tells...

Thursday, March 8, 2018 | Read More

How to tell if the Exadata column cache is fully loaded

When a background activity is happening on the cell you typically can't use RDBMS v$ views to monitor it in the same way. One such question is how to tell if a segment is fully loaded in the Exadata column cache since this does not appear in the equivalent In-Memory v$ views. When a segment is scanned by Smart Scan sufficiently often to be eligible the AUTOKEEP pool (typically that means at least twice an hour), the eligible 1MB chunks are written to flash in 12.1.0.2 style...

Tuesday, January 23, 2018 | Read More

Revisiting buffer cache Very Large Object Threshold (VLOT)

If you turn on NSMTIO tracing you will see references to VLOT: qertbFetch:[MTT < OBJECT_SIZE < VLOT]: Checking cost to read from caches (local/remote) and checking storage reduction factors (OLTP/EHCC Comp) I had said you could ignore VLOT and Frits Hoogland pointed out that tracing showed it had some impact, so let me clarify: VLOT is the absolute upper bound that cached reads can even be considered  This defaults to 500% of the number of buffers in the cache i.e. _very_large_obje...

Wednesday, November 15, 2017 | Read More

Random thoughts on block sizes

I heard "Oracle only tests on 8k and doesn't really test 16k" I heard someone assert that one reason you should only use 8k block sizes is that, and I quote, "Oracle only tests on 8k and doesn't really test 16k". I tried googling that rumour and tracked it back to AskTom. Also as disks and memory get bigger and CPUs get faster it is natural to ask if 8k is now "too small". So here are my thoughts: 1. A quick scan of the data layer regression tests showed a very large number...

Thursday, November 9, 2017 | Read More

External Tables Part 1: Project Columns All vs Referenced

I normally blog about table scans on Oracle native data but Oracle also supports a wide variety of features for scanning external tables and I need to cover these too. One of the ways I learn new things is being sent a bug report and saying to myself "Oh! I didn't know you could do that". So today I'm going to start with the grammar: Alter Table <xt> Project Columns [ All | Referenced ]   This DDL changes the value in the Property column displayed in user_external_tables: SQL>...

Wednesday, October 25, 2017 | Read More

The beginners guide to Oracle Table Scans

I was asked a question yesterday that reminded me there are always people completely new to the topic who need an introduction  - somewhere to start before the other articles make sense. So, here's my brief write-up of everything you need to know about the basic of Oracle Table Scans. Oracle has four main ways of scanning a table: the pre-9ir2 table scan, the 9ir2 TurboScan, the 11.1.0.1 Exadata SmartScan, and the 12.1.0.1 In-Memory Scan. Before we summarize each one,...

Wednesday, August 2, 2017 | Read More

What's new in 12.2: CELLMEMORY Part 3

The Cellmemory Stats in RDBMS  The RDBMS stats for Cellmemory are designed to closely follow the pattern used by the Inmemory stats  Query Stats   Each column in each one MB of disk blocks will be rewritten into one IMC format Column CU in flash and a set of Column CUs comprise an overall Compression Unit so these stats reflect the number of 1 MB rewrites that were processed (not the number of column CUs). "cellmemory IM scan CUs processed for query"- #1 MB chuncks scanned in...

Friday, April 14, 2017 | Read More

What's new in 12.2: CELLMEMORY Part 2

Question: do I need to know anything in this blog post? Answer: No, it is a true cache and CELLMEMORY works automatically Question: so why should I read this blog post? Answer:  because you like to keep a toolkit of useful ways to control the system when needed The DDL The Exadata engineering team has done a lot of work to make the flash cache automatically handle a variety of very different workloads happening simultaneously which is why we now typically discourage users from...

Monday, March 27, 2017 | Read More

What's new in 12.2: CELLMEMORY Part 1

Many people know that in 12.1.0.2 we introduced a ground-breaking columnar cache that rewrote 1 MB chunks of HCC format blocks in the flash cache into pure columnar form in a way that allowed us to only do the I/O for the columns needed but also to recreate the original block when that was required. This showed up in stats as "cell physical IO bytes saved by columnar cache". But in 12.1.0.2 we had also introduced Database In-Memory (or DBIM) that rewrote heap blocks into pure...

Friday, March 24, 2017 | Read More

More tricks with OPT_PARAM

Did you know you can set most parameters for the execution of a single statement without using an Alter Session by using an OPT_PARAM hint? For example, regular parameters (here forcing the storage clause in the query plan): SQL> select /*+ OPT_PARAM('cell_offload_plan_display' 'always') */ col1 From table1 and underscore parameters: SQL> select /*+ OPT_PARAM('_arch_comp_dbg_scan',2048) */ n_name from nation However if you try conflicting settings that set a different value...

Tuesday, March 21, 2017 | Read More

Improvements to HCC with wide tables in 12.2

HCC Compression Unit Sizing  Since the beginning Oracle has provided four compression levels to offer a trade-off between the compression ratio and various other factors including table scans and the performance of single-row retrieval. I can not emphasize enough that the various trade offs mean that YMMV with the different levels and you should always test what works best with your application and hardware when choosing the level. Historically people have rarely used Query...

Friday, March 17, 2017 | Read More

Why are HCC stats being bumped when Smart Scanning row major data in 12.2?

In 12.2, there is a stat "cell blocks pivoted" that points to a new optimization. When Smart Scan processes data, it has to create a new synthetic block that only contains the columns that are needed and the rows that pass the offloaded predicates. If Smart Scan has to create a new block, why would you create a new row-major block when we can just as easily create uncompressed HCC block? It is roughly the same amount of work but once the synthetic block is returned to the...

Friday, February 3, 2017 | Read More

When bloggers get it wrong - part 2

In Part 2 we are going to look at making use of the trace events that show what was discussed in Part 1.  NB: Oracle no longer adds new numeric trace events, going forward new trace events use the Unified Tracing Service whose grammer is much simpler. The elements we need are: trace[[x.]y] disk = [ lowest | low | medium | high | highest ] For example Table Scan tracing is in the DATA hierachy: [1] DATA [2] KDS    “Kernel Data Scan” [3] KDSFTS  “Full Table Scan”[3] KDSRID  ...

Saturday, January 21, 2017 | Read More

When bloggers get it wrong - part 1

I've read a number of blog entries from various people who've clearly put great diligence into trying to understand how the decision to use the buffer cache for a table scan or whether it is better to use direct read for the scan is made. Since this is critical decision from a performance perspective, I've decided to write a more definitive account. Part 1 deals with a few principles and part 2 (which will follow very shortly) will show how this works with tracing and clear...

Thursday, January 12, 2017 | Read More

Controlling the offload of specific operators

One of the joys of regexp is that you can write a pattern that is painfully expensive to match and offloading these to the cell can cause significant impact on other users and overall throughput (including heartbeat issues). If you have a user who is prone to writing bad regexp expressions you as DBA can prevent regexp (or any other operator) from being offloaded to the cells.Let's take a very simple example using a cut down version of TPC-H Query 16 and a NOT LIKE predicate: ...

Friday, November 18, 2016 | Read More

More on tracing the offload server

I posted a while back on how to use Tracing Hybrid Columnar Compression in an offload server so this is a quick follow up. 1. I have trouble remembering the syntax for setting a regular parameter in an offload server without bouncing it. Since I need to keep this written down somewhere I thought it might be use to support folks and dbas. 2. I forgot to show you how to specify which offload group to set the trace event So this example should do both:  CellCLI > alter cell...

Wednesday, June 29, 2016 | Read More

Shining some light on Database In-Memory vs the Exadata Columnar Cache in 12.1.0.2

Over the last few weeks I have gotten the following questions from a number of different sources,"what's the difference between Oracle's various columnar technologies and how do they fit together"? Or, put another way, "what scenarios does each benefit and would I ever need both or one will cover all the bases"? When a question comes up often enough it is always worth a blog post, so here goes. They say a picture is worth a thousand words so my own mental picture of the...

Wednesday, June 22, 2016 | Read More

What you need to know about Offload Quarantines

Several questions have couple up in the last few weeks about offload quarantines which means a blog post on this topic is overdue. We work hard to stress test every new rpm that is released but on rare occasions customers can encounter an issue with the thin database layer that exists in the offload server. This layer is known externally as "Smart Scan" and internally as "FPLIB" (a.k.a. Filter Projection Library). A crash in the thin database layer could because of either...

Tuesday, May 24, 2016 | Read More

Using INMEMORY with External Tables

Several people have asked if there is any way to use INMEMORY on External Tables because the INMEMORY syntax is not supported in Create Table and Alter Table with Organization External (actually Create Table parses it but then ignores it). While there is no way out of having to load the data into Oracle 12.1.0.2, there is a short cut to simplify life when you have an external process recreating flat file data on a regular basis and want to query the latest incarnation against...

Friday, April 8, 2016 | Read More

Compression in a well-balanced system

This continues the previous entry (Is there such a thing as fast enough?) Exadata is designed to present a well-balanced system between disk scan rates, cell CPU processing rates, network bandwidth, and RDBMS CPU such that on average across a lot of workloads and a lot of applications there should not be no one obvious bottleneck. Good performance engineering means avoiding a design like this: because that would be nuts. If you change one component in a car e.g. the engine you...

Tuesday, March 15, 2016 | Read More

Is there such a thing as fast enough?

Have you seen this video which went viral? What fascinates me is that I remember upgrading from Win 3.1 to Win 95 and was blown away by how fast and smooth it was. The feeling of "this is it", this is fast enough to feel really productive. But sadly so often today's amazement dies to quickly becomes tomorrow's frustration. We become addicted to the buzz of doing more interesting things and doing them more quickly. When I started in DW in the 90's, we thought an...

Thursday, March 10, 2016 | Read More

Hotsos 2016

If you want to understand more about Smart Scan works with Hybrid Columnar Compression I will be presenting a performance paper at Hotsos 2016 in Dallas March 6-10. Hotsos 2016 We will be looking at the effect of changing CPU, disk, and flash technologies in recent Exadata hardware revisions have on HCC and how that works with the 12.1.0.2 columnar cache and different workloads.

Monday, February 15, 2016 | Read More

Double counting of HCC stats prior to next major release

HCC stats have been confusing for many since their introduction in 11.2 both because of the lack of a common naming convention and because of double counting. When we did DBIM we spent a lot of time getting the stats right both in terms of naming convention that lent itself to straightforward wildcard searches, and in contents. For the next major release, the introduction of the IMC format Columnar Cache meant that the time had come to revamp the HCC stats more in line with...

Wednesday, December 16, 2015 | Read More

Offloading row level security to Smart Scan

The question came up this week about whether the predicates for row level security are offloaded to Smart Scan. The simple answer is yes as long as the policy_function contains off-loadable predicates. Let's see this in action. Using the public TPC-H schema we can set up the customer table so that customers can only see their own entries in that table. We need a row level security policy to see if each row is one that the current user is permitted to see and a function for the...

Tuesday, November 24, 2015 | Read More

Tracing Hybrid Columnar Compression in an offload server

I had previously commented on how to use the FPLIB facility in a trace event but the question came up today of how to trace HCC in an offload server.  The facility name in this case is ADVCMP (Advanced Compression) and the hierachy is: ADVCMP_MAIN ADVCMP_COMP ADVCMP_DECOMP No compression occurs on the cell so we are only interested in the decompression tracing in an offload server. So in this case the correct syntax is [facility.event] so:  cellcli -e 'alter cell offloadgroupEvents...

Tuesday, August 11, 2015 | Read More

Why you don't want to set _partition_large_extents = FALSE

I've seen some blogs recommending that _partition_large_extents be set to FALSE for a variety of space conserving reasons without the authors thinking about the negative impact this is going to have on Smart Scan. Large Extents cause an INITIAL allocation of 8 MB and a NEXT allocation of 1 MB and they have been the default for table spaces on Exadata since 11.2.0.2. You can verify that large extents are in use by a given table or partition by: Selectsegment_flags From sys_dba_seg...

Tuesday, June 16, 2015 | Read More

Alter Table Shrink Space Cascade and SmartScan

Over the years, updates can cause rows to become highly fragmented sapping performance on Exadata table scans. The offload server and hence SmartScan get data to process 1 MB at a time. Because of this, SmartScan is only able to process row pieces that are available in the current 1 MB chunk that it processes at a time. Unlike RDBMS table scans, SmartScan is not able to initiate disk I/Os to retrieve further row pieces and, even if it could, it in unlikely that they would be...

Friday, May 22, 2015 | Read More

Working around heatbeat issues caused by tracing or by regexp

I had noted in my first post that using the highest level of tracing caused timeout issues with the offload server heartbeat monitor. Heartbeat issues can also occur with expensive (and badly formed) regexp expressions. By default the heartbeat monitor is set to 6 seconds which is the maximum permitted to process 1MB data in the offload server and mark the task completed and is far more time than is reasonably expected to take.  Operations such as expensive tracing to disk or...

Thursday, March 5, 2015 | Read More

Examining the new Columnar Cache with v$cell_state

12.1.0.2 introduced the new Columnar Flash Cache where 1MB of  blocks that are all in HCC format are rewritten so as to make each column from each CU contiguous. This works by first writing all the block headers to an array, then writing all the CU headers to an array, finally writing all the Column 1 column-level-CUs, then writing all the Column2 column-level-CUs etc. The flash cache hash table maintains a simple mapping of column numbers to 64KB flash pages so, for any given...

Tuesday, February 24, 2015 | Read More

Using trace events with an offload server

I've noticed several people who were familiar with using trace events with cellsrv, were uncertain about how to use tracing the new offload server architecture. So whereas in the past you could have added tracing to SmartScan processing with: > cellcli -e 'alter cell events = "trace[fplib.sage_data] disk=lowest, memory=lowest"' the new syntax to turn it on is: > cellcli -e 'alter cell offloadgroupEvents = "trace[fplib.sage_data] disk=lowest, memory=lowest"' and to turn it off: >...

Monday, February 23, 2015 | Read More
Oracle

Integrated Cloud Applications & Platform Services