X

Information, tips, tricks and sample code for Big Data Warehousing in an autonomous, cloud-driven world

The first really hidden gem in Oracle Database 12c Release 2: runtime modification of external table parameters

Hermann Baer
Senior Director Product Management

We missed to document some functionality !!!

With the next milestone for Oracle Database 12c Release 2 just taking place - the availability on premise for Linux x86-64, Solaris Sparc64, and Solaris x86-64, in addition to the Oracle Cloud - I managed to use this as an excuse to play around with it for a bit .. and found that we somehow missed to document new functionality. Bummer. But still better than the other way around .. ;-)

We missed to document the capability to override some parameters of an external table at runtime.

So I decided to quickly blog about this to not only fill the gap in documentation (doc bug is filed already) but also to ruthlessly hijack the momentum and to start highlighting new functionality (there's more blogs to come, specifically around my pet peeve Partitioning, but that's for later).

So what does it mean to override some parameters of an external table at runtime?

It simply means hat you can use one external table definition stub as proxy for external data access of different files, with different reject limits, at different points in time. Without the need to do a DDL to modify the external table definition.

The usage is pretty simple and straightforward, so let me quickly demonstrate this with a not-so-business-relevant sample table. The pre-requirement SQL for this one to run is at the end of this blog and might make its way onto github as well; I have not managed that yet and just wanted to get this blog post out.

Here is my rather trivial external table definition. Works for me since version 9, so why not using it with 12.2 as well.

CREATE TABLE et1 (col1 NUMBER, col2 NUMBER, col3 NUMBER)
ORGANIZATION EXTERNAL
( TYPE ORACLE_LOADER
  DEFAULT DIRECTORY d1
  ACCESS PARAMETERS
  ( RECORDS DELIMITED BY NEWLINE CHARACTERSET US7ASCII
    NOBADFILE
    NOLOGFILE
    FIELDS TERMINATED BY ","
  )
  LOCATION ('file1.txt')
)
REJECT LIMIT UNLIMITED
;
Pretty straightforward vanilla external table. Let's now see how many rows this external table returns (the simple "data generation" is at the end of this blog):
SQL> SELECT count(*) FROM et1;

  COUNT(*)
----------
        99

So far, so good. And now the new functionality. We will now access the exact same external table but tell the database to do a runtime modification of the file (location) we are accessing:
SQL> SELECT count(*) FROM et1
     EXTERNAL MODIFY
     (LOCATION ('file2.txt'));

  COUNT(*)
----------
         9
As you can see, the row count changes without me having done any change to the external table definition like an ALTER TABLE. You will also see that nothing has changed in the external table definition:
SQL> SELECT table_name, location FROM user_external_locations WHERE table_name='ET1';

 TABLE_NAME                     LOCATION
 ------------------------------ ------------------------------
 ET1                            file1.txt

And there's one more thing. You might have asked yourself right now, right this moment ... why do I have to specify a location then for the initial external table creation? The answer is simple: you do not have to do this anymore.

Here is my external table without a location:
CREATE TABLE et2 (col1 NUMBER, col2 NUMBER, col3 NUMBER)
ORGANIZATION EXTERNAL
( TYPE ORACLE_LOADER
  DEFAULT DIRECTORY d1
  ACCESS PARAMETERS
 ( RECORDS DELIMITED BY NEWLINE CHARACTERSET US7ASCII
   NOBADFILE
   NOLOGFILE
   FIELDS TERMINATED BY ","
 )
)
REJECT LIMIT UNLIMITED
;
When I now select from it, guess what: you won't get any rows back. The location is NULL.
SQL> SELECT * FROM et2;

no rows selected
Using this stub table in the same way as before gives me access to my data.
SQL> SELECT count(*) FROM et1
     EXTERNAL MODIFY
     (LOCATION ('file2.txt'));

  COUNT(*)
----------
         9

You get the idea. Pretty cool stuff. 

Aaah, and to complete the short functional introduction: the following clauses can be over-ridden: DEFAULT DIRECTORY, LOCATION, ACCESS PARAMETERS (BADFILE, LOGFILE, DISCARDFILE) and REJECT LIMIT. 

That's about it for now for online modification capabilities for external tables. I am sure I have forgotten some little details here and there, but there's always soo many things to talk (write) about that you will never catch it all. And hopefully the documentation will cover it all rather sooner than later.

Stay tuned for now. There's more blog posts about 12.2 to come. And please, if you have any comments about this specific one or suggestions for future ones, then please let me know. You can always reach me at hermann.baer@oracle.com.

Cheers, over and out. 

And here's the most simple "data generation" I used for the examples above to get "something" in my files. Have fun playing.


rem my directory
rem create or replace directory d1 as '/tmp';

rem create some dummy data in /tmp
set line 300 pagesize 5000

spool /tmp/file1.txt
select rownum ||','|| 1 ||','|| 1 ||','  from dual connect by level < 100;
spool off

spool /tmp/file2.txt
select rownum ||','|| 22 ||','|| 22 ||','  from dual connect by level < 10;
spool off

 

Join the discussion

Comments ( 11 )
  • Yuan Tschang Monday, August 21, 2017
    In the past, Tom Kyte has advised against using the same external table concurrently to read different source files (https://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:37593123416931). Does this 12.2 enhancement change anything in that regard?
  • Jorge Rimblas Wednesday, August 23, 2017
    Can the file be a bind variable?
    I suspect that being part of the FROM we cannot
  • Hermann Baer Wednesday, August 23, 2017
    @Yuan: looking at the AskTOM discussion you forwarded I could not fully decipher what exactly the biggest concern was from Tom back then. There were two themes outstanding that I wanted to quickly comment on:
    - invalidation of dependent objects: this will not happen with this functionality; it is not a DDL, so any runtime override does not have any impact on dependent objects.
    - changing the file 'under the covers' in a concurrent access environment; this is not applicable here since the override is only for the session/statement that does the override.

    So in a nutshell: none of these concerns apply to these new functionality and are addressed with this implementation. Feel free to use the same ET definition for multiple files.
  • Hermann Baer Wednesday, August 23, 2017
    @Jorge: there is a two-fold answer, meaning yes and no. ;-)
    Some of the parameters can be using bind variables, some others don't. The variables that can be binds are LOCATION and REJECT LIMIT. The others can't
  • Yuan Tschang Thursday, August 24, 2017
    Thanks for the response, Hermann! That's great to hear.

    Thanks Jorge for your question because it is applicable to my use case as well. Hermann, if we cannot use bind variables for bad, log, and discard file names, is there a way to make them share the same root as the source file specified in LOCATION?
  • Hermann Baer Friday, August 25, 2017
    @Yuan - not fully sure what you mean with that. if you are asking how to change the other parameters .. well, you cannot use binds for that (internal implementation reasons as of now), so you would have to use dynamic sql to do that.
  • Yuan Tschang Friday, August 25, 2017
    Hermann, so what I'd like to do is if I'm specifying the location to be "file1.txt" through a bind variable, I'd like the resulting log file to be named "file1.log", bad file to be named "file1.bad", etc. A second session could be simultaneously querying against "file2.txt", and should be producing log file "file2.log", bad file "file2.bad", etc.
  • Hermann Baer Friday, August 25, 2017
    as of today, dynamic sql is your only friend here ...
  • Yuan Tschang Sunday, August 27, 2017
    Thanks again for the responses, Hermann. I'm a bit disappointed that dynamic sql is the only way to go for now. Am I missing some other easy way to associate resulting output files to their input file? I'm open to other suggestions.
  • Adriano Policastro Wednesday, May 8, 2019
    Hi Hermann.

    I've a question and a problem.
    The question is easy: is this a supported or not supported feature?

    The problem.
    I liked this feature ad use it a lot in a big DWH. If I used the syntax select * from my_ext_table EXTERNAL MODIFY (LOCATION (MY_DIR:'my_file.csv')), and I set 'CURSOR_SHARING = EXACT', I receive an ORA-00922: missing or invalid option."

    If I use the syntax select * from my_ext_table EXTERNAL MODIFY (LOCATION ('my_file.csv')) or I set 'CURSOR_SHARING != EXACT', it works fine.

    What do you think about this?
  • Hermann Baer Friday, May 17, 2019
    Hi Adriano -

    functionality exposed in SQL must not be dependent on anything like the setting of cursor sharing, so I consider this a bug. You should follow up with Support on this.

    It works in my environment btw, although the only env I had quickly available was 19c .. so there is a high likelihood that I am ahead of you WRT the version.


    SQL> CREATE TABLE "TOTO_XT"
    (COL1 number)
    ORGANIZATION EXTERNAL
    ( TYPE ORACLE_LOADER
    DEFAULT DIRECTORY DD
    ACCESS PARAMETERS
    ( RECORDS DELIMITED BY NEWLINE
    NOLOGFILE NOBADFILE NODISCARDFILE READSIZE=10000000
    FIELDS CSV WITH EMBEDDED NOTRIM
    )
    LOCATION
    ( 'toto.csv'
    )
    )
    PARALLEL ;

    2 3 4 5 6 7 8 9 10 11 12 13 14 15
    Table created.

    SQL> SQL> select * from toto_xt;

    COL1
    ----------
    1.1
    2.2
    3.3

    SQL> show parameters cursor_sharing

    NAME TYPE VALUE
    ------------------------------------ ----------- ------------------------------
    cursor_sharing string EXACT
    SQL> select * from toto_xt external modify (location (dd:'toto2.csv'));

    COL1
    ----------
    99.9
    88.8
    77.7

    SQL>
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.