Using Data Studio Data Load APIs in PL/SQL

June 29, 2024 | 9 minute read
William Endress
Autonomous Database Product Management
Text Size 100%:

Introduction

The Data Studio Data Load application provides a powerful but easy-to-use data-loading application that allows users to load a variety of file formats from object storage without any coding. The same REST APIs that are used by this application are documented and available to application developers. I wrote about this last in the post Data Loading using REST Data Services APIs, providing a simple example using CURL.

In this post, I provide examples for creating cloud storage links, surveying input data, creating data load jobs, and monitoring progress in PL/SQL.

Advantages of using Data Studio Data Load APIs

Suppose you have used the Data Studio Data Load application and looked at the SQL generated to execute data loads. In that case, you know that the Data Load tools and, thus, the APIs are layered over DBMS_CLOUD.  You might then wonder what advantages the Data Studio APIs offer above and beyond DBMS_CLOUD.  There are plenty of advantages!  Here are just a few:

  • Data Studio APIs automatically survey files on object stores and provide the inputs to DBMS_CLOUD.COPY_DATA and CREATE_EXTERNAL_TABLE (the field list, column list, and format parameters). 
  • Data Studio APIs automatically schedule data load jobs to run as background processes. While this is necessary for stateless REST APIs, the ability to run longer-running operations in the background is also very useful to PL/SQL-based applications.
  • Data Studio automatically logs jobs and provides APIs for tracking job progress.
  • Data Studio APIs allow you to create names cloud store links (cloud store URIs and credentials) for later reuse.

Those are just a few. My favorite is the survey endpoint because it saves time and takes the guesswork out of setting up calls to DBMS_CLOUD.  I never set up DBMS_CLOUD.COPY_DATA by hand.

Running REST APIs using PL/SQL

You probably know REST APIs as the foundation of web-based applications, but you might not know how easy it is to use REST APIs in PL/SQL.  All you need is a PL/SQL function to make web requests, receive responses, and know how to extract data from JSON objects. Fortunately, Oracle provides for both.  My colleague Nilay Panchal wrote about using DBMS_CLOUD.SEND_REQUEST in his post How to level up and invoke an Oracle Function (or any Cloud REST API) from within your Autonomous Database.  I spend I good amount of time in APEX, so I'll use  APEX_UTIL.MAKE_WEB_REQUEST in my examples.  Knowledge of Oracle JSON functions such as JSON_VALUE, JSON_TABLE, and JSON_OBJECT are necessary when working with and REST API.

Using the database username and password is very easy to use in sample scripts such as these. In a real application, you should to use a different authentication method with APEX_UTIL.MAKE_WEB_REQUEST or a different method such as DBMS_CLOUD.SEND_REQUEST using a credential.  If you are using APEX_UTIL.MAKE_WEB_REQUEST you can securely store credentials as workspace web credentials.

Using APEX_UTIL.MAKE_WEB_REQUEST is very straightformat. It takes the form of:

Copied to Clipboard
Error: Could not Copy
Copied to Clipboard
Error: Could not Copy
DECLARE
  v_response CLOB;
BEGIN
  v_response := APEX_WEB_SERVICE.MAKE_REST_REQUEST (
    p_url => 'endpoint url'
    -- recommended for basic testing only
    , p_username => 'database username'         
    , p_password => 'database user password' 
    -- recommended for production applications
    , p_credential_static_id => 'workspace credential static id'                  
    , p_body => 'body');
  RETURN v_response;
END;

Sample Script

It's very easy to create a PL/SQL script that executes each stage of a data load. The following script:

  1. Creates a cloud storage link using a preauthenticated request (PAR).
  2. Runs a survey to inspect data files.
  3. Creates a data load job.
  4. Gets the progress of the data load job.

The following sample script creates a cloud store location, creates a data load job, and checks on the job's progress.

Create a Cloud Store Link

A cloud storage link is a named cloud store URI and, if necessary, a credential name. Since this example uses a pre-authenticated request, a credential is not needed.

Copied to Clipboard
Error: Could not Copy
Copied to Clipboard
Error: Could not Copy
DECLARE
  v_response CLOB;
  v_base_url VARCHAR2(1000);
  v_database_user VARCHAR2(100);
  v_password VARCHAR2(100);
  v_url VARCHAR2(1000);
  v_body CLOB;
BEGIN

  -- The name of your database user (schema) and password.
  v_database_user := 'your_user';
  v_password := 'your_password';

  -- The URL to your Autonomous Database instance.  Note: Do not include a / at the end.
  v_base_url := 'https://xxxxxx.oraclecloudapps.com';

  -- The URL used in the web request (no substitutions required).
  v_url := v_base_url || '/ords/' || LOWER(v_database_user) || '/_/db-api/stable/data-tools/cloud-storage-links/';

 -- Parameters to the cloud store link (no substitutions required).
v_body :=
  '{
     "cloud_storage_links":[
        {
          "storage_link_name":"FUEL_PRICES",
          "storage_link_description":"Preauthenicated request to Oracle object store bucket",
          "uri":"https://objectstorage.uk-london-1.oraclecloud.com/p/5ehNh0k3kWPS1u9yNyQ8r1MXZH-iaWG0djjnfd8wWrPoJInfU8osmxMf08TNTApZ/n/adwc4pm/b/blog_data/o/
       }
     ]
  }';

  -- Make the web request to create the cloud storage link.
  v_response := APEX_WEB_SERVICE.MAKE_REST_REQUEST(
    p_url => v_url
    , p_http_method => 'POST'
    , p_username => v_database_user
    , p_password => v_password
    , p_body => v_body);

  -- Print the response to the screen.
  DBMS_OUTPUT.PUT_LINE (v_response);

END;
/

Survey Input Data and Create Data Load Job

Creating a data load job is a two part process:

  1. Survey the input data (gets the format, column list, and field list).
  2. Create the data load job.
Copied to Clipboard
Error: Could not Copy
Copied to Clipboard
Error: Could not Copy
DECLARE
  v_survey CLOB;
  v_response CLOB;
  v_base_url VARCHAR2(1000);
  v_database_user VARCHAR2(100);
  v_password VARCHAR2(100);
  v_url VARCHAR2(1000);
  v_body CLOB;
BEGIN

  -- The name of your database user (schema) and password.
  v_database_user := 'your_user';
  v_password := 'Oracle123456';

  -- The URL to your Autonomous Database instance. Note: Do not include a / at the end.
  v_base_url := 'https://xxxxxx.oraclecloudapps.com';

  -- The URL used in the survey web request (no substitutions required).
  v_url := v_base_url || '/ords/' || LOWER(v_database_user) || '/_/db-api/stable/data-tools/surveys/';

 -- Create the body to the survey web request.  Note that in this example
-- 'fuel_prices/' is a folder within the cloud store bucket.
   v_body :=
   '{
      "data_loads":[
        {
          "storage_link_name":"FUEL_PRICES",
          "objects":[
            {
               "object_name":"fuel_prices/"
            }
         ],
         "table_name":"FUEL_PRICES_TABLE"
        }
      ]
    }';

  -- Make the survey web request. 
  v_survey := APEX_WEB_SERVICE.MAKE_REST_REQUEST(
    p_url => v_url
    , p_http_method => 'POST'
    , p_username => v_database_user
    , p_password => v_password
    , p_body => v_body);

    DBMS_OUTPUT.PUT_LINE (JSON_QUERY(v_survey, '$' RETURNING CLOB PRETTY) );

  -- The URL used in the create data load job web request.
   v_url := v_base_url || '/ords/' || LOWER(v_database_user) || '/_/db-api/stable/data-tools/data-loads/';    

  -- Make the create load job request.   Note the previous survey response is
  -- the body to the create data load job request.

  v_response := APEX_WEB_SERVICE.MAKE_REST_REQUEST(
    p_url => v_url
    , p_http_method => 'POST'
    , p_username => v_database_user
    , p_password => v_password
    , p_body => v_survey);   
  
  -- Print the response to the screen.
  DBMS_OUTPUT.PUT_LINE (JSON_QUERY(v_response, '$' RETURNING CLOB PRETTY) );

END;
/

Check Status of the Data Load Job

Check the status of data load job. Be sure to note the data_load_id from the create job request repsonse.

Copied to Clipboard
Error: Could not Copy
Copied to Clipboard
Error: Could not Copy
DECLARE
  v_response CLOB;
  v_base_url VARCHAR2(1000);
  v_database_user VARCHAR2(100);
  v_password VARCHAR2(100);
  v_job_id INT;
  v_url VARCHAR2(1000);
BEGIN

  -- The name of your database user (schema) and password.
  v_database_user := 'your_user';
  v_password := 'your_password';

  -- The URL to your Autonomous Database instance. Note: Do not include a / at the end.
  v_base_url := 'https://xxxxxx.oraclecloudapps.com';
 
  -- Substitute your job id.
  v_job_id := 21947;

  -- The URL used in the survey web request (no substitutions required).
  v_url := v_base_url || '/ords/' || LOWER(v_database_user) || '/_/db-api/stable/data-tools/data-loads/' || v_database_user || ',' || v_job_id || '/progress';
 
  -- Make the progress web request. 
  v_response := APEX_WEB_SERVICE.MAKE_REST_REQUEST(
    p_url => v_url
    , p_http_method => 'GET'
    , p_username => v_database_user
    , p_password => v_password);

  -- Print the response to the screen.  
  DBMS_OUTPUT.PUT_LINE (JSON_QUERY(v_response, '$' RETURNING CLOB PRETTY) );  
END;
/

Query the Table

This job is small and will run quickly. You can now query the table.

Copied to Clipboard
Error: Could not Copy
Copied to Clipboard
Error: Could not Copy
SELECT * FROM FUEL_PRICES_TABLE;

Selecting from API Responses

Web requests return JSON objects. If you are working in PL/SQL, you must query or extract data from JSON. Pro tip:  Generative AI generally provides good examples of Oracle JSON functions using your JSON objects.  Use a prompt such as 'Using Oracle JSON_TABLE, select table_name, column_name, field_name FROM following JSON object:  {paste your object here}.

Here are a few examples.

Survey Response

Select from survey response as a table.

Copied to Clipboard
Error: Could not Copy
Copied to Clipboard
Error: Could not Copy
SELECT
  table_name
  , column_name
  , field_name
  , data_type
  , data_length
  , data_format
  , column_id
FROM
    JSON_TABLE ({your survey response}, '$.data_loads[*]'
      COLUMNS (
        table_name VARCHAR2 ( 100 ) PATH '$.table_name'
      , NESTED PATH '$.columns[*]'
          COLUMNS (
            column_name VARCHAR2 ( 500 ) PATH '$.column_name'
          , field_name VARCHAR2 ( 500 ) PATH '$.field_name'
          , data_type VARCHAR2 ( 500 ) PATH '$.data_type'
          , data_length NUMBER PATH '$.data_length'
          , data_format VARCHAR2 ( 500 ) PATH '$.data_format'
          , column_id NUMBER PATH '$.column_id'
          )
      )
    )
  jt;

Extract the value ignoremissingcolumns from the survey response.

Copied to Clipboard
Error: Could not Copy
Copied to Clipboard
Error: Could not Copy
JSON_VALUE(json_column, '$.data_loads[0].format.ignoremissingcolumns' RETURNING BOOLEAN)

Progress Response

Select from progress response as a table.

Copied to Clipboard
Error: Could not Copy
Copied to Clipboard
Error: Could not Copy
SELECT
    owner
    , data_load_job_id
    , table_name
    , rows_loaded
    , rows_rejected
    , rows_total
    , estimated_total_rows
    , percentage_completed
    , status
    , time_start 
    , time_completion
    , exception_details
    , last_event VARCHAR2
    , time_of_last_event
FROM
    JSON_TABLE ( '{your progess response}', '$.items[*]'
      COLUMNS (
        owner VARCHAR2 ( 20 ) PATH '$.owner'
      , data_load_job_id NUMBER PATH '$.data_load_job_id'
      , table_name VARCHAR2 ( 30 ) PATH '$.table_name'
      , rows_loaded NUMBER PATH '$.rows_loaded'
      , rows_rejected NUMBER PATH '$.rows_rejected'
      , rows_total NUMBER PATH '$.rows_total'
      , estimated_total_rows VARCHAR2 ( 10 ) PATH '$.estimated_total_rows'
      , percentage_completed NUMBER PATH '$.percentage_completed'
      , status VARCHAR2 ( 20 ) PATH '$.status'
      , time_start TIMESTAMP PATH '$.time_start'
      , time_completion TIMESTAMP PATH '$.time_completion'
      , exception_details VARCHAR2 ( 100 ) PATH '$.exception_details'
      , last_event VARCHAR2 ( 20 ) PATH '$.last_event'
      , time_of_last_event TIMESTAMP PATH '$.time_of_last_event'
      )
    )
  jt;

An express to extract a single value from the progress response.

Copied to Clipboard
Error: Could not Copy
Copied to Clipboard
Error: Could not Copy
JSON_VALUE({your progess response},'$.items[0].percentage_completed')

Conclusion

The Oracle Autonomous Data Studio Data Load APIs can make defining and executing data-loading jobs much easier. The APIs can be run as web requests from within PL/SQL using database-provided packages, allowing you to build scripts and applications that leverage these important APIs.

Comments and Questions

Please feel to get in touch at william.endress@oracle.com if you have any comments or questions about this post.

 

William Endress

Autonomous Database Product Management

William (Bud) Endress a member of the Oracle Autonomous Database and SQL product management team, responsible for analytic views, data loading in Oracle Autonomous Database, in-memory aggregation (a feature of Oracle Database In-Memory), and the OLAP option to Oracle Database.

Everything you need to know about data warehousing with the world's leading cloud solution provider