Using faker-js/SimpleFaker to produce test data in Oracle Database 23c

December 19, 2023 | 7 minute read
Text Size 100%:

Creating random sample data with the Oracle database is often perceived to be quite tricky. This is only partially true as you can read in this post.

Multilingual Engine offers new possibilities

Oracle Database 23c on Linux x86-64 introduced much-enhanced support for In-Database JavaScript. Multilingual Engine (MLE), powered by GraalVM, allows developers to use many open-source modules written for Deno or Node.js inside the database. FakerJS for example is a very popular data generator, and it can be used to generate heaps of realistic-looking test data.

Please refer to the project's GitHub project site for more details about its license and use implications. The article assumes your legal and IT Security departments (as well as any other party) agreed that using the module in your code is safe and compliant with your license. Using 3rd party code in your application typically requires specific compliance steps to be completed which are out of the scope of this article.

Using faker-js/SimpleFaker in the database

The BFILE clause is perhaps the most convenient way to load a JavaScript community module into the database. Assuming a directory object named SRC_CODE_DIR has been created pointing to a directory on the database server you can stage the ESM (ECMA Script Module) version of the module for deployment in the database. For the purpose of this article, the following directory object has been created:

SQL> SELECT
  2      directory_path
  3  FROM
  4      dba_directories
  5  WHERE
  6      directory_name = 'SRC_CODE_DIR';

DIRECTORY_PATH      
___________________ 
/opt/oracle/code

Next, download faker-js/faker to a convenient location on your laptop. The module's most current version can be found here for example:

https://cdn.jsdelivr.net/npm/@faker-js/faker/+esm

At the time of writing version 8.3.1 was the most current, stable release. In case you run into issues with newer versions, grab 8.3.1 from this URL:

https://cdn.jsdelivr.net/npm/@faker-js/faker@8.3.1/+esm

Alternative locations are provided in the documentation. After ensuring the source is trustworthy the next important step is to ensure you download the ESM version of the module. Once downloaded, transfer the file to the database server's SRC_CODE_DIR. The following snippet shows how to create the module in the database.

create mle module fakerJS
language javascript
version '8.3.1'
using bfile(SRC_CODE_DIR, 'faker.js')
/

If you get errors at this stage please ensure the requirements for creating MLE schema objects are met and that you have been granted execute privileges on SYS.JAVASCRIPT.

Let there be test data

To keep this example short, let's assume the only requirement is to get random ...

  • numbers,
  • strings,
  • dates.

Each function should allow the user to provide a lower and an upper bound. In the case of the random string generation there should also be an option to indicate the resulting string's casing.

Random data generator

A simple JavaScript module uses faker-js/SimpleFaker to generate the required strings, numbers, and dates. As with all JavaScript code you can reference functionality from other modules in your module. This is no exception with MLE. Unlike Node and Deno projects, however, you cannot reference a file from the file system - MLE modules are proper schema objects and reside inside the database. In-Database JavaScript uses MLE environments for name resolution. The code in the new module shown below uses the import name fakerjs, pointing to the fakerjs MLE module. The corresponding MLE environment is created as follows:

create or replace mle env faker_env imports (
    'fakerjs' module fakerjs
);

With the module in place the stage is set for the creation of the random number generator:

create or replace mle module mle_faker 
language javascript as

import { simpleFaker } from "fakerjs";

/**
 * Generate a random integer in the interval [minNumber, maxNumber]
 * @param {number} minNumber the lower bound of the interval
 * @param {number} maxNumber the upper bound of the interval
 * @returns number
 */
export function randomInt(minNumber, maxNumber) {

    const options = {
        max: maxNumber,
        min: minNumber
    }

    return simpleFaker.number.int(options);
}

/**
 * Generate a random, alpha-numeric string of a certain case (defaults
 * to mixed case), and a specific length
 * @param {string} casing the string's case, one of upper, lower, mixed
 * @param {*} minLength the string's minimum length
 * @param {*} maxLength the string's maximum length
 * @returns string
 */
export function randomString(casing, minLength, maxLength) {

    const options = {
        length: {
            max: maxLength,
            min: minLength
        }
    };

    switch (casing) {
        case 'upper':
            options.casing = 'upper';
            break;
        case 'lower':
            options.casing = 'lower';
            break;
        case 'mixed':
            options.casing = 'mixed';
            break;
        default:
            options.casing = 'mixed';
            break;
    }

    return simpleFaker.string.alpha(options);
}

/**
 * Generate a random date between in the interval [startDate, stopDate]
 * @param {Date} startDate the earliest possible date
 * @param {Date} stopDate the latest possible date
 * @returns Date
 */
export function randomDate(startDate, stopDate) {

    const options = {
        from: startDate,
        to: stopDate
    };

    return simpleFaker.date.between(options);
}
/

Call specifications

Before you can use randomInt(), randomString() and randomDate() in SQL or PL/SQL you must create a call specification. This PL/SQL code unit maps a (PL/SQL) name to the JavaScript code and environment as shown in this example:

create or replace package mle_faker_api as

    function random_number(
        p_min_number number,
        p_max_number number
    ) return number
        as mle module mle_faker
        env faker_env
        signature 'randomInt';

    function random_string(
        p_casing varchar2,
        p_min_length number,
        p_max_length number
    ) return varchar2
        as mle module mle_faker
        env faker_env
        signature 'randomString';

    function random_date(
        p_start_date date,
        p_stop_date date
    ) return date
        as mle module mle_faker
        env faker_env
        signature 'randomDate';

end mle_faker_api;
/

Note the use of the env clause. Since mle_faker references another module, the environment describing the mapping between the import name and MLE module must be provided, or an error will be thrown at runtime.

From this point on, anyone who can call SQL and PL/SQL from this schema will have the ability to invoke the JavaScript code via its call specification.

Let's try it

With the call-specification created, it's time to create some sample data:

SQL> select mle_faker_api.random_number(
  2      p_min_number => 10,
  3      p_max_number => 20
  4  ) as sample_integer;

   SAMPLE_INTEGER 
_________________ 
               10 

SQL> select mle_faker_api.random_string(
  2      p_casing => 'lower',
  3      p_min_length => 10,
  4      p_max_length => 20
  5  ) as simple_string;

SIMPLE_STRING           
_______________________ 
ezrdkbjjuvxctirnxoav    

SQL> select mle_faker_api.random_date(
  2      p_start_date => DATE '2023-01-01',
  3      p_stop_date  => DATE '2024-01-01'
  4  ) as sample_date;

SAMPLE_DATE    
______________ 
05-MAY-23      

With the basic building block available, it's easy to create lots of test data as well - for example, using PL/SQL for loops or Common Table Expressions (CTEs) as shown here:

with lots_of_dates as (
    select
        mle_faker_api.random_date(
            p_start_date => DATE '2023-01-01',
            p_stop_date  => DATE '2024-01-01'
        ) as rd
) 
select
    rd
from
    lots_of_dates
connect by level
    <= &how_much_data_do_you_want;

Summary

Generating simple test data isn’t particularly hard using SimpleFaker. Grab the module from your preferred CDN, load it into the database after carefully assessing license and security implications, expose the functionality you need to SQL, and off you go!

A follow-up to this article will show you how to create test data using @faker-js/faker. Stay tuned!

Martin Bach

Martin is a product manager at Oracle helping customers in Europe and around the world address their IT related problems. He is most interested in cloud technology, DevOps and how these can be used best with Oracle technology.


Previous Post

Oracle Linux: 2023 year in review

Honglin Su | 7 min read

Next Post


Automating OAuth 2.0 Authorization Flow For Connnector SuiteApps

Vlastimil Martinek | 6 min read