The Oracle Database ships with a set of sample schemas. This comes in quite handy for this post as it makes for a realistic data model. Rather than re-inventing the wheel this article is based on tables in the human resource, aka HR
schema.
For the sake of argument, let's assume you are assigned a task to provide sample data for the HR.EMPLOYEES
table. The EMPLOYEES
table is defined as follows:
SQL> info+ employees
TABLE: EMPLOYEES
LAST ANALYZED:
ROWS :
SAMPLE SIZE :
INMEMORY :DISABLED
COMMENTS :employees table. References with departments,
jobs, job_history tables. Contains a self reference.
Columns
NAME DATA TYPE NULL DEFAULT LOW_VALUE HIGH_VALUE NUM_DISTINCT HISTOGRAM
*EMPLOYEE_ID NUMBER(6,0) No
FIRST_NAME VARCHAR2(20 BYTE) Yes
LAST_NAME VARCHAR2(25 BYTE) No
EMAIL VARCHAR2(25 BYTE) No
PHONE_NUMBER VARCHAR2(20 BYTE) Yes
HIRE_DATE DATE No ..... .....
JOB_ID VARCHAR2(10 BYTE) No
SALARY NUMBER(8,2) Yes
COMMISSION_PCT NUMBER(2,2) Yes
MANAGER_ID NUMBER(6,0) Yes
DEPARTMENT_ID NUMBER(4,0) Yes
Indexes
INDEX_NAME UNIQUENESS STATUS FUNCIDX_STATUS COLUMNS
_______________________ _____________ _________ _________________ ________________________
HR.EMP_JOB_IX NONUNIQUE VALID JOB_ID
HR.EMP_NAME_IX NONUNIQUE VALID LAST_NAME, FIRST_NAME
HR.EMP_EMAIL_UK UNIQUE VALID EMAIL
HR.EMP_EMP_ID_PK UNIQUE VALID EMPLOYEE_ID
HR.EMP_MANAGER_IX NONUNIQUE VALID MANAGER_ID
HR.EMP_DEPARTMENT_IX NONUNIQUE VALID DEPARTMENT_ID
References
TABLE_NAME CONSTRAINT_NAME DELETE_RULE STATUS DEFERRABLE VALIDATED GENERATED
______________ __________________ ______________ __________ _________________ ____________ ____________
DEPARTMENTS DEPT_MGR_FK NO ACTION ENABLED NOT DEFERRABLE VALIDATED USER NAME
EMPLOYEES EMP_MANAGER_FK NO ACTION ENABLED NOT DEFERRABLE VALIDATED USER NAME
JOB_HISTORY JHIST_EMP_FK NO ACTION ENABLED NOT DEFERRABLE VALIDATED USER NAME
(info+
is a command available in sqlcl
)
The EMPLOYEES
table is a great candidate for a blog post: plenty of constraints and foreign keys to other tables add a certain degree of complexity.
FakerJS is a very popular data generator; it has been introduced in the previous article. It's again at the centre of attention and will be used to create the test data.
Please refer to the project’s GitHub project site for more details about its license and use implications. The article assumes your legal and IT Security departments (as well as any other party) agreed that using the module in your code is safe and compliant with your license. Using 3rd party code in your application typically requires specific compliance steps to be completed which are out of the scope of this article.
The article assumes that you deployed @faker-js/faker
8.3.1 as a Multilingual Engine (MLE) module in Oracle Database 23c. If you are unsure how to do this, please refer to the previous article for details. The post further assumes that you created the corresponding MLE environment FAKER_ENV
.
The following JavaScript module extends @faker-js/faker
with a few custom routines to generate the requested employee record. Some of the functions shown below have previously been introduced in an earlier article, they are kept to allow copy/pasting the code without creating problems at runtime.
The employee record generated by randomEmployee()
is returned as JSON
. Consumers of randomEmployee()
can store the data as is in a column or collection. Alternatively it's possible to use json_table()
to transform the JSON
into a relational model. You'll see an example of the latter a bit later in this post.
create or replace mle module mle_faker language javascript as
import { simpleFaker, faker } from "fakerjs";
/**
* Generate a random integer in the interval [minNumber, maxNumber]
* @param {number} minNumber the lower bound of the interval
* @param {number} maxNumber the upper bound of the interval
* @returns number
*/
export function randomInt(minNumber, maxNumber) {
const options = {
max: maxNumber,
min: minNumber
}
return simpleFaker.number.int(options);
}
/**
* Generate a random, alpha-numeric string of a certain case (defaults
* to mixed case), and a specific length
* @param {string} casing the string's case, one of upper, lower, mixed
* @param {*} minLength the string's minimum length
* @param {*} maxLength the string's maximum length
* @returns string
*/
export function randomString(casing, minLength, maxLength) {
const options = {
length: {
max: maxLength,
min: minLength
}
};
switch (casing) {
case 'upper':
options.casing = 'upper';
break;
case 'lower':
options.casing = 'lower';
break;
case 'mixed':
options.casing = 'mixed';
break;
default:
options.casing = 'mixed';
break;
}
return simpleFaker.string.alpha(options);
}
/**
* Generate a random date between in the interval [startDate, stopDate]
* @param {Date} startDate the earliest possible date
* @param {Date} stopDate the latest possible date
* @returns Date
*/
export function randomDate(startDate, stopDate) {
const options = {
from: startDate,
to: stopDate
};
return simpleFaker.date.between(options);
}
/**
* Create a new employee record for insertion into HR.EMPLOYEES. The main
* goal is to ensure referential integrity is preserved. Helper functions
* select the relevant details from the schema. Consumers of this function
* need to either store the record in JSON format, or use json_table() to
* convert JSON into the relational format.
* @param {string} gender the employee's gender
* @returns {JSON} a JSON representation of the new employee
*/
export function randomEmployee(gender) {
let employee = {};
const extension = randomInt(1, 9999);
// employee_id doesn't need to be created here
employee.firstName = faker.person.firstName(gender);
employee.lastName = faker.person.lastName();
employee.email = employee.firstName.substr(0,1) + employee.lastName.toUpperCase();
employee.phoneNumber = '1.650.555.' + faker.string.numeric(4);
employee.hireDate = randomDate('2020-01-01T00:00:00.000Z', '2023-12-01T00:00:00.000Z');
employee.jobId = randomJob();
employee.salary = faker.number.int(randomSalary(employee.jobId));
employee.commissionPct = null;
employee.departmentId = randomDepartment();
employee.managerId = randomManager(employee.departmentId);
return employee;
}
/**
* Get a random job from the HR.JOBS table
* @returns a random JOB_ID
*/
function randomJob() {
const result = session.execute(
`SELECT
job_id
FROM
jobs
ORDER BY
dbms_random.random
FETCH FIRST 1 ROWS ONLY`
);
if (result.rows === undefined) {
throw new Error(`database query went wrong: ${err}`);
} else {
return result.rows[0].JOB_ID;
}
}
/**
* Get a random department ID from HR.DEPARTMENTS. Makes sure the
* department has a manager. The department's manager will later
* be used as the new employee's manager as well.
* @returns a random DEPARTMENT_ID
*/
function randomDepartment() {
const result = session.execute(
`SELECT
department_id
FROM
departments
WHERE
manager_id is not null
ORDER BY
dbms_random.random
FETCH FIRST 1 ROWS ONLY`
);
if (result.rows === undefined) {
throw new Error(`database query went wrong: ${err}`);
} else {
return result.rows[0].DEPARTMENT_ID;
}
}
/**
* Get a specific department's manager from HR.DEPARTMENTS
* @param {number} departmentId the department ID for which a manager must be found
* @returns the corresponding MANAGER_ID
*/
function randomManager(departmentId) {
const result = session.execute(
`SELECT
manager_id
FROM
departments
WHERE
department_id = :1`,
[ departmentId ]
);
if (result.rows === undefined) {
throw new Error(`database query went wrong: ${err}`);
} else {
return result.rows[0].MANAGER_ID;
}
}
/**
* Get a random salary in the range defined by HR.JOBS, depending
* on the input parameter.
* @param {string} jobId matching HR.JOBS.JOB_ID
* @returns {JSON} a JSON object to be used in faker.number.int to calculate the appropriate salary
*/
function randomSalary(jobId) {
const result = session.execute(
`SELECT
min_salary as min,
max_salary as max
FROM
jobs
WHERE
job_id = :1`,
[ jobId ]
);
if (result.rows === undefined) {
throw new Error(`database query went wrong: ${err}`);
} else {
return {
min: result.rows[0].MIN,
max: result.rows[0].MAX
}
}
}
A call specification makes the JavaScript code available in SQL and PL/SQL. It can be created as follows:
create or replace package mle_faker_api as
function random_number(
p_min_number number,
p_max_number number
) return number
as mle module mle_faker
env faker_env
signature 'randomInt';
function random_string(
p_casing varchar2,
p_min_length number,
p_max_length number
) return varchar2
as mle module mle_faker
env faker_env
signature 'randomString';
function random_date(
p_start_date date,
p_stop_date date
) return date
as mle module mle_faker
env faker_env
signature 'randomDate';
function random_employee(
p_gender varchar2
) return JSON
as mle module mle_faker
env faker_env
signature 'randomEmployee';
end mle_faker_api;
/
With the JavaScript code in place and a call specification available, it's time for some testing.
Let's see what happens when you create a single employee:
SQL> select
2 mle_faker_api.random_employee('female') employee;
The query generates the following output:
EMPLOYEE
---------------------------------------------------
{
"firstName" : "Beatrice",
"lastName" : "Lynch",
"email" : "BLYNCH",
"phoneNumber" : "1.650.555.2607",
"hireDate" : "2023-01-11T05:42:26.380000+01:00",
"jobId" : "IT_PROG",
"salary" : 6458,
"commissionPct" : null,
"departmentId" : 90,
"managerId" : 100
}
A few things are noteworthy about this JSON:
EMAIL
column matches the rows in the table. The email address consists of the first letter of the first name + the last name in all upper-caseHR.JOBS
. Programmers earn between 4000 and 10000 monetary unitsThis looks good, all the values make sense in the context of the HR
sample schema.
What if you need to create more than a single employee record? No problem! The following example demonstrates how to create 10 new employee records, transformed to the relational format using json_table()
:
SQL> with lots_of_employees as (
2 select
3 case when mod(rownum, 2) = 0 then
4 mle_faker_api.random_employee('female')
5 else
6 mle_faker_api.random_employee('male')
7 end employee
8 from
9 dual
10 connect by level
11 <= 10
12 )
13 select
14 jt.*
15 from
16 lots_of_employees e,
17 json_table(
18 e.employee,
19 '$'
20 columns (
21 first_name varchar2 path '$.firstName',
22 last_name varchar2 path '$.lastName',
23 email varchar2 path '$.email',
24 phone_number varchar2 path '$.phoneNumber',
25 hire_date date path '$.hireDate',
26 job_id varchar2 path '$.jobId',
27 salary number path '$.salary',
28 commission_pct number path '$.commissionPct',
29 manager_id number path '$.managerId',
30 department_id number path '$.departmentId'
31 )
32* ) jt;
FIRST_NAME LAST_NAME EMAIL PHONE_NUMBER HIRE_DATE JOB_ID SALARY COMMISSION_PCT MANAGER_ID DEPARTMENT_ID
_____________ __________________ ___________________ _________________ ____________ __________ _________ _________________ _____________ ________________
Marc Mills MMILLS 1.650.555.4471 18-JUN-23 PU_MAN 14241 100 90
Marcia Welch MWELCH 1.650.555.6313 18-MAR-21 PU_MAN 13593 100 90
Mack Quitzon MQUITZON 1.650.555.7048 20-DEC-22 PR_REP 5533 200 10
Leticia Murphy LMURPHY 1.650.555.6040 29-AUG-22 AD_ASST 5066 103 60
Laurence Schuppe LSCHUPPE 1.650.555.8321 02-SEP-20 SA_MAN 19420 103 60
Phyllis Grant PGRANT 1.650.555.2625 14-MAY-22 AC_MGR 9475 100 90
Lowell Kilback LKILBACK 1.650.555.2167 20-DEC-21 PR_REP 4751 121 50
Mindy Upton MUPTON 1.650.555.4995 02-APR-23 ST_MAN 8356 200 10
Eddie Mitchell EMITCHELL 1.650.555.6041 26-MAR-23 SA_REP 11012 100 90
Gertrude O'Connell-Hoppe GO'CONNELL-HOPPE 1.650.555.8125 23-JAN-21 AC_MGR 9139 100 90
It’s very easy to add new records to the HR.EMPLOYEES table using this approach. Thankfully we don’t need to worry about the EMPLOYEE_ID, a sequence (EMPLOYEES_SEQ) helps populating this column:
SQL> insert into employees
2 with lots_of_employees as (
3 select
4 case when mod(rownum, 2) = 0 then
5 mle_faker_api.random_employee('female')
6 else
7 mle_faker_api.random_employee('male')
8 end employee
9 from
10 dual
11 connect by level
12 <= 10
13 )
14 select
15 employees_seq.nextval,
16 jt.*
17 from
18 lots_of_employees e,
19 json_table(
20 e.employee,
21 '$'
22 columns (
23 first_name varchar2 path '$.firstName',
24 last_name varchar2 path '$.lastName',
25 email varchar2 path '$.email',
26 phone_number varchar2 path '$.phoneNumber',
27 hire_date date path '$.hireDate',
28 job_id varchar2 path '$.jobId',
29 salary number path '$.salary',
30 commission_pct number path '$.commissionPct',
31 manager_id number path '$.managerId',
32 department_id number path '$.departmentId'
33 )
34* ) jt;
10 rows inserted.
Creating random dummy data isn't too hard, the challenge is somewhat higher in cases where data should make sense and respect primary and foreign key definitions. The previous article hopefully demonstrated that it's not too hard to fulfill these requirements.
Martin is a product manager at Oracle helping customers in Europe and around the world address their IT related problems. He is most interested in cloud technology, DevOps and how these can be used best with Oracle technology.
Next Post