Cloud Security Perspectives and Insights

A Simple Guide to Generating Fake Identity Test Data

Paul Toal
Distinguished Solution Engineer - Cyber Security

Whilst Cloud-based IAM services such as Oracle Identity Cloud Service are clearly the strategic direction for many customers, I still work with lots of companies either using today, or still considering traditional ‘on-premise’ IAM solutions. This can be for a variety of reasons, including:

  1. They cannot move to the cloud due to the sensitivity of their organisation, or possibly for regulatory reasons.
  2. They need deep technical capabilities and flexibility of an on-premise IAM platform, not typically provided with cloud-based Identity-as-a-Service (IDaaS) solutions.
  3. They are in the process of migrating to Cloud, but still have many systems, including their IAM platform running on-premise.

Of course, when I refer to ‘on-premise’ IAM I am talking about traditional IAM platforms, where the customer is responsible for installation and configuration of the software, as well as the day-to-day operation of it. Whether that software is actually running ‘on-premise’ within a customer’s own data centre, in a partner’s DC, or within an Cloud IaaS platform, it is still distinctly different to a Cloud-based IDaaS platform, where the customer is not installing and managing the underlying platform. Instead, they are just consuming the IDaas service. For the remainder of this article, I will refer to this ‘on-premise’ IAM as Enterprise IAM.

As anyone who has looked at a true IDaaS solution such as Oracle Identity Cloud Service is aware, you are not responsible for many of the non-functional requirements of the platform, such as performance, monitoring, backup and recovery, DR etc. However, all of this is firmly your responsibility with Enterprise IAM, just like any other on-premise software.

At the moment, I am working on a project that uses Oracle Management Cloud (OMC) to monitor Enterprise IAM (in this case, Oracle Enterprise Identity Services Suite). In case you aren’t aware, OMC is a cloud-native suite of management services that eliminates the human effort associated with traditional solutions for monitoring, managing and securing applications and infrastructure. OMC leverages machine learning and big data techniques against the full breadth of the operational data set to help customers drive innovation while removing cost and risk from operational processes. More details on this project in a future post.

An Overview of Oracle Management Cloud


In my environment, I have a full demo platform of Oracle Enterprise IAM deployed and OMC agents deployed to monitor the activity and metrics for that platform. However, monitoring provides limited value without any throughput and, being my own demo platform, it’s not heavily used enough to generate any serious metrics or activity. For access management, I want to throw some load at the servers for different use case and see how they perform, together with the underlying LDAP. Similarly, for identity governance, I want to perform a number of activities to kick off various actions and workflows etc. Therefore, I have been spending some time building some automated testing scripts using tools like Apache JMeter and Postman.

To make the testing realistic I needed to generate some fake test data. In the past, I have used Perl scripts to generate data but I didn’t really fancy brushing up on my very rusty Perl skills. Therefore, after asking a couple of colleagues I was pointed at a Python module called Faker. If you are already familiar with Faker, then you can stop reading now. However, if you aren’t, then I found it extremely useful. In just a few lines of code I was able to generate a CSV containing completely random test data. The official location for Faker is in GitHub here, and it provides installation instructions and simple usage instructions. As you will see, Faker has a wide range of different modules for generating different types of fake data. Below is the script I wrote to generate a simple CSV of random user details.

# Paul Toal, Oracle

# March 2019

# This file is used to generate a CSV file containing

# random user details for use with an Oracle Identity

# Governance test script


#Import the Faker module for generating fake data

from faker import Factory

#Import the random module to generate an employeeID

import random


fake = Factory.create()


#Define the file to write the output to

file = open("OIGTestUsers.csv","w")


# How many entries to make

howMany = 10


#Create a random number to use as a starting point for employeeID

entropy = random.randint(10000,99999)


#Write the CSV header file

file.write("employeeNumber, title, familyName, givenName, organization, email, userName, userType, phone" + "\r")


### Create a new line in the CSV

for n in range(1,howMany+1):


   #Generate job title. Returned value can contain a comma, so will be stripped out later


   #Generate phone number. Return value can contain an extension, i.e. x1234, so will be stripped out later


   # Generate first and last names separately to re-use in fields such as userName and email

   lastName = str(fake.last_name())

   firstName = str(fake.first_name())


   #Write each entry line to the output file

   file.write(str(n+entropy) + "," \

    + title.split(',')[0] + "," \

    + lastName + "," \

    + firstName + "," \

    + "Finance," \

    + firstName + "." + lastName + "@oracledemo.com," \

    + firstName + "." + lastName + "," \

    + "Full-Time," \

    + phone.split('x')[0] \

    + "\r")


#Close the file handler



I hope you find this useful as a guide to generating your own fake test data. Of course, there are many alternative ways to generate test data, in many different languages. However, I found this ideal for my particular purpose, and with very little effort.

Join the discussion

Comments ( 1 )
  • Karl Miller Saturday, April 6, 2019
    Looks good!
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.