AI and Data Security: Protecting Sensitive Information with Oracle Autonomous Database Select AI

Oracle Autonomous Database Select AI helps provide a simple way to integrate Large Language Models (LLMs) with your database applications to use natural language to query your data. However, some organizations may have concerns that using LLMs and AI with their business data could introduce data security challenges. For example, there might be concerns related to AI’s access to sensitive data and potential misuse. Oracle Autonomous Database can help make it easier for organizations to take advantage of AI while helping to allow you to address data security concerns through a multi-layered security approach.

Greater security for AI data queries: Oracle’s zero-trust model for data access

Setting up Select AI requires a simple configuration step to create credentials to be granted and a profile for data to be included. But let’s take a step back and start with how Autonomous Database handles data access for users, including those using AI-powered applications enabled by Select AI. Autonomous Database adopts Oracle’s zero-trust model for database access, where privileges must be granted to users. This means that for an existing or new environment, you must carefully consider, and grant data access based on least privilege policies. In addition, Autonomous Database offers built-in security features like role-based access control, virtual private database, Oracle Database Vault to help protect sensitive data, even from highly privileged users (e.g. DBAs). The zero-trust model and Autonomous Database’s security features are inherited by Select AI. Users can only ask and retrieve data where they have privileges.

For example, let’s say an organization has several different departments like HR, Financial, Support, and Sales. Each department has its own set of sensitive data and intra and inter-department users who require privileges to access the appropriate data. A common practice for this can be to use access roles using the principle of least privilege.

To assist you to implement these privileges, Oracle offers automation, default policies and tools. Oracle Data Safe and Privilege Analysis help provide monitoring and reporting of data users, user privileges, and privilege changes in Autonomous Database. Data Safe can also discover sensitive data based on over 170 pre-defined data types to help organization protect and classify data. This analysis of sensitive data and privileges is important for underlying privileges for Select AI. (For more on Data Safe, user assessment and data discovery check out: Autonomous Learning Lounge – https://www.youtube.com/watch?v=me9blY1kNQ0)

Finer-grained security restrictions for AI with Oracle Virtual Private Database

You can enable even finer-grained data access with Oracle Virtual Private Database, which lets you set security policies to control data access at the row- and column-level directly on the database tables or views. With such a security policy set, natural language queries generated by a LLM via Select AI adhere to the additional data access conditions and returns only the data that the user has access.

To show an example, let me extend the previous use case. Suppose a group of sales users are using Select AI to ask natural language questions about their customers and sales data. Now, the organization wants to enforce a new security policy that the user can only access their own region’s financial sales data.

The following SQL statements are for table structures and the security function used for a virtual private database (VPD). Once in place, this will restrict each user’s data access to their own region, even if the specific restriction is not included in the query generated by Select AI. Using this data security technique is ideal for natural language queries, because it enables Select AI to help generate the query for the right set of data without knowing the underlying table structures or the filters.

Example tables:

CREATE TABLE MOVIESTREAM.SALES
( PROD_ID NUMBER,
CUST_ID NUMBER,
REGION_ID NUMBER,
TIME_ID DATE,
CHANNEL_ID NUMBER,
PROMO_ID NUMBER,
QUANTITY_SOLD NUMBER(10,2),
AMOUNT_SOLD NUMBER(10,2)
);

CREATE TABLE HR.SALES_TEAM
( USERNAME VARCHAR2(80),
TEAM_ID NUMBER,
REGION_ID NUMBER
);

Security function to use as the filter for the VPD policy:

CREATE or REPLACE FUNCTION SEC_FUNCTION
(OBJECT_SCHEMA VARCHAR2, OBJECT_NAME VARCHAR2)
RETURN VARCHAR2
AS
REGION_NUM NUMBER;
BEGIN
select region_id into region_num from
hr.sales_team where username= SYS_CONTEXT('userenv', 'SESSION_USER');
RETURN 'region_id=' || region_num;
END;

Create the VPD policy on the table and enable it:

BEGIN
DBMS_RLS.ADD_POLICY(
object_schema => 'MOVIESTREAM',
object_name => 'SALES',
policy_name => 'by_region',
policy_function => 'sec_function',
sec_relevant_cols =>'region_id',
sec_relevant_cols_opt => DBMS_RLS.ALL_ROWS);
END;
BEGIN
DBMS_RLS.ENABLE_POLICY(object_schema => 'MOVIESTREAM',
object_name=>'SALES',policy_name=>'by_region');
END;

If a user of the sales data asks a question, “What are the sales for the monthly promotion?” the following will happen:

Select AI helps formulate a query and returns results based on the promotion, quantity and amount sold, but
Security function limits the results to the region of the user.

Security control with Select AI’s Profile

Now that the data privileges are in place, let’s look at the next security layer: Autonomous Database Select AI. Within Select AI’s AI Profile, you can define the AI provider, credentials for that provider, and a schema or set of objects by using the DBMS_CLOUD_AI.CREATE_PROFILE function. This means that you have the flexibility to select which schema or objects to include at the profile level, instead of exposing the entire database to an LLM. Here is a quick example of a profile for Select AI:

DBMS_CLOUD_AI.CREATE_PROFILE (
PROFILE_NAME => 'MOVIESTREAM_AI',
ATTRIBUTES =>
'{ "provider":"AI_PROVIDER", #Example OCI GENAI
"credential_name":"AI_CREDENTIAL_NAME",
"comments":"true",
"object_list": [
{"owner":"MOVIESTREAM","name":"SALES"},
{"owner":"MOVIESTREAM","name":"CUSTOMER"},
{"owner":"SUPPORT","name":"TICKETS"},
{"owner":"HR","name":"DEPARTMENTS"},
{"owner":"MOVIESTREAM","name":"MOVIES"}]}'
);

In the profile, the provider and credentials also define the LLM to use, which provides a choice of LLM based on the use case and even sensitivity of the data. For example, using the LLMs within the OCI Generative AI Service can help reduce the risk of the data leaving Oracle Cloud Infrastructure (OCI).

For additional features of Select AI beyond the text-to-SQL feature, such as ‘narrate’ and retrieval augmented generation (RAG), data can be sent to the LLM. There are, however, a couple of options to limit the potential risk and protect the data, such as:

Bring the model to the data and keep it in your control by using an in-database transformer with Select AI RAG. Using DBMS_VECTOR.LOAD_ONNX_MODEL, load the transformer into the database.
Alternatively, send only the metadata to the LLMs. The DBMS_CLOUD_AI.DISABLE_DATA_ACCESS feature helps prevent data from being sent to the LLMs.

Gain multi-layered security for your data with Oracle Autonomous Database Select AI

Oracle Autonomous Database has multiple layers of security options to help safeguard your data while using AI in your applications or asking natural language questions about your data. It’s based on Oracle’s zero-trust model that requires privileges to access data, and it also provides you with ways to help fine-tune the data access for the users leveraging features such as virtual private database, Oracle Database Vault, and Data Safe. Select AI combines the data security in Autonomous Database with additional security controls defined for using the data with LLMs.

Oracle Autonomous Database and Select AI is available on OCI, Azure, Google Cloud, and AWS. Give it a try, and if you need more details on how to get started, we have LiveLabs workshops that can provide you with a hands-on experience on how to successfully set up and use Select AI. In addition, there is a sample code repository that walks you through these steps for Autonomous Database.

More information about Select AI

Unlocking Data for All with Sidecar: Empowering Business Users with AI-Driven Insights
Select AI Enhances Text-to-SQL Support
Select AI Retrieval Augmented Generation now supports in-database ONNX-format transformers

LiveLabs: https://livelabs.oracle.com/adb
Samples: https://github.com/oracle-devrel/oracle-autonomous-database-samples

AI and Data Security: Protecting Sensitive Information with Oracle Autonomous Database Select AI

Michelle Malcher

Director, Autonomous Database Product Managment

Announcing: ProxySQL Routing for Oracle Autonomous Database

Essbase 21.7.3.0.0 Marketplace and Independent Deployment is available

AI and Data Security: Protecting Sensitive Information with Oracle Autonomous Database Select AI

Authors

Michelle Malcher

Director, Autonomous Database Product Managment

Announcing: ProxySQL Routing for Oracle Autonomous Database

Essbase 21.7.3.0.0 Marketplace and Independent Deployment is available