Python is a popular, easy-to-learn programming language commonly used for data science, machine learning, and artificial intelligence. In the context of generative AI, Python is widely used to build, train, and deploy models that can generate text, images, or other content. As Oracle continues to help organizations harness the power of generative AI with their private data, we give Python developers native access to Select AI functionality within Oracle Autonomous Database.
Oracle now offers Select AI for Python enabling you to ask questions of your database data using natural language (text-to-SQL), receive generative AI responses using your trusted content (retrieval-augmented generation), generate synthetic data using large language models, and other features – all from Python. With Select AI for Python, you can leverage the broader Python ecosystem in combination with generative AI and database functionality – bridging the gap between the Select AI’s DBMS_CLOUD_AI PL/SQL package and Python’s rich ecosystem. It provides intuitive objects and methods for pythonic AI model interaction.
With Select AI, there is no vendor lock-in. You can choose your favorite LLMs and transformers from a wide range of AI providers, like OpenAI, Anthropic, Cohere, and others. You can even work with AI models from private endpoints to keep your models, data, and metadata under your control for added security. Of course, you benefit from the same Oracle enterprise-grade database security and availability.
Select AI for Python Main Features
Here’s a summary of the features that are available in Select AI for Python:
- Choice of AI provider and AI models: Create and manage Select AI’s AI profile objects, which allow you to specify the AI provider, and the LLM and transformer models you wish to use, among other settings. Select AI provides default AI models for specific AI providers automatically, which you can override. For example, OCI Generative AI Service uses one of the more recent Meta Llama LLMs.
- Text-to-SQL: Query your database using natural language, whether to generate a SQL query, run or explain that query, or get a narrated response to the results of that query.
- Chat: Use your LLM directly for content generation based on your user prompt—for example, generating custom emails, answering questions, and sentiment analysis—just to name a few use cases.
- Retrieval-Augmented Generation (RAG): Enable LLMs to generate more relevant responses by augmenting your prompt with knowledge from your provided documents– without LLM fine-tuning, which can be costly in terms of time and resources and potentially introduce security issues.
- Automated vector index creation and maintenance: Quickly and easily create a vector index to be used with RAG using data from cloud storage and other sources. Select AI automatically vectorizes your content for use in subsequent RAG actions with updates as new content is added.
- Results integrated in Python objects: Receive AI-generated results directly into Python data structures, facilitating analysis and integration.
- Chatbot with conversation memory: Create and manage named conversations based on your interactions with the LLM.
- Synthetic data generation: Generate synthetic data for a single table or a set of tables with referential integrity constraints.
- Synchronous and asynchronous invocation: Build applications using Python with either synchronous or the more flexible asynchronous programming style using standalone Python clients. With asynchronous support, the API integrates easily with web frameworks like FastAPI or Flask, enabling real-time AI-driven applications.
Examples using Select AI for Python
Let’s look at a few examples to get you familiar with Select AI for Python.
When using Select AI for Python, first import the select_ai module. If you’re using a standalone Python client, use your database username, password, and connection string to connect to your database. In OML Notebooks, the connection is implicit.
import select_ai # Use explicit connection from external Python client select_ai.connect(user=user, password=password, dsn=dsn)
See the section below for instructions on how to install and get started with Select AI for Python.
Example 1: Using NL2SQL Generation Actions
In this example, we use Select AI to generate queries against our database tables that answer the questions “how many customers do I have” and “how many distinct products are there.”
We’re using the OCI Generative AI service as our AI provider from the Chicago region with the default model from OCI; but we could have specified one of the available OCI models. In the provider attributes, we specify our access credential and the schema with the tables we want to be able to query. We then define our AI profile giving it a profile name, the attributes, and a description. Finally, we iterate over the prompts using three of the Select AI actions: show_sql to show the generated SQL query, explain to explain the generated query, run_sql to run the query and return the result, and narrate to provide an LLM-narrated response using the SQL query result.
provider = select_ai.OCIGenAIProvider(region="us-chicago-1", oci_apiformat="GENERIC") profile_attributes = select_ai.ProfileAttributes( credential_name = "<my_oci_ai_profile_key>", object_list = [{"owner": "SH"}], provider = provider) profile = select_ai.Profile( profile_name = "oci_ai_profile", attributes = profile_attributes, description = "My OCI AI Profile", replace = True) prompts = ["How many customers do I have?", "How many distinct products are there?"] for prompt in prompts: print("Prompt: ", prompt) print("\Show SQL: ", profile.show_sql(prompt=prompt)) print("\nExplain SQL: ", profile.explain_sql(prompt=prompt)) print("\nRun SQL: ", profile.run_sql(prompt=prompt)) print("\nNarrate: ", profile.narrate(prompt=prompt))
Example 2: Create a vector index for using narrate with RAG
In this next example, we generate vectors for text files in object storage. We again use an OCI Generative AI service provider from the Chicago region and specify the embedding model for vector indexing. Note that we are using a default embedding model here, but you can specify one if you prefer different model. As before, we provide our OCI profile key and the provider as attributes for our AI profile.
We then specify the Oracle vector index attributes with the object storage file location and credential. Lastly, we create the vector index object by specifying the index name, attributes, description, and the AI profile to use.
We then use the AI profile with the narrate action to answer the question “what types of actions does select AI support” and print out the results.
# Configure an AI provider with an embedding model of your choice provider = select_ai.OCIGenAIProvider( region = "us-chicago-1", oci_apiformat = "GENERIC", embedding_model = "cohere.embed-english-v3.0") # Create an AI profile that uses the vector index profile_attributes = select_ai.ProfileAttributes( credential_name = "my_oci_ai_profile_key", provider = provider) profile = select_ai.Profile( profile_name ="oci_vector_ai_profile", attributes = profile_attributes, description = "MY OCI AI Profile", replace = True) # Specify text documents for creating vector # Here, the objects reside in Oracle Object Store and the vector database is Oracle vector_index_attributes = select_ai.OracleVectorIndexAttributes( location = "https://objectstorage.us-ashburn-1.oraclecloud.com/n/dwcsdev/b/conda-environment/o/tenant1-pdb3/graph", object_storage_credential_name = "my_oci_ai_profile_key") # Create a vector index object vector_index = select_ai.VectorIndex( index_name = "my_vector_index", attributes = vector_index_attributes, description = "My first vector index", profile = profile) vector_index.create(replace=True) # Use the narrate action to use RAG result = profile.narrate("What types of actions does Select AI support") print(result)
Example 3: Generate synthetic data
Synthetic data generation allows you to use generative AI to populate one or more tables with semi-realistic synthetic data. In our example, we populate a table MOVIE with 100 generated rows featuring movies from 2019. If the table has some data already in it, we can specify the number of sample rows we would like to have sent to the LLM to further help guide its data generation. Here, we specify 10 sample rows to be included. In the data parameters, we assign “HIGH” priority for this data generation process to allow for maximum parallelism in Autonomous Database.
profile = select_ai.Profile(profile_name="oci_ai_profile") synthetic_data_params = select_ai.SyntheticDataParams( sample_rows = 10, table_statistics = True, priority = "HIGH") synthetic_data_attributes = select_ai.SyntheticDataAttributes( object_name = "MOVIE", user_prompt = "the release date for the movies should be in 2019", params = synthetic_data_params, record_count = 100) profile.generate_synthetic_data(synthetic_data_attributes=synthetic_data_attributes)
To generate data for multiple tables, we list each one in the object list, the number of records we’d like generated, and an optional user prompt for the LLM to consider. Here’s the code for the tables “MOVIE”, “ACTOR”, and “DIRECTOR”.
profile = select_ai.Profile(profile_name="oci_ai_profile") synthetic_data_params = select_ai.SyntheticDataParams( priority = "HIGH" ) object_list = [ {"owner": user, "name": "MOVIE", "record_count": 100, "user_prompt": "the movie release date should be in 2019"}, {"owner": user, "name": "ACTOR", "record_count": 10}, {"owner": user, "name": "DIRECTOR", "record_count": 5} ] synthetic_data_attributes = select_ai.SyntheticDataAttributes( object_list = object_list, params = synthetic_data_params) profile.generate_synthetic_data(synthetic_data_attributes=synthetic_data_attributes)
Getting started with Select AI for Python
To use the select_ai Python module from a standalone client, you will need:
- Python 3.9, 3.10, 3.11, 3.12, or 3.13
- python-oracledb – This package is automatically installed as a dependency requirement.
- pandas – This package is automatically installed as a dependency requirement.
select_ai can be installed from Python’s package repository PyPI using pip. To install and test the select _ai module, follow these steps.
1. Install Python 3 if it is not already available. Use any version from Python 3.9 through 3.13.
2. Install select_ai:
python3 -m pip install select_ai --upgrade --user
3. If you are behind a proxy, use the –proxy option. For example:
python3 -m pip install select_ai --upgrade --user --proxy=http://proxy.example.com:80
4. Run the following from your Python command line:
import select_ai
user = "<your_db_user>" password = "<your_db_password>" dsn = "<your_db_dsn>" select_ai.connect(user=user, password=password, dsn=dsn)
If using Oracle Machine Learning Notebooks, Select AI for Python is pre-installed and ready for use of Python paragraphs in your notebook. Simply import the select_ai module. A default connection is already established for using the synchronous API. The asynchronous API is not supported in the notebook environment.
See the product documentation for more details.
Select AI for Python as Open Source
Select AI for Python code is visible to users as a Python wrapper around the DBMS_CLOUD_AI PL/SQL package functionality, and Oracle welcomes your code contributions through the GitHub repository. Contribute by simply signing up using the Oracle Contributor Agreement (OCA).
Resources
- Try Select AI for free on OCI: Autonomous Database Free Trial
- Video: Getting Started with Oracle Select AI
- Documentation
- LiveLabs