OCI Document Understanding: Using OJET apps with TypeScript

June 8, 2023 | 8 minute read
Text Size 100%:

What is OCI Document Understanding?

OCI Document Understanding is a cloud native, serverless service that provides out-of-the box, learning-based custom models over REST APIs. These services can be accessed through the Oracle Cloud Console, REST API’s, and OCI software developer kits (SDKs) that are available in different languages like Typescript, Python, Go, .NET, Ruby, and Java, or the OCI CLI.

Key features of OCI Document Understanding include:

Data extraction: lets you identify and extract data items within a document, either through key-value pairs, lines/text, or tabular content. With OCI AI services, data can be positioned and easily extracted from the document.

Classification: With the AI services, documents can be classified into predefined categories such as invoice, passports, receipts etc. based on the features of a document, such as textual content or item positioning.

As you may have gathered, this service has a high potential for automating data-entry processes, as well as digitally categorizing and searching for large numbers of documents.

This blog has two parts:

  • Part one illustrates uploading documents to OCI buckets through an OJET application
  • Part two details the data extraction feature through the OCI document understanding API.

A user-friendly screen is built with OJET-TypeScript to demonstrate selecting a Visa document, which is then uploaded to the OCI bucket. Finally, the content of the uploaded is mapped to respective UI form elements. Along with that, the uploaded document is shown on the same screen to reference the data.

Document Understanding interface
Figure 1 OJet Application UI screen with Document Understanding

Prerequisites

  • OCI-SDK (Type script)
  • Nodejs
  • ojet CLI
  • any IDE like Visual Studio Code
  • Oracle Cloud Account (free oracle cloud services being used in this blog)
  • Express server
  • Multer

Part 1

1. Setup

  • Login to your Oracle Cloud Account. If you don't have one yet, sign up at cloud.oracle.com.

screenshot

  • Create a compartment from the OCI “Identity and Security” menu and note the CompartmentID

    screenshot
    • Click the Profile icon, choose the tenancy option, and note the “namespace.” Note the user's OCID if required.

      screenshot

From the Storage menu, choose the “Bucket” option. Create a bucket the compartment you created above and name it “oracle_bucket.” Note the OCID.

​screenshot

2. Set up an API signing key and OCI profile

  • Sign into the Console as a function’s developer.
  • Open the Profile menu and click User Settings.
  • Under Resources, click API Keys, and then click Add API Key.
  • Select Generate API Key Pair in the Add API Key dialog.
  • Click Download Private Key and save the private key file (as a .pem file) in the ~/.oci directory. (If the ~/.oci directory doesn't already exist, create it now).
  • Click Add to add the new API signing key to your user settings.
  • The Configuration File Preview dialog is displayed, containing a configuration file snippet with basic authentication information for a profile named DEFAULT (including the fingerprint of the API signing key you just created).
  • Copy the configuration file snippet shown in the text box and close the Configuration File Preview dialog.
  • In a text editor, open the ~/.oci/config file and paste the snippet into the file. (If the ~/.oci/config file doesn't already exist, create it now).
  • In the text editor, change the snippet, as follows:
  • Change the value of the key_file parameter of the profile to specify the path of the private key file (the .pem file) you downloaded earlier.
    Config file
  • In the text editor, save the changes you've made to the ~/.oci/config file, and close the text editor. Details about this process can be found in this documentation.
     

3. Create a simple OJet application using the name “DocumentUnderstandingApp” with this command:

ojet create DocumentUnderstandingApp --template=navbar --typescript

screenshot

4. Build View Layer

Update dashboard.html and add ‘oj-file-picker’ to add the file browse feature on screen and add a corresponding select listener to capture the event.

code screenshot

5. Build the model layer

  • In the dashboard.ts file, create a “selectFileListener” method and specify the OCI’s parameters of namespaceName, bucketName, compartmentid, userid from the OCI console.
  • selectFileListener will get the file picker event and extract the browsed file object and retrieve the file stream from the object. It further builds the ‘formData’ and posts the data to the server OCI -Object storage Upload API to upload the file to the OCI – bucket.
  • Once the response is received, it constructs the server Image URL path to show that document on screen. The server path is stored in OJET- observable that it maps to the UI screen.

    screenshot

6. Create a server code base

  • Create a new folder named “aiservices” and run “npm init” to add package.json and tsconfig.json in it and install the desired npm modules listed in dependencies below.
  • Use ‘nodemon’ to start the server on 3000 port actively.

    screenshot

7. Configure Multer

In this blog, Express.js, or simply Express, is used as back-end application framework for building RESTful APIs with Node.js. We can use any other backend framework as well.

MULTER, an npm package, is used in this blog to easily handle file uploads to server. MULTER is a node.js middleware for handling multipart/form-data. The Upload API takes the POST data , extracts the content and calls a custom ‘uploadtobucket’ to upload the file to the OCI bucket. It uploads the file to a temporary folder ‘upload’ and, once processed, removes the file from it.

Create a new ‘.ts’ file and name it "server.ts"

screenshot

8. File Upload to OCI Bucket using Upload Manager API and Type script

  • Create a new ‘.ts’ file and name it ‘fileupload.ts’
    screenshot
  • Code fileupload.ts

9.  Run the application

On the aiservices application start the server with this command:

npm start

screenshot

On the OJET application serve the jetapp:

Ojet serve

Users can browse the visa document and same document will be uploaded to OCI bucket and shown on screen.

screenshot

Part 2

Step 10:  Key- Value Extraction from document using typescript SDK API’s

  • Create an API for KEY value extract from a document using OCI Document understanding API using Type script
  • Create a “ai_keyvalueextraction.ts” in “aiservices” package and use the below code to extract key values from the document and return the response back as JSON.

    screenshot
CODE : ai_keyvalueextraction.ts:

Step 11:  Update Server.ts

code screenshot

Step 12 : ‘Update Dashboard.ts’ to call getObjectKeyvalue from the model layer  and map to UI observable. Create Interface for each attribute

code screenshot


Step 13:  Update dashboard.html and construct a layout that retrieved object URL will be placed besides UI form to show the extracted data.

screenshot

Serve the project again.

screenshot of passport

Photo by Scott Graham on Unsplash.

Namit Kakkar


Previous Post

Announcing the General Availability of the SQL:2023 Standard

Gerald Venzl | 2 min read

Next Post


Quick start development with the Oracle Linux Cloud Developer image

Julie Wong | 5 min read