Telecommunications (telco) companies deal with massive amounts of data, including social media monitoring, customer information, network traffic, and operational data. With the growing demand to increase customer satisfaction, enhance customer loyalty, and provide a personalized customer experience, telcos need efficient solutions for intelligent contact centers, proactive social media, brand reputation protection, and sound decision-making at every point in the customer lifecycle.
Telcos face several specific challenges when using call center interactive voice response (IVR) systems and social media interactions for customer support and marketing: First, speech recognition and natural language processing (NLP) technologies might not always accurately interpret customer speech, accents, or dialects, leading to miscommunication and frustration. Over social media, interactions require real-time monitoring and response capabilities to address customer inquiries, complaints, and feedback effectively. For broader marketing, developing compelling and visually appealing content requires creativity and expertise in multi-modal formats, including text, images, videos, and interactive elements.
To address these and other challenges, telcos can use GPUs on Oracle Cloud Infrastructure (OCI) to power machine learning (ML) and generative AI for various applications, such as network optimization, fraud detection, customer experience, and personalized marketing. GPUs on OCI excel in training and deploying deep learning models because of their parallel processing architecture, making them suitable for handling complex ML tasks in telecommunications. Generative AI can assist in generating content for social media posts, marketing materials, and generate scripts for Agents in Call centers to improve customer service responses
In this blog post, we explore how OCI instances powered by GPUs can benefit telco customers with accelerated performance, reduced time to insights, increased efficiency and employee productivity for manually-intensive tasks by simulating the following use case of IVR personalization: Enhancing IVR systems’ natural language understanding capabilities by interpreting and responding to customer queries more accurately and conversationally.
This use case uses distinct AI models, consuming resources independently from one another. Moreover, this use case can apply different AI models, such as JAIS for Arabic, and different ML libraries, such as PyTorch, HuggingFace and so forth.
However, the throughput of incoming queries varies for each AI model. We evaluated the projected throughput independently in order to appropriately design the resources for each AI model. A logical chain for IVR personalization that maps the AI model onto the necessary number of instances of this AI model is demonstrated in the example below.
IVR Personalization:
The logical chain includes the following nuances:
After completing the investigation, we mapped 24 instances of a single AI model for IVR personalization to handle the volume of consumer demands.
Now that we know we’ll have to run up to 24 instances of AI model for IVR Personalization at a time, we can calculate how many GPU devices these will consume.
Assuming that we’ll use LLAMA2 70B with GPU RAM footprint of 320GB, we’ll consume up to 320GB * 24 = 7680GB GPU RAM.
Provided that we use eight devices per compute instance BM.GPU.H100.8, 640GB GPU RAM each, we must deploy 7680/(80GB*8) = 12 compute instances. This configuration goes on a production tier for processing up to 10M inference requests in IVR personalization use case.
As we have estimated twelve compute instances for 10M requests in production, we have an opportunity for optimization of the total amount of resources here. In the following table, there is storage real-time mirroring between the primary and DR sites:
Tier | Primary Site | DR Site |
PROD | 12x OCI BM.GPU.H100.8 | 1x OCI BM.GPU.A10.4 12x OCI BM.GPU.H100.8 |
PREPROD | 9x of DR prod utilized | |
QA | 2x of DR prod utilized | |
DEV |
1x of DR prod utilized 1x OCI BM.GPU.A10.4 |
We’re aiming to have the production, pre-production, quality assurance (QA), and development tiers spread across master and disaster recovery facilities. This layout requires a full duplication of the production tier on the disaster recovery site to be able to quickly switch over.
Because the duplicated production tier is expected to stay idle 99% of time, we can reuse its resources for the rest of the master tiers, pre-production, QA, and development. The only differentiation of production is the maximum (production) throughput of requests, while the hardware and software configuration remains the same.
So, we can reuse the disaster recovery production tier and saturate the following resources:
The following diagram shows the solution architecture for this use case. The head nodes 1 and 2 are used for login purposes and with the master failover configuration. The server node is in a private network with compute options (BM.GPU.A10.4, BM.GPU.4.8, and BM.GPU.H100.8) locally available with the server node for AI modal training.
We are deploying two BM.GPU.A10.4 compute instances for fine-tuning and are distributing these across primary and DR as well.
This layout applies the following assumptions:
Even though not practical for many customers, these assumptions still allow customer-tuning, such as extension of QA to make it constantly available across sites, at the cost of pre-production mimicking a smaller share of the production throughput. Eventually, such a layout allows for 32% cost-cutting (25 BM.GPU.H100.8 compute instances instead of 37), at the same time provides continuous development, a few seconds for switchover, and zero loss of data.
GPUs on Oracle Cloud Infrastructure can offer telcos significant performance improvements, scalability, and cost-efficiency across various tasks ranging from data processing and analysis to AI, video streaming, virtualization, and cybersecurity. Integrating GPU technology into telco infrastructure can help drive innovation, improve service quality, and meet the growing demands of modern telecommunications networks and services.
To learn more, see the following resources:
I bring to the table more than 19+ years’ experience, with 2 years in end-to-end deployment of generative AI, LLM’s AI/ML in OCI, and the rest of the year with high-performance computing infrastructure and solution development across a variety of cloud platforms. Experienced in data centre and automotive, financial, and healthcare domains, my area of expertise includes HPC infrastructure architecture, design and solution development with application profiling and benchmarking, scientific application deployments, and business systems.
I have been adding value in the High-Performance Computing world since 2007, with a focus on Cloud since 2013 and GPU & AI infrastructure solution design since 2021. Based on my Mathematics background at university, I have accomplished missions in different BizDev and Technical Presale roles throughout those years. In 2022, as an Oracle employee, I disclosed a new way to solve the problems of our customers even better: once we take HPC & GPU technologies out of their conventional context and apply them to Enterprise problems, we create innovative solutions, and enhanced ones in terms of performance and scalability. This has now become my brand and the heart of a community of like-minded innovators at Oracle, called HPC4Enterprise. Together, we make the world more efficient as we multiply the power of HPC technologies by solving the practical needs of Enterprises.
Khalid Odeh is a Master Principal Account Cloud Engineer at Oracle working closely with strategic customers toinspire themby demonstrating Oracle’s unique value-proposition & developing an end-to-end integrated solution addressing all their needs that matches their business transformation strategy, goals and vision.
Khalid has been with Oracle for 7 years, having previously held the position of Cloud Architect leader for Oracle Apps on OCI. Before Oracle, Khalid was a Technical consulting manager implementing innovative solutions for wide range of customers.
Previous Post