A key step in machine learning model development is optimizing hyperparameters. Learn how ADSTuner streamlines this process.
A key step in machine learning model development is optimizing hyperparameters. Learn how ADSTuner streamlines this process.
Speed up data science performance. Oracle Cloud Infrastructure (OCI) Data Science offers an NVIDIA RAPIDS enviornment for running machine learning on GPUs.
Speed up data science performance. Oracle Cloud Infrastructure (OCI) Data Science offers an NVIDIA RAPIDS enviornment for running machine learning on GPUs.
Manage the lifecycle of Conda Environments with Oracle Cloud Infrastructure Data Science. Conda Environments provide the right level of isolation and flexibility for many machine learning use cases.
Manage the lifecycle of Conda Environments with Oracle Cloud Infrastructure Data Science. Conda Environments provide the right level of isolation and flexibility for many machine learning use cases.
Here are road-tested techniques experts shared at Oracle's Make Machine Learning Work for You event to get started with powerful platforms and popular open source tools.
Here are road-tested techniques experts shared at Oracle's Make Machine Learning Work for You event to get started with powerful platforms and popular open source tools.
This blog post covers a few options that are available to a data scientist who wants to parallelize a workload done on a data frame. It covers approaches that offer multi-threading and multi-processing execution. Each method provides benchmarks in terms of speed of execution that you can run in your notebook session.
This blog post covers a few options that are available to a data scientist who wants to parallelize a workload done on a data frame. It covers approaches that offer multi-threading and...
After building and training machine learning models in Oracle Cloud Infrastructure Data Science, deploy them in Oracle Functions for faster machine learning results.
After building and training machine learning models in Oracle Cloud Infrastructure Data Science, deploy them in Oracle Functions for faster machine learning results.
A short piece on how to use Oracle Cloud Data Science Services in conjunction with Oracle IOT and 5G solutions to accelerate Experience Economy.
A short piece on how to use Oracle Cloud Data Science Services in conjunction with Oracle IOT and 5G solutions to accelerate Experience Economy.
Applying for data science jobs? Check out this 365 Data Science research on the education, work experience, and skills that employers are seeking for data scientist jobs in 2020.
Applying for data science jobs? Check out this 365 Data Science research on the education, work experience, and skills that employers are seeking for data scientist jobs in 2020.
BotSupply's multilingual AI chatbots, which run on Oracle Cloud, help fight online abuse around the world.
BotSupply's multilingual AI chatbots, which run on Oracle Cloud, help fight online abuse around the world.
Learn six ways the AI-powered Oracle Digital Assistant with pre-built app skills and the ability to customize can drive efficiency and productivity, improve customer and employee experience, and support digital transformation.
Learn six ways the AI-powered Oracle Digital Assistant with pre-built app skills and the ability to customize can drive efficiency and productivity, improve customer and employee experience, and...
Run large-scale machine learning and deep learning models with the latest NVIDIA GPU offering from Oracle Cloud Infrastructure Data Science.
Run large-scale machine learning and deep learning models with the latest NVIDIA GPU offering from Oracle Cloud Infrastructure Data Science.
Today we are pleased to announce the availability of Tribuo, a Java Machine Learning (ML) library, as open source. We’re releasing it under an Apache 2.0 license on Github for the wider ML community to use.
Today we are pleased to announce the availability of Tribuo, a Java Machine Learning (ML) library, as open source. We’re releasing it under an Apache 2.0 license on Github for the wider ML community...
Artificial intelligence and machine learning promise to bring unparalleled potential for HR. Both emerging technologies also bring ethical considerations for CHROs.
Artificial intelligence and machine learning promise to bring unparalleled potential for HR. Both emerging technologies also bring ethical considerations for CHROs.
In our last blog [1], we covered several housekeeping items for your business applications that you and your team could tackle during this stay-at-home period. One of the suggestions was learning about an emerging technology, such as AI, IoT, or digital assistants. In this blog, I’m going to dig deeper into specific emerging technologies and how you can apply an emerging technology to add extra value to your company. First, learn about your company My first recommendation is to gain a better understanding of your company. If you’re working for a publicly traded company, a good place to start is the 10-K. It’s a document that your company is required to file every year with the SEC. It details financial performance, earnings per share, the organizational structure, subsidiaries, executive compensation, and much, much more information. The purpose of the 10-K is to make investors aware of the innerworkings of a company so they can make timely buy/sell decisions. It’s a great place for you to gain a big picture understanding of your company over and above what you may know about your specific line of business, geographical region, or the particular division in which you work. Next, learn more about your company’s new objectives Most likely, these will be internal documents, and they may be readily available. The strategic goals may have shifted so be aware. These will help you understand the path forward … how your company plans to grow, new markets the leadership wants to enter, and possibly new products on the horizon. These strategic priorities may also give you an indication of challenges, struggles, and points of pain that your company needs to address during this time. Apply emerging technologies to your company’s objectives Once you have an understanding of your company’s growth strategies and challenges, you can apply the information you learn about emerging technologies to help solve a challenge, streamline a process, or improve the employee or customer experience. This is your opportunity to think outside of the box as they say, but be sure that your ideas are tied to strategic objectives. Here are some examples of emerging technologies in action that may get you thinking about ideas for your company. Artificial Intelligence for Sales and Marketing The current crisis situation has literally shifted markets in a matter of days. Some segments of the economy went from tremendous growth to double-digit losses in a week. Your company may need to shift its sales strategy in the short-term to find ways to recover lost revenue or tap into new markets to keep pace with economic shifts. AI solutions apply machine learning algorithms to your customer data to identify an ideal customer profile and then match that profile to trusted 3rd-party data to create a target list of prospects for your sales and marketing teams. AI for Supply Chain Similar to entire markets that shifted quickly, suppliers all over the world found themselves dealing with shipping and receiving restrictions that may have prevented them from fulfilling orders. That situation may have put your company in scramble-mode as you tried to find new suppliers. Depending on your industry, component parts may have spiked in price, impacting sales projections and profit margins. An AI application can analyze your ERP data, including current suppliers, POs, invoices, payables, etc., and compare company information with 3rd-party data to give you insight into your supplier ecosystem. It can also help you rank suppliers, figure out where you can negotiate better discounts, and create models to help your company optimize procurement processes. Digital Assistants/Chatbots for Workforce or Customer Experience This conversational interface uses machine learning and natural language processing to guide people to the right source of information, the appropriate live support person, or walk someone through a process. Digital assistants can be a relatively quick and easy way to fill a gap as your company shifts staff with people working from home. Chatbots can also greatly improve the user experience for both internal and external use cases. For example, your company can add a digital assistant to your employee portal to help people find specific HCM benefits information, or you can add a digital assistant to the home page of your website to help people get to the right resource to help with a specific question or issue. Learning More Please read my last blog that is referenced at the end of this article for more information on emerging technologies. There is a lot of free content online; from YouTube tutorials to online classes. According to a recent cnbc.com article, many 4-year colleges are offering free online classes, so people can learn while staying safe at home. You can search for the online classes about emerging technologies that fit your interests and company objectives. I hope this gives you inspiration to learn something new so you can apply emerging technologies to develop professionally and to help tackle the priorities and challenges in your company. For more information on a complete suite of cloud-based applications that includes emerging technologies, go to www.oracle.com/applications [1] https://blogs.oracle.com/saas/tackling-your-cloud-applications-to-do-list
In our last blog [1], we covered several housekeeping items for your business applications that you and your team could tackle during this stay-at-home period. One of the suggestions was learning...
Resource principals provide easier, more secure authentication for data science resources. Oracle Cloud Infrastructure Data Science makes resource principals readily available, and it secures resource principals for you.
Resource principals provide easier, more secure authentication for data science resources. Oracle Cloud Infrastructure Data Science makes resource principals readily available, and it secures resource...
The latest release includes resource principals in notebook sessions, accumulated local effects (ALEs) in MLX, a new "what-if" scenario diagnostic in MLX, and ADS updates.
The latest release includes resource principals in notebook sessions, accumulated local effects (ALEs) in MLX, a new "what-if" scenario diagnostic in MLX, and ADS updates.
By Suhas Uliyar, Vice President, AI and Digital Assistant AI-based chatbots or digital assistants stand to change the way we interact with business applications, not just consumer ones. The main benefit is the ability to get immediate responses to queries via natural local language, without having to download apps or get training. While we have the freedom to engage in user-friendly experiences in our personal lives – such as Alexa and Siri – there have been few options for people in their professional lives. But that’s changing. As Steve Miranda, Oracle’s executive vice president of application development, remarked, “In HR, every common question or transaction has lent itself nicely to digital assistants. Within the next year, we will be calling HTML our ‘old UI.’ Every transaction you have will be through a digital assistant UI.” Work-at-home requirements associated with the spread of COVID-19 have made it all the more important to give employees easy access to ever-changing information – on company policies, insurance coverage, and public health guidance, in addition to the usual cadence of questions on vacation balances, status of expenses, and IT workarounds. Here are a few key ways in which chatbots and digital assistants can help. An assistant for every employee Finding answers to simple questions can be a frustrating experience if there is no easy way to do so. Take, for example, basic questions like “how many vacation days do I have left?” or “what do I do if I have a change in marital status?” In some cases, employees need to log into their VPN to find the policy document or a web page, or the application – which they then need to further navigate to find answers to these straightforward questions. With a digital assistant, employees can simply speak the question out loud in a natural way or simply input the text, instead of having to navigate multiple screens or interfaces, and they will receive an immediate response. Not only that, the digital assistant can further help them by recommending or taking action as a follow-up to their original interaction and be a true assistant for the employees. For example, rather than just informing the employee on what to do to change their marital status, the digital assistant can actually trigger the change process by gathering the necessary information and then updating the relevant systems with that information. Answering general policy questions With rapidly evolving governmental directives such as sheltering-in-place and social distancing, most organizations are quickly adapting their HR policies and guidelines. At the same time, employees need help and answers from their organizations more than ever. Questions may range widely from policies on employment, travel guidelines, and health and safety instructions, as well as guidelines on dealing with and working during the pandemic. In some cases, the information is very dynamic and changes by the minute. Digital assistants give employees a consistent channel, which is available 24x7, to ask their questions so they can get an immediate response – while freeing up the HR and IT/support teams to manage the more complex challenges they are facing today. In fact, you can also use digital assistants to send proactive alerts and notifications like changes in policies, so that employees don’t need to keep checking or search for the latest information time and again. Supporting employee health and safety Practicing social distancing has also had an impact on recruitment, onboarding, and training processes for organizations. In effect, these processes provide resources and support that most organizations may seriously need in these uncertain times. Using a digital assistant, businesses can drive candidates’ pre-screening and interview scheduling online, across any messaging channel. You can drive virtual onboarding by enabling easy remote online access to relevant trainings, policies, and materials all via a digital assistant. Data can also be safely recorded to keep track of employee health status based on the organization’s health policy and guidelines. A digital assistant can also save the employee from the time-consuming task of completing forms or reporting on any health-related issues at work. Employee self-service Whether working remotely or on-site as needed, employees may need access to both information and processes beyond just the HR systems. From submitting expenses to filing IT support tickets to making changes to travel plans, we touch a number of systems or applications as employees. Some processes even span across multiple systems, like role- or location-based expense reimbursement policies, where the system requires role information from the HR system before interacting with the finance/ERP system for reimbursement. A digital assistant is one common interaction point for employees, contractors, or partners across multiple applications and can provide a quick, consistent, and concise response. Leading an organization through this unprecedented time has put an increased demand on the HR function. As a result, organizations would be wise to leverage AI-powered technologies such as digital assistants to scale their functions, create online connection and engagement, and provide dynamic updates on policies and safety guidance without bogging down human communication channels – which need to be available for essential tasks. A digital assistant can support an organization by providing benefits such as: Lowering operational costs via online self-service & automation Expanding HR availability 24x7 across different channels Enabling easy access to information and processes delivered via text or natural language Delivering consistent information and maintaining employee engagement Enabling proactive HR outreach Digital assistants can support the functions employees may need now while creating efficiencies for the long term. For more information or to discuss how a digital assistant can support your needs, email us here. Stay well, and be safe.
By Suhas Uliyar, Vice President, AI and Digital Assistant AI-based chatbots or digital assistants stand to change the way we interact with business applications, not just consumer ones. The...
Previous deep learning models for auto composing have often used a huge number of parameters and decreased inference speed. Learn how parallel decoders with parameter sharing enable auto composition while using fewer parameters to compose understandable, reasonable compositions.
Previous deep learning models for auto composing have often used a huge number of parameters and decreased inference speed. Learn how parallel decoders with parameter sharing enable auto...
Oracle’s innovation is leading the way in artificial intelligence and machine learning-powered applications and platforms. Learn more about how Oracle’s embedded AI is giving companies more value from their data by using big data, advanced analytics, and modern machine learning to enhance business and IT operations.
Oracle’s innovation is leading the way in artificial intelligence and machine learning-powered applications and platforms. Learn more about how Oracle’s embedded AI is giving companies more value...
One challenge for deep learning models is adjusting to change while maintaining previous learning. Read how knowledge distillation for incremental learning can help.
One challenge for deep learning models is adjusting to change while maintaining previous learning. Read how knowledge distillation for incremental learning can help.
When searching for a new machine learning platform, a data science trial can help you find the solution that solves the everyday problems you experience as a data scientist so you can successfully drive business outcomes within your organization.
When searching for a new machine learning platform, a data science trial can help you find the solution that solves the everyday problems you experience as a data scientist so you can...
At their core, data science platforms have tools that need to support open-source library languages and frameworks. Collaborative platforms should offer a rich portfolio of integrated products and components that help with various stages of the machine learning lifecycle.
At their core, data science platforms have tools that need to support open-source library languages and frameworks. Collaborative platforms should offer a rich portfolio of integrated products...
Follow this data science tutorial on how to execute ad hoc Python processes in Oracle Cloud Infrastructure Data Science without leaving the notebook session environment.
Follow this data science tutorial on how to execute ad hoc Python processes in Oracle Cloud Infrastructure Data Science without leaving the notebook session environment.
Ovum, a leading analyst firm and part of the global technology research organization, Omdia, has recognized Oracle Digital Assistant as a leader in the market in its latest research report, "Ovum Decision Matrix: Selecting an Intelligent Virtual Assistant Solution, 2020–21." The report analyzes the evolution of virtual intelligent assistants, the increasing scope of use cases, and the market landscape, and evaluates 10 niche and large technology vendors to determine Oracle Digital Assistant as one of the leaders in this market. Oracle Digital Assistant is a comprehensive, AI-powered conversational interface for business applications. Oracle Digital Assistant interprets the user’s intent so it can automate processes and deliver contextual responses to their voice or text commands to enrich the user experience, eliminate helpdesk and support overhead, and enable scale for communications and engagement. The Ovum report specifically highlights Oracle Digital Assistant as an easy-to-build solution, thanks to its no code, design-by-example, Conversational Design Interface that is intended to be used by non-developers to build, train, test, deploy, and monitor AI-powered digital assistant on channels of choice. Ovum also noted Oracle Digital Assistant’s advanced linguistic and deep learning-based natural language processing (NLP) models as a key strength that enables the Digital Assistant to better understand domain specific vocabulary, and respond with contextual information and best next step actions accordingly. Oracle Digital Assistant also received kudos in the report for providing an “enterprise-ready” solution. Organizations leveraging Oracle Digital Assistant know that their data is their own, stored securely in Oracle Cloud or via Cloud@Customer for organizations wanting to keep their data within their own boundaries. Furthermore, because it is a comprehensive platform, Oracle Digital Assistant can integrate with existing processes, routing rules, and contact center agents to support enterprises’ unique business needs. In fact, Ovum noted that “A differentiator for ODA [Oracle Digital Assistant] is that a business process engine sits beneath it and is tightly integrated to perform tasks emerging from the conversation. For example, when an end user informs the ODA of a change of address, several relevant processes kick in. Oracle's ODA and business process management R&D teams are also tightly integrated because of the overlap in functions.” Oracle also offers out-of-the-box chatbot skills for Oracle Cloud HCM, Cloud ERP, and Cloud CX, as well as integration with Oracle CX Service to speed up deployment and provide seamless engagement for Oracle Cloud Applications customers – a point that was noted as a strength in the Ovum report. With no apps to download and no training needed to use Oracle Digital Assistant, the use of intelligent assistants has picked up quite significantly in the industry. Over the past years more and more organizations – both public sector and commercial – have come to rely on Oracle Digital Assistant for their needs. Common use cases include enabling easy and 24x7 access to employee HR self-service functions and employee expense and finance functions, offering customer or employee FAQs and information lookup. This enables Oracle Digital Assistant to be the first line of customer/employee helpdesk and drive seamless bot-agent handoff only where needed, and more. These use cases present massive sales and ROI opportunities, freeing up human resources to take on the more complex challenges while at the same time improving the user experience. Oracle’s leadership position in the Ovum report is a testament to the significant R&D investments in AI and NLP-powered Cloud service over these recent years. For more information on how your organization can leverage Oracle Digital Assistant, please visit our website. And to download the full report, click here.
Ovum, a leading analyst firm and part of the global technology research organization, Omdia, has recognized Oracle Digital Assistant as a leader in the market in its latest research report, "Ovum...
AutoML automates common, repetitive machine learning steps. It expedites the capabilities of data scientists, just like machine learning itself.
AutoML automates common, repetitive machine learning steps. It expedites the capabilities of data scientists, just like machine learning itself.
Access the latest Oracle Cloud Infrastructure Data Science release, including a TensorFlow 2.0 upgrade, accessing Vault and Streaming from your notebook, and new launcher buttons to access notebook examples.
Access the latest Oracle Cloud Infrastructure Data Science release, including a TensorFlow 2.0 upgrade, accessing Vault and Streaming from your notebook, and new launcher buttons to access...
Data scientists can spend weeks tuning machine learning for specific needs. Artificial intelligence can do the job many times faster, Oracle Labs finds.
Data scientists can spend weeks tuning machine learning for specific needs. Artificial intelligence can do the job many times faster, Oracle Labs finds.
Research from Oracle and Enterprise Strategy Group (ESG) shows that organizations adopting emerging technologies are reaching new heights in efficiency.
Research from Oracle and Enterprise Strategy Group (ESG) shows that organizations adopting emerging technologies are reaching new heights in efficiency.
Information architecture (IA) structures data for AI to use, so that your business unlock AI-powered insights from data.
Information architecture (IA) structures data for AI to use, so that your business unlock AI-powered insights from data.
This is a blog explaining the concept of Self-Attention, Multi-head Self-Attention followed by its use as a replacement for conventional RNN based models.
This is a blog explaining the concept of Self-Attention, Multi-head Self-Attention followed by its use as a replacement for conventional RNN based models.
Learn about BiDirectional Attention Flow, a Natural Language Processing model for connecting the query and context within Question Answering.
Learn about BiDirectional Attention Flow, a Natural Language Processing model for connecting the query and context within Question Answering.
Oracle Cloud Infrastructure VM for Data Science and AI offers data scientists a powerful cloud-based alternative to develop AI applications quickly and efficiently.
Oracle Cloud Infrastructure VM for Data Science and AI offers data scientists a powerful cloud-based alternative to develop AI applications quickly and efficiently.
Want to become a data scientist? 365 Data Science provides their research on a typical data scientist's skills in 2020.
Want to become a data scientist? 365 Data Science provides their research on a typical data scientist's skills in 2020.
AI & ML can transform a data catalog into an engine for holistic insight and enterprise class. A data catalog can dramatically accelerate the time to value from data science initiatives.
AI & ML can transform a data catalog into an engine for holistic insight and enterprise class. A data catalog can dramatically accelerate the time to value from data science initiatives.
Today, robotics, AI, and machine learning can be found outside of Star Wars movies. Yet, do you really understand the difference between the three technologies? Or what projects are under way at Oracle to build next-generation solutions?
Today, robotics, AI, and machine learning can be found outside of Star Wars movies. Yet, do you really understand the difference between the three technologies? Or what projects are under way...
When it comes to AI and machine learning, it's sometimes a Wild West out there. How can university business officers navigate the hype? We have some advice for Higher Education from the recent Educause Horizon Report.
When it comes to AI and machine learning, it's sometimes a Wild West out there. How can university business officers navigate the hype? We have some advice for Higher Education from the...
Accelerated Data Science (ADS) provides a single library that covers all the steps in the lifecycle of predictive machine learning models.
Accelerated Data Science (ADS) provides a single library that covers all the steps in the lifecycle of predictive machine learning models.
Learn how the Bayes factor can help maintain fairness in machine learning models for domains like financial services and hiring.
Learn how the Bayes factor can help maintain fairness in machine learning models for domains like financial services and hiring.
Are fairness and accuracy inversely related in machine learning? Learn more in this blog post based on a NeurIPS 2019 paper.
Are fairness and accuracy inversely related in machine learning? Learn more in this blog post based on a NeurIPS 2019 paper.
85% of big data projects fail by imploding under their own complexity. Oracle Cloud Infrastructure Data Flow represents a more efficient, more secure, and simply easier way to take projects to completion.
85% of big data projects fail by imploding under their own complexity. Oracle Cloud Infrastructure Data Flow represents a more efficient, more secure, and simply easier way to take projects...
Oracle announces Oracle Cloud Infrastructure Data Catalog, a brand new tool to manage and govern your big data. Organize, enrich, search, and consolidate data in a way that expedites and optimizes your data lake.
Oracle announces Oracle Cloud Infrastructure Data Catalog, a brand new tool to manage and govern your big data. Organize, enrich, search, and consolidate data in a way that expedites and...
Oracle Cloud Infrastructure Data Science is an enterprise grade data science service where teams of data scientists can collaborate to build, train, and deploy machine learning models.
Oracle Cloud Infrastructure Data Science is an enterprise grade data science service where teams of data scientists can collaborate to build, train, and deploy machine learning models.
See how Deep Learning can help reduce traffic congestion and pollution by providing real-time parking availability.
See how Deep Learning can help reduce traffic congestion and pollution by providing real-time parking availability.
A new survey shows companies that have embraced emerging technologies are growing their profits 80% faster than peers who haven’t.
A new survey shows companies that have embraced emerging technologies are growing their profits 80% faster than peers who haven’t.
Learn about unpredictability of intelligent systems, which poses a challenge to AI Safety by limiting the ability to understand the impact of intelligent systems.
Learn about unpredictability of intelligent systems, which poses a challenge to AI Safety by limiting the ability to understand the impact of intelligent systems.
Attend our Feb. 4 webinar to learn how to overcome key roadblocks in the AI development lifecycle and make AI a sustainable competitive advantage for your company.
Attend our Feb. 4 webinar to learn how to overcome key roadblocks in the AI development lifecycle and make AI a sustainable competitive advantage for your company.
Learn how to overcome AI roadblocks, and how to unite data scientists and business users for better AI adoption.
Learn how to overcome AI roadblocks, and how to unite data scientists and business users for better AI adoption.
Artificial intelligence usage will grow in 2020. Here are some predictions of what to expect with AI in Finance, Supply chain, sales, marketing, HR.
Artificial intelligence usage will grow in 2020. Here are some predictions of what to expect with AI in Finance, Supply chain, sales, marketing, HR.
Artificial intelligence usage will grow in 2020. Here are some predictions of what to expect with AI in Finance, Supply chain, sales, marketing, HR.
Artificial intelligence usage will grow in 2020. Here are some predictions of what to expect with AI in Finance, Supply chain, sales, marketing, HR.
For a successful and scalable data science team, don’t leave it all to your data scientists. Data engineers and machine learning engineers can improve outcomes.
For a successful and scalable data science team, don’t leave it all to your data scientists. Data engineers and machine learning engineers can improve outcomes.
Learn how machine learning model lifecycle management compares to database patch lifecycle management.
Learn how machine learning model lifecycle management compares to database patch lifecycle management.
Learn how automation, robots, and artificial intelligence are already changing how relationships between workers and managers will look in the future workplace.
Learn how automation, robots, and artificial intelligence are already changing how relationships between workers and managers will look in the future workplace.
Learn how the DataFox Data Cleaning team manages company data to help customers get a unique, more complete view of their market.
Learn how the DataFox Data Cleaning team manages company data to help customers get a unique, more complete view of their market.
Due to the volume, heterogeneity, and interdependency of data, IoT benefits cannot be derived without artificial intelligence. This is why, every Oracle IoT SaaS application is built with AI at its core.
Due to the volume, heterogeneity, and interdependency of data, IoT benefits cannot be derived without artificial intelligence. This is why, every Oracle IoT SaaS application is built with AI at...
Evaluate which machine learning models elicit the greatest profit for your business with the Expected Value Framework.
Evaluate which machine learning models elicit the greatest profit for your business with the Expected Value Framework.
How can CFOs harness the power of new digital technologies such as AI and machine learning? Here are six strategies to help your organization compete in the digital age of finance.
How can CFOs harness the power of new digital technologies such as AI and machine learning? Here are six strategies to help your organization compete in the digital age of finance.
Artificial Intelligence and the Oracle CX Cloud – The Power Behind Informed and Productive Human Interactions
Artificial Intelligence and the Oracle CX Cloud – The Power Behind Informed and Productive Human Interactions
Recommendation systems are key to solving recruitment challenges. Learn how we use Cold Start to enhance the process.
Recommendation systems are key to solving recruitment challenges. Learn how we use Cold Start to enhance the process.
We’ve all heard news stories about self-driving cars or DeepMinds beating the Go world champion. But is AI up to the hype? We separate reality from myth.
We’ve all heard news stories about self-driving cars or DeepMinds beating the Go world champion. But is AI up to the hype? We separate reality from myth.
Amidst rising user privacy concerns, developers are working to protect data at the software level. Learn how differential privacy can help.
Amidst rising user privacy concerns, developers are working to protect data at the software level. Learn how differential privacy can help.
To learn how Oracle Cloud Platform and its technologies can help you in your cloud journey, please attend the Platform as a Service Cloud Day, a free virtual event hosted by the Quest Oracle Community.
To learn how Oracle Cloud Platform and its technologies can help you in your cloud journey, please attend the Platform as a Service Cloud Day, a free virtual event hosted by the Quest...
Businesses will reap more success from AI projects by setting short-term, achievable goals instead of pursuing extremely ambitious ones, industry executives advise.
Businesses will reap more success from AI projects by setting short-term, achievable goals instead of pursuing extremely ambitious ones, industry executives advise.
At my first Oracle Open World as a recent employee, the excitement about artificial intelligence was everywhere: observations, customer story, and product announcements.
At my first Oracle Open World as a recent employee, the excitement about artificial intelligence was everywhere: observations, customer story, and product announcements.
Surprises in learning models can expand knowledge, but they also increase risk. Read insights from Roman Yampolskiy on how to make AI safer.
Surprises in learning models can expand knowledge, but they also increase risk. Read insights from Roman Yampolskiy on how to make AI safer.
Graph data processing is already an integral part of big-data analytics with many applications in various domains including Finance, Cyber Security, Compliance, Retail, and Health Sciences. The adoption of graph processing is expected to further grow in the upcoming years. This is partially because graphs can naturally represent data that captures fine-grained relationships among entities. Graph analysis can provide valuable insights about such data by examining these relationships. Oracle Labs PGX has been providing graph solutions both for Big Data and for Relational Database customers. In this post, I will describe our new distributed graph traversal solution that significantly improves performance and memory consumption of Oracle PGX's in-memory distributed graph query engine. That is especially true on very large graph queries where our competitors either fail to execute due to memory usage (see the performance figures later in the post), or fall back to slow and inefficient disk-based joins. Typically, graph analysis is performed with two distinct but correlated methods, namely computational analysis (a.k.a graph algorithms) and pattern matching queries. Most graph engines nowadays, such as Oracle Labs PGX and Apache Spark GraphX/GraphFrames, support both graph algorithms and graph queries. With computational analysis, the user executes various algorithms that traverse the graph, often repeatedly, and calculate certain (numeric) values to get the desired information, e.g., PageRank or shortest paths. Pattern matching queries are declaratively given as graph patterns. The system finds every subgraph of the target graph that is topologically isomorphic/homomorphic to the query graph and satisfies any accompanying filters. For example, the following PGQL (for Property Graph Query Language) query: returns the persons p1 (called "John Doe") and p3 who have the largest number of common friends. Such queries can be used for example for friend recommendation. Graph queries are a very challenging workload because they focus on the connections in the data. By following connections, i.e., edges, graph query execution can possibly explore large parts of the graph, generating large intermediate and final result sets with a combinatorial explosion effect. For example, on a very old snapshot of Twitter (known as the "Twitter graph" in academic graph-related research papers), a single-edge query (e.g., (v0)→(v1)) matches the whole graph, counting 1.4 billion results. A two-edge query (e.g., (v0)→(v1)→(v2)) returns more than nine trillion matches. Additionally, graph queries can exhibit extremely irregular access patterns and therefore require low-latency data accesses. For this reason, high-performance graph query engines try to keep data in main memory and scale out to a distributed system in order to handle graphs that exceed the capacity of a single node. Graph data processing and querying is an increasing market that has many applications in various domains including Finance, Cyber Security, Compliance, Retail, and Health Sciences. These applications often require querying very large graph data in a fast and efficient manner. Oracle has been providing graph solutions both for Big Data and Relational Database audiences, while there are also commercial competitors like Amazon Neptune, Neo4J , as well as open source alternatives including Spark GraphFrame. Our invention can provide significant differentiation for Oracle's solution over those competitors. Our invention significantly improves performance and memory consumption of Oracle's current in-memory distributed graph query engine, especially on very large graph queries where our competitors either fail to execute due to memory usage (e.g., Spark GraphFrame) or to fall back to slow and inefficient disk-based joins (Neptune or Neo4J). Traditional Distributed Graph Traversal Approaches In a distributed system, graphs are typically partitioned across machines by vertices, meaning that each machine is storing a partition of the vertices of the overall graph, plus the edges corresponding to that vertex. For example, in the graph below, machine 0 stores vertices v0, v1, and v2, while machine 1 holds vertices v3 and v4. The edge between v0→v1 is local to machine 0, while the edge connecting v2→v3 is remote, as it spans machines 0 and 1. For large distributed graphs, none of the traditional graph exploration/traversal approaches is suitable for distributed queries. Breadth-first traversals and distributed joins quickly explode in terms of intermediate results and pose a performance challenge over the network. Depth-first traversals are challenging to paralellize and result in completely random data access patterns. In practice, most engines use breadth-style traversals combined with synchronous / blocking communication across machines. Breadth-First Traversals In breadth-first traversals, the execution expands the query in width. The query pattern is matched to the target graph edge-after-edge. For example, matching pattern (a)→(b)→(c) to the example graph above could proceed by matching edge (a)→(b) to all graph edges, namely (v0)→(v1), (v0)→(v2) etc., and then proceed with expanding these intermediate results to match edge (b)→(c). Typically, the execution proceeds with synchronous traversals, i.e., the first edge is completely matched before moving to the next edge to match. Expanding the query breadth-first is not ideal for a distributed system. First, materializing large sets of intermediate results at every step leads to an intermediate-result explosion. Breadth-first traversals typically have the benefit of locality (i.e., accessing adjacent edges one after the other). Unfortunately, locality in distributed graphs is much more limited, since many of the edges that are followed are remote. As a result, a large part of the intermediate results produced at each part of the query must be sent to the remote machines, creating large network bursts. Distributed Joins Graph traversals can be also expressed as relational joins. Following the edge (a)→(b) can be mapped to a join (or two) between the "vertex table" (holding the graph vertices) and the "edge table" (holding the edges): Distributed joins face the same problems as breadth-first traversals, plus an additional important problem. They perform table joins instead of graph traversals on top of specialized graph data structures. Unsurprisingly, graph-specific data structures are much faster than generic joins (see later in this post the performance comparison of Oracle Labs PGX Distributed to Apache Spark GraphFrames). Depth-First Traversals In depth-first traversals, the execution expands the query in depth. The query pattern is matched to the target graph as a whole, result-by-result. For example, matching the pattern (a)→(b)→(c) to the example graph above could proceed by matching (v0)→(v1)→(v2), then (v0)→(v2)→(v3), etc. The main advantage of expanding the query depth-first is that intermediate results can be eagerly expanded to final results, thus reducing the memory footprint of query execution. Nevertheless, depth-first traversals have the disadvantages of not leveraging locality and of more complicated parallelism. The lack of locality is depth-first results in "edge chasing" – i.e., following one edge after the other as dictated by the query pattern – thus not accessing adjacent edges in order. The complication for parallelism manifests because the runtime cannot know in advance if there is enough work for all threads at any stage during the query. For instance, a query like the `MATCH (p1:person)-[:friend]->(p2:person)<-[:friend]-(p3:person) WHERE p1 <> p2 AND p2 <> p3 AND p1.name = "John Doe"` that I described in the beginning of the post will probably produce a single match for (p1). If this intermediate result is expanded in a depth-first manner, the amount of intermediate results (hence the parallelism) will grow slowly. Dynamic Asynchronous Traversals For Distributed Graphs The Parallel Graph AnalytiX (PGX) toolkit, developed at Oracle Labs, is capable of executing graph analysis in a distributed way (i.e., across multiple servers); we refer to this capability as PGX.D. We are experimenting in PGX.D with a new hybrid approach to executing graph traversals that offers the best of breadth-first and depth-first traversals. Competing graph engines include the classic trade-off between performance and memory consumption for graph query execution: Sacrifice performance: One option is to use a fixed memory area (typically several gigabytes) for the execution, but spill intermediate results that do not fit to disk. Sacrifice memory: Another option is to perform the whole computation in memory. If the intermediate results do not fit the memory, this approach cannot compute this query on that graph. PGX.D enables the in-memory execution of any-size query without sacrificing memory or performance. In particular, graph queries in PGX.D: Can operate with a fixed, predefined amount of memory for storing intermediate results; Only use this memory for computations, i.e., do not spill any intermediate results to disk; and Can essentially calculate queries of any size, because intermediate results are turned to final results "on-demand", to keep memory consumption within limits. On the technical side, PGX.D achieves the aforementioned characteristics by deploying: Dynamic traversals, using Depth-first execution when needed, thus aggressively completing intermediate results and keeping the memory consumption within limits; and Breadth-first execution when possible, thus removing the performance complexities of depth-first traversals. Asynchronous communication of intermediate results from one machine to the other, thus not blocking/delaying local computation due to remote edges. Flow-control and incremental termination to keep global memory consumption (including messaging) within limits and guarantee the query termination (i.e., avoid deadlocks). Example Consider matching the pattern (a)→(b)→(c) to our example graph presented above and consider a worker thread currently starting to match from vertex v0. The thread can bind v0 as (a) and then try to expand to the edge (a)→(b). The worker could match (b) with v1. At this point, the dynamic traversal approach in PGX.D, based on how much memory does the query already consume, will dictate whether the worker will continue matching (a)→(b) edges (breadth) or should continue with the (b)→(c) edge (depth). In either case, the worker will eventually match v3 for (b) in which case PGX.D simply buffers the intermediate match to a message destined for machine 1 and continues matching the next (a)→(b) edge in a bread-expanding manner. Of course, PGX.D controls the number of outgoing messages / intermediate results in order to maintain the execution within limits. You can find more details in our GRADES 2017 publication that describes the main runtime queries in PGX.D (missing the local dynamicity which is described in a follow-up publication currently under submission). Comparing PGX.D to Open-Source Systems We use the LDBC social network benchmark graph (scale 100, 283 million vertices, 1.78 billion edges) and queries (we adapted the queries to reflect the current features of PGX.D; these changes mainly include the removal of HAVING clause, subqueries, and regular path queries). We compare PGX.D to Apache Spark GraphFrames (version 0.7 on top of Spark 2.4.1) and PostgreSQL (version 11.2). Both PGX.D and GraphFrames execute on 8 machines connected with Infiniband. We perform 15 repetitions and report the median run. Clearly, PGX.D is significantly faster than both GraphFrames and the traditional PostgreSQL RDBMS. PGX.D executes the total query suite 29.5- and 17.5-times faster than GraphFrames and PosgreSQL, respectively. In addition, PGX.D is configured to use approximately 16GB runtime memory for intermediate results, while the other two engines are configured to use the whole 756GB (8x for GraphFrames) of memory available in the underlying machines. As I mentioned earlier in this post, GraphFrames implements graph traversals on top of distributed joins on dataframes. Large-Scale Queries In this experiment, we evaluate the engines with very large queries. In particular: Q1: Simple cycle; pattern (v1)→(v2)→(v1) with (a) a COUNT(*) aggregation and (b) AVG aggregations of vertex data; Q2: Two-hop match; pattern (v1)→(v2)→(v3) with (a) a COUNT(*) aggregation and (b) AVG aggregations of vertex data. We execute these queries on graphs of increasing size: Graph # Vertices # Edges Description Livejournal 484K 68.9M Users and friendships Uniform Random 100M 1B Uniform random edges Twitter 42.6M 1.47B Tweets and followers Webgraph-UK 77.7M 2.97B 2006 .uk domains This experiment highlights the true need for the scalable in-memory distributed graph traversal methodology of PGX.D. As the query exploration size increases, GraphFrames and PostgreSQL cannot keep up with the workload. Even with the two smallest graphs, PGX.D is on average 48- and 115-times faster than GraphFrames and PostgreSQL, respectively. Clearly, joins in PostgreSQL are significantly slower than graph traversals in PGX.D. With Q2 on Twitter and Webgraph-UK we see that even the 8 x 756GB = 6TB of total memory (backed by 1+TB of disk) is not sufficient for GraphFrames. As in the previous experiment, PGX.D completes these queries with approximately 16GB of memory in each machine. What's Next We have explored dynamic asynchronous traversals in PGX.D only for graph querying and pattern matching with PGQL. We are currently exploring how to further leverage these fast in-memory explorations for machine learning. For example, we are developing large-scale random walks on top of PGX.D that will serve as the backbone for graph machine-learning solutions. Conclusions I briefly presented our new dynamic asynchronous traversal approach for distributed graphs in PGX distributed mode. Using this approach PGX.D achieves fast, scalable, fully in-memory, with a small memory-footprint distributed graph queries, thus enabling graph processing on a whole new scale of graphs and queries. The hybrid/dynamic traversal functionality is on PGX.D roadmap so stay tuned for new information on its availability. For more information and for trying PGX, you can visit Oracle Labs PGX Technology Network Page.
Graph data processing is already an integral part of big-data analytics with many applications in various domains including Finance, Cyber Security, Compliance, Retail, and Health Sciences. The adoption...
Oracle Enterprise Cloud is putting the power of artificial intelligence into the hands of customers today. Join us at Oracle OpenWorld this September to learn how. And Check out these featured sessions and demo about Oracle AI Applications.
Oracle Enterprise Cloud is putting the power of artificial intelligence into the hands of customers today. Join us at Oracle OpenWorld this September to learn how. And Check out these...
KDnuggets is where data professionals go for answers. Editor Matthew Mayo answers questions about data science trends.
KDnuggets is where data professionals go for answers. Editor Matthew Mayo answers questions about data science trends.
Still holding off on adopting AI and other advanced technologies via the cloud? Here are 3 reasons that’s putting your business at risk.
Still holding off on adopting AI and other advanced technologies via the cloud? Here are 3 reasons that’s putting your business at risk.
Kirk Borne, Principal Data Scientist at Booz Allen Hamilton, breaks down pattern discovery for common business use cases.
Kirk Borne, Principal Data Scientist at Booz Allen Hamilton, breaks down pattern discovery for common business use cases.
Read about Victor Lu's experience with Oracle Database, from the evolution of the database optimizer to the ways that the Autonomous Database helps customers develop data science and machine learning solutions.
Read about Victor Lu's experience with Oracle Database, from the evolution of the database optimizer to the ways that the Autonomous Database helps customers develop data science and machine...