Training a Machine Learning Model on Time-Series Telemetry to Optimize Manufacturing

A blast furnace heats raw iron ore to produce molten pig iron, and this furnace heating is the first major step in the steel-making process that is illustrated in Fig. 1 below.

Alternative Text — Figure 1. Schematic shows a skip-cart loading a charge of raw iron ore and coke (purified coal) into the top of a blast furnace that is heated from below via a hot air blast of temperture T~2000° F. This hot air blast heats the furnace and smelts the molten pig iron that flows out via the iron notch, with waste furnace gasses exiting above where bleeder valves protect the furnace from sudden pressure surges.

The blast furnace at one particular steel-making Oracle cloud customer was also prone to problematic `bleeder’ events where the furnace gas pressure can surge without warning by 20-100% (see Fig. 2 below), which triggers a bleeder valve to vent excess furnace gas. Though rare, these bleeder events are undesired because they interrupt production and can have environmental impact.

To date, the furnace’s subject-matter-experts (SMEs) had not determined the root-cause of these events, so a traditional machine learning (ML) model was then trained on the furnace telemetry, to forecast whether a bleeder event would occur 30 minutes in advance. But that ML model was only a partial success. Although that ML model did alert on most events in sufficient time, testing revealed that this model also emitted three times as many false positive alerts, with those false positives causing furnace operators to ignore the alerting system. This Oracle customer then asked for assistance with this use-case, and this blog describes the recommended solution that was developed using the OCI-DataScience service on the furnace’s historical telemetry that this cloud customer had archived in its datalake within Oracle Cloud Infrastructure (OCI).

First note that this steel maker had achieved only partial success with its ML forecasting system, which tells us that this manufacturing mishap is a probabilistic event and is not deterministic. In other words, the furnace data does provide useful hints about whether a mishap is more or less likely to happen, but that data is not sufficiently diagnostic to allow an ML model to forecast when that mishap would happen again with the desired certainty.

In this circumstance, the better remedy is to instead use ML to optimize furnace operations, namely, to train an ML model to recommend to the furnace operator those furnace settings that would preserve furnace production while steering clear of those conditions that are more likely to lead to a mishap. The recommended solution trains an ML model to instead forecast a derived quantity called furnace-health where p is the furnace pressure, is the furnace’s mass-production-rate, and the so-called mishap-factor f is almost always unity except during rare moments that are detailed below. The goal of this optimization strategy is to always seek those furnace settings that minimize h in the near future, which implies that production is sustained in way that prevents p from ramping up rapidly.

ML for optimization: This new ML model is trained on a short list of the model’s most impactful features X, namely, furnace pressure p , furnace production-rate , furnace-health h , plus a few other features relevant to furnace operations. This new ML model is also trained on the furnace’s wind-rate w , which is the main control lever that the operator uses to manage furnace operations, so w is the furnace setting that is to be optimized by this ML model. And to make the model time-sequence aware, the ML model is also trained on δ X_-1= X – X_-1 which measures how feature X changed from its value at the previous timestep (aka moment-in-time) X_-1 to its value X at the next moment. And because we want to use this model as a recommendation engine, the model is also trained on δ w ₊₁, which is the change that the furnace operator will make to the furnace’s wind-rate w during the subsequent timestep. Note that δ w ₊₁ is a forward-looking quantity that is known when training the model on historical data but is of course unknown when the model is used in production. Lastly, this ML model is trained to predict h ₊₂, which is the furnace-health two time-steps into the future. This results in an ML model that is a function of a single unknown, δ w ₊₁, which is the wind-rate change that the operator could make during the subsequent timestep, with that model also predicting the furnace health h ₊₂ two timesteps out. This ML strategy now allows the furnace operator to use the ML model to optimize furnace settings since the optimal change to furnace wind-rate δ w ₊₁ will be the value that minimizes h ₊₂. Consequently, this optimization strategy will always recommend to the operator the furnace setting that would steadily nudge the furnace towards a lower value of h such that furnace production rate is maximal while keeping furnace pressure p under control.

Note that the mishap factor f appearing in the furnace-health score is always unity except when the furnace is one time-step away from a mishap, for which f =3. So when training the ML model on historical data, this setting flags those furnace operations that resulted in a bleeder event. So f is also a forward-looking quantity that ordinarily would not be known when the ML model is being used to predict future furnace health h +2 . Nonetheless, by setting f =1 when using ML to optimize the furnace’s δw +1 , that setting also constrains the ML model to only consider those possible future-states where the furnace is always at least one or more timesteps away from a bleeder event. This f =1 requirement thus allows this optimization strategy to navigate the furnace around or away from a mishap in the future.

Next step: testing the solution. Lastly note that an ML optimizer is a recommendation engine whose efficacy can only be assessed while the solution is used in production. That testing strategy is known as A/B testing and involves temporarily exposing the ML model’s recommended furnace settings to the furnace operator and then monitoring whether production is sustained while bleeder frequency is reduced. Which is this Oracle customer’s next step, the results of which will be reported via a follow up blog.

References:

Training a Machine Learning Model on Time-Series Telemetry to Optimize Manufacturing

Joe Hahn

Senior Data Scientist

Predictive Maintenance for Upstream Oil and Gas

Using Jupyter on OCI Data Science to analyze HPC results

Training a Machine Learning Model on Time-Series Telemetry to Optimize Manufacturing

Authors

Joe Hahn

Senior Data Scientist

Predictive Maintenance for Upstream Oil and Gas

Using Jupyter on OCI Data Science to analyze HPC results