Method of Transfer Learning for a Specific Production Process of an Industrial Plant

Info

Publication number: 20230023896
Type: Application
Filed: Sep 30, 2022
Publication Date: Jan 26, 2023
Applicant: ABB Schweiz AG (Baden)
Inventors: Benedikt Schmidt (Heidelberg), Ido Amihai (Heppenheim), Arzam Muzaffar Kotriwala (Ladenburg), Moncef Chioua (Montreal), Dennis Janka (Heidelberg), Felix Lenders (Darmstadt), Jan Christoph Schlake (Darmstadt), Martin Hollender (Dossenheim), Hadil Abukwaik (Weinheim), Benjamin Kloepper (Mannheim)
Application Number: 17/957,592

Abstract

A method of transfer learning for a specific production process of an industrial plant includes providing data templates defining expected data for a production process, and providing plant data, wherein the data templates define groupings for the expected data according to their relation in the industrial plant; determining a process instance and defining a mapping with the plant data; determining historic process data; determining training data using the determined process instance and the determined historic process data, wherein the training data comprises a structured data matrix, wherein columns of the data matrix represent the sensor data that are grouped in accordance with the data template and wherein rows of the data matrix represent timestamps of obtaining the sensor data; providing a pre-trained machine learning model using the determined process instance; and training a new machine learning model using the provided pre-trained model and the determined training data.

Description

Description

CROSS-REFERENCE TO RELATED APPLICATIONS

This patent application claims priority to International Patent Application No. PCT/EP2021/058477, filed on Mar. 31, 2021, which claims priority to International Application No. PCT/EP2020/059169, filed on Mar. 31, 2020, each of which is incorporated herein in its entirety by reference.

FIELD OF THE DISCLOSURE

The present disclosure relates to method of transfer learning for a specific production process of an industrial plant, a use of a new machine learning model, trained by such a method, a data processing system, and a computer program.

BACKGROUND OF THE INVENTION

Looking at the current state of machine learning in industry, there is a growing interest in utilizing it for different useful applications. The machine learning-based industrial applications play a role in different tasks like predictive maintenance, process monitoring, and quality control. In these different tasks of problems, certain signals, such as temperature, pressure, flow, etc., can be shared across different tasks, and thus enable knowledge transfer among tasks. However, building a machine learning model for a specific problem of an industrial plant then transfer its learning by reusing it to solve similar problem of another plant is not trivial. This is due to the fact even similar tasks and plants are still having different space of signals.

Each time a new problem in industrial plants and their processes needs to be addressed using machine learning, it is required to go through the tedious and time-consuming tasks of training and validating the model. To decrease this effort and its cost, it would be of advantage to reuse prior learning and knowledge acquired on industrial plant and processes and incorporate them when training new models for similar problems. However, reusing machine learning models or parts of them is a complex task itself and it requires better organization of analyzed input signals. This challenge can be even harder when it is applied to the industrial applications that may involve several signals related to one process or plant.

BRIEF SUMMARY OF THE INVENTION

According to an aspect of the disclosure, a method of transfer learning for a specific production process of an industrial plant comprises the following steps. In a step, a plurality of data templates defining expected data for a production process are provided. In another step, plant data of the industrial plant, comprising data points of the specific production process, are provided, wherein the data points comprise information about input and output of the specific production process. The data template defines a grouping for the expected data according to their relation in the industrial plant. In another step, a process instance of the specific production process is determined, defining a mapping between the plant data to the expected data of the specific production process.

Historic process data, being historic sensor data relating to the specific production process, is determined, using the determined process instance. In another step, training data is determined using the determined process instance and the determined historic process data; wherein the training data comprises a structured data matrix, wherein columns of the data matrix represent the sensor data that are grouped in accordance with the data template and wherein rows of the data matrix represent timestamps of obtaining the sensor data. In another step, a pre-trained machine learning model is provided using the determined process instance. In another step, a new machine learning model is trained using the provided pre-trained model and the determined training data.

BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWING(S)

FIG. 1 is a schematic of a training process for transfer learning in accordance with the disclosure.

FIG. 2 is a diagram of a relation between the data template and the pre-trained machine learning model in accordance with the disclosure.

FIG. 3 is a schematic of an arrangement for reusing layers of a pre-trained machine learning model in accordance with the disclosure.

FIG. 4 is a flowchart for a method of transfer learning for a specific production process in accordance with the disclosure.

DETAILED DESCRIPTION OF THE INVENTION

The reference symbols used in the drawings, and their meanings, are listed in summary form in the list of reference symbols. In principle, identical assembly parts are provided with the same reference symbols in the figures.

Preferably, the functional modules and/or the configuration mechanisms are implemented as programmed software modules or procedures, respectively; however, one skilled in the art will understand that the functional modules and/or the configuration mechanisms can be implemented fully or assembly partially in hardware.

FIG. 1 shows a schematic view of a training process for transfer learning. In one step S30, a process instance is created either manually by a human who defines the mapping between industrial plant data P, in particular inputs/outputs, I/Os, in the industrial plant to data templates T. In other words, from a plurality of generic templates T, comprising expected data of specific assets or production processes, one template T is selected corresponding to the industrial plant data P of the current industrial plant. Alternatively, this is done automatically using digital P&ID and I/O lists and eventually the C&E matrices of the plant by using pre-defined rules for mapping sensor locations to data points in the data template T.

In another step S40, historic process data H is extracted from a historian, in particular using I/Os' names. In other words, the process instance reflects the current asset or production process of the current industrial plant on which the new machine learning model M should be used. Thus, the process instance for example defines names of inputs and outputs of the current industrial plant for which historical production data H can be determined.

In another step S50, a standard data matrix is build, in which columns represent the data points of the historical production data H and the rows represent the timestamps of corresponding sensor readings. The individual data points are subject to various data pre-processing steps as follows: Adapting the sampling frequencies to the standard matrix format, e.g., down sampling from seconds to minutes or up sampling from minutes to 30 seconds, Scaling the data to 0-1 domain, optionally fuse missing data points from available data points, e.g., estimate bottom section temperature based on top section temperature, and remove outliers.

In another step S60, a new model is trained starting from a pre-trained model Mp using weights obtain from previous trainings and allow the training process to adjust these weights according to loss generated from data samples of the current plant. This may involve using all or parts of the of the pre-trained model. Optionally, certain layers of the network can be excluded, e.g., freeze the layer, from the changing the weights, e.g., keep top layer as it is, or optionally choose different learning rates across the layered networks. These two options could be explored and optimized automatically using hyper-parameter optimization.

FIG. 2 shows a relation between the data template and the pre-trained machine learning model. The data template T is a list of data point for example, I1: temperature values, I2: pressure values, I3: level alarms, and I4: valve positions with information on the location on the process or asset (e.g., temperature on top section of processing column). Each prediction, the order of the training data is maintained across all training runs of the new machine learning model M, or in other words the transferred learning model. In this way, the weights the pre-trained machine learning model Mp has obtained during training still can be mapped to the same meaningful features F1-F5 across all training runs.

FIG. 3 shows a schematic view of reusing layers of a pre-trained machine learning model. A new machine learning model M comprises a plurality of layers, in this case, a first layer L1, a second layer L2, a third layer L3 and a fourth layer Ln. The first layer L1, the second layer L2, the third layer L3 and the fourth layer Ln are pre-trained layers that have been trained with plant data for a first plant A. In other words, weights obtained by training the first layer L1, the second layer L2, the third layer L3 and the fourth layer Ln are already known to the new machine learning model M. However, when training the new machine learning model M with plant data of a second plant B, not all weights are adjusted. In this case, the first layer L1, the second layer L2 and the third layer L3 are frozen. In other words, those weights are not adjusted during training with the plant data of the second plant B.

If the new machine learning model M that has been trained with the data of the second plant B does not perform to a predetermined satisfaction, an iterative process is executed in which it is decided which parts of the pre-trained machine learning model Mp can be reused and which parts should be dropped and retrained. The performance of the new machine learning model M is determined in an evaluation process using a score model, for example classification, regression values or anomaly scores. In other words, if the new machine learning model M does not perform satisfactory, an amount of frozen layers are iteratively unfrozen and retrained.

FIG. 4 shows a schematic view of a method of transfer learning for a specific production process.

In a first step S10, a plurality of data templates T defining expected data for a production process are provided. In a second step S20, plant data of the industrial plant, comprising data points of the specific production process, are provided, wherein the data points comprise information about input and output of the specific production process. The data template defines a grouping for the expected data according to their relation in the industrial plant. In a third step S30, a process instance I of the specific production process is determined, defining a mapping between the plant data to the expected data of the specific production process.

Historic process data H, being historic sensor data relating to the specific production process, is determined in a fourth step S40, using the determined process instance I. In a fifth step S50, training data is determined using the determined process instance I and the determined historic process data H; wherein the training data comprises a structured data matrix, wherein columns of the data matrix represent the sensor data that are grouped in accordance with the data template T and wherein rows of the data matrix represent timestamps of obtaining the sensor data. In a sixth step S60, a pre-trained machine learning model Mp is provided using the determined process instance I. In a seventh step S70, a new machine learning model Mn is trained using the provided pre-trained model Mp and the determined training data.

The one embodiment, the data points comprise information the specific production process, in particular an asset of the production process, with basic semantic information, for example sensor positions and/or sensor types.

The term “data templates”, as used herein, comprises a list of the typical data points or measurements that are typically available from an asset (e.g. a drive train (pump, motor, drive) or distillation columns (temperature, levels, pressures and flows on different height levels). Furthermore, the data template places measurements that are related in proximity in the list. e.g. the speed setpoint of the drive, the voltage/current of the motor and the vibration of pump and motor are subsequent elements of the list.

When the data templates are determined, typical signal combinations are identified in the expected data. Those typical signal combinations are always grouped together in the training data. Further preferably, the grouped signals are disposed in neighbouring columns of the data matrix. Thus, a machine learning model, in particular an artificial neural network, ANN, processes the grouped signals together, for example by convolutions, or control the network architecture, in particular which data is convoluted with which data. Thus, a performance of the new machine learning model is improved. Further, transfer learning is facilitated.

In other words, typical signal, A&E, combination, e.g. 2× level, 2× pressure, temperature, inflow, outflow of a processing columns, are identified. These signals are always grouped together in the plant data, e.g., neighboring columns, so that an artificial neural network processes the data together, e.g. by convolutions, or control the network architecture, e.g. which data is convoluted. This helps the performance of the machine learning model. It can be also used to facilitate transfer learning. If a new model is trained and also data is used from a process column, the network architecture and weights from previously learnt models can be partially extracted.

Digital libraries of data templates that define what data is expected from production processes are provided as inputs. Additionally, plant data, comprising a list of data points of a specific asset or processes with basic semantic information, e.g., sensors position and their types, are provided. Further, historic process data from the current process that are tried to transfer the machine learning model to are provided.

As an output, a new working machine learning model is achieved by tuning the pre-trained model to the current industrial plant. In addition, the new model is used to present the production process or asset status to the human user or to trigger automated actions, e.g., closing a valve.

In one embodiment, the data templates comprise digital libraries that define what data are expected from a production process.

In one embodiment, the data points comprise temperature values, pressure values, level alarms, valve positions.

In one embodiment, the pre-trained machine learning model has been trained from at least one asset or production process of an industrial plant.

In other words, the method provides working machine learning model by tuning a pre-trained machine learning model to the current industrial plant or in particular a component of the current industrial plant.

The described method allows for providing transfer learning for industrial applications based on data templates of industrial plant signals. Thus, an improved method for transfer learning for a specific production process of an industrial plant is provided.

In a preferred embodiment, determining the training data comprises pre-processing the historic process data, thereby standardizing a format of the training data.

Preferably, the pre-processing steps format the historic process data so that a data matrix is determined that is semantically identical to what the pre-trained model has been trained on. The determined data matrix is used as input for new machine learning model for training to obtain predictions from the new machine learning model that are either displayed to a human user or used to trigger automatic actions.

In one embodiment, pre-processing the historic process data comprises adapting a sampling frequency to a standardized data matrix format.

In one embodiment, pre-processing the historic process data comprises scaling the historic process data to a 0-1 domain.

In one embodiment, pre-processing the historic process data comprises fusing missing data points of the historic process data from available data points of the historic process data.

In one embodiment, pre-processing the historic process data comprises removing outliers of the historic process data.

In one embodiment, the pre-trained model comprises weights wherein training the new machine learning model comprises adjusting the weights

In other words, the weights are obtained from previous trainings of the pre-trained model.

Preferably, the weights are adjusted according to loss generated from data samples of new machine learning model, in other words the current industrial plant.

In a preferred embodiment, the pre-trained machine learning model comprises at least one layer wherein training the new machine learning model comprises the following steps. In a step, each layer is categorised, using the determined process instance, in one of the categories frozen or non-frozen. In another step, the frozen layers of the pre-trained machine learning model are reused and the non-frozen layers of the pre-trained machine learning model are retrained.

Preferably, for each layer, it is determined if the layer is a frozen layer that is not retrained or a non-frozen layer that is retrained, using the corresponding data template.

Preferably, reusing the frozen layers allows to use a network architecture and/or weights from the pre-trained machine learning model to train the new machine learning model.

Preferably, the determination of the layer is a frozen layer or a non-frozen layer is automatically optimized using hyper-parameter optimization.

Preferably, the retraining is performed in an iterative way where additional layers are retrained until a satisfactory level of performance is achieved.

Preferably, determining, which layer is a frozen layer and which layer is a non-frozen layer, is done based on the type of the layer. The aim is to retrain mainly the decision logic of the machine learning network. Usually, these layer have a different type of architecture (densely connected) then previous layers (e.g. convolutional and pooling layers or Recurrent Layers). Further preferably, the determination is done by trying out reusing different layers and selecting the configuration that yield the best results (best performance on a test data set, e.g. measured as root-mean-square error for regression or accuracy for classification).

Thus, an automatic matching of reusable pre-trained machine learning models based on their data templates is provided.

In a preferred embodiment, the pre-trained machine learning model comprises at least one layer, wherein training the new machine learning model comprises the following steps: In a step, each layer is categorised, using the determined process instance, in one of the categories frozen or non-frozen. In another step, different learning rates are applied on the at least one layer depending on the determination if the layer is a frozen layer or a non-frozen layer.

In other words, different learning rates can be chosen across the layers of the pre-trained machine learning model.

Preferably, the determination of the layer is a frozen layer or a non-frozen layer is automatically optimized using hyper-parameter optimization.

Preferably, the retraining is performed in an iterative way where additional layers are retrained until a satisfactory level of performance is achieved.

In a preferred embodiment, the data points comprise input/output names of the specific production process, wherein the historic process data is determined using the input/output names.

In a preferred embodiment, wherein training the new machine learning model comprises using the data matrix as input for the new machine learning model to obtain a prediction as output from the new machine learning model.

Preferably, the prediction comprises a classification, regression values and/or an anomaly score.

According to an aspect of the disclosure, the new machine learning model, trained by a method, as described herein, is used to provide status data of the industrial plant.

In other words, the working new machine learning model allows presenting a process status or an asset status of the industrial plant to a human user or to trigger an automated action, for example closing a valve of the industrial plant.

According to an aspect of the invention, a data processing system comprising means for carrying out the steps of a method, as described herein, is provided.

According to an aspect of the invention, a computer program comprising instructions, which, when the program is executed by a computer, cause the computer to carry out the steps of a method, as used herein, is provided.

LIST OF REFERENCE SYMBOLS

T data template
M new machine learning model
Mp pre-trained machine learning model
H historic process data
P plant data
l1 first list
l2 second list
l3 third list
l4 fourth list
F1 first feature
F2 second feature
F3 third feature
F4 fourth feature
F5 fifth feature
L1 first layer
L2 second layer
L3 third layer
Ln fourth layer
A plant data of a first plant
B plant data of a second plant

All references, including publications, patent applications, and patents, cited herein are hereby incorporated by reference to the same extent as if each reference were individually and specifically indicated to be incorporated by reference and were set forth in its entirety herein.

The use of the terms “a” and “an” and “the” and “at least one” and similar referents in the context of describing the invention (especially in the context of the following claims) are to be construed to cover both the singular and the plural, unless otherwise indicated herein or clearly contradicted by context. The use of the term “at least one” followed by a list of one or more items (for example, “at least one of A and B”) is to be construed to mean one item selected from the listed items (A or B) or any combination of two or more of the listed items (A and B), unless otherwise indicated herein or clearly contradicted by context. The terms “comprising,” “having,” “including,” and “containing” are to be construed as open-ended terms (i.e., meaning “including, but not limited to,”) unless otherwise noted. Recitation of ranges of values herein are merely intended to serve as a shorthand method of referring individually to each separate value falling within the range, unless otherwise indicated herein, and each separate value is incorporated into the specification as if it were individually recited herein. All methods described herein can be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The use of any and all examples, or exemplary language (e.g., “such as”) provided herein, is intended merely to better illuminate the invention and does not pose a limitation on the scope of the invention unless otherwise claimed. No language in the specification should be construed as indicating any non-claimed element as essential to the practice of the invention.

Preferred embodiments of this invention are described herein, including the best mode known to the inventors for carrying out the invention. Variations of those preferred embodiments may become apparent to those of ordinary skill in the art upon reading the foregoing description. The inventors expect skilled artisans to employ such variations as appropriate, and the inventors intend for the invention to be practiced otherwise than as specifically described herein. Accordingly, this invention includes all modifications and equivalents of the subject matter recited in the claims appended hereto as permitted by applicable law. Moreover, any combination of the above-described elements in all possible variations thereof is encompassed by the invention unless otherwise indicated herein or otherwise clearly contradicted by context.

Claims

1. A method of transfer learning for a specific production process of an industrial plant, comprising:

providing a plurality of data templates defining expected data for a production process;

providing plant data of the industrial plant, comprising data points of the specific production process, wherein the data points comprise information about input and output of the specific production process;

wherein the data template defines a grouping for the expected data according to their relation in the industrial plant;

determining a process instance of the specific production process, defining a mapping between the plant data (P) to the expected data of the specific production process;

determining historic process data, being historic sensor data relating to the specific production process using the determined process instance;

determining training data using the determined process instance and the determined historic process data, wherein the training data comprises a structured data matrix, wherein columns of the data matrix represent the sensor data that are grouped in accordance with the data template, and wherein rows of the data matrix represent timestamps of obtaining the sensor data;

providing a pre-trained machine learning model using the determined process instance; and

training a new machine learning model using the provided pre-trained model and the determined training data.

2. The method of claim 1, wherein determining the training data comprises preprocessing the historic process data, thereby standardizing a format of the training data.

3. The method of claim 2, wherein preprocessing the historic process data comprises adapting a sampling frequency to a standardized data matrix format.

4. The method of claim 2, wherein preprocessing the historic process data comprises scaling the historic process data to a 0-1 domain.

5. The method of claim 2, wherein preprocessing the historic process data comprises fusing missing data points of the historic process data from available data points of the historic process data.

6. The method of claim 2, wherein preprocessing the historic process data comprises removing outliers from the historic process data.

7. The method of claim 1, wherein the pre-trained model comprises trained weights, and wherein training the new machine learning model comprises adjusting the trained weights.

8. The method of claim 1, wherein the pre-trained machine learning model comprises at least one layer, and wherein training the new machine learning model comprises:

categorizing each layer using the determined process instance in one of the categories frozen or non-frozen; and

reusing the frozen layers of the pre-trained machine learning model and retraining the non-frozen layers of the pre-trained machine learning model.

9. The method of claim 1, wherein the pre-trained machine learning model comprises at least one layer, and wherein training the new machine learning model comprises:

categorizing each layer using the determined process instance in one of the categories frozen or non-frozen; and

applying different learning rates on the at least one layer depending on the determination if the layer is a frozen layer or a non-frozen layer.

10. The method of claim 1, wherein the data points comprise input/output names of the specific production process, and wherein the historic process data is determined using the input/output names.

11. The method of claim 1, wherein training the new machine learning model comprises using the data matrix as input for the new machine learning model to obtain a prediction as output from the new machine learning model.