SYSTEM AND METHOD FOR USING AI-BASED MODEL TO PREDICT BOREHOLE SIZE IN HORIZONTAL CARBONATE WELLS

Info

Publication number: 20240254868
Type: Application
Filed: Jan 27, 2023
Publication Date: Aug 1, 2024
Inventors: Lailaa Helmi Alshammasi (Qatif), Yacine Meridji (Dhahran), Majed Fareed Kanfar (Dammam), Lautaro Rayo (Dhahran)
Application Number: 18/102,293

Abstract

Some implementations provide a method that includes: accessing a stream of input data from logging tools in a first well-bore, wherein the stream of input data comprises measurements of bore sizes inside the first well-bore; splitting the stream of input data into a training set of input data and a testing set of input data; training a machine learning model using the training set of input data, wherein the machine learning model is configured to predict a bore size parameter based on input features of the training set of input data; evaluating the machine learning model using the testing set of input data; and in response to evaluating the machine learning model as satisfactory, applying the machine learning model to a newly received stream of input data from a second well-bore such that the bore size parameter of the second well-bore is determined independent of measurements of bore sizes inside the second well-bore.

Description

Description

TECHNICAL FIELD

This disclosure generally relates to bore completion design in the context of geo-exploration for oil and gas.

BACKGROUND

Logging-While-Drilling (LWD) generates logs of measurements during a drilling operation at a well-bore. The logs can include real-time and in-situ recordings capable of revealing the characteristics of the well-bore and the surround reservoir. The drilling operation can also include drilling deviated wells and horizontal wells in addition to vertical wells.

SUMMARY

In one aspect, some implementations provide a computer-implemented method, that includes: accessing a stream of input data from logging tools in a first well-bore, wherein the stream of input data comprises measurements of bore sizes inside the first well-bore; splitting the stream of input data into a training set of input data and a testing set of input data; training a machine learning model using the training set of input data, wherein the machine learning model is configured to predict a bore size parameter based on input features of the training set of input data; evaluating the machine learning model using the testing set of input data; and in response to evaluating the machine learning model as satisfactory, applying the machine learning model to a newly received stream of input data from a second well-bore such that the bore size parameter of the second well-bore is determined independent of measurements of bore sizes inside the second well-bore.

Implementations may include one or more of the following features.

The machine learning model may include at least one of: a Random Forest (RF) model, or a XGBoost (eXtreme Gradient Boosting) model. The computer-implemented method may include: selecting the input features for the machine learning model. In the computer-implemented method, evaluating the machine learning model may include: computing a Root Mean Square Error (RMSE) between the predicted bore size parameter and an actual measurement; and comparing the RMSE with a pre-determined threshold. The computer-implemented method may further include: in response to evaluating the machine learning model as unsatisfactory, refining the machine learning model. In the computer-implemented method, refining the machine learning model may include at least one of: providing at least one additional input feature to the machine learning model, or replacing at least one input feature with a different input feature. In the computer-implemented method, refining the machine learning model may include: adjusting at least one parameter of the machine learning model. The stream of input data may include: logging data encoding a resistivity, a density, a neutron recording, and a gamma ray recording. The bore size parameter may include: at least one of: a maximum size, or a minimum size.

In another aspect, some implementations of the present disclosure include computer system comprising one or more hardware processors configured to perform operations of: accessing a stream of input data from logging tools in a first well-bore, wherein the stream of input data comprises measurements of bore sizes inside the first well-bore; splitting the stream of input data into a training set of input data and a testing set of input data; training a machine learning model using the training set of input data, wherein the machine learning model is configured to predict a bore size parameter based on input features of the training set of input data; evaluating the machine learning model using the testing set of input data; and in response to evaluating the machine learning model as satisfactory, applying the machine learning model to a newly received stream of input data from a second well-bore such that the bore size parameter of the second well-bore is determined independent of measurements of bore sizes inside the second well-bore.

Implementations may include one or more of the following features.

The machine learning model may include at least one of: a Random Forest (RF) model, or a XGBoost (eXtreme Gradient Boosting) model. The operations may include: selecting the input features for the machine learning model. Evaluating the machine learning model may include: computing a Root Mean Square Error (RMSE) between the predicted bore size parameter and an actual measurement; and comparing the RMSE with a pre-determined threshold. The operations may further include: in response to evaluating the machine learning model as unsatisfactory, refining the machine learning model. Refining the machine learning model may include at least one of: providing at least one additional input feature to the machine learning model, or replacing at least one input feature with a different input feature. Refining the machine learning model may include: adjusting at least one parameter of the machine learning model. The stream of input data may include: logging data encoding a resistivity, a density, a neutron recording, and a gamma ray recording. The bore size parameter may include: at least one of: a maximum size, or a minimum size.

Implementations may include one or more of the following features.

The machine learning model may include at least one of: a Random Forest (RF) model, or a XGBoost (eXtreme Gradient Boosting) model. The computer-implemented method may include: selecting the input features for the machine learning model. In the computer-implemented method, evaluating the machine learning model may include: computing a Root Mean Square Error (RMSE) between the predicted bore size parameter and an actual measurement; and comparing the RMSE with a pre-determined threshold. The computer-implemented method may further include: in response to evaluating the machine learning model as unsatisfactory, refining the machine learning model. In the computer-implemented method, refining the machine learning model may include at least one of: providing at least one additional input feature to the machine learning model, or replacing at least one input feature with a different input feature. In the computer-implemented method, refining the machine learning model may include: adjusting at least one parameter of the machine learning model. The stream of input data may include: logging data encoding a resistivity, a density, a neutron recording, and a gamma ray recording. The bore size parameter may include: at least one of: a maximum size, or a minimum size.

Implementations according to the present disclosure may be realized in computer implemented methods, hardware computing systems, and tangible computer-readable media. For example, a system of one or more computers can be configured to perform particular actions by virtue of having software, firmware, hardware, or a combination of them installed on the system that in operation causes or cause the system to perform the actions. One or more computer programs can be configured to perform particular actions by virtue of including instructions that, when executed by data processing apparatus, cause the apparatus to perform the actions.

The details of one or more implementations of the subject matter of this specification are set forth in the description, the claims, and the accompanying drawings. Other features, aspects, and advantages of the subject matter will become apparent from the description, the claims, and the accompanying drawings.

DESCRIPTION OF DRAWINGS

FIG. 1 is a diagram illustrating an example of a work flow according to some implementations of the present disclosure.

FIG. 2 is another diagram illustrating an example of a workflow process that operates on the data illustrated in FIG. 1

FIG. 3 shows an example of training a machine learning model and testing the machine learning model according to some implementations of the present disclosure.

FIG. 4 shows another example of training a machine learning model and testing the machine learning model according to some implementations of the present disclosure.

FIG. 5 is an example of a flow chart according to some implementations of the present disclosure.

FIG. 6 is a block diagram illustrating an example of a computer system used to provide computational functionalities associated with described algorithms, methods, functions, processes, flows, and procedures, according to an implementation of the present disclosure.

Like reference numbers and designations in the various drawings indicate like elements.

DETAILED DESCRIPTION

A well-bore can have uneven size and shape. Knowledge of the borehole shape and profile can provide additional insight to petroleum engineers during drilling, for example, to implement accurate completion design packers. However, acquiring log measurements of size and profile presents additional cost and operational risk, for example, in horizontal development wells. As a practical matter, caliper logs may not always be feasible in every horizontal well. Moreover, the available caliper from logging while drilling (LWD) tools is a synthetic caliper that is generally qualitative.

During a drilling operation, completion design packers of a range of sizes are strategically placed in the development well for proper zonal isolation. However, packers have very limited tolerance and may not be placed in enlarged intervals. Water management caused by mismatched packers can be costly. Knowledge of the location of the washouts can lead to improved prediction of the presence of fractures. To this end, an AI-based caliper with sufficient accuracy but with no boots on the ground can be particularly advantageous.

Implementations of the present disclosure can generate real-time (or pseudo real-time) estimates of bore sizes with no added hardware cost. Significantly, the implementations can be based solely on basic logs during a drilling operation and can be applied to legacy wells where the caliper logs were not acquired. The innovative, and low risk AI-based solution of the present disclosure can predict caliper logs (e.g., for estimating borehole profile) from, for example, basic logs of resistivity-density-neutron-GR (gramm-ray), as the stream of input data is received at the drilling site. The AI-based caliper according to some implementations of the present disclosure can be integrated with petrophysical evaluation for optimum reservoir characterization to facilitate, for example, accurate well completion in improved and optimized hydrocarbon bearing intervals so as to avoid water producing zones, circumvent costly produced water management. As a result, implementations can eliminate the need to acquire caliper logs which require 24-36 hours rig time, while reduce carbon foot print, by using the AI-based pseudo logs as an alternative.

In more detail, the implementations can incorporate a machine learning model that has been trained on several wells where actual caliper logs and basic log data have been obtained for a full petrophysical analysis. The machine learning model utilizes existing field data to find hidden relationships between actual caliper measurements and basic LWD logs (e.g., resistivity, density, neutron and gamma ray logs). In some cases, the model can use a total of twenty-two (22) input features. The machine learning model can include, for example, a random forest algorithm based on the best root mean square error (RMSE). Other methods such Xgboost, SVM or neural networks can also be used.

After obtaining the model using measured caliper logs & basic logs, some implementations can test the model blindly in nearby wells. Blind testing in nearby wells can validate the model by generating a through comparison between measured calipers and AL-calipers in a controlled setting. The comparison reveals excellent match between caliper measurements and AI-based calipers, attesting to the robustness and utility of the obtained model.

Implementations of the present disclosure can be used for both oil and gas carbonate bearing reservoirs when dealing with horizontal wells with LWD T-combo (Triple-combo: resistivity-density-neutron-GR) data.

The terminology used in the present disclosure includes the following terms.

The term “machine learning analytics” refers to the use of machine learning and applied statistics to predict unknown conditions based on the available data. Two general areas that fall tinder machine learning analytics are classification and regression. While classification refers to the prediction of categorical values, regression connotes the prediction of continuous numerical values. One machine learning implementation is also known as “supervised learning” where the “correct” target or y values are available. For illustration, the goal of some implementations is to learn from the available data to predict the unknown values with some defined error metrics. In supervised learning, for example, there are a set of known predictors (features) x₁, x₂, . . . , x_mwhich are known to the system as well as the target values y₁, y₂, . . . , y_n, which are to be inferred. The system's objective is to train a machine learning model to predict new target values y₁, y₂, . . . , y_nby observing new features.

The implementations can employ a variety of machine learning algorithms. For classification, examples of prediction algorithms can include logistic regression, decision trees, nearest neighbor, support vector machines, K-means clustering, boosting, and neural networks. For regression, examples of predication algorithms can include least squares regression, Lasso, and others. The performance of an algorithm can depend on a number of factors, such as the selected set of features, training/validation methods, and hyper-parameters tuning. As such, machine learning analytics can manifest as an iterative approach of knowledge finding that includes trial and error. An iterative approach can iteratively modify data preprocessing and model parameters until the result achieves the desired properties.

FIG. 1 shows a diagram 100 illustrating an example of a workflow process according to some implementations of the present disclosure. As illustrated, diagram 100 may operate on an input stream of well-log data from a wellbore, shown as input log 101. Well-log data can present concise and detailed plots of formation parameters measured versus the depth where the measurements are taken. From these plots, interpretations are often performed to recognize the significance of each measurement. For context, logging tools can record the magnitude of a specific formation property, such as resistivity, measured as the tool traverses an interval defined by depth; a well log is a chart that shows the value of that measurement plotted versus depth. Implementations may incorporate resistivity, density, neutron, and gamma ray logs. The resistivity log may present a measure of resistivity measured in ohm·m²/m, which is usually referred to simply as ohm·m, as a function of depth. The ability to conduct electrical current is a function of the conductivity of the water contained in the pore space of the rock. Fresh water does not conduct electricity; however, the salt ions found in most formation waters do. Thus, unless that water is fresh, water-saturated rocks have high conductivity and low resistivity. Hydrocarbons, which are nonconductive, cause resistivity values to increase as the pore spaces within a rock become more saturated with oil or gas.

The density log may present a measurement of the electron density as a function of measurement depth, for example, based on gammy ray (GR) measurements using a counter device. For example, the logging device may include a contact tool that emits gamma rays from a source. Emitted gamma rays collide with formation electrons and scatter. The detector, located a fixed distance from the tool source, counts the number of returning gamma rays. The number of returning gamma rays is an indicator of formation bulk density. The litho-density tool (LDT) also provides a photoelectron (P_e) cross section curve, an independent indicator of lithology.

The neutron log can measure hydrogen concentration in a formation. The logging device is a noncontact tool that emits neutrons from a source. Emitted neutrons collide with nuclei of the formation and lose some of their energy. Maximum energy loss occurs when emitted neutrons collide with hydrogen atoms because a neutron and a hydrogen atom have almost the same mass. Therefore, most neutron energy loss occurs in the part of the formation that has the highest hydrogen concentration. Neutron energy loss can be related to porosity because in porous formations, hydrogen is concentrated in the fluid filling the pores. Reservoirs whose pores are gas filled may have a lower porosity than the same pores filled with oil or water because gas has a lower concentration of hydrogen atoms than either oil or water.

As mentioned above, mechanically-induced instabilities during the drilling process can perturb the local stress equilibrium. As a result, pre-existing stress can be redistributed around the wellbore, through which the stresses become concentrated in the rock adjacent to the wellbore wall. In cases where the stress concentration is too large for the rock to withstand, the rock around the wellbore can fail. While drilling fluid can provide hydrostatic pressure to carry some of the load previously carried by the excavated rock, the stress concentration around the wellbore can alter, depending on the amount of hydrostatic pressure exerted by the drilling fluid in the wellbore. Implementations of the present disclosure can predict bore sizes (e.g., horizontal wells) such that completion design packers can be installed with fitting sizes at intervals.

Further referring to FIG. 2, diagram 200 shows an example of a workflow that operates on input log 101. Here, basic logs 201 encodes input log 101 and corresponds to massive amounts of incoming data from sensors, e.g., at the bit level of a shaft. In various implementations, this incoming data are streamed from a range that can span kilometers underground or inside the wellbore. The speed and volume of such data can only be handled by processing machines, such as processors and dedicated hardware.

Based on basic log 201 that encodes input log 101, various implementations can predict a bore size parameter, for example, using machine learning models (202). In particular, implementations may distill the input log 101 for input features that are more relevant that those unselected. For example, implementations may focus on twenty-two (22) input features from the input log 101. As illustrated in box 102, the input features may include: resistivity log, petroelectric log, and density log from four quadrants (namely, bottom quadrant, upper quadrant, left quadrant, and right quadrant), volume_calcite, volume_dolomite, Volume_Anhydritet, borehole azimuth (HAZI), borehole deviation (DEVI), TVD (true vertical depth), neutron-density separation (NDS), gamma ray (GR), neutron (NPHI), and deep resistivity (ILD). The following table shows an example of twenty-two features used by some implementations of the present disclosure.

LWD LOG NAME DESCRIPTION LOG TYPE ROBB DENSITY TOOL BOTTOM QUADRANT MEASURED ROBL DENSITY TOOL LEFT QUADRANT MEASURED ROBR DENSITY TOOL RIGHT QUADRANT MEASURED ROBU DENSITY TOOL UPPER QUADRANT MEASURED DRHB DENSITY CORRECTION BOTTOM QUADRANT MEASURED DRHL DENSITY CORRECTION LEFT QUADRANT MEASURED DRHR DENSITY CORRECTION RIGHT QUADRANT MEASURED DRHU DENSITY CORRECTION UPPER QUADRANT MEASURED PEB DENSITY PHOTO ELECTRIC READING BOTTOM MEASURED QUADRANT PEL DENSITY PHOTO ELECTRIC READING LEFT MEASURED QUADRANT PER DENSITY PHOTO ELECTRIC READING RIGHT MEASURED QUADRANT PEU DENSITY PHOTO ELECTRIC READING UPPER MEASURED QUADRANT VOL_ANHYDRITE VOLUME ANHYDRITE FROM MULTMINERAL CALCULATED ANALYSIS VOL_CALCITE VOLUME CALCITE FROM MULTMINERAL CALCULATED ANALYSIS VOL_DOLOMITE VOLUME DOLOMITE FROM MULTMINERAL CALCULATED ANALYSIS HAZI BOREHOLE AZIMUTH FROM DIRECTIONAL MEASURED SURVEY DEVI BOREHOLE DEVIATION FROM DIRECTIONL MEASURED SURVEY LOG10-TVD LOG10 OF TRUE VERTICAL DEPTH CALCULATED NDS DENSITY NEUTRON SEPARATION NDS = DENSITY CALCULATED POROSITY − NEUTRON PORSITY GR GAMMA RAY LOG MEASURED NPHI NEUTRON POROSITY MEASURED LOG10-ILD LOG-10 OF DEEP RESISTIVITY MEASURED

In some implementations, the selection of the input features can be results driven. For example, the implementations may selected twenty-two input features that jointly present the most accurate prediction compared to other combination of input features. By way of illustration, implementations are not limited to the twenty-two (22) input features of the example.

Implementations can apply various machine learning (ML) models to predict a bore size of wells whose log data were not used to train the ML models (203). For example, the prediction can be based on analyzing the input features 102 distilled from well logs of wells that are different from the wells used for training the models. Examples of bore size parameters (103) can include a maximum size, which corresponds to an upper limit, and a minimum size, which corresponds to a lower limit. As explained above, completion design packers may be installed at intervals inside a well to, for example, mitigate mechanically-induced instabilities. To this end, the estimate size information may guide the petroleum engineer to select packers of the right size (or size range) based on field data. The ML models can include, for example, Logistic Regression (LR), Linear Discriminatory Analysis (LDA), Gaussian Naïve Bayes (GNB), RandomForest (RF), eXtreme Gradient Boosting (XGBoost), Multilayer Perception (MLP), and Extra Trees (ET). In particular, RF (Random Forest) and XGBoost (eXtreme Gradient Boosting) models may demonstrate higher accuracy than other models. In some implementations, the selection process of the input features can be based on knowledge of the actual measurement size. The selection process can include multiple iterations before determining the exact input features.

FIG. 3 shows an example of training (left) and applying (right) a machine learning model to predict a maximum bore size (CAL_MAX). As illustrated, the machine learning model trained by the training set is capable of predicting the maximum bore size in new wells with a root mean square error (RMSE) of 0.088 inches. Similarly, FIG. 4 shows an example of training (left) and applying (right) a machine learning model to predict a minimum bore size (CAL_MIN). As illustrated, the machine learning model trained by the training set is capable of predicting the minimum bore size in new wells with a root mean square error (RMSE) of 0.124 inches. These results demonstrate the feasibility of applying machine learning models to predict bore size when the actual bore size is unknown. Indeed, implementations may use the log data from several wells (where the actual size is known) to train the machine-learning model. After reaching an acceptable RMSE error in the training data set, the implementations may be deployed in multiple wells across the same field to benchmark against the measured logs. When the RMSE error becomes unsatisfactory, the implementations may further conduct the training of the machine learning model. For example, model parameters may be modified with updated and additional data to supplement the earlier training data. Additionally or alternatively, the machine learning model may be modified with a new set of input features that yield improved RMSE error (or comparable RMSE error but with fewer numbers of input features).

FIG. 5 is a flow chart 500 illustrating an example of a process according to some implementations of the present disclosures. The process may access well log data received from sensors deployed in a well-bore during a drilling operation (501). The well log data may also be known as the input log received as a data stream. The sensors may also be known as logging tools for recording measurements taken in the well bore during drilling, as described above in association with FIGS. 1-2.

The well log data may then be split into a training set and a testing set (502). In some cases, the training set can be about 80% of the available well log data, while the testing set may take up the remaining 20%. The training set, for example, can include actual measurement of the bore size. Similarly, the testing set may also include actually measured data of the bore size.

The process may then train a machine learning model using data from the training set (503). The machine learning model can predict, based on a number of input features, the bore size. Examples of the machine learning model can include a Logistic Regression (LR) model, a Linear Discriminatory Analysis (LDA) model, a Gaussian Naîve Bayes (GNB) model, a Random Forest (RF) model, an eXtreme Gradient Boosting (XGBoost) model, a Multilayer Perception (MLP) model, and an Extra Trees (ET) model. In particular, implementations may use the RF (Random Forest) model, or the XGBoost (eXtreme Gradient Boosting) model. As described above in association with FIGS. 1-2, a number of input features can be selected for the machine learning model

The process may then validate the trained machine learning model using data from the testing set (504). For example, the process may apply the trained machine learning model to data from the testing set to derive a bore size, and then compare the derived bore size with the actual measurement. Examples of metrics to quantify the validation result can include a root mean square error (RMSE).

The process may then determine whether the validation is successful (505). For example, the process may compare the metric, such as the RMSE error, to a pre-determined threshold level. Alternatively or additionally, the process may compare the computation time of applying the model with a threshold level, or the number of input features with a threshold number. In these cases, in response to determining that the validation is not successful, the process may incur additional training (503). For example, the process may refine the input features being used by adding additional input features. In response to determining that the validation is successful (e.g., when the RMSE error is satisfactory), the process may apply the machine learning model to additional well log data, for example, newly received input log that is without actual measurement of bore size (506).

FIG. 6 is a block diagram illustrating an example of a computer system 600 used to provide computational functionalities associated with described algorithms, methods, functions, processes, flows, and procedures, according to an implementation of the present disclosure. The illustrated computer 602 is intended to encompass any computing device such as a server, desktop computer, laptop/notebook computer, wireless data port, smart phone, personal data assistant (PDA), tablet computing device, one or more processors within these devices, another computing device, or a combination of computing devices, including physical or virtual instances of the computing device, or a combination of physical or virtual instances of the computing device. Additionally, the computer 602 can comprise a computer that includes an input device, such as a keypad, keyboard, touch screen, another input device, or a combination of input devices that can accept user information, and an output device that conveys information associated with the operation of the computer 602, including digital data, visual, audio, another type of information, or a combination of types of information, on a graphical-type user interface (UI) (or GUI) or other UI.

The computer 602 can serve in a role in a computer system as a client, network component, a server, a database or another persistency, another role, or a combination of roles for performing the subject matter described in the present disclosure. The illustrated computer 602 is communicably coupled with a network 630. In some implementations, one or more components of the computer 602 can be configured to operate within an environment, including cloud-computing-based, local, global, another environment, or a combination of environments.

The computer 602 is an electronic computing device operable to receive, transmit, process, store, or manage data and information associated with the described subject matter. According to some implementations, the computer 602 can also include or be communicably coupled with a server, including an application server, e-mail server, web server, caching server, streaming data server, another server, or a combination of servers.

The computer 602 can receive requests over network 630 (for example, from a client software application executing on another computer 602) and respond to the received requests by processing the received requests using a software application or a combination of software applications. In addition, requests can also be sent to the computer 602 from internal users, external or third-parties, or other entities, individuals, systems, or computers.

Each of the components of the computer 602 can communicate using a system bus 603 and network 630. In some implementations, any or all of the components of the computer 602, including hardware, software, or a combination of hardware and software, can interface over the system bus 603 and network 630 using an application programming interface (API) 612, a service layer 613, or a combination of the API 612 and service layer 613. The API 612 can include specifications for routines, data structures, and object classes. The API 612 can be either computer-language independent or dependent and refer to a complete interface, a single function, or even a set of APIs. The service layer 613 provides software services to the computer 602 or other components (whether illustrated or not) that are communicably coupled to the computer 602. The functionality of the computer 602 can be accessible for all service consumers using this service layer. Software services, such as those provided by the service layer 613, provide reusable, defined functionalities through a defined interface. For example, the interface can be software written in JAVA, C++, another computing language, or a combination of computing languages providing data in extensible markup language (XML) format, another format, or a combination of formats. While illustrated as an integrated component of the computer 602, alternative implementations can illustrate the API 612 or the service layer 613 as stand-alone components in relation to other components of the computer 602 or other components (whether illustrated or not) that are communicably coupled to the computer 602. Moreover, any or all parts of the API 612 or the service layer 613 can be implemented as a child or a sub-module of another software module, enterprise application, or hardware module without departing from the scope of the present disclosure.

The computer 602 includes an interface 604. Although illustrated as a single interface 604 in FIG. 6, two or more interfaces 604 can be used according to particular needs, desires, or particular implementations of the computer 602. The interface 604 is used by the computer 602 for communicating with another computing system (whether illustrated or not) that is communicatively linked to the network 630 in a distributed environment. Generally, the interface 604 is operable to communicate with the network 630 and comprises logic encoded in software, hardware, or a combination of software and hardware. More specifically, the interface 604 can comprise software supporting one or more communication protocols associated with communications such that the network 630 or interface's hardware is operable to communicate physical signals within and outside of the illustrated computer 602.

The computer 602 includes a processor 605. Although illustrated as a single processor 605 in FIG. 6, two or more processors can be used according to particular needs, desires, or particular implementations of the computer 602. Generally, the processor 605 executes instructions and manipulates data to perform the operations of the computer 602 and any algorithms, methods, functions, processes, flows, and procedures as described in the present disclosure.

The computer 602 also includes a database 606 that can hold data for the computer 602, another component communicatively linked to the network 630 (whether illustrated or not), or a combination of the computer 602 and another component. For example, database 606 can be an in-memory, conventional, or another type of database storing data consistent with the present disclosure. In some implementations, database 606 can be a combination of two or more different database types (for example, a hybrid in-memory and conventional database) according to particular needs, desires, or particular implementations of the computer 602 and the described functionality. Although illustrated as a single database 606 in FIG. 6, two or more databases of similar or differing types can be used according to particular needs, desires, or particular implementations of the computer 602 and the described functionality. While database 606 is illustrated as an integral component of the computer 602, in alternative implementations, database 606 can be external to the computer 602. As illustrated, the database 606 holds the previously described data 616 including, for example, input stream of data from various sensors, such as logging tools in the wellbore, as described above in association with FIGS. 1-2.

The computer 602 also includes a memory 607 that can hold data for the computer 602, another component or components communicatively linked to the network 630 (whether illustrated or not), or a combination of the computer 602 and another component. Memory 607 can store any data consistent with the present disclosure. In some implementations, memory 607 can be a combination of two or more different types of memory (for example, a combination of semiconductor and magnetic storage) according to particular needs, desires, or particular implementations of the computer 602 and the described functionality. Although illustrated as a single memory 607 in FIG. 6, two or more memories 607 or similar or differing types can be used according to particular needs, desires, or particular implementations of the computer 602 and the described functionality. While memory 607 is illustrated as an integral component of the computer 602, in alternative implementations, memory 607 can be external to the computer 602.

The application 608 is an algorithmic software engine providing functionality according to particular needs, desires, or particular implementations of the computer 602, particularly with respect to functionality described in the present disclosure. For example, application 608 can serve as one or more components, modules, or applications. Further, although illustrated as a single application 608, the application 608 can be implemented as multiple applications 608 on the computer 602. In addition, although illustrated as integral to the computer 602, in alternative implementations, the application 608 can be external to the computer 602.

The computer 602 can also include a power supply 614. The power supply 614 can include a rechargeable or non-rechargeable battery that can be configured to be either user- or non-user-replaceable. In some implementations, the power supply 614 can include power-conversion or management circuits (including recharging, standby, or another power management functionality). In some implementations, the power-supply 614 can include a power plug to allow the computer 602 to be plugged into a wall socket or another power source to, for example, power the computer 602 or recharge a rechargeable battery.

There can be any number of computers 602 associated with, or external to, a computer system containing computer 602, each computer 602 communicating over network 630. Further, the term “client,” “user,” or other appropriate terminology can be used interchangeably, as appropriate, without departing from the scope of the present disclosure. Moreover, the present disclosure contemplates that many users can use one computer 602, or that one user can use multiple computers 602.

Implementations of the subject matter and the functional operations described in this specification can be implemented in digital electronic circuitry, in tangibly embodied computer software or firmware, in computer hardware, including the structures disclosed in this specification and their structural equivalents, or in combinations of one or more of them. Software implementations of the described subject matter can be implemented as one or more computer programs, that is, one or more modules of computer program instructions encoded on a tangible, non-transitory, computer-readable computer-storage medium for execution by, or to control the operation of, data processing apparatus. Alternatively, or additionally, the program instructions can be encoded in/on an artificially generated propagated signal, for example, a machine-generated electrical, optical, or electromagnetic signal that is generated to encode information for transmission to a receiver apparatus for execution by a data processing apparatus. The computer-storage medium can be a machine-readable storage device, a machine-readable storage substrate, a random or serial access memory device, or a combination of computer-storage mediums. Configuring one or more computers means that the one or more computers have installed hardware, firmware, or software (or combinations of hardware, firmware, and software) so that when the software is executed by the one or more computers, particular computing operations are performed.

The term “real-time,” “real time,” “realtime,” “real (fast) time (RFT),” “near(ly) real-time (NRT),” “quasi real-time,” or similar terms (as understood by one of ordinary skill in the art), means that an action and a response are temporally proximate such that an individual perceives the action and the response occurring substantially simultaneously. For example, the time difference for a response to display (or for an initiation of a display) of data following the individual's action to access the data can be less than 1 millisecond (ms), less than 1 second (s), or less than 5 s. While the requested data need not be displayed (or initiated for display) instantaneously, it is displayed (or initiated for display) without any intentional delay, taking into account processing limitations of a described computing system and time required to, for example, gather, accurately measure, analyze, process, store, or transmit the data.

The terms “data processing apparatus,” “computer,” or “electronic computer device” (or equivalent as understood by one of ordinary skill in the art) refer to data processing hardware and encompass all kinds of apparatus, devices, and machines for processing data, including by way of example, a programmable processor, a computer, or multiple processors or computers. The apparatus can also be, or further include special purpose logic circuitry, for example, a central processing unit (CPU), an FPGA (field programmable gate array), or an ASIC (application-specific integrated circuit). In some implementations, the data processing apparatus or special purpose logic circuitry (or a combination of the data processing apparatus or special purpose logic circuitry) can be hardware- or software-based (or a combination of both hardware- and software-based). The apparatus can optionally include code that creates an execution environment for computer programs, for example, code that constitutes processor firmware, a protocol stack, a database management system, an operating system, or a combination of execution environments. The present disclosure contemplates the use of data processing apparatuses with an operating system of some type, for example LINUX, UNIX, WINDOWS, MAC OS, ANDROID, IOS, another operating system, or a combination of operating systems.

A computer program, which can also be referred to or described as a program, software, a software application, a unit, a module, a software module, a script, code, or other component can be written in any form of programming language, including compiled or interpreted languages, or declarative or procedural languages, and it can be deployed in any form, including, for example, as a stand-alone program, module, component, or subroutine, for use in a computing environment. A computer program can, but need not, correspond to a file in a file system. A program can be stored in a portion of a file that holds other programs or data, for example, one or more scripts stored in a markup language document, in a single file dedicated to the program in question, or in multiple coordinated files, for example, files that store one or more modules, sub-programs, or portions of code. A computer program can be deployed to be executed on one computer or on multiple computers that are located at one site or distributed across multiple sites and interconnected by a communication network.

While portions of the programs illustrated in the various figures can be illustrated as individual components, such as units or modules, that implement described features and functionality using various objects, methods, or other processes, the programs can instead include a number of sub-units, sub-modules, third-party services, components, libraries, and other components, as appropriate. Conversely, the features and functionality of various components can be combined into single components, as appropriate. Thresholds used to make computational determinations can be statically, dynamically, or both statically and dynamically determined.

Described methods, processes, or logic flows represent one or more examples of functionality consistent with the present disclosure and are not intended to limit the disclosure to the described or illustrated implementations, but to be accorded the widest scope consistent with described principles and features. The described methods, processes, or logic flows can be performed by one or more programmable computers executing one or more computer programs to perform functions by operating on input data and generating output data. The methods, processes, or logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, for example, a CPU, an FPGA, or an ASIC.

Computers for the execution of a computer program can be based on general or special purpose microprocessors, both, or another type of CPU. Generally, a CPU will receive instructions and data from and write to a memory. The essential elements of a computer are a CPU, for performing or executing instructions, and one or more memory devices for storing instructions and data. Generally, a computer will also include, or be operatively coupled to, receive data from or transfer data to, or both, one or more mass storage devices for storing data, for example, magnetic, magneto-optical disks, or optical disks. However, a computer need not have such devices. Moreover, a computer can be embedded in another device, for example, a mobile telephone, a personal digital assistant (PDA), a mobile audio or video player, a game console, a global positioning system (GPS) receiver, or a portable memory storage device.

Non-transitory computer-readable media for storing computer program instructions and data can include all forms of media and memory devices, magnetic devices, magneto optical disks, and optical memory device. Memory devices include semiconductor memory devices, for example, random access memory (RAM), read-only memory (ROM), phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), and flash memory devices. Magnetic devices include, for example, tape, cartridges, cassettes, internal/removable disks. Optical memory devices include, for example, digital video disc (DVD), CD-ROM, DVD+/−R, DVD-RAM, DVD-ROM, HD-DVD, and BLURAY, and other optical memory technologies. The memory can store various objects or data, including caches, classes, frameworks, applications, modules, backup data, jobs, web pages, web page templates, data structures, database tables, repositories storing dynamic information, or other appropriate information including any parameters, variables, algorithms, instructions, rules, constraints, or references. Additionally, the memory can include other appropriate data, such as logs, policies, security or access data, or reporting files. The processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.

To provide for interaction with a user, implementations of the subject matter described in this specification can be implemented on a computer having a display device, for example, a CRT (cathode ray tube), LCD (liquid crystal display), LED (Light Emitting Diode), or plasma monitor, for displaying information to the user and a keyboard and a pointing device, for example, a mouse, trackball, or trackpad by which the user can provide input to the computer. Input can also be provided to the computer using a touchscreen, such as a tablet computer surface with pressure sensitivity, a multi-touch screen using capacitive or electric sensing, or another type of touchscreen. Other types of devices can be used to interact with the user. For example, feedback provided to the user can be any form of sensory feedback. Input from the user can be received in any form, including acoustic, speech, or tactile input. In addition, a computer can interact with the user by sending documents to and receiving documents from a client computing device that is used by the user.

The term “graphical user interface,” or “GUI,” can be used in the singular or the plural to describe one or more graphical user interfaces and each of the displays of a particular graphical user interface. Therefore, a GUI can represent any graphical user interface, including but not limited to, a web browser, a touch screen, or a command line interface (CLI) that processes information and efficiently presents the information results to the user. In general, a GUI can include a plurality of user interface (UI) elements, some or all associated with a web browser, such as interactive fields, pull-down lists, and buttons. These and other UI elements can be related to or represent the functions of the web browser.

Implementations of the subject matter described in this specification can be implemented in a computing system that includes a back-end component, for example, as a data server, or that includes a middleware component, for example, an application server, or that includes a front-end component, for example, a client computer having a graphical user interface or a Web browser through which a user can interact with an implementation of the subject matter described in this specification, or any combination of one or more such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of wireline or wireless digital data communication (or a combination of data communication), for example, a communication network. Examples of communication networks include a local area network (LAN), a radio access network (RAN), a metropolitan area network (MAN), a wide area network (WAN), Worldwide Interoperability for Microwave Access (WIMAX), a wireless local area network (WLAN) using, for example, 802.11 a/b/g/n or 802.20 (or a combination of 802.11x and 802.20 or other protocols consistent with the present disclosure), all or a portion of the Internet, another communication network, or a combination of communication networks. The communication network can communicate with, for example, Internet Protocol (IP) packets, Frame Relay frames, Asynchronous Transfer Mode (ATM) cells, voice, video, data, or other information between networks addresses.

The computing system can include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

While this specification contains many specific implementation details, these should not be construed as limitations on the scope of what can be claimed, but rather as descriptions of features that can be specific to particular implementations. Certain features that are described in this specification in the context of separate implementations can also be implemented, in combination, in a single implementation. Conversely, various features that are described in the context of a single implementation can also be implemented in multiple implementations, separately, or in any sub-combination. Moreover, although previously described features can be described as acting in certain combinations and even initially claimed as such, one or more features from a claimed combination can, in some cases, be excised from the combination, and the claimed combination can be directed to a sub-combination or variation of a sub-combination.

Particular implementations of the subject matter have been described. Other implementations, alterations, and permutations of the described implementations are within the scope of the following claims as will be apparent to those skilled in the art. While operations are depicted in the drawings or claims in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed (some operations can be considered optional), to achieve desirable results. In certain circumstances, multitasking or parallel processing (or a combination of multitasking and parallel processing) can be advantageous and performed as deemed appropriate.

Moreover, the separation or integration of various system modules and components in the previously described implementations should not be understood as requiring such separation or integration in all implementations, and it should be understood that the described program components and systems can generally be integrated together in a single software product or packaged into multiple software products.

Furthermore, any claimed implementation is considered to be applicable to at least a computer-implemented method; a non-transitory, computer-readable medium storing computer-readable instructions to perform the computer-implemented method; and a computer system comprising a computer memory interoperably coupled with a hardware processor configured to perform the computer-implemented method or the instructions stored on the non-transitory, computer-readable medium.

Claims

1. A computer-implemented method, comprising:

accessing a stream of input data from logging tools in a first well-bore, wherein the stream of input data comprises measurements of bore sizes inside the first well-bore;

splitting the stream of input data into a training set of input data and a testing set of input data;

training a machine learning model using the training set of input data, wherein the machine learning model is configured to predict a bore size parameter based on input features of the training set of input data;

evaluating the machine learning model using the testing set of input data; and

in response to evaluating the machine learning model as satisfactory, applying the machine learning model to a newly received stream of input data from a second well-bore such that the bore size parameter of the second well-bore is determined independent of measurements of bore sizes inside the second well-bore.

2. The computer-implemented method of claim 1, wherein the machine learning model comprises at least one of: a Random Forest (RF) model, or a XGBoost (eXtreme Gradient Boosting) model.

3. The computer-implemented method of claim 2, further comprising:

selecting the input features for the machine learning model.

4. The computer-implemented method of claim 1, wherein evaluating the machine learning model comprises:

computing a Root Mean Square Error (RMSE) between the predicted bore size parameter and an actual measurement; and

comparing the RMSE with a pre-determined threshold.

5. The computer-implemented method of claim 4, further comprising:

in response to evaluating the machine learning model as unsatisfactory, refining the machine learning model.

6. The computer-implemented method of claim 5, wherein refining the machine learning model comprises at least one of: providing at least one additional input feature to the machine learning model, or replacing at least one input feature with a different input feature.

7. The computer-implemented method of claim 5, wherein refining the machine learning model comprises:

adjusting at least one parameter of the machine learning model.

8. The computer-implemented method of claim 1, wherein the stream of input data comprises logging data encoding a resistivity, a density, a neutron recording, and a gamma ray recording.

9. The computer-implemented method of claim 1, wherein the bore size parameter comprises at least one of: a maximum size, or a minimum size.

10. A computer system comprising one or more hardware processors configured to perform operations of:

accessing a stream of input data from logging tools in a first well-bore, wherein the stream of input data comprises measurements of bore sizes inside the first well-bore;

splitting the stream of input data into a training set of input data and a testing set of input data;

training a machine learning model using the training set of input data, wherein the machine learning model is configured to predict a bore size parameter based on input features of the training set of input data;

evaluating the machine learning model using the testing set of input data; and

in response to evaluating the machine learning model as satisfactory, applying the machine learning model to a newly received stream of input data from a second well-bore such that the bore size parameter of the second well-bore is determined independent of measurements of bore sizes inside the second well-bore.

11. The computer system of claim 10, wherein the machine learning model comprises at least one of: a Random Forest (RF) model, or a XGBoost (eXtreme Gradient Boosting) model.

12. The computer system of claim 11, wherein the operations further comprise:

selecting the input features for the machine learning model.

13. The computer system of claim 11, wherein evaluating the machine learning model comprises:

computing a Root Mean Square Error (RMSE) between the predicted bore size parameter and an actual measurement; and

comparing the RMSE with a pre-determined threshold.

14. The computer system of claim 13, wherein the operations further comprise:

in response to evaluating the machine learning model as unsatisfactory, refining the machine learning model.

15. The computer system of claim 13, wherein refining the machine learning model comprises at least one of: providing at least one additional input feature to the machine learning model, or replacing at least one input feature with a different input feature.

16. The computer system of claim 15, wherein refining the machine learning model comprises:

adjusting at least one parameter of the machine learning model.

17. The computer system of claim 10, wherein the stream of input data comprises logging data encoding a resistivity, a density, a neutron recording, and a gamma ray recording.

18. The computer system of claim 10, wherein the bore size parameter comprises at least one of: a maximum size, or a minimum size.

19. A non-transitory computer-readable medium comprising software instructions which, when executed by one or more computer processors, cause the one or more computer processors to perform operations of:

accessing a stream of input data from logging tools in a first well-bore, wherein the stream of input data comprises measurements of bore sizes inside the first well-bore;

splitting the stream of input data into a training set of input data and a testing set of input data;

training a machine learning model using the training set of input data, wherein the machine learning model is configured to predict a bore size parameter based on input features of the training set of input data;

evaluating the machine learning model using the testing set of input data; and

in response to evaluating the machine learning model as satisfactory, applying the machine learning model to a newly received stream of input data from a second well-bore such that the bore size parameter of the second well-bore is determined independent of measurements of bore sizes inside the second well-bore.

20. The non-transitory computer-readable medium of claim 19, wherein evaluating the machine learning model comprises:

computing a Root Mean Square Error (RMSE) between the predicted bore size parameter and an actual measurement; and

comparing the RMSE with a pre-determined threshold.