DISTRIBUTED CACHE CLEANUP FOR ANALYTIC INSTANCE RUNS PROCESSING OPERATING DATA FROM INDUSTRIAL ASSETS
In some embodiments, a cloud-based services architecture may receive operating data associated with a set of assets from a set of enterprise system devices. The cloud-based services architecture may then process the received operating data. A plurality of computing platforms may execute instance runs of a plurality of analytics, with each instance run being associated with an industrial asset. A distributed cache may be shared by the plurality of analytic instance runs executing on the plurality of computing platforms. An orchestration run-time execution engine may maintain an overall count value that represents a number of analytic instance runs currently utilizing the distributed cache. Note that the distributed cache may be emptied when the overall count value indicates that no analytic instance runs are still utilizing the distributed cache.
The invention relates generally to cloud-based systems to facilitate enterprise analytics. In particular, embodiments may facilitate distributed cache cleanup for analytic instance runs processing operating data from industrial assets.
An enterprise may collect operating data from a set of enterprise system devices. For example, the enterprise may deploy sensors associated with one or more industrial assets (e.g., wind farm devices, turbine engines, etc.) and collect data as those assets operate. Note that the amount of industrial data that can be collected in this way may be significant in terms of volume, velocity, and/or variety. To help extract insight from the data, the enterprise may employ a “cloud-based” industrial internet platform to facilitate creation of applications to turn real-time operational data into insights. As used herein, a “cloud-based” industrial platform may help connect machines to collect key industrial data and stream the information to the cloud and/or leverage services and development tools to help the enterprise focus on solving problems. In this way, the cloud-based industrial platform may help an enterprise deploy scalable services and end-to-end applications in a secure environment. For example, analytic instance runs may be executed by a plurality of computing platforms to process operating data associated with the industrial assets.
In some cases, a distribute cache may be employed to facilitate execution of the analytic instance runs. The distributed cache might store, for example, information about how to fetch data, how to store data, etc. Note that the overall performance of the system may be degraded as a result of memory leaks and other problems associated with such a distributed cache. Thus, it may be desirable to provide systems and methods to automatically facilitate distributed cache cleanup for analytic instance runs in an efficient and accurate manner.
BRIEF DESCRIPTIONSome embodiments are associated with a cloud-based services architecture that may receive operating data associated with a set of assets from a set of enterprise system devices. The cloud-based services architecture may then process the received operating data. A plurality of computing platforms may execute instance runs of a plurality of analytics, with each instance run being associated with an industrial asset. A distributed cache may be shared by the plurality of analytic instance runs executing on the plurality of computing platforms. An orchestration run-time execution engine may maintain an overall count value that represents a number of analytic instance runs currently utilizing the distributed cache. Note that the distributed cache may be emptied when the overall count value indicates that no analytic instance runs are still utilizing the distributed cache.
Some embodiments are associated with: means for receiving, at a cloud-based services architecture from a set of enterprise system devices, operating data associated with a set of assets; means for processing, by the cloud-based services architecture, the received operating data; means for executing, by a plurality of computing platforms, instance runs of a plurality of analytics, each instance run being associated with an industrial asset; means for sharing a distributed cache by the plurality of analytic instance runs executing on the plurality of computing platforms; and means for maintaining, by an orchestration run-time execution engine, an overall count value that represents a number of analytic instance runs currently utilizing the distributed cache, wherein the distributed cache is emptied when the overall count value indicates that no analytic instance runs are still utilizing the distributed cache.
A technical feature of some embodiments is a computer system and method that automatically facilitates distributed cache cleanup for analytic instance runs in an efficient and accurate manner.
Other embodiments are associated with systems and/or computer-readable medium storing instructions to perform any of the methods described herein.
Some embodiments disclosed herein automatically facilitate distributed cache cleanup for analytic instance runs in an efficient and accurate manner. Some embodiments are associated with systems and/or computer-readable medium that may help perform such a method.
Reference will now be made in detail to present embodiments of the invention, one or more examples of which are illustrated in the accompanying drawings. The detailed description uses numerical and letter designations to refer to features in the drawings. Like or similar designations in the drawings and description have been used to refer to like or similar parts of the invention.
Each example is provided by way of explanation of the invention, not limitation of the invention. In fact, it will be apparent to those skilled in the art that modifications and variations can be made in the present invention without departing from the scope or spirit thereof. For instance, features illustrated or described as part of one embodiment may be used on another embodiment to yield a still further embodiment. Thus, it is intended that the present invention covers such modifications and variations as come within the scope of the appended claims and their equivalents.
An enterprise may collect operating data from a set of enterprise system devices. For example, the enterprise may deploy sensors associated with one or more industrial assets (e.g., wind farm devices, turbine engines, etc.) and collect data as those assets operate. Moreover, the amount of industrial data that can be collected in this way may be significant in terms of volume, velocity, and/or variety. To help extract insight from the data (and perhaps gain a competitive advantage), the enterprise may employ an industrial internet platform to facilitate creation of applications to turn real-time operational data into insights.
The cloud-based services architecture 150 includes a number of analytics 120 that may execute on computing platforms 130 to process the received operating data. Each analytic 120 may receive, process, and output information, and a designer may combine the analytics 120 in different ways to achieve the desired results 155 (e.g., the output of one analytic may be provided to another analytic as an input). An orchestration run-time engine 140 may help coordinate the execution of the analytics 120.
In some cases, a distribute cache may be employed to facilitate execution of the analytic instance runs. The distributed cache might store, for example, information about how to fetch data, how to store data, etc.
The orchestration run-time engine 240 may physically create a sequence or series of analytics 220 to be executed, ensure that any required data is available, check information quality, smooth data, implement anomaly detection algorithms, etc. As used herein, the phrase “orchestration step” may refer to a step that has an associated analytic 220 and an understand of how to fetch/store data for that analytic (e.g., from which source and/or to which data store). Note that a series of analytics may be executed for an “asset group” (which could, for example, contain hundreds of thousands individual industrial assets). The phrase “orchestration run” may refer to the execution of an orchestration from an entire asset group.
Note that the overall performance of the system 200 may be degraded as a result of memory leaks and other problems associated with use of such a distributed cache 260. Thus, it may be desirable to provide systems and methods to automatically facilitate distributed cache cleanup for analytic instance runs in an efficient and accurate manner.
Note that the systems 100, 200, 300 of
At S410, a cloud-based services architecture may receive operating data associated with a set of assets from a set of enterprise system devices. As used herein, the phrase “enterprise system devices” might refer to, for example, devices associated with sensors, a big data stream, an industrial asset, a power plant, a wind farm, a turbine, power distribution, fuel extraction, healthcare, transportation, aviation, manufacturing, water processing, etc. Note that the cloud-based services architecture might be further associated with edge software, data management, security, development operations, and/or mobile applications.
At S420, the cloud-based services architecture may process the received operating data (e.g., using a series of analytics). At S430, a plurality of computing platforms may execute instance runs of a plurality of analytics. For example, each instance run might be associated with an industrial asset. At S440, the system may arrange to share a distributed cache by the plurality of analytic instance runs executing on the plurality of computing platforms. According to some embodiments, the distributed cache might store, for run-time look-up during processing, data related to orchestration, asset instance data, steps, metadata, etc. Note that the distributed cache might comprise an in-memory distributed cache.
At S450, an orchestration run-time execution engine may maintain an overall count value (e.g., in software, a hardware register, etc.) that represents a number of analytic instance runs currently utilizing the distributed cache. For example, the maintenance of the overall count value might include incrementing the overall count value when a new analytic instance run utilizes the distributed cache. Similarly, the maintenance of the overall count value might include decrementing the overall count value when an analytic instance run is done using the distributed cache. According to some embodiments, the distributed cache is emptied when the overall count value indicates that no analytic instance runs are still utilizing the distributed cache (e.g., the overall count value equals zero).
Note that multiple analytic instances runs may execute simultaneously. To avoid race conditions, a first analytic instance run utilizing the distributed cache may lock the overall count value such that other analytic instance runs cannot update the overall count value until the first analytic instance run is done using the distributed cache. According to some embodiments, a “distributed” lock may be provided for the distributed cache. For example, a first analytic instance run executing on a first computing platform may implement a distributed lock on the overall count value such that other analytic instance runs executing on other computing platforms cannot update the overall count value until the first analytic instance run is done using the distributed cache.
The cloud services 550 may, for example, facilitate the presentation of interactive displays 560 (e.g., mobile display) to a user in accordance with any of the embodiments described herein. For example the cloud services 550 may automatically facilitate distributed cache cleanup for analytic instance runs in an efficient and accurate manner. In this way, the system may comprise a machine-centric solution that supports heterogeneous data acquisition, storage, management, integration, and access. Moreover, the system may provide advanced predictive analytics and guide users with intuitive interfaces that are delivered securely in the cloud. In this way, users may rapidly build, securely deploy, and effectively operation industrial applications in connection with the industrial Internet of Things (“IoT”).
Note that a cloud services 550 platform may offer a standardized way to enable an enterprise to quickly take advantage of operational and business innovations. By using the platform which is designed around a re-usable building block approach, developers can build applications quickly, leverage customized work, reduce errors, develop and share best practices, lower any risk of cost and/or time overruns, and/or future-proof initial investments. Moreover, independent third parties may build applications and services for the platform, allowing businesses to extend capabilities easily by tapping an industrial ecosystem. In this way, the platform may drive insights that transform and/or improve Asset Performance Management (“APM”), operations, and/or business.
According to some embodiments, a distributed cache may be partitioned such that different partitions are associated with different information about analytic instance runs. For example,
Note that in some cases, it may be desirable to have an output of one analytic 620 act as an input to another analytic. In the example of
Note that operating data may be associated with a “big data” stream that is received by the cloud-based services architecture 650 on a periodic or asynchronous basis. Moreover, the client platforms 670 may, for example, be used to execute a web browser, smartphone application, etc. to provide results from and/or facilitate understating of the big data. As used herein, the phrase “big data” may refer to data sets so large and/or complex that traditional data processing applications may be inadequate (e.g., to perform appropriate analysis, capture, data curation, search, sharing, storage, transfer, visualization, and/or information privacy for the data). Analysis of big data may lead to new correlations, to spot business trends, prevent diseases, etc. Scientists, business executives, practitioners of media and advertising and governments alike regularly meet difficulties with large data sets in areas including Internet search, finance and business informatics. Scientists encounter limitations in meteorology, genomics, complex physics simulations, biological and environmental research, etc.
Any of the devices described with respect to the system 600 might be, for example, associated with a Personal Computer (“PC”), laptop computer, smartphone, an enterprise server, a server farm, and/or a database or similar storage devices. According to some embodiments, an “automated” cloud-based services architecture 650 may facilitate the collection and analysis of big data. As used herein, the term “automated” may refer to, for example, actions that can be performed with little (or no) intervention by a human.
As used herein, devices, including those associated with the cloud-based services architecture 650 and any other device described herein may exchange information via any communication network which may be one or more of a Local Area Network (“LAN”), a Metropolitan Area Network (“MAN”), a Wide Area Network (“WAN”), a proprietary network, a Public Switched Telephone Network (“PSTN”), a Wireless Application Protocol (“WAP”) network, a Bluetooth network, a wireless LAN network, and/or an Internet Protocol (“IP”) network such as the Internet, an intranet, or an extranet. Note that any devices described herein may communicate via one or more such communication networks.
Although a single cloud-based services architecture 650 is shown in
The analytics 830 may interact with analytic message queues 832, an analytic data/model service 860, and/or a cache 840 for data or a model (e.g., via get/put operations). The architecture 800 may use an overall count value to facilitate cleaning of the cache 840 in accordance with any of the embodiments described herein. The analytic data/model service 860 may provide results to an asset service 882 and/or a time-series service 884 as well as to an RDBMS 886 via a custom data connector service 862. Note that the cache 840 may store an analytic state 842 and be used to store an output of a first analytic model within the tenant-specific space before being provided as an input of a second analytic model. The cache 840 might comprise, for example, an in-memory cache of the tenant-specific space. Because this process is performed entirely “in memory” inside the tenant-specific space, the cache 840 may help make execution of the models efficient and relatively fast. According to some embodiments, tenant configuration management services 894 may receive information from cloud service brokers 892 and store information into a tenant configuration database 896.
The embodiments described herein may be implemented using any number of different hardware configurations. For example,
The processor 910 also communicates with a storage device 930. The storage device 930 may comprise any appropriate information storage device, including combinations of magnetic storage devices (e.g., a hard disk drive), optical storage devices, mobile telephones, and/or semiconductor memory devices. The storage device 930 stores a program 912 and/or an orchestration engine 914 for controlling the processor 910. The processor 910 performs instructions of the programs 912, 914, and thereby operates in accordance with any of the embodiments described herein. For example, the processor 910 might receive operating data associated with a set of assets from a set of enterprise system devices. The processor 910 may then process the received operating data. A plurality of computing platforms may execute instance runs of a plurality of analytics, with each instance run being associated with an industrial asset. A distributed cache may be shared by the plurality of analytic instance runs executing on the plurality of computing platforms. The processor 910 may also maintain an overall count value that represents a number of analytic instance runs currently utilizing the distributed cache. Note that the distributed cache may be emptied by the processor 910 when the overall count value indicates that no analytic instance runs are still utilizing the distributed cache.
The programs 912, 914 may be stored in a compressed, uncompiled and/or encrypted format. The programs 912, 914 may furthermore include other program elements, such as an operating system, a database management system, and/or device drivers used by the processor 910 to interface with peripheral devices.
As used herein, information may be “received” by or “transmitted” to, for example: (i) the apparatus 900 from another device; or (ii) a software application or module within the apparatus 900 from another software application, module, or any other source.
As shown in
The cache update identifier 1002 might be a unique alphanumeric code identifying an update to a distributed cache (and, in particular, to an overall count value maintained for the cache). The time 1004 and the analytic instance run identifier 1006 might indicate, for example, when a particular analytic instance run finished using the distributed cache identifier. The prior overall count value 1008 (which could, in some cases, have a value of hundreds of thousands) may then be adjusted by the overall count value change 1010 resulting in the new overall count value 1012. When the new overall count value 1012 equals zero, all analytic instance runs have finished using the distributed cache (as illustrated by the third entry in
Thus, some embodiments described herein may automatically facilitate distributed cache cleanup for analytic instance runs in an efficient and accurate manner. Moreover, such an approach may increase asset utilization with predictive analytics, improving performance and efficiency that can result in lower repair costs. Moreover, embodiments may achieve new levels of performance, reliability, and availability throughout the life cycle of an industrial asset.
The following illustrates various additional embodiments of the invention. These do not constitute a definition of all possible embodiments, and those skilled in the art will understand that the present invention is applicable to many other embodiments. Further, although the following embodiments are briefly described for clarity, those skilled in the art will understand how to make any changes, if necessary, to the above-described apparatus and methods to accommodate these and other embodiments and applications.
Although specific hardware and data configurations have been described herein, note that any number of other configurations may be provided in accordance with embodiments of the present invention (e.g., some of the information associated with the databases and apparatus described herein may be split, combined, and/or handled by external systems). Applicants have discovered that embodiments described herein may be particularly useful in connection with industrial asset management systems, although embodiments may be used in connection other any other type of asset.
While only certain features of the invention have been illustrated and described herein, many modifications and changes will occur to those skilled in the art. It is, therefore, to be understood that the appended claims are intended to cover all such modifications and changes as fall within the true spirit of the invention.
Claims
1. A system to facilitate enterprise analytics, comprising:
- a set of enterprise system devices to collect and transmit operating data associated with a set of assets; and
- a cloud-based services architecture, to receive the operating data from the set of enterprise system devices, including: a plurality of analytics to process the received operating data, a plurality of computing platforms to execute instance runs of a plurality of analytics, each instance run being associated with an industrial asset, a distributed cache to be shared by the plurality of analytic instance runs executing on the plurality of computing platforms, and an orchestration run-time execution engine to: maintain an overall count value that represents a number of analytic instance runs currently utilizing the distributed cache, wherein the distributed cache is emptied when the overall count value indicates that no analytic instance runs are still utilizing the distributed cache.
2. The system of claim 1, wherein said maintaining the overall count value includes:
- incrementing the overall count value when a new analytic instance run utilizes the distributed cache, and
- decrementing the overall count value when an analytic instance run is done using the distributed cache.
3. The system of claim 1, wherein a first analytic instance run utilizing the distributed cache locks the overall count value such that other analytic instance runs cannot update the overall count value until the first analytic instance run is done using the distributed cache.
4. The system of claim 3, wherein the first analytic instance run executes on a first computing platform and implements a distributed lock on the overall count value such that other analytic instance runs executing on other computing platforms cannot update the overall count value until the first analytic instance run is done using the distributed cache.
5. The system of claim 1, wherein the distributed cache is partitioned such that different partitions are associated with different information about analytic instance runs.
6. The system of claim 5, wherein at least one partition is associated with at least one of: (i) asset data, (ii) asset group data, (iii) orchestration step data, and (iv) analytic input output data.
7. The system of claim 1, wherein the distributed cache stores, for run-time look-up during processing, at least one of: (i) data related to orchestration, (ii) asset instance data, (iii) steps, and (iv) metadata.
8. The system of claim 1, wherein the distributed cache comprises an in-memory distributed cache.
9. The system of claim 1, wherein the set of enterprise system devices are associated with at least one of: (i) sensors, (ii) a big data stream, (iii) an industrial asset, (iv) a power plant, (v) a wind farm, (vi) a turbine, (vii) power distribution, (viii) fuel extraction, (ix) healthcare, (x) transportation, (xi) aviation, (xii) manufacturing, and (xiii) water processing.
10. The system of claim 1, wherein the cloud-based services architecture is further associated with at least one of: (i) edge software, (ii) data management, (iii) security, (iv) development operations, and (v) mobile applications.
11. A computer-implemented method to facilitate enterprise analytics, comprising:
- receiving, at a cloud-based services architecture from a set of enterprise system devices, operating data associated with a set of assets;
- processing, by the cloud-based services architecture, the received operating data;
- executing, by a plurality of computing platforms, instance runs of a plurality of analytics, each instance run being associated with an industrial asset;
- sharing a distributed cache by the plurality of analytic instance runs executing on the plurality of computing platforms; and
- maintaining, by an orchestration run-time execution engine, an overall count value that represents a number of analytic instance runs currently utilizing the distributed cache,
- wherein the distributed cache is emptied when the overall count value indicates that no analytic instance runs are still utilizing the distributed cache.
12. The method of claim 11, wherein said maintaining the overall count value includes:
- incrementing the overall count value when a new analytic instance run utilizes the distributed cache; and
- decrementing the overall count value when an analytic instance run is done using the distributed cache.
13. The method of claim 11, wherein a first analytic instance run utilizing the distributed cache locks the overall count value such that other analytic instance runs cannot update the overall count value until the first analytic instance run is done using the distributed cache.
14. The method of claim 13, wherein the first analytic instance run executes on a first computing platform and implements a distributed lock on the overall count value such that other analytic instance runs executing on other computing platforms cannot update the overall count value until the first analytic instance run is done using the distributed cache.
15. The method of claim 11, wherein the distributed cache is partitioned such that different partitions are associated with different information about analytic instance runs.
16. The method of claim 5, wherein at least one partition is associated with at least one of: (i) asset data, (ii) asset group data, (iii) orchestration step data, and (iv) analytic input output data.
17. A non-transitory, computer-readable medium storing instructions that, when executed by a computer processor, cause the computer processor to perform a method to facilitate enterprise analytics, the method comprising:
- receiving, at a cloud-based services architecture from a set of enterprise system devices, operating data associated with a set of assets;
- processing, by the cloud-based services architecture, the received operating data;
- executing, by a plurality of computing platforms, instance runs of a plurality of analytics, each instance run being associated with an industrial asset;
- sharing a distributed cache by the plurality of analytic instance runs executing on the plurality of computing platforms; and
- maintaining, by an orchestration run-time execution engine, an overall count value that represents a number of analytic instance runs currently utilizing the distributed cache,
- wherein the distributed cache is emptied when the overall count value indicates that no analytic instance runs are still utilizing the distributed cache.
18. The medium of claim 17, wherein the distributed cache comprises an in-memory distributed cache.
19. The medium of claim 17, wherein the set of enterprise system devices are associated with at least one of: (i) sensors, (ii) a big data stream, (iii) an industrial asset, (iv) a power plant, (v) a wind farm, (vi) a turbine, (vii) power distribution, (viii) fuel extraction, (ix) healthcare, (x) transportation, (xi) aviation, (xii) manufacturing, and (xiii) water processing.
20. The medium of claim 17, wherein the cloud-based services architecture is further associated with at least one of: (i) edge software, (ii) data management, (iii) security, (iv) development operations, and (v) mobile applications.
Type: Application
Filed: Dec 29, 2016
Publication Date: Jul 5, 2018
Inventors: Tun Chang (San Ramon, CA), Arnab Guin (San Ramon, CA)
Application Number: 15/393,755