METHODS AND APPARATUS TO OFFLOAD AND ONLOAD WORKLOADS IN AN EDGE ENVIRONMENT

Info

Publication number: 20200142735
Type: Application
Filed: Dec 20, 2019
Publication Date: May 7, 2020
Inventors: Christian Maciocco (Portland, OR), Kshitij Doshi (Tempe, AZ), Francesc Guim Bernat (Barcelona), Ned Smith (Beaverton, OR)
Application Number: 16/723,702

Abstract

Example methods, apparatus, systems, and articles of manufacture to offload and onload workloads in an edge environment are disclosed herein. Further examples and combinations thereof include the following: An apparatus includes an apparatus comprising a telemetry controller to determine that a workload is to be offloaded from a first resource to a second resource of a platform, and a scheduler to determine an instance of the workload that is compatible with the second resource, and schedule the workload to continue execution based on an exchange of a workload state from the first resource to the second resource, the workload state indicative of a previous thread executed at the first resource.

Description

Description

RELATED APPLICATION

This patent arises from a continuation of U.S. Provisional Patent Application Ser. No. 62/907,597, which was filed on Sep. 28, 2019; and U.S. Provisional Patent Application Ser. No. 62/939,303, which was filed on Nov. 22, 2019. U.S. Provisional Patent Application Ser. No. 62/907,597; and U.S. Provisional Patent Application Ser. No. 62/939,303 are hereby incorporated herein by reference in their entirety. Priority to U.S. Provisional Patent Application Ser. No. 62/907,597; and U.S. Provisional Patent Application Ser. No. 62/939,303 is hereby claimed.

FIELD OF THE DISCLOSURE

This disclosure relates generally to edge environments, and, more particularly, to methods and apparatus to offload and onload workloads in an edge environment.

BACKGROUND

Edge environments (e.g., an Edge, a network edge, Fog computing, multi-access edge computing (MEC), or Internet of Things (IoT) network) enable a workload execution (e.g., an execution of one or more computing tasks, an execution of a machine learning model using input data, etc.) closer or near endpoint devices that request an execution of the workload. Edge environments may include infrastructure (e.g., network infrastructure), such as an edge service, that is connected to cloud infrastructure, endpoint devices, or additional edge infrastructure via networks such as the Internet. Edge services may be closer in proximity to endpoint devices than cloud infrastructure, such as centralized servers.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 depicts an example environment including an example cloud environment, an example edge environment, an example endpoint environment, and example edge services to offload and onload an example workload.

FIG. 2 depicts an example edge service of FIG. 1 to register the edge platform with the edge environment of FIG. 1.

FIG. 3 depicts an example edge platform of FIG. 1 offloading and onloading a workload to example resource(s) of the example edge platform.

FIG. 4 is a flowchart representative of machine readable instructions which may be executed to implement the example edge service and edge platform of FIGS. 1 and 2 to register the example edge platform with the example edge service.

FIG. 5 is a flowchart representative of machine readable instructions which may be executed to implement the example edge service and the example edge platform of FIG. 1 to offload and onload a workload.

FIG. 6 is a flowchart representative of machine readable instructions which may be executed to implement an example telemetry data controller of FIG. 1 to determine a resource to offload and/or onload the workload.

FIG. 7 is a block diagram of an example processing platform structured to execute the instructions of FIGS. 5, 6, and 7 to implement the example edge service and the example edge platform of FIG. 1.

The figures are not to scale. In general, the same reference numbers will be used throughout the drawing(s) and accompanying written description to refer to the same or like parts.

DETAILED DESCRIPTION

Descriptors “first,” “second,” “third,” etc. are used herein when identifying multiple elements or components which may be referred to separately. Unless otherwise specified or understood based on their context of use, such descriptors are not intended to impute any meaning of priority, physical order or arrangement in a list, or ordering in time but are merely used as labels for referring to multiple elements or components separately for ease of understanding the disclosed examples. In some examples, the descriptor “first” may be used to refer to an element in the detailed description, while the same element may be referred to in a claim with a different descriptor such as “second” or “third.” In such instances, it should be understood that such descriptors are used merely for ease of referencing multiple elements or components.

Edge computing, at a general level, refers to the transition of compute, network and storage resources closer to endpoint devices (e.g., consumer computing devices, user equipment, etc.) in order to optimize total cost of ownership, reduce application latency, reduce network backhaul traffic, improve service capabilities, and improve compliance with data privacy or security requirements. Edge computing may, in some scenarios, provide a cloud-like distributed service that offers orchestration and management for applications among many types of storage and compute resources. As a result, some implementations of edge computing have been referred to as the “edge cloud” or the “fog,” as powerful computing resources previously available only in large remote data centers are moved closer to endpoints and made available for use by consumers at the “edge” of the network.

Edge computing use cases in mobile network settings have been developed for integration with multi-access edge computing (MEC) approaches, also known as “mobile edge computing.” MEC approaches are designed to allow application developers and content providers to access computing capabilities and an information technology (IT) service environment in dynamic mobile network settings at the edge of the network. Limited standards have been developed by the European Telecommunications Standards Institute (ETSI) industry specification group (ISG) in an attempt to define common interfaces for operation of MEC systems, platforms, hosts, services, and applications.

Edge computing, MEC, and related technologies attempt to provide reduced latency, increased responsiveness, and more available computing power and storage than offered in traditional cloud network services and wide area network connections. However, the integration of mobility and dynamically launched services for some mobile use and device processing use cases has led to limitations and concerns with orchestration, functional coordination, and resource management, especially in complex mobility settings where many participants (e.g., devices, hosts, tenants, service providers, operators, etc.) are involved.

In a similar manner, Internet of Things (IoT) networks and devices are designed to offer a distributed compute arrangement from a variety of endpoints. IoT devices can be physical or virtualized objects that may communicate on a network, and can include sensors, actuators, and other input/output components, which may be used to collect data or perform actions in a real-world environment. For example, IoT devices can include low-powered endpoint devices that are embedded or attached to everyday things, such as buildings, vehicles, packages, etc., to provide an additional level of artificial sensory perception of those things. In recent years, IoT devices have become more popular and thus applications using these devices have proliferated.

In some examples, an edge environment can include an enterprise edge in which communication with and/or communication within the enterprise edge can be facilitated via wireless and/or wired connectivity. The deployment of various Edge, Fog, MEC, and IoT networks, devices, and services have introduced a number of advanced use cases and scenarios occurring at and towards the edge of the network. However, these advanced use cases have also introduced a number of corresponding technical challenges relating to security, processing, storage, and network resources, service availability and efficiency, among many other issues. One such challenge is in relation to Edge, Fog, MEC, and IoT networks, devices, and services executing workloads on behalf of endpoint devices.

The present techniques and configurations may be utilized in connection with many aspects of current networking systems, but are provided with reference to Edge Cloud, IoT, Multi-access Edge Computing (MEC), and other distributed computing deployments. The following systems and techniques may be implemented in, or augment, a variety of distributed, virtualized, or managed edge computing systems. These include environments in which network services are implemented or managed using multi-access edge computing (MEC), fourth generation (4G), fifth generation (5G), or Wi-Fi wireless network configurations; or in wired network configurations involving fiber, copper, and other connections. Further, aspects of processing by the respective computing components may involve computational elements which are in geographical proximity of a user equipment or other endpoint locations, such as a smartphone, vehicular communication component, IoT device, etc. Further, the presently disclosed techniques may relate to other Edge/MEC/IoT network communication standards and configurations, and other intermediate processing entities and architectures.

Edge computing is a developing paradigm where computing is performed at or closer to the “edge” of a network, typically through the use of a computing platform implemented at base stations, gateways, network routers, or other devices which are much closer to end point devices producing and/or consuming the data. For example, edge gateway servers may be equipped with pools of compute, accelerators, memory and storage resources to perform computation in real-time for low latency use-cases (e.g., autonomous driving or video surveillance) for connected client devices. Or as an example, base stations may be augmented with compute and acceleration resources to directly process service workloads for connected user equipment, without further communicating data via backhaul networks. Or as another example, central office network management hardware may be replaced with computing hardware that performs virtualized network functions and offers compute resources for the execution of services and consumer functions for connected devices.

Edge environments include networks and/or portions of networks that are located between a cloud environment and an endpoint environment. Edge environments enable computations of workloads at edges of a network. For example, an endpoint device may request a nearby base station to compute a workload rather than a central server in a cloud environment. Edge environments include edge services, which include pools of memory, storage resources, and processing resources. Edge services perform computations, such as an execution of a workload, on behalf of other edge services and/or edge nodes. Edge environments facilitate connections between producers (e.g., workload executors, edge services) and consumers (e.g., other edge services, endpoint devices).

Because edge services may be closer in proximity to endpoint devices than centralized servers in cloud environments, edge services enable computations of workloads with a lower latency (e.g., response time) than cloud environments. Edge services may also enable a localized execution of a workload based on geographic locations or network topographies. For example, an endpoint device may require a workload to be executed in a first geographic area, but a centralized server may be located in a second geographic area. The endpoint device can request a workload execution by an edge service located in the first geographic area to comply with corporate or regulatory restrictions.

Examples of workloads to be executed in an edge environment include autonomous driving computations, video surveillance monitoring, machine learning model executions, and real time data analytics. Additional examples of workloads include delivering and/or encoding media streams, measuring advertisement impression rates, object detection in media streams, cloud gaming, speech analytics, asset and/or inventory management, and augmented reality processing.

Edge services enable both the execution of workloads and a return of a result of an executed workload to endpoint devices with a response time lower than the response time of a server in a cloud environment. For example, if an edge service is located closer to an endpoint device on a network than a cloud server, the edge service may respond to workload execution requests from the endpoint device faster than the cloud server. An endpoint device may request an execution of a time-constrained workload which will be served from an edge service rather than a cloud server.

In addition, edge services enable the distribution and decentralization of workload executions. For example, an endpoint device may request a first workload execution and a second workload execution. In some examples, a cloud server may respond to both workload execution requests. With an edge environment, however, a first edge service may execute the first workload execution request, and a second edge service may execute the second workload execution request.

To meet the low-latency and high-bandwidth demands of endpoint devices, an edge service is operated on the basis of timely information about the utilization of many resources (e.g., hardware resources, software resources, virtual hardware and/or software resources, etc.), and the efficiency with which those resources are able to meet the demands placed on them. Such timely information is generally referred to as telemetry (e.g., telemetry data, telemetry information, etc.).

Telemetry can be generated from a plurality of sources including each hardware component or portion thereof, virtual machines (VMs), processes or containers, operating systems (OSes), applications, and orchestrators. Telemetry can be used by orchestrators, schedulers, etc., to determine a quantity and/or type of computation tasks to be scheduled for execution at which resource or portion(s) thereof, and an expected time to completion of such computation tasks based on historical and/or current (e.g., instant or near-instant) telemetry. For example, a core of a multi-core central processing unit (CPU) can generate over a thousand different varieties of information every fraction of a second using a performance monitoring unit (PMU) sampling the core and/or, more generally, the multi-core CPU. Periodically aggregating and processing all such telemetry in a given edge platform, edge service, etc., can be an arduous and cumbersome process. Prioritizing salient features of interest and extracting such salient features from telemetry to identify current or future problems, stressors, etc., associated with a resource is difficult. Furthermore, identifying a different resource to offload workloads from the burdened resource is a complex undertaking.

Some edge environments desire to obtain capability data (e.g., telemetry data) associated with resources executing a variety of functions or services, such as data processing or video analytics functions (e.g., machine vision, image processing for autonomous vehicle, facial recognition detection, visual object detection, etc.).

In such an edge environment, services (e.g., orchestration services) may be provided on the basis of the capability data. For example, an edge environment includes different edge platforms (e.g., Edge-as-a-Service, edge devices, etc.) that may have different capabilities (e.g., computational capabilities, graphic processing capabilities, reconfigurable hardware function capabilities, networking capabilities, storage, etc.). The different edge platform capabilities are determined by the capability data and may depend on 1) the location of the edge platforms (e.g., the edge platform location at the edge network) and 2) the edge platform resource(s) (e.g., hardware resources, software resources, virtual hardware and/or software resources, etc. that include the physical and/or virtual capacity for memory, storage, power, etc.).

In some examples, the edge environment may be unaware of the edge platform capabilities due to the edge environment not having distributed monitoring software or hardware solutions or a combination thereof that are capable of monitoring highly-granular stateless functions that are executed on the edge platform (e.g., a resource platform, a hardware platform, a software platform, a virtualized platform, etc.). In such an example, conventional edge environments may be configured to statically orchestrate (e.g., offload) a full computing task to one of the edge platform's resources (e.g., a general purpose processor or an accelerator). Such static orchestrating methods prevents the edge platform to be optimized for various properties subject to the load the computing task puts on the edge platform. For example, the computing task may not meet tenant requirements (e.g., load requirements, requests, performance requirements, etc.) due to not having access to capability data. Conventional methods may offload the computing task to a single processor or accelerator, rather than splitting up the computing task among the resource(s) of the edge platform. In this manner, the resources of the edge platform that become dynamically available, or which can be dynamically reprogrammed to perform different functions at different times, are difficult to utilize in conventional static orchestrating methods. Therefore, conventional methods do not optimize the edge platform to its maximum potential (e.g., not all the available resources are utilized to complete the computing task).

Additionally, the edge environment may operate on the basis of tenant (e.g., user, developer, etc.) requirements. Tenant requirements are desired and/or necessary conditions, determined by the tenant, in which the edge platform is to meet when providing orchestration services. For example, tenant requirements may be represented as policies that determine whether the edge platform is to optimize for latency, power consumption, or CPU cycles, limit movement of workload data inside the edge platform, limit CPU temperature, and/or any other desired condition the tenant requests to be met. In some examples, it may be difficult for the edge service to manage the policies set forth by tenants with static orchestration capabilities. For example, the edge service may require the use of more than one edge platform to complete a computing task in order to meet the tenant requirements or may perform tradeoffs in order to meet the tenant requirements.

In conventional edge environments, acceleration is typically applied within local machines with fixed function-to-accelerator mappings. For example, with conventional approaches, a certain service workload (e.g., an edge computing workload) may invoke a certain accelerator; when this workload needs to migrate or transition to another location (based on changes in network access, to meet new latency conditions, or otherwise), the workload may need to be re-parsed and potentially re-analyzed.

In conventional edge environments, application developers who deliver and/or automate applications (e.g., computing tasks) to an edge environment were required to develop applications to meet the requirements and definitions of the edge environment. The task of delivering and/or automating such applications is arduous and burdensome because the developer is required to have the underlying knowledge of one or more edge platforms (e.g., the capabilities of the edge platform). For example, developers may attempt to fragment software solutions and services (e.g., computing tasks) in an effort to compose a service that may execute in one way on one resource while remaining migratable to a different resource. In such an example, composing such a service becomes difficult when there are multiple acceleration resources within an edge platform.

Examples disclosed herein improve distribution of computing tasks to resources of edge platforms based on an edge service that is distributed across multiple edge platforms. In examples disclosed herein, the edge service includes features that determine capability data, register applications, computer programs, etc., and register edge platforms with the edge service, and schedule workload execution and distribution to resources of the edge platforms. Such edge service features enable the coordination of different acceleration functions on different hosts (e.g., edge computing nodes). Rather than treating acceleration workloads in serial, the edge platform utilizes a parallel distribution approach to “divide and conquer” the workload and the function operations. This parallel distribution approach may be applied during use of the same or multiple forms of acceleration hardware (e.g., FPGAs, GPU arrays, AI accelerators) and the same or multiple types of workloads and invoked functions.

Example disclosed herein enable late binding of workloads by generating one or more instances of the workload based on the capability data. As used herein, late binding, dynamic binding, dynamic linkage, etc., is a method in which workloads of an application are looked up at runtime by the target system (e.g., intended hardware and/or software resource). For example, late binding does not fix the arguments (e.g., variables, expressions, etc.) of the program to a resource during compilation time. Instead, late binding enables the application to be modified until execution.

In examples disclosed herein, when the edge platform(s) are registered with the edge service, capability discovery is enabled. For example, the edge service and/or the edge platforms can determine the capability information of the edge platforms' resources. In this example, the edge service enables an aggregation of telemetry data corresponding to the edge platforms' telemetry to generate capability data. Examples disclosed herein utilize the capability data to determine applications or workloads of an application to be distributed to the edge platform for processing. For example, the capability data informs the edge service of available resources that can be utilized to execute a workload. In this manner, the edge service can determine whether the workload will be fully optimized by the edge platform.

Examples disclosed herein integrate different edge platform resources (e.g., heterogeneous hardware, acceleration-driven computational capabilities, etc.) into an execution of an application or an application workload. For example, applications or services executing in an edge environment are no longer being distributed as monolithic preassembled units. Instead, applications or services are being distributed as collections of subunits (e.g., microservices, edge computing workloads, etc.) that can be integrated (e.g., into an application) according to a specification referred to as an assembly and/or composition graph. Before actual execution of the application or service, examples disclosed herein process the composition graph, such that different subunits of the application or service may use different edge platform resources (e.g., integrate different edge platform resources for application or service execution). During processing, the application or service is subject to at least three different groups of conditions evaluated at run time. The three groups of conditions are, (a) the service objectives or orchestration objectives, (b) availabilities or utilizations of different resources, and (c) capabilities of different resources. In some examples, examples disclosed herein can integrate the subunits in different forms (e.g., one form or implementation for CPUs, a different form or implementation for an FPGA, etc.) just-in-time and without manual intervention because these three conditions (e.g., a, b, and c) can be evaluated computationally at the very last moment before execution.

Further, there may be more conditions than just a, b, and c described above. For example, security requirements in a given edge infrastructure may be less or more stringent according to whether an application or service runs on an attackable component (e.g., a software module) or one that is not attackable (e.g., an FPGA, an ASIC, etc.). Similarly, some tenants may be restricted to certain types of edge platform resources according to business or metering-and-charging agreements. Thus security and business policies may also be at play in determining the dynamic integration.

Examples disclosed herein act to integrate different edge platform resources due to edge platform resources capabilities. For example, fully and partially reconfigurable gate arrays (e.g., variations of FPGAs) are trending to reconfigurability (e.g., such as re-imaging which is the process of removing software on a computer and reinstalling the software) on a scale of fraction of milliseconds. In such an example, the high speeds provided by the hardware accelerated functions (e.g., reconfigurability functions for FPGAs) makes just-in time offloading of functions more viable. For example, just-in time offloading includes allocating edge computing workloads from general purpose processing units to accelerators. The offloading of edge computing workloads from one resource to another optimizes latency, data movement, and power consumption of the edge platform, which in turn boosts the overall density of edge computing workloads that may be accommodated by the edge platform.

In some examples, it may be advantageous to perform just-in time onloading of edge computing workloads. For example, by utilizing telemetry data, edge computing workloads executing at an accelerator resource can be determined as less important based on Quality of Service (QoS), energy consumption, etc. In such an example, the edge computing workload may be onloaded from the accelerator onto the general purpose processing unit.

FIG. 1 depicts an example environment (e.g., a computing environment) 100 including an example cloud environment 105, an example edge environment 110, and an example endpoint environment 115 to schedule, distribute, and/or execute a workload (e.g., one or more computing or processing tasks). In FIG. 1, the cloud environment 105 is an edge cloud environment. For example, the cloud environment 105 may include any suitable number of edge clouds. Alternatively, the cloud environment 105 may include any suitable backend components in a data center, cloud infrastructure, etc. In FIG. 1, the cloud environment 105 includes a first example server 112, a second example server 114, a third example server 116, a first instance of an example edge service 130A, and an example database (e.g., a cloud database, a cloud environment database, etc.) 135. Alternatively, the cloud environment 105 may include fewer or more servers than the servers 112, 114, 116 depicted in FIG. 1. The servers 112, 114, 116 can execute centralized applications (e.g., website hosting, data management, machine learning model applications, responding to requests from client devices, etc.).

In the illustrated example of FIG. 1, the edge service 130A facilitates the generation and/or retrieval of example capability data 136A-C and policy data 138A-C associated with at least one of the cloud environment 105, the edge environment 110, or the endpoint environment 115. In FIG. 1, the database 135 stores the policy data 138A-C, the capability data 136A-C and example executables 137, 139 including at least a first example executable 137 and a second example executable 139. Alternatively, the database 135 may include fewer or more executables than the first executable 137 and the second executable 139. For example, the executables 137, 139 can be capability executables that, when executed, can generate the capability data 136A-C.

In the illustrated example of FIG. 1, the capability data 136A-C includes first example capability data 136A, second example capability data 136B, and third example capability data 136C. In FIG. 1, the first capability data 136A and the second capability data 136B can be generated by the edge environment 110. In FIG. 1, the third capability data 136C can be generated by one or more of the servers 112, 114, 116, the database 135, etc., and/or, more generally, the cloud environment 105.

In the illustrated example of FIG. 1, the policy data 138A-C includes first example policy data 138A, second example policy data 138B, and third example policy data 138C. In FIG. 1, the first policy data 138A and the second policy data 138B can be retrieved by the edge environment 110. In FIG. 1, the third policy data 138C can be retrieved by one or more of the servers 112, 114, 116, the database 135, etc., and/or, more generally, the cloud environment 105.

In the illustrated example of FIG. 1, the cloud environment 105 includes the database 135 to record data (e.g., the capability data 136A-C, the executables 137, 139, the policy data 138A-C, etc.). In some examples, the database 135 stores information including tenant requests, tenant requirements, database records, website requests, machine learning models, and results of executing machine learning models. The database 135 can be implemented by a volatile memory (e.g., a Synchronous Dynamic Random Access Memory (SDRAM), Dynamic Random Access Memory (DRAM), RAMBUS Dynamic Random Access Memory (RDRAM), etc.) and/or a non-volatile memory (e.g., flash memory). The database 135 can additionally or alternatively be implemented by one or more double data rate (DDR) memories, such as DDR, DDR2, DDR3, DDR4, mobile DDR (mDDR), etc. The database 135 can additionally or alternatively be implemented by one or more mass storage devices such as hard disk drive(s), compact disk drive(s), digital versatile disk drive(s), solid-state disk drive(s), etc. While in the illustrated example the database 135 is illustrated as a single database, the database 135 can be implemented by any number and/or type(s) of databases. Furthermore, the data stored in the database 135 can be in any data format such as, for example, binary data, comma delimited data, tab delimited data, structured query language (SQL) structures, etc.

In the illustrated example of FIG. 1, the servers 112, 114, 116 communicate to devices in the edge environment 110 and/or the endpoint environment 115 via a network such as the Internet. Likewise, the database 135 can provide and/or store data records in response to requests from devices in the cloud environment 105, the edge environment 110, and/or the endpoint environment 115.

In the illustrated example of FIG. 1, the edge environment 110 includes a first example edge platform 140 and a second example edge platform 150. In FIG. 1, the edge platforms 140, 150 are edge-computing platforms or platform services. For example, the edge platforms 140, 150 can include hardware and/or software resources, virtualizations of the hardware and/or software resources, containerization of virtualized or non-virtualized hardware and software resources, etc., and/or a combination thereof. In such examples, the edge platforms 140, 150 can execute a workload obtained from the database 135, an edge, or an endpoint device as illustrated in the example of FIG. 1.

In the illustrated example of FIG. 1, the first edge platform 140 is in communication with a second instance of the edge service 130B and includes a first example interface 131, the first example orchestrator 142, a first example scheduler 144, a first example capability controller 146, a first example edge service (ES) database 148, first example resource(s) 149, a first example telemetry controller 152, and a first example security controller 154. In FIG. 1, the first example interface 131, the first executable 137, the first example orchestrator 142, the first example scheduler 144, the first example capability controller 146, the first example edge service (ES) database 148, first example resource(s) 149, the first example telemetry controller 152, and the first example security controller 154 are connected via a first example network communication interface 141. In FIG. 1, the first capability controller 146 includes the first executable 137 and/or otherwise implements the first executable 137. Alternatively, the first executable 137 may not be included in the first capability controller 146. For example, the first executable 137 can be provided to and/or otherwise accessed by the first edge platform 140 as a service (e.g., Function-as-a-Service (FaaS), Software-as-a-Service (SaaS), etc.). In such examples, the executable 137 can be hosted by one or more of the servers 112, 114, 116. In FIG. 1, the first ES database 148 includes the first capability data 136A and the first policy data 138A.

In the illustrated example of FIG. 1, the second edge platform 150 is in communication with a third instance of the edge service 130C and includes the second executable 139, a second example orchestrator 156, a second example scheduler 158, a second example capability controller 160, a second example edge service (ES) database 159, second example resource(s) 162, a second example telemetry controller 164, and a second example security controller 166. The second example orchestrator 156, the second example scheduler 158, the second example capability controller 160, the second example edge service (ES) database 159, the second example resource(s) 162, the second example telemetry controller 164, and the second example security controller 166 are connected via a second example network communication interface 151. In FIG. 1, the second capability controller 160 includes and/or otherwise implements the second executable 139. Alternatively, the second executable 139 may not be included in the second capability controller 160. For example, the second executable 139 can be provided to and/or otherwise accessed by the second edge platform 150 as a service (e.g., FaaS, SaaS, etc.). In such examples, the second executable 139 can be hosted by one or more of the servers 112, 114, 116. In FIG. 1, the second ES database 159 includes the second capability data 136B and the second policy data 138B.

In the illustrated example of FIG. 1, the edge platforms 140, 150 include the first interface 131 and the second interface 132 to interface the edge platforms 140, 150 with the example edge services 130B-C. For example, the example edge services 130B-C are in communication with the example edge platforms 140, 150 via the example interfaces 131, 132. For example, the edge platforms 140, 150 include the interfaces 131, 132 to be in communication with one or more edge services (e.g., edge services 130A-C), one or more edge platforms, one or more endpoint devices (e.g., endpoint devices 170, 175, 180, 185, 190, 195), one or more servers (e.g., servers 112, 114, 116), and/or more generally, the example cloud environment 105, the example edge environment 110, and the example endpoint environment 115. In some examples, the interfaces 131, 132 may be hardware (e.g., a NIC, a network switch, a Bluetooth router, etc.), software (e.g., an API), or a combination of hardware and software.

In the illustrated example of FIG. 1, the edge platforms 140, 150 include the ES databases 148, 159 to record data (e.g., the first capability data 136A, the second capability data 136B, the first policy data 138A, the second policy data 138B, etc.). The ES databases 148, 159 can be implemented by a volatile memory (e.g., a SDRAM, DRAM, RDRAM, etc.) and/or a non-volatile memory (e.g., flash memory). The ES databases 148, 159 can additionally or alternatively be implemented by one or more DDR memories, such as DDR, DDR2, DDR3, DDR4, mDDR, etc. The ES databases 148, 159 can additionally or alternatively be implemented by one or more mass storage devices such as hard disk drive(s), compact disk drive(s), digital versatile disk drive(s), solid-state disk drive(s), etc. While in the illustrated example the ES databases 148, 159 are illustrated as single databases, the ES databases 148, 159 can be implemented by any number and/or type(s) of databases. Furthermore, the data stored in the ES databases 148, 159 can be in any data format such as, for example, binary data, comma delimited data, tab delimited data, SQL structures, etc.

In the example illustrated in FIG. 1, the first orchestrator 142, the first scheduler 144, the first capability controller 146, the first resource(s) 149, the first telemetry controller 152, and the first security controller 154 are included in, correspond to, and/or otherwise is/are representative of the first edge platform 140. However, in some examples, one or more of the edge service 130B, the first orchestrator 142, the first scheduler 144, the first capability controller 146, the first resource(s) 149, the first telemetry controller 152, and the first security controller 154 can be included in the edge environment 110 rather than be included in the first edge platform 140. For example, the first orchestrator 142 can be connected to the cloud environment 105 and/or the endpoint environment 115 while being outside of the first edge platform 140. In other examples, one or more of the edge service 130B, the first orchestrator 142, the first scheduler 144, the first capability controller 146, the first resource(s) 149, the first telemetry controller 152, and/or the first security controller 154 is/are separate devices included in the edge environment 110. Further, one or more of the edge service 130B, the first orchestrator 142, the first scheduler 144, the first capability controller 146, the first resource(s) 149, the first telemetry controller 152, and/or the first security controller 154 can be included in the cloud environment 105 or the endpoint environment 115. For example, the first orchestrator 142 can be included in the endpoint environment 115, or the first capability controller 146 can be included in the first server 112 in the cloud environment 105. In some examples, the first scheduler 144 can be included in and/or otherwise integrated or combined with the first orchestrator 142.

In the example illustrated in FIG. 1, the second orchestrator 156, the second scheduler 158, the second capability controller 160, the second resource(s) 162, the second telemetry controller 164, and the second security controller 166 are included in, correspond to, and/or otherwise is/are representative of the second edge platform 150. However, in some examples, one or more of the edge service 130C, the second orchestrator 156, the second scheduler 158, the second capability controller 160, the second resource(s) 162, the second telemetry controller 164, and the second security controller 166 can be included in the edge environment 110 rather than be included in the second edge platform 150. For example, the second orchestrator 156 can be connected to the cloud environment 105 and/or the endpoint environment 115 while being outside of the second edge platform 150. In other examples, one or more of the edge service 130C, the second orchestrator 156, the second scheduler 158, the second capability controller 160, the second resource(s) 162, the second telemetry controller 164, and/or the second security controller 166 is/are separate devices included in the edge environment 110. Further, one or more of the edge service 130C, the second orchestrator 156, the second scheduler 158, the second capability controller 160, the second resource(s) 162, the second telemetry controller 164, and/or the second security controller 166 can be included in the cloud environment 105 or the endpoint environment 115. For example, the second orchestrator 156 can be included in the endpoint environment 115, or the second capability controller 160 can be included in the first server 112 in the cloud environment 105. In some examples, the second scheduler 158 can be included in and/or otherwise integrated or combined with the second orchestrator 156.

In the illustrated example of FIG. 1, the resources 149, 162 are invoked to execute a workload (e.g., an edge computing workload) obtained from the endpoint environment 115. For example, the resources 149, 162 can correspond to and/or otherwise be representative of an edge node, such as processing, storage, networking capabilities, or portion(s) thereof. For example, the executable 137, 139, the capability controller 146, 160, the orchestrator 142, 156, the scheduler 144, 158, the telemetry controller 152, 164, the security controller 154, 166 and/or, more generally, the edge platform 140, 150 can invoke a respective one of the resources 149, 162 to execute one or more edge-computing workloads.

In some examples, the resources 149, 162 are representative of hardware resources, virtualizations of the hardware resources, software resources, virtualizations of the software resources, etc., and/or a combination thereof. For example, the resources 149, 162 can include, correspond to, and/or otherwise be representative of one or more CPUs (e.g., multi-core CPUs), one or more FPGAs, one or more GPUs, one or more dedicated accelerators for security, machine learning (ML), one or more network interface cards (NICs), one or more vision processing units (VPUs), etc., and/or any other type of hardware or hardware accelerator. In such examples, the resources 149, 162 can include, correspond to, and/or otherwise be representative of virtualization(s) of the one or more CPUs, the one or more FPGAs, the one or more GPUs, the one more NICs, etc. In other examples, the edge services 130B, 130C, the orchestrators 142, 156, the schedulers 144, 158, the resources 149, 162, the telemetry controllers 152, 164, the security controllers 154, 166 and/or, more generally, the edge platform 140, 150, can include, correspond to, and/or otherwise be representative of one or more software resources, virtualizations of the software resources, etc., such as hypervisors, load balancers, OSes, VMs, etc., and/or a combination thereof.

In the illustrated example of FIG. 1, the edge platforms 140, 150 are connected to and/or otherwise in communication with each other and to the servers 112, 114, 116 in the cloud environment 105. The edge platforms 140, 150 can execute workloads on behalf of devices associated with the cloud environment 105, the edge environment 110, or the endpoint environment 115. The edge platforms 140, 150 can be connected to and/or otherwise in communication with devices in the environments 105, 110, 115 (e.g., the first server 112, the database 135, etc.) via a network such as the Internet. Additionally or alternatively, the edge platforms 140, 150 can communicate with devices in the environments 105, 110, 115 using any suitable wireless network including, for example, one or more wireless local area networks (WLANs), one or more cellular networks, one or more peer-to-peer networks (e.g., a Bluetooth network, a Wi-Fi Direct network, a vehicles-to-everything (V2X) network, etc.), one or more private networks, one or more public networks, etc. For example, the edge platforms 140, 150 can be connected to a cell tower included in the cloud environment 105 and connected to the first server 112 via the cell tower.

In the illustrated example of FIG. 1, the endpoint environment 115 includes a first example endpoint device 170, a second example endpoint device 175, a third example endpoint device 180, a fourth example endpoint device 185, a fifth example endpoint device 190, and a sixth example endpoint device 195. Alternatively, there may be fewer or more than the endpoint devices 170, 175, 180, 185, 190, 195 depicted in the endpoint environment 115 of FIG. 1.

In the illustrated example of FIG. 1, the endpoint devices 170, 175, 180, 185, 190, 195 are computing devices. For example, one or more of the endpoint devices 170, 175, 180, 185, 190, 195 can be an Internet-enabled tablet, mobile handset (e.g., a smartphone), watch (e.g., a smartwatch), fitness tracker, headset, vehicle control unit (e.g., an engine control unit, an electronic control unit, etc.), IoT device, etc. In other examples, one or more of the endpoint devices 170, 175, 180, 185, 190, 195 can be a physical server (e.g., a rack-mounted server, a blade server, etc.). In additional or alternative examples, the endpoint devices can include a camera, a sensor, etc. Further, the label “platform,” “node,” and/or “device” as used in the computing environment 100 does not necessarily mean that such platform, node, and/or device operates in a client or slave role; rather, any of the platforms, nodes, and/or devices in the computing environment 100 refer to individual entities, platforms, nodes, devices, and/or subsystems which include discrete and/or connected hardware and/or software configurations to facilitate and/or use the edge environment 110.

As such, the edge environment 110 is formed from network components and functional features operated by and within the edge platforms (e.g., edge platforms 140, 150), edge gateways, etc. The edge environment 110 may be implemented as any type of network that provides edge computing and/or storage resources which are proximately located to radio access network (RAN) capable endpoint devices (e.g., mobile computing devices, IoT devices, smart devices, etc.), which are shown in FIG. 1 as endpoint devices 170, 175, 180, 185, 190, 195. In other words, the edge environment 110 may be envisioned as an “edge” which connects the endpoint devices and traditional network access points that serves as an ingress point into service provider core networks, including mobile carrier networks (e.g., Global System for Mobile Communications (GSM) networks, Long-Term Evolution (LTE) networks, 5G/6G networks, etc.), while also providing storage and/or compute capabilities. Other types and forms of network access (e.g., Wi-Fi, long-range wireless, wired networks including optical networks) may also be utilized in place of or in combination with such 3GPP carrier networks

In the illustrated example of FIG. 1, the first through third endpoint devices 170, 175, 180 are connected to the first edge platform 140. In FIG. 1, the fourth through sixth endpoint devices 185, 190, 195 are connected to the second edge platform 150. Additionally or alternatively, one or more of the endpoint devices 170, 175, 180, 185, 190, 195 may be connected to any number of edge platforms (e.g., the edge platforms 140, 150), servers (e.g., the servers 112, 114, 116), or any other suitable devices included in and/or otherwise associated with the environments 105, 110, 115 of FIG. 1. For example, the first endpoint device 170 can be connected to the edge platforms 140, 150 and to the second server 114.

In the illustrated example of FIG. 1, one or more of the endpoint devices 170, 175, 180, 185, 190, 195 can connect to one or more devices in the environments 105, 110, 115 via a network such as the Internet. Additionally or alternatively, one or more of the endpoint devices 170, 175, 180, 185, 190, 195 can communicate with devices in the environments 105, 110, 115 using any suitable wireless network including, for example, one or more WLANs, one or more cellular networks, one or more peer-to-peer networks, one or more private networks, one or more public networks, etc. In some examples, the endpoint devices 170, 175, 180, 185, 190, 195 can be connected to a cell tower included in one of the environments 105, 110, 115. For example, the first endpoint device 170 can be connected to a cell tower included in the edge environment 110, and the cell tower can be connected to the first edge platform 140.

In some examples, in response to a request to execute a workload from an endpoint device (e.g., the first endpoint device 170), an orchestrator (e.g., the first orchestrator 142) can communicate with at least one resource (e.g., the first resource(s) 149) and an endpoint device (e.g., the second endpoint device 175) to create a contract (e.g., a workload contract) associated with a description of the workload to be executed. The first endpoint device 170 can provide a task associated with the contract and the description of the workload to the first orchestrator 142, and the first orchestrator 142 can provide the task to a security controller (e.g., the first security controller 154). The task can include the contract and the description of the workload to be executed. In some examples, the task can include requests to acquire and/otherwise allocate resources used to execute the workload. In some examples, the orchestrator 142, 156 can create a contract by archiving previously negotiated contracts and selecting from among them at runtime. The orchestrator 142, 156 may select contracts based on conditions at the endpoint device (e.g., endpoint device 175) and in the edge infrastructure. In such an example, while the contract is dynamic, it can be quickly established by virtue of prior work and caching.

In some examples, the orchestrators 142, 156 maintain records and/or logs of actions occurring in the environments 105, 110, 115. For example, the first resource(s) 149 can notify receipt of a workload description to the first orchestrator 142. One or more of the orchestrators 142, 156, the schedulers 144, 158, and/or the resource(s) 149, 162 can provide records of actions and/or allocations of resources to the orchestrators 142, 156. For example, the first orchestrator 142 can maintain or store a record of receiving a request to execute a workload (e.g., a contract request provided by the first endpoint device 170).

In some examples, the schedulers 144, 158 can access a task received and/or otherwise obtained by the orchestrators 142, 156 and provide the task to one or more of the resource(s) 149, 162 to execute or complete. The resource(s) 149, 162 can execute a workload based on a description of the workload included in the task. The schedulers 144, 158 can access a result of the execution of the workload from one or more of the resource(s) 149, 162 that executed the workload. The schedulers 144, 158 can provide the result to the device that requested the workload to be executed, such as the first endpoint device 170.

Advantageously, an execution of a workload in the edge environment 110 can reduce costs (e.g., compute or computation costs, network costs, storage costs, etc., and/or a combination thereof) and/or processing time used to execute the workload. For example, the first endpoint device 170 can request the first edge platform 140 to execute a workload at a first cost lower than a second cost associated with executing the workload in the cloud environment 105. In other examples, an endpoint device, such as the first through third endpoint devices 170, 175, 180, can be nearer to (e.g., spatially or geographically closer) and/or otherwise proximate to an edge service, such as the first edge platform 140, than a centralized server (e.g., the servers 112, 114, 116) in the cloud environment 105. For example, the first edge platform 140 is spatially closer to the first endpoint device 170 than the first server 112. The first endpoint device 170 can request a workload to be executed with certain constraints, which the example edge service 130A can determine and further position the workload at the first edge platform 140 to execute a workload, and the response time of the first edge platform 140 to deliver the executed workload result is lower than that can be provided by the first server 112 in the cloud environment 105. In some examples, the edge service 130A includes an orchestrator to obtain the workload and determine the constraints, optimal edge platforms for execution, etc.

In the illustrated example of FIG. 1, the edge service 130A-C improves the distribution and execution of edge computing workloads (e.g., among the edge platforms 140, 150) based on the capability data 136A-C, the policy data 138A-C, and registered workloads associated with at least one of the cloud environment 105, the edge environment 110, or the endpoint environment 115. For example, the edge service 130A-C is distributed at the edge platforms 140, 150 to enable the orchestrators 142, 156, the schedulers 144, 158, the capability controllers 146, 160, the telemetry controllers 152, 164, and/or the security controllers 154, 166 to dynamically offload and/or onload registered workloads to available resource(s) 149, 162 based on the capability data 136A-C and the policy data 138A-C. An example implementation of the edge service 130A-C is described in further detail below in connection to FIG. 2.

In the illustrated example of FIG. 1, the capability controllers 146, 160 can determine that the first edge platform 140 and/or the second edge platform 150 has available one(s) of the resource(s) 149, 162, such as hardware resources (e.g., compute, network, security, storage, etc., hardware resources), software resources (e.g., a firewall, a load balancer, a virtual machine (VM), a container, a guest operating system (OS), an application, the orchestrators 142, 156, a hypervisor, etc.), etc., and/or a combination thereof, based on the capability data 136A-C, from which edge computing workloads (e.g., registered workloads) can be executed.

In some examples, the first capability executable 137, when executed, generates the first capability data 136A. In some examples, the second capability executable 139, when executed, generates the second capability data 136B. In some examples, the capability executables 137, 139, when executed, can generate the capability data 136A-B by invoking a composition(s).

In some examples, the composition(s) can be resource composition(s) associated with one or more of the resource(s) 149, 162, edge service composition(s) associated with the edge platforms 140, 150, etc. In some examples, the composition(s) include(s), correspond(s) to, and/or otherwise is/are representative of machine readable resource models representative of abstractions and/or virtualizations of hardware resources, software resources, etc., of the resource(s) 149, 162, and/or, more generally, the edge platforms 140, 150, that can facilitate the aggregation and/or integration of edge computing telemetry and/or capabilities. For example, the composition(s) can be representative of one or more interfaces to generate and/or otherwise obtain the capability data 136A-C associated with the resource(s) 149, 162 of the edge platforms 140, 150. In some examples, the composition(s) include(s) one or more resource compositions that each may include one or more resource models. For example, a resource model can include, correspond to, and/or otherwise be representative of an abstraction and/or virtualization of a hardware resource or a software resource.

In some examples, the composition(s) include(s) at least a resource model corresponding to a virtualization of a compute resource (e.g., a CPU, an FPGA, a GPU, a NIC, etc.). In such examples, the first resource model can include a resource object and a telemetry object. The resource object can be and/or otherwise correspond to a capability and/or function of a core of a multi-core CPU, one or more hardware portions of an FPGA, one or more threads of a GPU, etc. The telemetry object can be and/or otherwise correspond to an interface (e.g., a telemetry interface) to the core of the multi-core CPU, the one or more hardware portions of the FPGA, the one or more threads of the GPU, etc. In some examples, the telemetry object can include, correspond to, and/or otherwise be representative of one or more application programming interfaces (APIs), calls (e.g., hardware calls, system calls, etc.), hooks, etc., that, when executed, can obtain telemetry data from the compute resource.

In the illustrated example of FIG. 1, the telemetry controllers 152, 164 collect telemetry data from resource(s) 149, 162 during workload execution. For example, telemetry controllers 152, 164 may operate in a similar manner as the capability controller 146, 160, such that the telemetry controllers 152, 164 may include executables that invoke resource compositions during execution of a workload. In some examples, the composition(s) include at least a resource model corresponding to a virtualization of a compute resource (e.g., a CPU, an FPGA, a GPU, a NIC, etc.). In such examples, the resource model can include a telemetry object. The telemetry object can be and/or otherwise correspond to an interface (e.g., a telemetry interface) to the core of the multi-core CPU, the one or more hardware portions of the FPGA, the one or more threads of the GPU, etc. In some examples, the telemetry object can include, correspond to, and/or otherwise be representative of one or more application programming interfaces (APIs), calls (e.g., hardware calls, system calls, etc.), hooks, etc., that, when executed, can obtain telemetry data from the compute resource.

In some examples, the telemetry controllers 152, 164 determine utilization metrics of a workload. Utilization metrics correspond to a measure of usage by a resource when the resource is executing the workload. For example, a utilization metrics may be indicative of a percentage of CPU cores utilized during workload execution, bytes of memory utilized, amount of disk time, etc. The telemetry data can include a utilization (e.g., a percentage of a resource that is utilized or not utilized), a delay (e.g., an average delay) in receiving a service (e.g., latency), a rate (e.g., an average rate) at which a resource is available (e.g., bandwidth, throughput, etc.), power expenditure, etc., associated with one(s) of the resource(s) 149, 162 of at least one of the first edge platform 140 or the second edge platform 150. The example telemetry controllers 152, 164 may store telemetry data (e.g., utilization metrics) in the example ES databases 148, 159. For example, the orchestrators 142, 156 and/or schedulers 144, 158 may access telemetry data from corresponding databases 148, 159 to determine whether to offload and/or onload the workload or portion of the workload to one or more different resource(s). In such an example, the orchestrators 142, 156 and/or schedulers 144, 158 apply the parallel distribution approach, by accessing telemetry data, to “divide and conquer” the edge computing workload among different resources (e.g., resource(s) 149, 162) available at the edge environment 110.

In some examples, the telemetry controllers 152, 164 perform a fingerprinting analysis. As used herein, a fingerprinting analysis as a method in which analyzes one or more workloads in an effort to identify, track, and/or monitor the workload across an edge environment (e.g., the edge environment 110). For example, when the first orchestrator 142 generates a workload description, the first telemetry controller 152 may fingerprint the workload description to determine requirements of the workload, known or discoverable workload characteristics, and/or the workload execution topology (e.g., which microservices are collocated with each other, what is the speed with which the microservices communicate data, etc.). In some examples, the telemetry controllers 152, 164 store analysis results and telemetry data locally (e.g., in the respective ES database 148, 159). In other examples, the telemetry controllers 152, 164 provide analysis results and telemetry data directly to the orchestrators 142, 156.

In the illustrated example of FIG. 1, the security controllers 154, 166 determine whether the resource(s) 149, 162 can be made discoverable to a workload and whether an edge platform (e.g., edge platforms 140, 150) is sufficiently trusted for assigning a workload to. In some examples, the example security controllers 154, 166 negotiate key exchange protocols (e.g., TLS, etc.) with a workload source (e.g., an endpoint device, a server, an edge platform, etc.) to determine a secure connection between the security controller and the workload source. In some examples, the security controllers 154, 166 perform cryptographic operations and/or algorithms (e.g., signing, verifying, generating a digest, encryption, decryption, random number generation, secure time computations or any other cryptographic operations).

The example security controllers 154, 166 may include a hardware root of trust (RoT). The hardware RoT is a system on which secure operations of a computing system, such as an edge platform, depend. The hardware RoT provides an attestable device (e.g., edge platform) identity feature, where such a device identity feature is utilized in a security controller (e.g., security controllers 154, 166). The device identify feature attests the firmware, software, and hardware implementing the security controller (e.g., security controllers 154, 166). For example, the device identify feature generates and provides a digest (e.g., a result of a hash function) of the software layers between the security controllers 154, 166 and the hardware RoT to a verifier (e.g., a different edge platform than the edge platform including the security controller). The verifier verifies that the hardware RoT, firmware, software, etc. are trustworthy (e.g., not having vulnerabilities, on a whitelist, not on a blacklist, etc.).

In some examples, the security controllers 154, 166 store cryptographic keys (e.g., a piece of information that determines the functional output of a cryptographic algorithm, such as specifying the transformation of plaintext into ciphertext) that may be used to securely interact with other edge platforms during verification. In some examples, the security controllers 154, 166 store policies corresponding to the intended use of the security controllers 154, 166. In some examples, the security controllers 154, 166 receive and verify edge platform security and/or authentication credentials (e.g., access control, single-sign-on tokens, tickets, and/or certificates) from other edge platforms to authenticate those other edge platforms or respond to an authentication challenge by other edge platforms.

In some examples, the edge services 130A-C may communicate with the security controllers 154, 166 to determine whether the resource(s) 149, 162 can be made discoverable. For example, in response to receiving an edge computing workload, an edge service (e.g., one or more of the edge services 130A-C) provides a contract and a description of the workload to the security controller (e.g., the first security controller 154). In such an example, the security controller (e.g., the first security controller 154) analyzes the requests of the workload to determine whether the resource(s) (e.g., the first resource(s) 149) are authorized and/or registered to take on the workload. For example, the security controllers 154, 166 include authentication information, security information, etc., in which determines whether an edge computing workload meets edge platform credentials and whether an edge platform (e.g., edge platforms 140, 150) is sufficiently trusted for assigning a workload to. For example, edge platform credentials may correspond to the capability data 136A-C and may be determined during the distribution and/or registration of the edge platform 140, 150 with the edge service 130A-C. Examples of edge platform security and/or authentication credentials include certificates, resource attestation tokens, hardware and platform software verification proofs, compound device identity codes, etc.

In some examples, in response to a notification, message, or communication from the security controller 154, 166, the schedulers 144, 158 can access a task received and/or otherwise obtained by the orchestrators 142, 156 and provide the task to one or more of the resource(s) 149, 162 to execute or complete. For example, the schedulers 144, 158 are to generate thread scheduling policies. Thread scheduling policies are policies that assign workloads (e.g., sets of executable instructions also referred to as threads) to resource(s) 149, 162. The schedulers 144, 158 may generate and/or determine the thread scheduling policy for corresponding edge platforms 140, 150 based on capability data 136A-C, policy data 138A-C, and telemetry data (e.g., utilization metrics).

FIG. 2 depicts an example edge service 200 to register an edge platform (e.g., first edge platform 140 or the second edge platform 150) with the edge environment 110. In FIG. 2, the edge service 200 includes an example orchestrator 204, an example policy controller 208, an example registration controller 206, and an example capability controller 210. The example edge service 200 registers and/or communicates with the example edge platform (e.g., the first edge platform 140, the second edge platform 150) of FIG. 1 via an example interface (e.g., the first interface 131, the second interface 132). In examples disclosed herein, the edge service 200 illustrated in FIG. 2 may implement any of the edge services 130A-C of FIG. 1. For example, the first edge service 130A, the second edge service 130B, and the third edge service 130C may include the example orchestrator 204, the example policy controller 208, the example registration controller 206, and/or the example capability controller 210 to orchestrate workloads to edge platforms, register workloads, register edge platforms, etc.

In the illustrated example of FIG. 2, the orchestrator 204 controls edge computing workloads and edge platforms operating at the edge environment (e.g., edge environment 110). For example, the orchestrator 204 may orchestrate and/or otherwise facilitate the edge computing workloads to be registered by the registration controller 206. The orchestrator 204 may be an interface in which developers, users, tenants, etc., may upload, download, provide, and/or deploy workloads to be registered by the registration controller 206. The example orchestrator 204 may be implemented and/or otherwise be a part of any of the edge services 130A-C.

In edge environments and cloud environments (e.g., the cloud environment 105 of FIG. 1 and the edge environment 110 of FIG. 1), applications are increasingly developed as webs of interacting, loosely coupled workloads called microservices. For example, an application may be a group of interacting microservices that perform different functions of the application. Some or all of such microservices benefit from dynamic decisions about where (e.g., what resources) they may execute. Such decisions may be determined by the orchestrator 204. Alternatively, the example orchestrators 142, the example scheduler 144, the example capability controller 146, the example telemetry controller 152, and/or more generally the first example edge platform 140 generates decisions corresponding to microservice execution location.

In this manner, some parts of an application (e.g., one or more microservices) may execute on one of the resource(s) 149 (e.g., general purpose processors like Atom, Core, Xeon, AMD x86, IBM Power, RISC V, etc.), while other parts of the application (e.g., different microservices) may be configured to execute at a different one of the resource(s) 149 (e.g., acceleration hardware such as GPU platforms (like Nvidia, AMD ATI, integrated GPU, etc.), ASIC platforms (like Google TPU), custom logic on FPGAs, custom embedded-ware as on SmartNlCs, etc.). A microservice may include a workload and/or executable instructions. Such execution of an application on one or more resources may be called parallel distribution.

In the illustrated example of FIG. 2, the registration controller 206 registers workloads and edge platforms (e.g., edge platform 140) with the edge environment 110. For example, the registration controller 206 onboards applications, services, microservices, etc., with the edge service 200. Additionally, the registration controller 206 onboards edge platforms 140, 150 with the edge service 200.

In some examples, registration controller 206 is initiated by the orchestrator 204. For example, an edge administrator, an edge platform developer, an edge platform manufacturer, and/or more generally, an administrative domain requests, via the orchestrator 204, to onboard an edge platform (e.g., 140, 150) with the edge service 200. The administrative domain may provision the edge platforms 140, 150 with cryptographic keys, credentials, policies, software, etc., that are specific to the edge platforms 140, 150. The example registration controller 206 receives the request from the orchestrator 204 and onboards the edge platform 140 with the edge service 200. In this manner, the administrative domain is no longer assigned to the edge platform, and the edge platform is assigned a new identity. In some examples, the new identity enables the edge platforms 140, 150 to be discoverable by multiple endpoint devices (e.g., endpoint devices 170, 175, 180, 185, 190, 195), multiple edge platforms (e.g., edge platform 150), multiple servers (e.g., servers 112, 114, 116), and any other entity that may be registered with the edge service 200.

In some examples, the registration controller 206 onboards edge computing workloads with the edge service 200. For example, an edge computing workload is a task that is developed by an edge environment user (e.g., a user utilizing the capabilities of the edge environment 110), an edge computing workload developer, etc. In some examples, the edge environment user and/or edge computing workload developer requests for the edge computing workload to be onboarded with the edge service 200. For example, an edge computing workload developer authorizes an edge platform (e.g., edge platform 140) to execute the edge computing workload on behalf of the user according to an agreement (e.g., service level agreement (SLA) or an e-contract). For example, the registration controller 206 generates an agreement for the orchestrator 204 to provide to the user, via an interface (e.g., a GUI, a visualization API, etc.). The example registration controller 206 receives a signature and/or an acceptance, from the user, indicative that the user accepts the terms of the agreement. In this manner, the edge computing workload is onboarded with the edge service 200 and corresponding edge platform.

In some examples, the edge service 200 (e.g., the orchestrator 204) is responsible for the edge computing workload lifecycle management, subsequent to the registration controller 206 onboarding the edge computing workload. For example, the orchestrator 204 accepts legal, fiduciary, contractual, and technical responsibility for execution of the edge computing workload in the edge environment 110. For example, the orchestrator 204 provides the edge platform 140 (e.g., the orchestrator 142, the scheduler 144, the telemetry controller 152, the security controller 154) responsibility of subsequent scheduling of resource(s) 149 to perform and/or execute the edge computing workload.

In some examples, the registration controller 206 generates an existence (e.g., a new identity) of the workloads and edge platforms to endpoint devices, cloud environments, and edge environments. For example, the edge platform 140 is made available to the endpoint devices 170, 175, 180, 185, 190, 195 and/or the servers 112, 114, 116 in the cloud environment 105, and the edge computing workloads are managed by the edge platforms 140, 150.

In the illustrated example of FIG. 2, the example policy controller 208 controls the receipt and storage of policy data (e.g., policy data 138A). The example policy controller 208 may be an interface, an API, a collection agent, etc. In some examples, a tenant, a developer, an endpoint device user, an information technology manager, etc., can provide policy data (e.g., policy data 138A) to the policy controller 208. Policy data includes requirements and/or conditions in which the edge platform (e.g., edge platforms 140, 150) are to meet. For example, an endpoint device user desires to optimize for resource performance during workload execution. In other examples, the endpoint device user desires to optimize for power consumption (e.g., save battery life) during workload execution. In some examples, the telemetry controller 152 compares these policies with telemetry data to determine if a workload is to be offloaded from a first resource to a second resource of the resource(s). In this manner, the telemetry controller 152 may periodically and/or aperiodically query the policy controller 208. Alternatively, the policy controller 208 stores policies in the database 148 and the telemetry controller 152 periodically and/or aperiodically queries the database 148 for policy data.

In some examples, the policy controller 208 can determine how an edge platform orchestrator performs parallel distribution. For example, parallel distribution may be used where an endpoint device wants to execute an acceleration function upon a workload providing a large chunk of data (e.g., 10 GB, or some significantly sized amount for the type of device or network). If the registration controller 206 determines such a chunk of data supports parallel processing—where the data can be executed or analyzed with multiple accelerators in parallel—then acceleration distribution may be used to distribute and collect the results of the acceleration from among multiple resources (e.g., resource(s) 149, 162, processing nodes, etc.). Additionally, the policy controller 208 can determine that the parallel distribution approach may be used where an endpoint device wants to execute a large number of functions (e.g., more than 100 functions at one time) which can be executed in parallel, in order to fulfill the workload in a more efficient or timely manner. The endpoint device sends the data and the workload data to be executed with a given SLA and given cost. The workload is distributed, coordinated, and collected in response, from among multiple processing nodes—each of which offers different flavors or permutations of acceleration.

In the illustrated example of FIG. 2, the capability controller 210 determines the edge platform 140 capabilities during registration and onboarding of the edge platform 140. For example, the capability controller 210 invokes an executable (e.g., the executable 137), of the edge platform capability controller 146, to generate capability data (e.g., capability data 136A). In some examples, the capability controller 210 retrieves the capability data from the database 148. In this manner, the capability controller 210 enables the registration controller 206 to register the edge platform 140 as including such capabilities. For example, when the orchestrator 204 receives a request to execute a workload, the orchestrator 204 identifies, via the capability controller 210, whether the capabilities of the edge platform 140 includes proper resource(s) to fulfill the workload task.

In the illustrated example of FIG. 2, the orchestrator 204, registration controller 206, policy controller 208, capability controller 210, and/or more generally the example edge service 200 may operate as a registration phase. For example, the edge service 200 prepares edge platforms for operation in the edge environment (e.g., the edge environment 110).

In an example operation, the orchestrator 204 orchestrates the registration of the edge platform 140. For example, the orchestrator 204 notifies the registration controller 206 to begin the onboarding process of the edge platform 140. The registration controller 206 tags and/or otherwise identifies the edge platform 140 with an edge platform identifier. In some examples, the edge platform identifier is utilized by endpoint devices 170, 175, 180, the edge environment 110, the servers 112, 114, 116, and the edge platform 150. In this manner, the endpoint devices 170, 175, 180 have the ability to offload a registered edge computing workload onto the edge platform that includes an edge platform identifier (e.g., edge platform 140 is registered with identifier platform A).

During edge platform onboarding, the example registration controller 206 queries the capability controller 210 to determine the edge platform 140 capabilities. For example, the registration controller 206 may utilize the edge platform capabilities to assign the edge platform with a new identity. The example capability controller 210 queries and/or otherwise invokes the capability controller 146 of the edge platform 140 to generate capability data (e.g., capability data 136A). In some examples, the capability controller 210 notifies the registration controller 206 of the capability data. In this manner, the registration controller 206 utilizes the capability data to onboard or register the edge platform 140 and further to generate agreements with edge computing workloads.

For example, the during onboarding of edge computing workloads, the orchestrator 204 obtains the edge computing workloads (a load balancer service, a firewall service, a user plane function, etc.) that a provider desires to be implemented and/or managed by the edge environment 110. Further, the example registration controller 206 generates an agreement. For example, the registration controller 206 generates a contract indicative that the edge service 200 will provide particular aspects (e.g., quality, availability, responsibility, etc.) for the edge computing workload. In some examples, the registration controller 206 notifies the capability controller 210 to initiate one or more platform capability controllers (e.g., capability controller 146) to identify capability data. In this manner, the registration controller 206 can obtain the capability data and generate an agreement associated with the edge computing workload description. In some examples, the registration controller 206 receives an agreement acceptance from the edge computing workload provider and thus, the edge computing workload is onboarded. When the edge computing workload is onboarded, is to be operable on one or more edge platforms (e.g., edge platforms 140, 150).

In some examples, the orchestrator 204 determines whether an edge platform (e.g., edge platform 140) includes sufficient capabilities to meet the edge computing workload requests. For example, the orchestrator 204 may identify whether an edge platform (e.g., edge platform 140 and/or 150) can take on the edge computing workload. For example, the capability controller 210 confirms with the edge platform capability controllers whether the description of the workload matches the capability data.

When the example edge service 200 onboards the edge platforms (e.g., edge platform 140, 150) and the edge computing workloads, the edge service 200 orchestrates edge computing workloads to the edge platform 140, and the edge platform 140 manages the edge computing workload lifecycle. For example, the edge platform (e.g., edge platform 140) facilitates integration of its resources (e.g., resource(s) 149) for edge computing workload execution, management and distribution. For example, the edge platform (e.g., edge platform 140) facilitates parallel computing, distribution computing, and/or a combination of parallel and distribution computing.

FIG. 3 depicts the example resource(s) 149 of FIG. 1 offloading and/or onloading an edge computing workload (e.g., an edge computing service). Alternatively, FIG. 3 depicts the example resource(s) 162 of FIG. 1. The example of FIG. 3 includes a first example resource 305, a second example resource 310, a third example resource 315, an example configuration controller 320, a fourth example resource 330, a fifth example resource 335, and a sixth example resource 340. The example resource(s) 149 in FIG. 3 may include more or less resources than the resources 305, 310, 315, 330, 335, 340 depicted.

In the example of FIG. 3, the edge computing workload is an application formed by microservices. For example, the edge computing workload includes a first microservice, a second microservice, and a third microservice coupled together through a graph-like mechanism to constitute the workload. In some examples, the microservices are in communication with each other. In some examples, the microservices include similar workload tasks. Alternatively, microservices include dissimilar workload tasks. In such an example, the first microservice and the second microservice are workloads including executable instructions formatted in a first implementation (e.g., x86 architecture) and the third microservice is a workload including executable instructions formatted in a second implementation (e.g., an FGPA architecture).

As used herein, an implementation, a software implementation, a flavor of code, and/or a variant of code corresponds to a type of programming language and a corresponding resource. For example, an application may be developed to execute on an FPGA. In this manner, the microservices of the application may be written in a programming language that the FPGA can understand. Some resources (e.g., resource(s) 149) require specific instructions to execute a task. For example, a CPU requires different instructions than a GPU. In some examples, a microservice including a first implementation can be transformed to include a second implementation.

In the illustrated example of FIG. 3, the first resource 305 is a general purpose processing resource (e.g., a CPU), the second resource 310 is an interface resource (e.g., a NIC, smart NIC, etc.), and the third resource 315 is a datastore. In some examples, the first resource 305 may, by default, obtain the edge computing workload. For example, the scheduler 144 may initially schedule the edge computing workload to execute at the first resource 305. In some examples, the second resource 310 may, by default, obtain the edge computing workload. For example, the scheduler 144 may initially provide the edge computing workload to the second resource 310 for distribution across the resources 305, 315, 330, 335, 340.

In some examples, the second resource 310 includes features in which communicate with ones of the resources 305, 315, 330, 335, 340. For example, the second service 310 may include a hardware abstraction layer (HAL) interface, a bit stream generator, a load balancer, and any other features that operate within a network interface to control data distribution (e.g., instructions, workloads, etc.) across resource(s) 149. In some examples, the second resource 310 is an interface between the resources 305, 315, 330, 335, 340 and the orchestrator 142, the scheduler 144, the capability controller 146, the telemetry controller 152, the security controller 154, and applications (e.g., edge computing workloads, software programs, etc.). For example, the second resource 310 provides a platform (e.g., a hardware platform) on which to run applications.

Additionally the second resource 310 is coupled to the configuration controller 320 to generate one or more implementations of the microservices, and/or more generally the edge computing workload. For example, the configuration controller 320 may be a compiler which transforms input code (e.g., edge computing workload) into a new format. In some examples, the configuration controller 320 transforms input code into a first implementation corresponding to the first resource 305, a second implementation corresponding to the fourth resource 330, a third implementation corresponding to the fifth resource 335, and a fourth implementation corresponding to the sixth resource 340. In this manner, the configuration controller 320 may be configured with transformation functions that dynamically translate a particular implementation to a different implementation.

In some examples, the configuration controller 320 stores all implementations of the edge computing workload into the third resource 315. For example, the third resource 315 is a datastore that includes one more implementations of a microservice. In this manner, the third resource 315 can be accessed by any of the resources 305, 310, 330, 335, 340 when instructed by the orchestrator 142 and/or scheduler 144.

In the illustrated example of FIG. 3, the second example resource 310 is in communication with the example orchestrator 142, the example scheduler 144, the example capability controller 146, the example telemetry controller 152, and/or the example security controller 154 via the example network communication interface 141. For example, the network communication interface 141 is a network connection between the example orchestrator 142, the example scheduler 144, the example capability controller 146, the example resource(s) 149, the example telemetry controller 152, and/or the example security controller 154. For example, the network communication interface 141 may be any hardware and/or wireless interface that provides communication capabilities.

In an example operation, the orchestrator 204 of the edge service 200 obtains an edge computing workload. The example orchestrator 204 determines an edge platform available to take the edge computing workload and to fulfill the workload description. For example, the orchestrator 204 determine whether the edge platform 140 is registered and/or capable of being utilized. For example, the orchestrator 204 provides the edge computing workload description to the security controller 154. The security controller 154 performs cryptographic operations and/or algorithms to determine whether the edge platform 140 is sufficiently trusted to take on the edge computing workload. For example, the security controller 154 generates a digest for a verifier (e.g., the second edge platform 150) to verify that the edge platform 140 is trustworthy.

Additionally, the example orchestrator 204 determines whether edge platform 140 resource(s) 149 are capable of executing the edge computing workload. For example, the orchestrator 204 determines whether the capability data, corresponding to the edge platform 140, meets workload requirements of the edge computing workload. For example, if the edge computing workload requires 10 MB of storage but the resource(s) 149 of the edge platform 140 only have 1 MB of storage, then the orchestrator 204 determines the edge computing workload does not meet workload requirements. In this manner, the orchestrator 204 identifies a different edge platform to take on the edge computing workload. In examples where the orchestrator 204 determines the capability data meets workload requirements of the edge computing workload, the example orchestrator 142 is provided the edge computing workload for execution.

In some examples, the orchestrator 142 requests that the edge computing workload be instantiated. For example, the orchestrator 142 orchestrates generation of multiple instances of the edge computing workload based on capability data. For example, the orchestrator 142 notifies the configuration controller 320 to generate multiple instances (e.g., multiple variations and/or multiple implementations) of the edge computing workload based on capability data. The capability data, indicative of available resources 305, 310, 330, 335, 340, is used to generate multiple instances of the edge computing workload in a manner that enables the resources 305, 310, 330, 335, 340 to execute the edge computing workload upon request by the scheduler 144. Generating multiple instances of the edge computing workload avoids static hardware implementation of the edge computing workload. For example, only one of the resources 305, 310, 330, 335, 340 can execute the workload in a static hardware implementation, rather than any of the resources 305, 310, 330, 335, 340.

Based on the workload description of the edge computing workload, the orchestrator 142 determines a target resource which the workload is to execute at. For example, if the workload description includes calculations, the orchestrator 142 determines the first resource 305 (e.g., indicative of a general purpose processing unit) is target resource. The scheduler 144 configures the edge computing workload to execute at the target resource. The workload implementation matches the implementation corresponding to the target resource.

In some examples, the scheduler 144 schedules the first microservice to execute at the target resource and the second and third microservices to execute at different resources. For example, the orchestrator 142 analyzes the workload description in connection with the capability data to dynamically decide where to offload the microservices. In some examples, the orchestrator 142 analyzes the workload description in connection with the capability data and the policy data. For example, when a microservice (e.g., the first microservice) includes tasks that are known to reduce throughput, and policy data is indicative to optimize throughput, the orchestrator 142 decides to offload the first microservice to the fourth resource 330 (e.g., the first accelerator). In this manner, the scheduler 144 configures the second and third microservices to execute at the first resource 305 (e.g., the CPU) and the first microservice to execute at the fourth resource 330 to maximize the edge platform 140 capabilities while additionally meeting user requirements (e.g., policy data).

During workload execution, the telemetry controller 152 fingerprints the resources at which the workloads are executing to determine workload utilization metrics. For example, the telemetry controller 152 may query the performance monitoring units (PMUs) of the resources to determine performance metrics and utilization metrics (e.g., CPU cycles used, CPU vs. memory vs. IO bound, latency incurred by the microservice, data movement such as cache/memory activity generated by the microservice, etc.)

Telemetry data collection and fingerprinting of the pipeline of the edge computing workload enables the telemetry controller 152 to decide the resource(s) (e.g., the optimal resource) which the microservice is to execute at, to fulfill the policy data (e.g., desired requirements). For example, if the policy data is indicative to optimize for latency and the telemetry controller 152 indicates that the first microservice executing at the first resource 305 is the bottleneck in the overall latency budget (e.g., the latency allocated to resource), then the telemetry controller 152 decides the first microservice is a candidate to be offloaded to a fourth, fifth or sixth resource 330, 335, 340 (e.g., an accelerator). In some examples, this process is referred to as accelerating.

In some examples, an edge platform 140 with multiple capabilities may be seen as a group resource (e.g., resource 149), and a microservice to be offloaded to the resource(s) 149 of the edge platform 140 may originate from a near-neighborhood edge platform (e.g., the second edge platform 150). In such an example, the orchestrator 204 of the edge service 200 may communicate telemetry data, capability data, and policy data with an orchestrator of the edge service 130C to make decisions about offloading a service.

In some examples, the orchestrator 142 and/or the scheduler 144 implement flexible acceleration capabilities by utilizing storage across the edge environment 110. For example, in a collection of edge platforms (e.g., edge platforms 140, 150) it is possible to utilize storage resources between edge platforms to increase the speed at which microservices are executed. In some examples, the orchestrator 142 and/or scheduler 144 couple persistent memory, if available on the edge platform 140, with a storage stack that is on a nearby edge platform (e.g., second edge platform 150). Persistent memory is any apparatus that efficiently stores data structures (e.g., workloads of the edge computing workload) such that the data structures can continue to be accessed using memory instructions or memory APIs even after the structure was modified or the modifying tasks have terminated across a power reset operation. A storage stack is a data structure that supports procedure or block invocation (e.g., call and return). For example, a storage stack is used to provide both the storage required for the application (e.g., workload) initialization and any automatic storage used by the called routine. Each thread (e.g., instruction in a workload) has a separate and distinct stack. The combination of the persistent memory implementation and the storage stack implementation enables critical data to be moved into persistent memory synchronously, and further allows data to move asynchronously to slower storage (e.g., solid state drives, hard disks, etc.).

In other examples, if the policy data is indicative to optimize for power consumption and the telemetry controller 152 determines the microservice load on the first resource 305 is light (e.g., not compute intensive) but the second and third microservices are consuming significant power from the fourth resource 330, then the telemetry controller 152 determines that the second and third microservices are candidates to be onloaded to the first resource 305. In some examples, this process is referred to as onloading. Onloading is the process of loading (e.g., moving) a task from an accelerator back onto a general purpose processor (e.g., CPU, multicore CPU, etc.).

In some examples, when the telemetry controller 152 determines candidate ones of microservices to be offloaded, the scheduler 144 may determine whether a correct instance or implementation of that workload is available. For example, when the telemetry controller 152 decides to offload the first microservice from the first resource 305 to the fourth resource 330, the scheduler 144 determines whether this is possible. In such an example, the scheduler 144 may query the third resource 315 (e.g., the datastore) to determine if an instance of the microservice exists that is compatible with the fourth resource 330. For example, the first microservice representative of a fast Fourier Transform (FFT) is implemented in a first flavor (e.g., x86) and the scheduler 144 determines if there is an instance of the FFT that is implemented in a second flavor (e.g., FPGA). In such a manner, the scheduler determines the instance of the microservice (e.g., workload) that is compatible with the resource of which the microservice is to execute at (e.g., the fourth resource 330).

When a microservice has been identified as a candidate to be offloaded and/or onloaded from one resource to another, the scheduler 144 pauses the workload execution and determines a workload state of the microservice, the workload state indicative of a previous thread executed at a resource. For example, the scheduler 144 performs a decoupling method. Decoupling is the task of removing and/or shutting down a microservice task at a target resource and adding and/or starting the microservice task on a different resource. The scheduler 144 may implement persistent queuing and dequeuing operations through the means of persistent memory of the edge platform 140. In this manner, the scheduler 144 allows microservices (e.g., microservices) to achieve resilient operation, even as instances of the workloads are shutdown on one resource and started on a different resource. The implementation of decoupling allows the scheduler 144 to determine a workload state. For example, the scheduler 144 snapshots (e.g., saves) the state of the microservice at the point of shutdown for immediate use a few tens of milliseconds later, to resume at a different resource. By implementing decoupling, the scheduler 144 is able to change microservice execution at any time.

The scheduler 144 utilizes the workload state to schedule the microservice to execute at a different resource. For example, the scheduler 144 captures the workload state at the first resource 305 and stores the workload state in a memory. In some examples, the scheduler 144 exchanges the workload state with the fourth resource 330 (e.g., when the microservice is to be offloaded to the fourth resource 330). In this manner, the fourth resource 330 obtains the workload state to from a memory for continued execution of the workload at the workload state.

In some examples, this operation continues until the microservices and/or more generally the edge computing workload, have been executed. For example, the telemetry controller 152 continues to collect telemetry data and utilization metrics throughout execution. Additionally, the telemetry data and utilization metrics are constantly being compared to the policy data by the telemetry controller 152.

While an example manner of implementing the edge services 130A-C and the edge platform 140 of FIG. 1 is illustrated in FIGS. 2 and 3, one or more of the elements, processes and/or devices illustrated in FIGS. 2 and 3 may be combined, divided, re-arranged, omitted, eliminated and/or implemented in any other way. Further, the example orchestrator 142, the example scheduler 144, the example capability controller 146, the example resource(s) 149, the example telemetry controller 152, the example security controller 154, and/or more generally, the example edge platform 140 of FIGS. 1 and 3 may be implemented by hardware, software, firmware and/or any combination of hardware, software and/or firmware. Additionally, the example orchestrator 204, the example registration controller 206, the example policy controller 208, the example capability controller 210, and/or, more generally, the example edge services 130A-C of FIG. 2 may be implemented by hardware, software, firmware and/or any combination of hardware, software and/or firmware. Thus, for example, any of the example orchestrator 142, the example scheduler 144, the example capability controller 146, the example resource(s) 149, the example telemetry controller 152, the example security controller 154, the example orchestrator 204, the example registration controller 206, the example policy controller 208, the example capability controller 210 and/or, more generally, the example edge platform 140 and the example edge services 130A-C could be implemented by one or more analog or digital circuit(s), logic circuits, programmable processor(s), programmable controller(s), graphics processing unit(s) (GPU(s)), digital signal processor(s) (DSP(s)), application specific integrated circuit(s) (ASIC(s)), programmable logic device(s) (PLD(s)) and/or field programmable logic device(s) (FPLD(s)). When reading any of the apparatus or system claims of this patent to cover a purely software and/or firmware implementation, at least one of the example orchestrator 142, the example scheduler 144, the example capability controller 146, the example resource(s) 149, the example telemetry controller 152, the example security controller 154, the example orchestrator 204, the example registration controller 206, the example policy controller 208, and/or the example capability controller 210 is/are hereby expressly defined to include a non-transitory computer readable storage device or storage disk such as a memory, a digital versatile disk (DVD), a compact disk (CD), a Blu-ray disk, etc. including the software and/or firmware. Further still, the example edge services 130A-C and/or the example edge platform 140 of FIG. 1 may include one or more elements, processes and/or devices in addition to, or instead of, those illustrated in FIGS. 2 and 3, and/or may include more than one of any or all of the illustrated elements, processes and devices. As used herein, the phrase “in communication,” including variations thereof, encompasses direct communication and/or indirect communication through one or more intermediary components, and does not require direct physical (e.g., wired) communication and/or constant communication, but rather additionally includes selective communication at periodic intervals, scheduled intervals, aperiodic intervals, and/or one-time events.

A flowchart representative of example hardware logic, machine readable instructions, hardware implemented state machines, and/or any combination thereof for implementing the edge services 130A-C of FIG. 2 and the edge platform 140 of FIG. 3 are shown in FIGS. 4-6. The machine readable instructions may be one or more executable programs or portion(s) of an executable program for execution by a computer processor such as the processor 712 shown in the example processor platform 700 discussed below in connection with FIG. 7. The program may be embodied in software stored on a non-transitory computer readable storage medium such as a CD-ROM, a floppy disk, a hard drive, a DVD, a Blu-ray disk, or a memory associated with the processor 712, but the entire program and/or parts thereof could alternatively be executed by a device other than the processor 712 and/or embodied in firmware or dedicated hardware. Further, although example programs are described with reference to the flowcharts illustrated in FIGS. 4-6, many other methods of implementing the example edge platform 140 may alternatively be used. For example, the order of execution of the blocks may be changed, and/or some of the blocks described may be changed, eliminated, or combined. Additionally or alternatively, any or all of the blocks may be implemented by one or more hardware circuits (e.g., discrete and/or integrated analog and/or digital circuitry, an FPGA, an ASIC, a comparator, an operational-amplifier (op-amp), a logic circuit, etc.) structured to perform the corresponding operation without executing software or firmware.

The machine readable instructions described herein may be stored in one or more of a compressed format, an encrypted format, a fragmented format, a compiled format, an executable format, a packaged format, etc. Machine readable instructions as described herein may be stored as data (e.g., portions of instructions, code, representations of code, etc.) that may be utilized to create, manufacture, and/or produce machine executable instructions. For example, the machine readable instructions may be fragmented and stored on one or more storage devices and/or computing devices (e.g., servers). The machine readable instructions may require one or more of installation, modification, adaptation, updating, combining, supplementing, configuring, decryption, decompression, unpacking, distribution, reassignment, compilation, etc. in order to make them directly readable, interpretable, and/or executable by a computing device and/or other machine. For example, the machine readable instructions may be stored in multiple parts, which are individually compressed, encrypted, and stored on separate computing devices, wherein the parts when decrypted, decompressed, and combined form a set of executable instructions that implement a program such as that described herein.

In another example, the machine readable instructions may be stored in a state in which they may be read by a computer, but require addition of a library (e.g., a dynamic link library (DLL)), a software development kit (SDK), an application programming interface (API), etc. in order to execute the instructions on a particular computing device or other device. In another example, the machine readable instructions may need to be configured (e.g., settings stored, data input, network addresses recorded, etc.) before the machine readable instructions and/or the corresponding program(s) can be executed in whole or in part. Thus, the disclosed machine readable instructions and/or corresponding program(s) are intended to encompass such machine readable instructions and/or program(s) regardless of the particular format or state of the machine readable instructions and/or program(s) when stored or otherwise at rest or in transit.

The machine readable instructions described herein can be represented by any past, present, or future instruction language, scripting language, programming language, etc. For example, the machine readable instructions may be represented using any of the following languages: C, C++, Java, C #, Perl, Python, JavaScript, HyperText Markup Language (HTML), Structured Query Language (SQL), Swift, etc.

As mentioned above, the example processes of FIGS. 4-6 may be implemented using executable instructions (e.g., computer and/or machine readable instructions) stored on a non-transitory computer and/or machine readable medium such as a hard disk drive, a flash memory, a read-only memory, a compact disk, a digital versatile disk, a cache, a random-access memory and/or any other storage device or storage disk in which information is stored for any duration (e.g., for extended time periods, permanently, for brief instances, for temporarily buffering, and/or for caching of the information). As used herein, the term non-transitory computer readable medium is expressly defined to include any type of computer readable storage device and/or storage disk and to exclude propagating signals and to exclude transmission media.

“Including” and “comprising” (and all forms and tenses thereof) are used herein to be open ended terms. Thus, whenever a claim employs any form of “include” or “comprise” (e.g., comprises, includes, comprising, including, having, etc.) as a preamble or within a claim recitation of any kind, it is to be understood that additional elements, terms, etc. may be present without falling outside the scope of the corresponding claim or recitation. As used herein, when the phrase “at least” is used as the transition term in, for example, a preamble of a claim, it is open-ended in the same manner as the term “comprising” and “including” are open ended. The term “and/or” when used, for example, in a form such as A, B, and/or C refers to any combination or subset of A, B, C such as (1) A alone, (2) B alone, (3) C alone, (4) A with B, (5) A with C, (6) B with C, and (7) A with B and with C. As used herein in the context of describing structures, components, items, objects and/or things, the phrase “at least one of A and B” is intended to refer to implementations including any of (1) at least one A, (2) at least one B, and (3) at least one A and at least one B. Similarly, as used herein in the context of describing structures, components, items, objects and/or things, the phrase “at least one of A or B” is intended to refer to implementations including any of (1) at least one A, (2) at least one B, and (3) at least one A and at least one B. As used herein in the context of describing the performance or execution of processes, instructions, actions, activities and/or steps, the phrase “at least one of A and B” is intended to refer to implementations including any of (1) at least one A, (2) at least one B, and (3) at least one A and at least one B. Similarly, as used herein in the context of describing the performance or execution of processes, instructions, actions, activities and/or steps, the phrase “at least one of A or B” is intended to refer to implementations including any of (1) at least one A, (2) at least one B, and (3) at least one A and at least one B.

As used herein, singular references (e.g., “a”, “an”, “first”, “second”, etc.) do not exclude a plurality. The term “a” or “an” entity, as used herein, refers to one or more of that entity. The terms “a” (or “an”), “one or more”, and “at least one” can be used interchangeably herein. Furthermore, although individually listed, a plurality of means, elements or method actions may be implemented by, e.g., a single unit or processor. Additionally, although individual features may be included in different examples or claims, these may possibly be combined, and the inclusion in different examples or claims does not imply that a combination of features is not feasible and/or advantageous.

FIG. 4 is a flowchart representative of machine readable instructions which may be executed to implement the example edge service 200 of FIG. 2 to register the example edge platform 140 with the example edge service 200. The registration program 400 begins at block 402, where the example orchestrator 204 obtains instructions to onboard an edge platform (e.g., edge platform 140). For example, the orchestrator 204 is provided with a request from an administrative domain edge platforms indicative to implement the edge platform in an edge environment (e.g., the edge environment 110).

The example orchestrator 204 notifies the example registration controller 206 of the request (e.g., edge platform 140). The example registration controller 206 onboards the edge platform 140 with an edge service (e.g., edge service 200) (block 404). For example, the registration controller 206 assigns a new identity to the edge platform 140 which enables the edge platform 140 to be discoverable by multiple endpoint devices (e.g., endpoint devices 170, 175, 180, 185, 190, 195), multiple edge platforms (e.g., edge platform 150), multiple servers (e.g., servers 112, 114, 116), and any other entity that may be registered with the edge service 200.

The example registration controller 206 may request capability data from the edge platform 140 as a part of the edge platform registration. In this manner, the example capability controller 210 is initiated to determine edge platform capabilities (block 406). For example, the capability controller 210 may invoke an executable (e.g., executable 137) to generate capability data. Such an executable may be implemented by an edge platform capability controller (e.g., the example capability controller 146) implemented by the edge platform (e.g., edge platform 140). In some examples, the registration controller 206 utilizes the capability data to generate the new identity for the edge platform 140, such that the new identity includes information and/or a meaning indicative that the edge platform 140 includes specific capabilities.

The example capability controller 210 stores the edge platform capability data (block 408). For example, the capability controller 210 stores capability data in a datastore, a non-volatile memory, a database, etc., that is accessible by the orchestrator 204.

The example orchestrator 204 obtains workloads (block 410). For example, the orchestrator may receive and/or acquire edge computing workloads, services, applications, etc., from an endpoint device, an edge environment user, an edge computing workload developer, etc., that desires to execute the workload at the edge environment 110. The example orchestrator 204 notifies the registration controller 206. The example registration controller 206 generates an agreement for the workload provider (block 412). For example, the registration controller 206 generates an agreement (e.g., an SLA, e-contract, etc.) for the orchestrator 204 to provide to the user, via an interface (e.g., a GUI, a visualization API, etc.). In some examples, the registration controller 206 generates the agreement based on platform capabilities, determined by the capability controller 210.

The example registration controller 206 determines if the agreement has been accepted (block 414). For example, the registration controller 206 receives a signature and/or an acceptance, from the user, indicative that the user accepts the terms of the agreement (e.g., block 414=YES). In this manner, the registration controller 206 onboards the workload with the edge service (block 416). For example, when the workload provider accepts the agreement, the edge service (e.g., edge service 200) is responsible for the lifecycle and management of the workload. In some examples, if the agreement is not accepted, the workload is not onboarded and the registration of the workload ends.

The registration program 400 ends when an edge platform has been onboarded by the example edge service (e.g., edge service 200) and when obtained workloads have been onboarded or not onboarded with the edge service. The registration program 400 can be repeated when the edge service 200 (e.g., edge services 130A-C) obtains new edge platforms and/or new workloads.

FIG. 5 is a flowchart representative of machine readable instructions which may be executed to implement the example edge platform 140 of FIG. 1 to integrate resource(s) to execute an edge computing workload. The integration program 500 of FIG. 5 begins at block 502 when the orchestrator 204 obtains a workload. For example, the edge service 200 (e.g., edge services 130A-C) may receive a registered workload, to be executed by the edge environment 110, from an endpoint device.

The example orchestrator 204 identifies an edge platform capable of executing the workload (block 504). For example, the orchestrator 204 queries the capability controller 210 for capability data of different edge platforms. The example orchestrator 204 determines if the edge platform is available (block 506). For example, the capability controller 210 communicates with capability controller 146 of the first edge platform 140 to determine whether the first edge platform 140 includes capability data that can meet the requirements indicated in the workload description. When the orchestrator 204 determines an edge platform is not available (e.g., block 506=NO), control returns to block 504.

When the example orchestrator 204 determines the edge platform 140 is available (e.g., block 506=YES), the example orchestrator 204 initiates the example security controller 154 to verify edge platform 140 security credentials (block 508). For example the security controller 154 obtains security credentials and generates a digest to provide to a verifier (e.g., the second edge platform 150). In some examples, security credentials are verified by verifying a public key certificate, or a similar signed credential, from a root authority known to the edge environment 100. In other examples, the edge platform may be verified by obtaining a hash or a digital measurement of the workload's image and checking that it matches a presented credential.

In some examples, the orchestrator 204 determines whether the security credentials are indicative of a trusted edge platform (block 510). For example, if the verifier does not indicate and/or otherwise notify the orchestrator 204 that the first edge platform includes security credentials indicative of a trusted edge platform (e.g., block 510=NO), the orchestrator identifies a different edge platform to take the workload (block 504).

If the verifier indicates and/or otherwise notifies the example orchestrator 204 that the first edge platform 140 includes security credentials indicative of a trusted edge platform (e.g., block 510=YES), the first edge platform 140 takes on the workload and the example orchestrator 142 generates multiple instances of the workload based the capability data (block 512). For example, the orchestrator 142 notifies a configuration controller (e.g., configuration controller 320) to generate multiple instances (e.g., multiple variations and/or implementations) of the workload (e.g., edge computing workload) based on capability data. The capability data, indicative of available resources (e.g., resource(s) 149), is used to generate multiple instances of the workload in a manner that enables the resource(s) to execute the workload upon request.

The example orchestrator 142 determines a target resource the workload is to execute at (block 514). Based on a workload description, the orchestrator 142 determines the target resource the workload is to execute at. For example, if the workload description includes calculations, the orchestrator 142 determines a general purpose processing unit is the target resource. The scheduler 144 configures the workload to execute at the target resource (block 516). For example, the scheduler generate threads to be executed at the target resource.

The example telemetry controller 152 fingerprints the target resource to determine utilization metrics (block 518). For example, the telemetry controller 152 queries the performance monitoring units (PMUs) of the target resource to determine performance metrics and utilization metrics (e.g., CPU cycles used, CPU vs. memory vs. TO bound, latency incurred by the microservice, data movement such as cache/memory activity generated by the microservice, etc.).

The example telemetry controller 152 compares the utilization metrics with policy data (block 520). For example, the telemetry controller 152 determines whether the utilization metrics meet a policy threshold. Further example instructions that may be used to implement block 520 are described below in connection with FIG. 6.

The example orchestrator 142 determines if the comparison of utilization metrics to policy data determines the workload is to be offloaded (block 522). For example, the orchestrator 142 obtains a notification from the telemetry controller 152 indicative to not offload the workload (e.g., block 522=NO), then the example scheduler 144 is notified to continue execution of the workload (block 532). If the orchestrator 142 determines the workload is to be offloaded (e.g., block 522=YES), the orchestrator 142 determines the correct workload instance for the second resource (block 524). For example, if the workload is to be offloaded from a general purpose processing unit to an acceleration unit, the example orchestrator 142 queries a database for a variant and/or transformation of the workload that corresponds to the acceleration unit.

Further, the example scheduler 144 pauses execution of the workload (block 526). For example, the scheduler 144 pauses threads, processes, or container execution at the target resource. The example scheduler 144 offloads the workload from the target resource to the second resource (block 528). For example, the scheduler 144 obtains the workload instance for the second resource and configures the workload instance to execute at the second resource. Additionally, the example scheduler 144 exchanges a workload state from the target resource to the second resource (block 530). For example, the scheduler 144 performs a decoupling method. The implementation of decoupling allows the scheduler 144 to determine a workload state. For example, the scheduler 144 snapshots (e.g., saves) the state of the workload at the point of shutdown (e.g., at block 526) for immediate use a few milliseconds later, to resume at the second resource.

The example scheduler 144 continues execution of the workload restarting at the workload state (block 532). For example, the scheduler 144 configures threads, processes, or images to be executed, at the second resource, at the point of shutdown on the target resource.

During workload execution, the telemetry controller 152 may periodically and/or aperiodically collect utilization metrics and telemetry data from the resources the workload is executing at. Additionally, the example telemetry controller 152 periodically and/or aperiodically performs comparisons of the utilization metrics to the policy data. In this manner, the orchestrator 142 is constantly making decisions about how to optimize usage of the edge platform resources during workload executions.

In this manner, the example orchestrator 142 and/or scheduler 144 determines if the workload execution is complete (block 534). For example, the scheduler 144 determines the workload threads have been executed and there are no more threads configured to be scheduled (e.g., block 534=NO), and the program of FIG. 5 ends. In other examples, the scheduler 144 determines there are still threads scheduled to be executed (e.g., block 534=YES), and control returns to block 518.

The example integration program 500 of FIG. 5 may be repeated when the edge service and/or otherwise the example orchestrator 204 obtains a new workload.

Turning to FIG. 6, the example comparison program 520 begins when the example telemetry controller 152 obtains policy data from a database (block 602). For example, the telemetry controller 152 utilizes the policy data and the utilization metrics for the comparison program.

The example telemetry controller 152 analyzes the policy data to determine if the policy data is indicative to optimize for performance (block 604). For example, it may be desirable to optimize (e.g., enhance) the quality of workload execution (e.g., the quality of video streaming). If the example telemetry controller 152 determines the policy data is indicative to optimize for performance (e.g., block 604=YES), then the example telemetry controller 152 analyzes the utilization metrics with regard to performance. The example telemetry controller 152 determines a performance metric from the utilization metrics (block 606). For example, the telemetry controller 152 determines network throughput, bandwidth, bit rate, latency, etc., of the resource executing the workload.

The example telemetry controller 152 determines if the performance metric(s) meet a performance threshold corresponding to the policy data (block 608). A performance threshold is indicative of the smallest allowable performance metric in which the workload is to meet, as required by the policy data. If the telemetry controller 152 determines the workload performance metric(s) do meet a performance threshold corresponding to the policy data (e.g., block 608=YES), the telemetry controller 152 generates a notification (block 612) indicative that the comparison results are not indicative to offload a workload the comparison program 520 returns to the program of FIG. 5.

If the telemetry controller 152 determines the workload performance metric(s) does not meet a performance threshold corresponding to the policy data (e.g., block 608=NO), the telemetry controller 152 determines a second resource which the performance of the workload will meet the performance threshold (block 610). For example, the capability data may be obtained by the telemetry controller 152. The telemetry controller 152 may analyze the capability models corresponding to other resources in the edge platform to make a decision based on the capability model. For example, the capability model may indicate that an accelerator resource can perform two tera operations per second, and the telemetry controller 152 makes a decision to execute the workload at the accelerator resource.

The example telemetry controller 152 generates a notification (block 612) corresponding to the comparison result and the second resource and control returns to the program of FIG. 5. For example, the telemetry controller 152 generates a notification indicative that the workload is to be offloaded from the target resource to the second resource.

In some examples, if the example telemetry controller 152 determines the policy data is not indicative to optimize for performance (e.g., block 604=NO), then the example telemetry controller 152 determines if the policy data is indicative to optimize for power consumption (block 614). For example, it may be desirable to optimize the power consumption during workload execution (e.g., the saving of battery life while video streaming). If the example telemetry controller 152 determines the policy data is indicative to optimize for power consumption (e.g., block 614=YES), then the example telemetry controller 152 analyzes the utilization metrics with regard to power usage.

The example telemetry controller 152 determines a power consumption metric from the utilization metrics (block 616). For example, the telemetry controller 152 determines CPU cycles used, CPU cores used, etc. during workload execution.

The example telemetry controller 152 determines if the power consumption metric(s) meet a consumption threshold corresponding to the policy data (block 618). A consumption threshold is indicative of the largest allowable power usage metric in which the workload can meet, as required by the policy data. If the telemetry controller 152 determines the workload performance metric(s) do meet a consumption threshold corresponding to the policy data (e.g., block 618=NO), the telemetry controller 152 generates a notification (block 622) indicative that the comparison results are not indicative to offload a workload and control returns to the program of FIG. 5.

If the telemetry controller 152 determines the power consumption metric(s) do meet the consumption threshold corresponding to the policy data (e.g., block 618=YES), the telemetry controller 152 determines a second resource which the power usage of the workload will be reduced (block 620). For example, the capability data may be obtained by the telemetry controller 152. The telemetry controller 152 may analyze the capability models corresponding to other resources in the edge platform to make a decision based on the capability model. For example, the capability model may indicate that a general purpose processing unit includes multiple unused cores, and the telemetry controller 152 makes a decision to execute the workload at the general purpose processing unit resource. The example telemetry controller 152 generates a notification (block 622) indicative of the second resource the comparison program 518 returns to the program of FIG. 5.

In some examples, at block 614, the telemetry controller 152 determines the policy data is not indicative to optimize for power consumption (e.g., block 614=NO). In this manner, the example telemetry controller 152 determines the optimization policy (block 624). For example, the telemetry controller 152 analyzes the policy data to determine the specifications set forth by a tenant.

The example telemetry controller 152 determines if the utilization metrics and/or telemetry data meet policy data specifications (block 626). For example, if the policy specifications are indicative to limit temperature of hardware (e.g., CPU temperature) and the telemetry data is indicative that the temperature of the target resource is at an above-average level, then the telemetry controller 152 determines the utilization metrics and/or telemetry data do not meet policy data specifications (e.g., block 626=NO). In this manner, the example telemetry controller 152 determines a second resource to offload the workload (block 628). For example, the telemetry controller 152 determines a resource that will reduce the temperature of the resource executing the workload. The example telemetry controller 152 generates a notification (block 630) indicative of the second resource.

If the example telemetry controller 152 determines the utilization metrics and/or telemetry data do meet policy data specifications (e.g., block 626=YES), the example telemetry controller 152 generates a notification (block 630) indicative that the workload is not to be offloaded. Control returns to the program of FIG. 5 after the telemetry controller 152 generates the notification.

FIG. 7 is a block diagram of an example processor platform 700 structured to execute the instructions of FIGS. 4-6 to implement the example edge platform 140 and/or the example edge services 130A-C (e.g., edge service 200) of FIG. 1. The processor platform 700 can be, for example, a server, a personal computer, a workstation, a self-learning machine (e.g., a neural network), a mobile device (e.g., a cell phone, a smart phone, a tablet such as an iPad™), a personal digital assistant (PDA), an Internet appliance, a DVD player, a CD player, a digital video recorder, a Blu-ray player, a gaming console, a personal video recorder, a set top box, a headset or other wearable device, or any other type of computing device.

The processor platform 700 of the illustrated example includes a processor 712. The processor 712 of the illustrated example is hardware. For example, the processor 712 can be implemented by one or more integrated circuits, logic circuits, microprocessors, GPUs, DSPs, security modules, co-processors, accelerators, ASICs, CPUs that operate in a secure (e.g., isolated) mode, or controllers from any desired family or manufacturer. The hardware processor may be a semiconductor based (e.g., silicon based) device. In this example, the processor implements the example orchestrator 142, the example scheduler 144, the example capability controller 146, the example resource(s) 149, the example telemetry controller 152, the example security controller 154, the example orchestrator 204, the example registration controller 206, the example policy controller 208, and the example capabilities controller 210.

The processor 712 of the illustrated example includes a local memory 713 (e.g., a cache). The processor 712 of the illustrated example is in communication with a main memory including a volatile memory 714 and a non-volatile memory 716 via a bus 718. The bus 718 may implement the example network communication interface 141. The volatile memory 714 may be implemented by Synchronous Dynamic Random Access Memory (SDRAM), Dynamic Random Access Memory (DRAM), RAMBUS® Dynamic Random Access Memory (RDRAM®) and/or any other type of random access memory device. The non-volatile memory 716 may be implemented by flash memory and/or any other desired type of memory device. Access to the main memory 714, 716 is controlled by a memory controller.

The processor platform 700 of the illustrated example also includes an interface circuit 720. The interface circuit 720 may be implemented by any type of interface standard, such as an Ethernet interface, a universal serial bus (USB), a Bluetooth® interface, a near field communication (NFC) interface, and/or a PCI express interface. The interface circuit 720 implements the example interface 131 and/or the example second resource (e.g., an interface resource) 310.

In the illustrated example, one or more input devices 722 are connected to the interface circuit 720. The input device(s) 722 permit(s) a user to enter data and/or commands into the processor 712. The input device(s) can be implemented by, for example, an audio sensor, a microphone, a camera (still or video), a keyboard, a button, a mouse, a touchscreen, a track-pad, a trackball, isopoint and/or a voice recognition system.

One or more output devices 724 are also connected to the interface circuit 720 of the illustrated example. The output devices 724 can be implemented, for example, by display devices (e.g., a light emitting diode (LED), an organic light emitting diode (OLED), a liquid crystal display (LCD), a cathode ray tube display (CRT), an in-place switching (IPS) display, a touchscreen, etc.), a tactile output device, a printer and/or speaker. The interface circuit 720 of the illustrated example, thus, typically includes a graphics driver card, a graphics driver chip and/or a graphics driver processor.

The interface circuit 720 of the illustrated example also includes a communication device such as a transmitter, a receiver, a transceiver, a modem, a residential gateway, a wireless access point, and/or a network interface to facilitate exchange of data with external machines (e.g., computing devices of any kind) via a network 726. The communication can be via, for example, an Ethernet connection, a digital subscriber line (DSL) connection, a telephone line connection, a coaxial cable system, a satellite system, a line-of-site wireless system, a cellular telephone system, etc.

The processor platform 700 of the illustrated example also includes one or more mass storage devices 728 for storing software and/or data. Examples of such mass storage devices 728 include floppy disk drives, hard drive disks, compact disk drives, Blu-ray disk drives, redundant array of independent disks (RAID) systems, and digital versatile disk (DVD) drives.

The machine executable instructions 732 of FIGS. 4-6 may be stored in the mass storage device 728, in the volatile memory 714, in the non-volatile memory 716, and/or on a removable non-transitory computer readable storage medium such as a CD or DVD.

From the foregoing, it will be appreciated that example methods, apparatus and articles of manufacture have been disclosed that utilize the full computing capabilities at the edge of the network to provide the desired optimizations corresponding to workload execution. Additionally, examples disclosed herein reduce application and/or software development burden both for the developers of the application software and the managers automating the application software for edge installation. The disclosed methods, apparatus and articles of manufacture improve the efficiency of using a computing device by allocating edge computing workloads to available resource(s) of the edge platform or by directing edge computing workloads away from a stressed or overutilized resource of the edge platform. The disclosed methods, apparatus and articles of manufacture are accordingly directed to one or more improvement(s) in the functioning of a computer.

Although certain example methods, apparatus and articles of manufacture have been disclosed herein, the scope of coverage of this patent is not limited thereto. On the contrary, this patent covers all methods, apparatus and articles of manufacture fairly falling within the scope of the claims of this patent.

Example methods, apparatus, systems, and articles of manufacture to offload and onload workloads in an edge environment are disclosed herein. Further examples and combinations thereof include the following: Example 1 includes an apparatus comprising a telemetry controller to determine that a workload is to be offloaded from a first resource to a second resource of a platform, and a scheduler to determine an instance of the workload that is compatible with the second resource, and schedule the workload to continue execution based on an exchange of a workload state from the first resource to the second resource, the workload state indicative of a previous thread executed at the first resource.

Example 2 includes the apparatus of example 1, further including a capability controller to generate a resource model indicative of one or more resources of the platform based on invoking a composition.

Example 3 includes the apparatus of example 1, wherein the telemetry controller is to obtain utilization metrics corresponding to the workload, compare the utilization metrics to a policy, and based on the comparison, determine that the workload is to be offloaded from the first resource to the second resource.

Example 4 includes the apparatus of example 1, wherein the scheduler is to pause execution of the workload at the first resource, save the workload state of the workload into a memory, and offload the workload to the second resource, the second resource to obtain the workload state from the memory for continued execution of the workload at the workload state.

Example 5 includes the apparatus of example 1, wherein the telemetry controller is to periodically compare utilization metrics to policy data to optimize execution of the workload at the platform.

Example 6 includes the apparatus of example 1, further including an orchestrator is to distribute the workload between two or more resources when first threads corresponding to a first task of the workload are optimizable on the first resource and second threads corresponding to a second task of the workload are optimizable on the second resource.

Example 7 includes the apparatus of example 1, further including an orchestrator to orchestrate generation of multiple instances of a workload based on capability information, the capability information corresponding to one or more available resources of the platform in which the workload is configured to execute.

Example 8 includes the apparatus of example 1, wherein the telemetry controller is to obtain utilization metrics corresponding to the workload, compare the utilization metrics to a policy, and based on the comparison, determine that the workload is to be onloaded from the second resource to the first resource.

Example 9 includes a non-transitory computer readable storage medium comprising instructions that, when executed, cause a machine to at least determine that a workload is to be offloaded from a first resource to a second resource, determine an instance of the workload that is compatible with the second resource, and schedule the workload to continue execution based on an exchange of a workload state from the first resource to the second resource, the workload state indicative of a previous thread executed at the first resource.

Example 10 includes the non-transitory computer readable storage medium of example 9, wherein the instructions, when executed, cause the machine to generate a resource model indicative of one or more resources of a platform based on invoking a composition.

Example 11 includes the non-transitory computer readable storage medium of example 9, wherein the instructions, when executed, cause the machine to obtain utilization metrics corresponding to the workload, compare the utilization metrics to a policy, and based on the comparison, determine that the workload is to be offloaded from the first resource to the second resource.

Example 12 includes the non-transitory computer readable storage medium of example 9, wherein the instructions, when executed, cause the machine to pause execution of the workload at the first resource, save the workload state of the workload into a memory, and offload the workload to the second resource, the second resource to obtain the workload state from the memory for continued execution of the workload at the workload state.

Example 13 includes the non-transitory computer readable storage medium of example 9, wherein the instructions, when executed, cause the machine to periodically compare utilization metrics to policy data to optimize execution of the workload at a platform.

Example 14 includes the non-transitory computer readable storage medium of example 9, wherein the instructions, when executed, cause the machine to distribute the workload between two or more resources when first threads corresponding to a first task of the workload are optimizable on the first resource and second threads corresponding to a second task of the workload are optimizable on the second resource.

Example 15 includes the non-transitory computer readable storage medium of example 9, wherein the instructions, when executed, cause the machine to orchestrate generation of multiple instances of the workload based on capability information, the capability information corresponding to one or more available resources of a platform in which the workload is configured to execute.

Example 16 includes the non-transitory computer readable storage medium of example 9, wherein the instructions, when executed, cause the machine to obtain utilization metrics corresponding to the workload, compare the utilization metrics to a policy, and based on the comparison, determine that the workload is to be onloaded from the second resource to the first resource.

Example 17 includes a method comprising determining that a workload is to be offloaded from a first resource to a second resource, determining an instance of the workload that is compatible with the second resource, and scheduling the workload to continue execution based on an exchange of a workload state from the first resource to the second resource, the workload state indicative of a previous thread executed at the first resource.

Example 18 includes the method of example 17, further including generating a resource model indicative of one or more resources of a platform based on invoking a composition.

Example 19 includes the method of example 17, further including obtaining utilization metrics corresponding to the workload, comparing the utilization metrics to a policy, and based on the comparison, determining that the workload is to be offloaded from the first resource to the second resource.

Example 20 includes the method of example 17, further including pausing execution of the workload at the first resource, saving the workload state of the workload into a memory, and offloading the workload to the second resource, the second resource to obtain the workload state from the memory for continued execution of the workload at the workload state.

Example 21 includes the method of example 17, further including periodically comparing utilization metrics to policy data to optimize execution of the workload at a platform.

Example 22 includes the method of example 17, further including distributing the workload between two or more resources when first threads corresponding to a first task of the workload are optimizable on the first resource and second threads corresponding to a second task of the workload are optimizable on the second resource.

Example 23 includes the method of example 17, further orchestrating a generation of multiple instances of the workload based on capability information, the capability information corresponding to one or more resources of a platform in which the workload is configured to execute.

Example 24 includes the method of example 17, further including obtaining utilization metrics corresponding to the workload, comparing the utilization metrics to a policy, and based on the comparison, determining that the workload is to be onloaded from the second resource to the first resource.

Example 25 includes an apparatus to distribute a workload at an edge platform, the apparatus comprising means for determining to determine that the workload is to be offloaded from a first resource to a second resource, and means for scheduling to determine an instance of the workload that is compatible with the second resource, and schedule the workload to continue execution based on an exchange of a workload state from the first resource to the second resource, the workload state indicative of a previous thread executed at the first resource.

Example 26 includes the apparatus of example 25, wherein the determining means is configured to obtain utilization metrics corresponding to the workload, compare the utilization metrics to a policy, and based on the comparison, determine that the workload is to be offloaded from the first resource to the second resource.

Example 27 includes the apparatus of example 25, wherein the scheduling means is configured to pause execution of the workload at the first resource, save the workload state of the workload into a memory, and offload the workload to the second resource, the second resource to obtain the workload state from the memory for continued execution of the workload at the workload state.

Example 28 includes the apparatus of example 25, further including means for orchestrating to distribute the workload between two or more resources when first threads corresponding to a first task of the workload are optimizable on the first resource and second threads corresponding to a second task of the workload are optimizable on the second resource.

Example 29 includes the apparatus of example 25, wherein the determining means is configured to periodically compare utilization metrics to policy data to optimize execution of the workload at the platform.

Example 30 includes the apparatus of example 25, wherein the determine means is configured to obtain utilization metrics corresponding to the workload, compare the utilization metrics to a policy, and based on the comparison, determine that the workload is to be onloaded from the second resource to the first resource. The following claims are hereby incorporated into this Detailed Description by this reference, with each claim standing on its own as a separate embodiment of the present disclosure.

Claims

1. An apparatus comprising:

a telemetry controller to determine that a workload is to be offloaded from a first resource to a second resource of a platform; and

a scheduler to: determine an instance of the workload that is compatible with the second resource; and schedule the workload to continue execution based on an exchange of a workload state from the first resource to the second resource, the workload state indicative of a previous thread executed at the first resource.

2. The apparatus of claim 1, further including a capability controller to generate a resource model indicative of one or more resources of the platform based on invoking a composition.

3. The apparatus of claim 1, wherein the telemetry controller is to:

obtain utilization metrics corresponding to the workload;

compare the utilization metrics to a policy; and

based on the comparison, determine that the workload is to be offloaded from the first resource to the second resource.

4. The apparatus of claim 1, wherein the scheduler is to:

pause execution of the workload at the first resource;

save the workload state of the workload into a memory; and

offload the workload to the second resource, the second resource to obtain the workload state from the memory for continued execution of the workload at the workload state.

5. The apparatus of claim 1, wherein the telemetry controller is to periodically compare utilization metrics to policy data to optimize execution of the workload at the platform.

6. The apparatus of claim 1, further including an orchestrator is to distribute the workload between two or more resources when first threads corresponding to a first task of the workload are optimizable on the first resource and second threads corresponding to a second task of the workload are optimizable on the second resource.

7. The apparatus of claim 1, further including an orchestrator to orchestrate generation of multiple instances of a workload based on capability information, the capability information corresponding to one or more available resources of the platform in which the workload is configured to execute.

8. The apparatus of claim 1, wherein the telemetry controller is to:

obtain utilization metrics corresponding to the workload;

compare the utilization metrics to a policy; and

based on the comparison, determine that the workload is to be onloaded from the second resource to the first resource.

9. A non-transitory computer readable storage medium comprising instructions that, when executed, cause a machine to at least:

determine that a workload is to be offloaded from a first resource to a second resource;

determine an instance of the workload that is compatible with the second resource; and

schedule the workload to continue execution based on an exchange of a workload state from the first resource to the second resource, the workload state indicative of a previous thread executed at the first resource.

10. The non-transitory computer readable storage medium of claim 9, wherein the instructions, when executed, cause the machine to generate a resource model indicative of one or more resources of a platform based on invoking a composition.

11. The non-transitory computer readable storage medium of claim 9, wherein the instructions, when executed, cause the machine to:

obtain utilization metrics corresponding to the workload;

compare the utilization metrics to a policy; and

based on the comparison, determine that the workload is to be offloaded from the first resource to the second resource.

12. The non-transitory computer readable storage medium of claim 9, wherein the instructions, when executed, cause the machine to:

pause execution of the workload at the first resource;

save the workload state of the workload into a memory; and

offload the workload to the second resource, the second resource to obtain the workload state from the memory for continued execution of the workload at the workload state.

13. The non-transitory computer readable storage medium of claim 9, wherein the instructions, when executed, cause the machine to periodically compare utilization metrics to policy data to optimize execution of the workload at a platform.

14. The non-transitory computer readable storage medium of claim 9, wherein the instructions, when executed, cause the machine to distribute the workload between two or more resources when first threads corresponding to a first task of the workload are optimizable on the first resource and second threads corresponding to a second task of the workload are optimizable on the second resource.

15. The non-transitory computer readable storage medium of claim 9, wherein the instructions, when executed, cause the machine to orchestrate generation of multiple instances of the workload based on capability information, the capability information corresponding to one or more available resources of a platform in which the workload is configured to execute.

16. The non-transitory computer readable storage medium of claim 9, wherein the instructions, when executed, cause the machine to:

obtain utilization metrics corresponding to the workload;

compare the utilization metrics to a policy; and

based on the comparison, determine that the workload is to be onloaded from the second resource to the first resource.

17. A method comprising:

determining that a workload is to be offloaded from a first resource to a second resource;

determining an instance of the workload that is compatible with the second resource; and

scheduling the workload to continue execution based on an exchange of a workload state from the first resource to the second resource, the workload state indicative of a previous thread executed at the first resource.

18. The method of claim 17, further including generating a resource model indicative of one or more resources of a platform based on invoking a composition.

19. The method of claim 17, further including:

obtaining utilization metrics corresponding to the workload;

comparing the utilization metrics to a policy; and

based on the comparison, determining that the workload is to be offloaded from the first resource to the second resource.

20. The method of claim 17, further including:

pausing execution of the workload at the first resource;

saving the workload state of the workload into a memory; and

offloading the workload to the second resource, the second resource to obtain the workload state from the memory for continued execution of the workload at the workload state.

21. The method of claim 17, further including periodically comparing utilization metrics to policy data to optimize execution of the workload at a platform.

22. The method of claim 17, further including distributing the workload between two or more resources when first threads corresponding to a first task of the workload are optimizable on the first resource and second threads corresponding to a second task of the workload are optimizable on the second resource.

23. The method of claim 17, further orchestrating a generation of multiple instances of the workload based on capability information, the capability information corresponding to one or more resources of a platform in which the workload is configured to execute.

24. The method of claim 17, further including:

obtaining utilization metrics corresponding to the workload;

comparing the utilization metrics to a policy; and

based on the comparison, determining that the workload is to be onloaded from the second resource to the first resource.

25. An apparatus to distribute a workload at an edge platform, the apparatus comprising:

means for determining to determine that the workload is to be offloaded from a first resource to a second resource; and

means for scheduling to: determine an instance of the workload that is compatible with the second resource; and schedule the workload to continue execution based on an exchange of a workload state from the first resource to the second resource, the workload state indicative of a previous thread executed at the first resource.

26.-30. (canceled)