DISTRIBUTED COMPUTING SYSTEM WITH MULTI TENANCY BASED ON APPLICATION SLICES

Info

Publication number: 20220350675
Type: Application
Filed: May 3, 2022
Publication Date: Nov 3, 2022
Inventors: Prabhudev Navali (Westford, MA), Raj Nair (Lexington, MA), Prasad Dorbala (Lexington, MA), Sudhir Halbhavi (Bangalore)
Application Number: 17/735,339

Abstract

A distributed computing system has interconnected clusters with compute nodes executing a set of microservices in containers organized into multi-container pods. The system includes application slice components distributed among the clusters to define and operate a plurality of application slices providing application slice services for respective sets of pods distributed among the clusters. The clusters are configured in a multi-tenancy in which distinct tenants each include a respective distinct set of the application slices and is configured according to respective per-tenant configuration data.

Description

Description

BACKGROUND

The present invention relates to the field of computing systems with automated deployment, scaling, and management of containerized applications across multiple clusters (e.g., Kubernetes clusters) in a hybrid/multi-cloud, or multi-datacenter environment.

SUMMARY

A distributed computing system has interconnected clusters with compute nodes executing a set of microservices in containers organized into multi-container pods. The system includes application slice components distributed among the clusters to define and operate a plurality of application slices providing application slice services for respective sets of pods distributed among the clusters. The clusters are configured in a multi-tenancy in which distinct tenants each include a respective distinct set of the application slices and is configured according to respective per-tenant configuration data.

BRIEF DESCRIPTION OF THE DRAWINGS

The foregoing and other objects, features and advantages will be apparent from the following description of embodiments of the invention, as illustrated in the accompanying drawings in which like reference characters refer to the same parts throughout the different views.

FIG. 1 is a block diagram of a multi-cluster Kubernetes-based distributed computing system;

FIG. 2 is a block diagram of a cluster;

FIG. 3 is a schematic block diagram of a service model;

FIG. 4 is a block diagram of a multi-cluster arrangement employing application slice overlay networks;

FIG. 5 is a block diagram of slice network components;

FIG. 6 is a block diagram of a multi-cluster arrangement employing application slices without overlay networks;

FIG. 7 is a schematic diagram of multi-tenancy using application slices across a set of clusters;

FIG. 8 is a flow diagram for a workflow of deploying application slices for multi-tenancy.

DETAILED DESCRIPTION

The content of U.S. Application No. 63/183,244 filed on May 3, 2021, entitled “Smart Application Framework”, is hereby incorporated by reference in its entirety.

Overview

The disclosure is directed to a container-based service deployment system having Pod/Node/Cluster architecture and corresponding management and operational functions, which in one embodiment may be realized using Kubernetes® components. In one major aspect the disclosure is directed to “multi-tenancy” in an application environment, which provides an ability to run applications from different customers, teams, or other administrative units (“tenants”) simultaneously sharing cluster resources while providing compute resource, network, security, and policy isolation among the tenants.

Existing Kubernetes practices do not provide multi-tenancy as a first-class construct for users and resources. Teams deploying applications on one or more Kubernetes clusters leads to operational challenges to manage the namespaces and associated shared resources across all the applications deployed. In some cases, this can lead to security concerns and resource contention due to resource intensive applications. In addition, with multi-cluster deployments admins have tedious operational management challenges to extend the normalized resource quota management, namespace sameness and configuration and configuration drift management. They lack a normalized way to support multi-tenancy related configuration and features like secure overlay network for network traffic isolation, application namespaces association, namespace sameness, resource quota management and isolation based on container and overlay network policies, zero-trust security related features, and slice optimization specific to customer/tenant applications across one or more clusters.

The present disclosure is directed to methods an apparatus that address the above shortcomings using a construct called Application Slice, which can exhibit some or all the following:

The Mesh platform (also known as “Mesh” or “KubeSlice”) combines network, application, Kubernetes, and deployment services in a framework to accelerate application deployment in a multi-cluster, multi-tenant environment. KubeSlice achieves this by creating logical application slice boundaries that allow pods and services to communicate seamlessly across clusters, clouds, edges, and data centers. As enterprises expand application architectures to span multiple clusters located in data centers or cloud provider regions, or across cloud providers, Kubernetes clusters need the ability to fully integrate connectivity and pod-to-pod communications with namespace propagation across clusters. The Smart Application Framework makes it easier to scale and operate cloud business. It infuses intelligence and automation on top of the existing infrastructure to make application infrastructure smarter and grow efficiently while improving quality. The framework includes: (1) the Smart Application Mesh (KubeSlice/Mesh Platform); (2) the Application Slice; and (3) the Smart Applications like AIOps driven Load Balancer or workload placement.

The platform enables creating multiple logical slices in a single cluster or group of clusters regardless of their physical location. Existing intra-cluster communication remains local to the cluster utilizing the CNI interface. Application slice provides isolation of network traffic between clusters by creating an overlay network for inter-cluster communication. Clusters are interconnected using secure gateways. One or more clusters may be attached to the slice. Each slice has its own separate L3 domain address space—separate Subnet. Each cluster that is part of the slice has a part of the slice-subnet. Application Pods are connected to a slice and can connect to each other on slice subnet creating an overlay L3 network using slice routers cross the slice. The overlay L3 network is collection of virtual wires (vWires), and the connectivity is driven by the network service names (namespace-driven) associating workloads/applications to a slice. Applications/Pods that are attached to slice have an IP interface to the slice specific L3 address space. Each slice may include a global namespace that is normalized across the slice—in all the clusters that are attached to slice. All the services that are attached to the slice (across one or more clusters) are visible to each other via slice wide service discovery. Exporting services from one attached cluster in the slice to all the clusters that are attached to the slice. Exported services are only visible to the applications/services attached to the slice.

The platform architecture consists of several components that interact with each other to manage the lifecycle of the slice components and its overlay network. Mesh platform enables creation of a collection of microservices and or collection of virtual machines irrespective of location be in a data center or in multi-cloud to form a domain. This domain acts as micro segmentation to the rest of the workloads. Slice has the capability of spanning across clusters and geographical boundaries. Application slice is an overlay on existing service mesh or hybrid footprint. The platform enables zero trust security across all workloads/microservices. The system federates security for service-to-service communication. A security controller works as a typical Kubernetes-native application with Custom Resources and Controllers with no additional infrastructure or custom configuration formats.

The platform enables customers to extend compute resources to Edge. A small footprint will enable workloads to scale-out to edge compute and appear as a cloud extension to the rest of the services

The system can establish Reinforcement Learning for load balancing service to service communication. RL based load balancing of service-to-service communication helps better utilization of resources and enables huge positive impact to customer experience. RL based load balancing helps to identify bottlenecks in service-to-service communication in a proactive measure.

The Smart Application Overlay works on a multi-cluster environment with slice. In a Multi-cluster environment, service discovery, security and name space are normalized to create a surface area which has fine grain traffic control and security posture.

The Mesh provides a seamless way to manage, connect, secure, and observe applications that need to run workloads on the edge as well as public cloud.

The disclosed system addresses an opportunity that has arisen from the development of the ‘Service Mesh’ (like Istio™) and ‘Network Service Mesh (NSM)’ constructs originating from the development of Kubernetes, microservices, and other technologies under the umbrella of ‘Cloud Native Computing.’ These technologies have enabled multi-cloud distributed applications with Kubernetes microservices clusters deployed across multiple public clouds, edge clouds and customer premise private clouds. It is now possible to create an application overlay infrastructure that interconnects distributed application clusters/Pods across domains. These application specific overlays can now provide a tight binding between an application and its overlay network. Applications can now specify the exact connectivity and QOS requirements required for the application. This allows application developers to build and deploy application overlay networks that support application driven traffic engineering/steering with network-level QOS on the underlying infrastructure.

In accordance with certain embodiments, disclosed herein is an “Application Slice”—a key feature of the Mesh Platform. The platform allows operators to build application slices—application overlays—that are a way of grouping application pods based on one or more organizing principles such as velocity of deployment, security, governance, teams, deployment environments like production/development/pre-production, etc.

The Mesh provides mechanisms to create and manage slices—create an overlay network, apply network policy and service discovery across the slice; and continuous monitoring of slices; observe the slice telemetry, service-to-service relationships, and traffic prioritization and management.

In some embodiments, the Mesh supports combinations of the following:

- Operators that create, monitor, and manage application slice overlay networks that are specific to each set of distributed applications.
- Connecting, securing, and deploying the microservices across multiple Kubernetes clusters using application slices. A cluster to be part of multiple slices simultaneously.
- Applying Network Policies and Service Discovery across the slice, ensuring traffic isolation within their respective overlay networks while also managing traffic and its prioritization.
- Observing slice telemetry and service-to-service relationships.
- Provides separate independent L3 domain per slice
- Provides an ability to create multiple slices in one or more clusters
- Provides micro-segmentation in one or more clusters using application slices
- Provides a mechanism to create and manage global namespace for application slice and normalize that across the slice worker clusters
- Provides mechanism to associate namespaces to application slices and normalize that across the slice worker clusters
- Provide mechanism to associated resource quotas to application slice and associated namespaces and normalize that across the slice worker clusters
- Provides mechanism to create and apply network policies to application slice normalize that across the slice worker clusters
- Provides secure inter-domain connectivity across slice worker clusters
  - Separate VPN/IPSEC/L2TP/etc. tunnels, per slice network namespace
- Namespace-driven connectivity across the slice—using network service mesh
- Provides mechanism to integrate Service Mesh(es) across the slice
- Provides mechanisms to import/export services from/to separate Service Mesh control planes across clusters
- Provides mechanism to incorporate ingress/egress gateways to scale the service deployment and discovery
- Provides declarative mechanisms for slice management
- Provides an overlay data plane (CNI agnostic) and an associate control plane to build the overlay networks
- Provides mechanisms for namespace-driven intra-domain (within a cluster) and inter-domain (across clusters) connectivity over an overlay network.

Embodiments

FIG. 1 shows a wide-area distributed computer network having a plurality of distinct sections referred to as “clouds” 10 interconnected by a network referred to as a “substrate network” (S-NW) 12. The substrate network 12 has portions 14 within respective clouds 10 and a portion 16 that interconnects the clouds 10, as shown. Each cloud 10 includes a plurality of clusters C interconnected by the local substrate network portion 14. As generally known in the art, a “cloud” 10 is a collection of networked computing resources that exclusively share certain attributes such as common ownership/administration, physical infrastructure (e.g., datacenter(s)), Internet connectivity, etc. In the present context, a given cloud 10 may be of a variety of types, including for example a private cloud (e.g., company datacenter or campus), public cloud (e.g., leased service provider such as Amazon AWS, Microsoft Azure), etc. The substrate network 12 includes basic network infrastructure such as cabling, interfaces, physical switches, and routers, etc., along with higher-level components contributing to basic local-area and wide-area network interconnection (e.g., DNS servers, etc.).

Also shown in FIG. 1 is an application mesh controller (APP MESH CTRLLR) 18 connected into the substrate network 12 to be capable of communicating with components of the clusters C. The application mesh controller 18 is shown as a standalone subsystem connected to network portion 16 in FIG. 1, but in alternative embodiments it may reside in one of the clouds 10 or in a separate cloud. In the present description the application mesh controller 18 may also be referred to as the “Backend” or “KubeSlice Controller”.

FIG. 2 shows certain internals of a cluster C. Its basic configuration is a collection of compute nodes 20 interconnected by a cluster network 22, which is specialized local network functionality within the cluster C defined on top of underlying local network infrastructure (cabling, switches, routers, etc.). It also includes an overall cluster controller shown as a cluster master 24. A compute node 20 has hardware computing components such as memory, processors, I/O interface circuitry, etc. (not shown) as generally known. Within each compute node 20, computing resources are virtualized into containers 26, which are arranged into groups or sets referred to as “pods” 28. As generally known in the art, a container 26 serves as a lightweight environment for execution of an application, which in the present context is typically a “microservice” as described more below. The pod-based structuring is used to facilitate certain system-level activities such as network communications, compute scheduling, etc. To that end, each pod 28 has a respective CNI network address (IP) 30, defined in the cluster network 22. Each node 20 also includes certain cluster management components including a cluster controller (CLUS-CTL) 32 and a cluster proxy (CLUS-PROXY) 34. In one embodiment, the general arrangement of FIG. 2 may be realized using a set of distributed-computing components known as Kubernetes®.

FIG. 3 illustrates a service model employed in the distributed network. A service 40 is realized using a distributed set of sub-services, also called “microservices”, that are executed on respective containers 26 of various pods 28. In the illustrated simplified example, the service 40 has microservice components executing on contains C1 and C3 of one pod 28, and contains C2, C4 and C5 of another pod 28. The depiction in FIG. 3 highlights that the service 40 is a higher-level logic arrangement of components executing in the containers 26.

One important aspect of the disclosed technique is its use of a specialized construct referred to as “application slice.” In general, the application slice construct can be used across distinct clusters C to ease the deployment and management of services (specifically, to provide an application-oriented view and organization as opposed to the more structural pod/cluster view and organization, which can have drawbacks as mentioned below). In some embodiments, as described in examples herein, an application slice can include a respective overlay network services that provides several communications related functionalities. More generally, an application slice may minimally include application namespace bindings to the slice and associated resource quota management and namespace-based isolation. Application slices can also be used in conjunction with multi-tenancy as described further below.

FIG. 4 illustrates use of an application slice with overlay network and related functionality The clusters C are shown as including specialized gateway (GW) nodes 50, each including one or more slice operators (Slice OP) 52, one or more slice DNS servers 53, one or more NetOps (network operator) pods 55, and one or more sets of slice components (COMPS) 54 each for a corresponding slice (e.g., Slice-1 and Slice-2 as shown). Slice components (COMPS) 54 can include slice specific slice routers, ingress/egress gateways and VPN gateways. Corresponding sets of slice components 54 among the clusters C function to realize corresponding slice overlay networks 56, which are a specialized type of virtual private network (VPN). Thus, the Slice-1 components 54-1 on the clusters C realize an APP1 Slice VPN 56-1, and Slice-2 components 54-2 on the clusters C realize an APP2 Slice VPN 56-2, etc. Within each cluster C, each App Slice VPN 56 is used by a corresponding set of application components (microservices) executing in respective pods 28. Microservices or Application components 28 in pods can communicate with other applications or microservices any of the IP protocols like UDP, TCP, HTTP, GRPC, etc.

FIG. 5 shows the structure of the slice components 54, namely as including a slice-specific router (slice router) 60 and a set of VPN gateways (GW) 62 that interconnect the local cluster C to the various remote clusters C. It will be appreciated that the router 60 effects routing in the overlay network 56 using network addresses defined therein, as distinct from network addresses defined in the underlying substrate network 12. Of course, all overlay network traffic is carried as application-level payloads in the substrate network 12, which may use known VPN techniques. As shown, the slice components 54 may also include one or more ingress and/or egress gateways 64.

Thus, in this embodiment an application slice is an application overlay infrastructure that includes network services/components distributed across multiple clusters C to provide a surface area with its own layer-3 (L3) domain and IP address space. Application slices may extend over multiple clusters C that are deployed in one or more public/private clouds 10 or data centers/edges. The application slice mechanism provides a framework for scalable secure segmentation of pods 28 that allows traffic prioritization, security isolation, service discovery for service-to-service communication across the slice, granular governance, and failover containment. In addition, this mechanism enables granular application performance management using artificial intelligence/machine learning (AI/ML) algorithms and AI driven AppNetOps (AIOps). Finally, an application slice is considered as an “overlay” because it can work with existing cloud-service infrastructure (such as Kubernetes) and may not require significant changes to existing code. For example, a Pod 28 may be included in an application slice by simple addition of an annotation to a Pod specification in the Kubernetes system.

FIG. 6 illustrates use of application slices 70 (70-1, 70-2) without overlay networks. Slice functionality is realized by the slice components 72 (slice gateway, ingress and/or egress gateways) and application components (microservices) executing in respective pods 28, with configuration and other slice management functionality provided by the slice operator 52 of each cluster C. In this embodiment there is no L3 overlay network, so the slice components 72 do not include a router or VPN gateways, and connectivity is provided via service export/import using ingress/egress gateways or proxies.

FIG. 7 illustrates structure related to support for multi-tenancy. A “tenant” is an administrative grouping of resources and functions that enable seamless operation within each tenant while providing for desired separation and protection between different tenants. In some embodiments the tenants may be different customers or clients, for example, of a compute cloud provider or other compute service provider. In other embodiments tenants may be distinct teams or functions within an organization or across separate organizations. Multi-tenancy provides for desired tailoring of the allocation and use of resources, and in the disclosed system this tailoring is supported through the application slice construct, as described further below.

FIG. 7 shows a simplified example with three tenants 80 (80-1, 80-2 and 80-3). In the clusters C, for each tenant 80 there is a respective distinct set of slices 70. The slices 70 may have different scopes (reach across clusters) as shown. The per-tenant grouping of slices 70 is managed by a slice controller 82, specifically control/management (CNTL/MGMT) logic 84 and a database 86 which stores per-tenant identification and configuration data 88 (88-1, 88-2, 88-3) as shown.

Referring again to FIGS. 4-6, the slices 70 can have a variety of features and functions according to the listing below. Not all of these may be utilized in all embodiments. In the listing, the indication “NW” is used to indicate presence of the respective feature when slices include overlay network functionality as described above with reference to FIG. 4.

Application Slice Features

- Slice across multiple attached clusters C
  - Slice per application or per set of applications 28, application PODs deployed across the slice
- Separate L3 domain per slice (NW), overlay network
- Secure slice with VPN Gateways for inter-cluster communication and mTLS tunnels for intra-cluster communication
- Slice specific service discovery using service export/import across the slice and lookups using Mesh/Slice DNS
  - Service discovery in headless service deployments
  - Service discovery with Egress and Ingress gateways to scale the service deployment
- Slice specific ingress/egress gateways for east/west (E/W) and north-south (N/S) traffic
- Slice specific overlay network policies normalized across the slice
- Slice specific namespace-based network policies normalized across the slice
- Slice specific resource quota management normalized across all the slice worker clusters
- Slice specific slice namespace, namespace associations, global namespace, and normalization of namespace sameness across the slice worker clusters
- Slice specific Traffic Management—for inter-cluster traffic
  - Traffic Control—Bandwidth control (NW)
  - Slice specific QOS profiles (NW)
  - Slice Priority—e.g., high, medium, and low priority slices
  - HTB (token bucket) and DSCP code-based prioritization
  - Traffic Segmentation across slices and other applications in the slice worker clusters (NW)
- Slice gateways 62 specific to application slice (NW)
  - Carries only its application inter-domain traffic
- Namespace-driven connectivity
  - Application Pods 28 connect using network service name (slice name) or namespace name
  - Provides AI/RL policy applications driven AIOPs mechanisms for tuning global/local load balancers, resource schedulers, auto-scalars, deployment specs, selection of edge resource, path selection and workload placement across the slice worker clusters
- Secure Inter-domain connectivity across slice worker clusters

Discovery and Orchestration of Application Slices

During an application deployment—network services are discovered using the slice network namespace; and inter-domain secure overlay links (VPN etc.) are established to build a distributed applications specific application overlay network slice.

Slices can use service export/import functions to export/import Kubernetes services and Istio virtual services for slice-wide service discovery. In addition, a Slice Ingress gateway can be used to export services and a Slice Egress gateway can be used for imported services. One or more application namespaces can be associated with these slices. Slice isolation can be enabled by implementing network policies for these namespaces. Slices are defined across clusters C, but in some deployments, it may be beneficial to use slices that exist within a single cluster.

Slice Namespace

The slice namespace is an association of application slice wide L3 network namespace and one or more cluster namespaces with the slice. Slice namespace provides slice-specific namespace associations for all the services on the application slice to be associated with. All the services that are deployed on the slice across all the clusters are associated with the slice namespace associations and are discovered across the slice. The services that are registered with the application slice namespace can be looked up by any of the services on the application slice. The Slice Operators (Slice Controllers) 52 in all the slice associated clusters C coordinate to normalize the slice namespace across those clusters. They also monitor and enforce the slice namespace associations within the slice. Any application/service to be deployed on the slice must be in one of the associated namespaces of the slice. These services are not visible or accessible outside of the slice (unless exception rules are applied). The slice namespace provides isolation of services to application slice. Slice network policies can be associated with namespaces that are associated with the slice namespace. These slice network policies provide isolation of traffic and traffic control within the slice and between slice and the other cluster resources.

Federated Security

The Application Slice offers an important feature—federated security—that automates the creation of Secure Overlay Links (SOL)—VPNs/VPCs or other wide area secure interconnection technologies, applying global security policies and removes the burden of the security management from the operational staff and further improves the overall security of the network through automation.

AIOps on Application Slice

During application runtime, a component AIOps (AI Ops) ingests telemetry from the overlay network services to ML/RL agents. The RL agents assist in tuning the overlay network services parameters to optimize the distributed application performance.

Mesh system components include the network service mesh Control plane and Dataplane components to create and manage the Application Slice L3 overlay network. These components include the network service manager, network service Dataplane daemons, network service registry, forwarders and Webhooks management functions. Network service mesh control plane enables the automation of orchestration of slice connectivity between the slice network service clients (Application Pods 28) and slice network services/components 54 such as Slice Routers 60.

Application Mesh Controller (“Backend,” “KubeSlice Controller”) 18

The Backend 18 provides management, visualization, dashboard functions and APIs to manage the life cycle of the slice and slice policy deployment across multiple clusters. In one embodiment the Backend can be implemented using Cloud services and, in another embodiment, as “KubeSlice/Mesh Controller” can be implemented using Kubernetes native constructs and custom resource descriptors (CRDs).

The Backend/KubeSlice Controller is installed in one of the clusters and provides a central configuration management system, for slices across multiple clusters. The KubeSlice Controller can be installed in one of the worker cluster or in a separate cluster.

The Backend/KubeSlice Controller 18 provides:

- A communication interface through which Slice Operators on multiple clusters can connect to it. The slice configuration that includes slice VPN gateway, service discovery with service import/export, and ingress/egress gateway related parameters are relayed to the Slice Operators on registered clusters.
- Creation and management of cryptographic certificates for secure slice VPN gateways.
- APIs through the API Gateway for the KubeSlice Manager to create and manage the application slices.

Slice Management Functions and APIs

- Provides APIs for UI Slice Management and Application onboarding
- Manages encrypted resources, keys, etc. needed for Slice VPN gateways 62
- Interacts with the Slice Operators 52 in the clusters for management of slice and slice resources, slice configuration, service export/import for discovery, status, and telemetry, etc.
- Management of certificates for secure Slice VPN gateways 62
- Tenants/Customers/User/Operators management
- Provides mechanisms to create customers/tenants and associated user and service accounts and related projects and other per customer/tenant data
- Customer/Tenant/User roles and responsibilities, RBP/RBAC management
- Define roles and responsibilities for slice management
- Slice Policy management functions and APIs
- Provides APIs for UI and Slice Policy Operator and ML/RL cluster Slice Policy Functions

Slice Operator 52

In accordance with certain embodiments, the Slice Operator 52 may be a Kubernetes Operator component that manages the life cycle of application slices related custom resource definitions (CRDs). It helps to manage the application slices with a declarative management support for GitOps based workflows. A SliceCtl tool may be used to manage the Slice CRD resources. Application Slices CRDs can be managed using Cluster Controller 32 as well.

- Reconciles slice resources in the cluster and with application mesh controller (KubeSlice Controller) 18
- Creates slice components needed for Slice VPN Gateway connectivity, Slice Service Discovery, and Slice Policy
- Auto insertion/deletion of the slice components to accommodate slice cluster topology changes
- Interacts with slice components for Config/Status updates
- Interacts with the Backend/Controller to manage the life cycle of the slices, Slice Configuration, Slice Status, Slice Namespace, Slice Telemetry and Slice Policy
- Interacts with the Backend to facilitate Network Policy Deployment and Service Discovery across the slice
- Interacts with the Backend to Export Istio Services labelled for export to other clusters attached to the slice
- Interacts with the Backend to Import Istio Services from other clusters attached to slice
- Interacts with Backend for RBAC for managing the Slice Components
- Supports Slice and Slice Policy resource management using GitOps workflows and declarative management.

SliceCtl

In accordance with certain embodiments, SliceCtl is a CLI tool to interact with Slice Operator 52 and manage slices and slice related resources on the cluster. SliceCtl commands include Login, register cluster, attach/detach slice, delete slice, service import/export, etc.

Slice Overlay Network

In an embodiment such as that of FIG. 4, each slice 70 has its own IP L3 domain (e.g., subnet /16) and each cluster C that is attached to the slice gets a part of the subnet (e.g., /24). Slice VPN Gateways 62 connect to local Slice Routers 60.

Slice VPN Gateway 62

Slice VPN Gateway 62 is a slice network service component that provides a secure VPN link connection endpoint for the slice on a cluster C. A pair of Slice VPN Gateways 62 are deployed to connect every pair of clusters C attached to a Slice. A VPN Gateway 62 connects to a remote VPN Gateway 62 in a remote cluster C. Slice Operator 52 manages the life cycle of the Slice VPN Gateways 62. Slice Operator 52 deploys and manages the configuration and keys/certificates for the operation of the Slice VPN Gateways. Slice Operator 52 interacts with Backend to get the slice configuration and auto inserts the slice components like VPN Gateways 62 and Slice Routers 60 for the slice. Slice Operator 52 constantly interacts with Slice VPN Gateways 62 for status, keys/certificates, and configuration changes. Backend manages the VPN gateway pairs for slice attached clusters, creates the keys and configuration for the operation.

Slice Traffic Control

Slice VPN Gateways 62 are the exit/entry points for all the traffic to/from the Applications Pods 28 on the slice to remote cluster Slice VPN Gateways 62. Slice VPN Gateways 62 are configured with Traffic Control (TC) Policies (with a QOS profile) to manage the traffic shaping for the slice. Slice TC on VPN Gateways 62 support marking the packets with DSCP/COS code points to provide prioritization of the slice traffic.

Slice Router 60

Slice Router 60 is a slice network service (VL3 NSE) component that provides a virtual L3 IP switching functionality for the slice. Each slice in a cluster C has one Slice Router 60, with the possibility of a redundant pair option. Slice Operator 52 manages the life cycle of the Slice Router 60, which includes deploying, configuring and continuously monitoring/managing the Slice Router 60 for the slice. All the Application 28 Pods of the cluster C on the slice connect to Slice Router 60 of the slice. Slice Router 60 provides the connectivity to the rest of the slice components, which are Applications distributed across the clusters C.

When an Application Pod 28 connects to the slice (as a network service client NSC) on a cluster C, the Slice Router 60 manages the establishment of the Slice Interface (NSM interface) on the Application Pod 28—done automatically via injection into the Pod 28. The Application Pods 28 use this Slice Interface to communicate with the other Applications/Network Services (local or remote) on the slice. Slice Router 60 manages the IPAM/routes for the slice cluster applications/components.

NetOps

Each slice in a cluster is associated with a QoS profile. The QOS profile is applied on the tunnel interface of the VPN gateways 62. In addition, on the Gateway nodes 50 the NetOp Pods enforces the QoS profiles for all the slices. It uses Linux TC (Traffic Control) to apply Hierarchical Token Bucket (HTB), priority and DSCP values for slice traffic classification.

Mesh DNS (KubeSlice DNS)

Mesh DNS is a core DNS server that is used to resolve service names exposed on application slices. The Slice Operator 52 manages the DNS entries for all the services running on the Slice overlay network(s). When a service is exported on the slice by installing a ServiceExport object, the Slice Operator 52 creates a DNS entry for the service in the Mesh DNS and a similar entry is created in the other clusters that are a part of the slice.

Slice Istio Components

The application mesh works with Istio service mesh components in a cluster. If Istio is deployed on a cluster, it uses Istio ingress/egress gateway resources to create Slice Ingress/Egress Gateways. These Slice Ingress/Egress Gateways can be manually deployed or auto deployed as part of the slice. Slice Ingress/Egress Gateways can be deployed for E/W traffic

Slice Egress/Ingress Gateways can be used to export/import slice connected application services across the slice clusters. A Slice Ingress Gateway can be used to export the services from a slice cluster. A Slice Egress Cluster can be used to import the slice services from remote slice clusters. Slice Service Discovery uses the Slice Ingress/Egress Gateways to export/import the application services across the slice clusters. Deployment of the Slice Ingress/Egress Gateways on a slice is optional.

User Interface (UI)

UI (KubeSlice Manager) is a web interface to manage the mesh network across multiple clusters C. The UI can be used for Slice and Slice Policy management. It allows users to register clusters, create slice and connect clusters. Slice dashboards provides observability into the slice operations—slice network services, application services deployed on the slice across multiple clusters. It allows users to view and explore the slice services topology, slice service discovery data, traffic, latency, and real time health status.

Deploying Application Slice across Multiple Clusters

The mesh allows users to create and manage application slices across multiple clusters C. Based on role-based permissions (RBP), a user can be Cluster Admin, Slice Admin, Application TL, Developer, etc. The Mesh allows multiple ways to create and deploy the slices—UI, Helm Charts/GitOps and Backend APIs.

In some embodiments, the following tasks are performed in preparation for deploying a slice on a cluster:

1. Create worker clusters C, and configure and deploy Istio and other system components

2. Deploy the Mesh/KubeSlice System and Slice Operator 52 components

3. Identify and label a node 20 in a cluster C as a Gateway Node 50 and open appropriate ports (UDP/TCP) for communication

Registering Clusters

Once the KubeSlice/Mesh system components and Operators are installed Users can register the worker clusters C with Controller 52. The user can use Helm charts or UI (KubeSlice manager) to register the clusters. Once clusters are registered user can create slices.

Installing Slice

There are multiple ways a slice can be created with worker clusters C:

1. Helm chart: Users can specify the slice parameters as values and apply slice helm chart to Backend/Controller 18. Slice controller creates appropriate SliceConfig resources (CRDs) for the configuration. The Slice Operator 52 interacts with the Controller 18 to get the SliceConfig and uses these parameters to create and deploy the slice components on the worker cluster.

2. User can use UI to register and create slice. UI interacts with Slice Controller 18 using Controller APIs to create SliceConfig resources (CRDs). Slice Operator 52 interacts with the Controller 18 to get the SliceConfig and uses these parameters to create and deploy the slice components on the worker cluster.

Once the slice components are deployed the Slice VPN gateways in worker clusters connect to each other to form a full mesh connectivity.

Deploying Applications Over Application Slice

Users can deploy the Application Services (App Pods 28) on to the slice on a cluster C to access other Application Services that are deployed on the slice in other attached clusters. Slice provides the network connectivity and service discovery to enable service-to-service communication. Users can deploy the Application Service on to a slice in multiple ways.

Users can update the service deployment specifications with slice related annotations to onboard the service and related replicas on to the slice.

Users can also associate namespaces with slice. In auto onboarding mode, all the services that are deployed on the associated namespaces are onboarded on to the slice the Slice Operator 52 by updating the deployment specs of the services.

Users can also use UI to onboard the applications/services on to a slice. Users can select and associate namespaces to slice. SliceConfig will be updated with selected namespace associations. Slice Operator 52 onboards the services that belong to the namespaces.

In one embodiment, onboarding of a service on to the slice will result in adding an overlay network interface (NSM interface) to the POD. The POD is attached to the slice overlay network. This will allow that service/POD to communicate with all the other PODs/Services that are attached (onboarded) to slice overlay network using IP/TCP/HTTP/GRPC/UDP/etc. protocols.

Multi-Tenancy with Application Slices

As described above with reference to FIG. 7, the system enables multi-tenancy across one or more clusters C using application slices 70 to support service and namespaces isolation, micro-segmentation for north-south ingress and east-west ingress/egress traffic isolation, resource quota management, normalized network and RL policies and security among customers or applications of a single customer.

Enterprise customers with multiple teams/departments/environments can share one or more cluster resources using one or more application slices per team/department/environments. Each team/department/environment can be a separate customer in the platform. The teams and departments can further have team members or sub-departments.

A service provider with multiple customers from different enterprises or individuals can share one or more cluster resources using one or more application slices per customer.

Application Slices provide features to support multi-tenancy deployment models. Platform provides mechanisms to support multi-tenant application deployments using application slices.

The slice controller 82 allows administrative users (Admins) to create separate customers/tenants 80 in the platform. Each customer/tenant data is kept in isolation in the controller database 86. For each tenant 80, Admins can create one or more slices 70 on which to deploy their applications.

Admins can also configure tenant-wide settings that would be applied to all the slices 70 that are created for the tenant across all the clusters C. The controller 82 may reside in a separate controller cluster, and multi-tenant configuration and resource data kept separate from the registered slice worker clusters C. Access control to customer data is controlled using service accounts and appropriate RBAC policies.

Each slice configuration has information about the multi-tenancy requirements for the customer and the slice. The platform controller uses that information to orchestrate the slice for the customer across one or more clusters. The slice operator in each cluster implements the orchestration of the slice on its cluster. The slice operator implements and enforces the application slice multi-tenancy requirements. The slice operator constantly monitors the slice metrics and configuration to enforce the multi-tenancy requirements.

Controller maintains the customer/slice associations and configuration details including multi-tenancy related configuration. The controller provides APIs, helm charts/YAMLs and gitOps mechanisms to orchestrate customers/tenants and slices.

The following is an example slice configuration description:

apiVersion: hub.kubeslice.io/v1alpha1 kind: SliceConfig metadata: name: <slice-name> namespace: <kubeslice-projectname> spec: sliceSubnet: <slice-subnet> sliceType: Application sliceGatewayProvider: sliceGatewayType: OpenVPN sliceCaType: Local sliceIpamType: Local clusters: - <registered-cluster-name-1> - <registered-cluster-name-2> qosProfileDetails: queueType: HTB priority: <qos_priority> tcType: BANDWIDTH_CONTROL bandwidthCeilingKbps: 5120 bandwidthGuaranteedKbps: 2560 dscpClass: AF11 externalGatewayConfig: - ingress: enabled: false egress: enabled: true gatewayType: istio clusters: - <cluster-name> - ingress: enabled:true egress: enabled: false gatewayType: istio clusters: - <cluster-name> namespaceIsolationProfile: isolationEnabled: <enable isolation> applicationNamespaces: - <cluster name>:<namespace> allowedNamespaces: - <cluster name>:<namespace> resourceQuota: enabled: true profile: green-quota authentication: mTLS: true profile: auth-profile optimization-policy: enabled: true rl-slice-lb-policy: true lb-profile: green-lb-profile rl-slice-auto-scalar-policy: true auto-scalar-profile: gree-auto-scalar-profile rl-slice-workload-policy: true slice-auto-scalar-profile: green-as-profile

It will be appreciated that in an embodiment such as FIG. 6 having no overlay network, the corresponding configuration statements/parameters are omitted (e.g., subnet configuration, etc.).

The following describes various aspects of the configuration and use of slices in additional detail:

Multiple clusters: Each slice can be deployed across one or more clusters. The registered clusters can be associated with the slice. Platform allows customer configurations related to resource quota, service mesh and overlay network configuration, etc. to support multi-tenancy with application slices across the clusters.

Isolation: To provide application namespaces and associated network traffic isolation the platform supports the namespace association with network policies and overlay network.

Global namespace/namespaces association: Each slice has a global namespace associated with it. This global namespace is the root namespace that would be present in every cluster that is associated with the slice. The global namespace root can be a K8S hierarchical namespace root. Admins can create and sub-namespaces under this root namespace and attach them to the slice. In addition, each slice can be associated with one or more k8s native namespaces. The Slice configuration carries associated application namespaces with the slice. In addition, each slice can be configured with a list of allowed namespaces that are allowed to communicate with the associated application namespaces.

Network policy: Each slice is configured to apply network policy for the application namespaces associated with the slice. The network policy allows communication with all the associated application namespaces and listed allowed namespaces. It blocks the communication with other namespaces/applications/Pods. Slice operator implements the namespace association and network policy for the slice. It creates and applies appropriate k8s native container networking and overlay networking resources to provide isolation for both north-south ingress and east-west ingress/egress traffic.

Secure Overlay Network: Each slice can be configured with its own secure overlay network that spans across one or more clusters. The secure overlay network provides the needed micro-segmentation traffic isolation and security for multi-tenancy. The secure overlay is created using networking services like VPN gateways, slice IP routers, layer 2 overlay data plane and control plane network services. The secure overlay also integrates with service mesh control and data plane to provide service discovery across the clusters. The overlay also integrates with north-south ingress and east-west ingress/egress gateways. These services are all specific to each slice. The communication between these services can be purely on the overlay network or combination of the overlay and container network. The overlay network is a collection of point-to-point virtual wire (vwire) tunnels. The overlay supports different tunnel protocols like GRE/IP-in-IP/etc. The traffic in the tunnels can be encrypted. To provide isolation for multi-tenancy these tunnels can be encrypted. The VPN tunnel between the clusters is encrypted with different encryption techniques like OpenVPN, L2TP/IPSEC, WireGuard, PPTP, IKEV2, etc. The controller manages the orchestration of the tunnels by generating configuration and associated keys/certificates and other parameters. In addition, the communication between services inside the cluster and across the clusters can be with mTLS authentication.

The secure overlay network in addition to namespace-based network policies and authentication and authorization provides a zero-trust security—essential for multi-tenancy—with the application slices.

RBAC and access control: On both controller clusters and slice clusters service accounts (and other identity management solution-based tokens) and appropriate RBAC policies are used to provide access control to the customer/slice resources. The allowed services accounts will be able to configure and manage the customer/slice resources and block access to the resources for others.

Resource quotas and optimized utilization: Admins can configure appropriate resource quotas for customers/tenants/slices. The controller passes these resource quota requirements to all the clusters during slice orchestration. The slice operator in each cluster implements the resource requirements across all the associated namespaces and overlay network services and other slice services. The slice operator also monitors the resource usage and takes appropriate actions like generating alerts and events that can trigger corrective actions by the controller.

QOS profiles and traffic control: The platform allows Admins to create separate QOS profiles for each customer/slice. This allows the platform to support multi-tenancy with applications/slices with different traffic control/priorities for each tenant. Different tenants can have different QOS profiles—high priority slices, medium priority slices and low slices. Admins will be able to support and enforce multi-tenancy with different traffic control requirements for each customer/tenant.

Slice monitoring for multi-tenancy: Controller and slice operators in all the clusters work together to ingest telemetry from multi-tenancy related resources like namespaces, network policy, overlay network services and other slice services. The configuration drift and other violations are detected, and appropriate alerts and events are generated to the controller so appropriate corrective actions can be taken.

Slice optimization: The platform allows Admins to configure and implement different RL driven slice optimization policies. Some of the RL policies are 1) load-balancer optimization for efficient traffic control and distribution, 2) workload placement for the cost and resource optimization and 3) slice wide auto-scalar policy to optimize the cost and resources used for auto-scaling the application/services deployment.

Service discovery and isolation: Each slice has its own slice specific service discovery and services discovered across the slice are isolated from other customers and slices. Each slice can have dedicated slice DNS to provide isolation across the tenants. Since each overlay has its own L3 IP domain the services from other clusters will not be able to resolve and access the services in other slices over the overlay network. In addition, network policies at the namespaces/pods will provide access control as well.

Multi-tenancy with application slices in single cluster: The platform allows Admins to be able to provide multi-tenancy with applications slices for application deployments for multiple customers/tenants. The application slice features that are discussed above apply to single cluster deployments as well.

FIG. 8 is a flow diagram describing an example workflow for multi-tenancy using application slices.

- 1. At 90, the Slice Controller 82 is installed, and the slice operators 52 are installed in the worker clusters, for example as a SaaS or Cluster deployment
  - a. Separate customer/tenant data in the database
  - b. Create separate customer/tenant service accounts/IAM accounts, projects, roles, RBACs, etc. data to manage the customers/tenants' slices
  - c. Slice Controller manages the life cycle of the tenants/customer data and associated application slices
- 2. At 92, Slice Operators are installed in the worker clusters
  - a. Worker clusters are registered with Slice Controller
  - b. One or more worker clusters are shared across one or more tenants/customers
- 3. At 94, a slice of a given tenant is installed or configured using Slice UI Manager or Helm charts (94 begins a loop of operations for all slices of all tenants, see below).
  - a. Slice configuration has customer/slice specific configuration
  - b. Slice configuration has configuration required to support multi-tenancy related features
  - c. Apply the configuration to Controller
- 4. At 96, the slice controller 82 validates the slice configuration and generates all the dynamic configuration required by all the clusters to orchestrate the slice related components
  - a. Controller generates the overlay network configuration
  - b. Controller generates the network policy configuration
  - c. Controller generates namespace associations and associated resource quotas
  - d. Controller generates the service discovery, authentication, and related configuration data
  - e. Controller communicates the slice optimization policy configuration to slice RL agent SPC controller
- 5. At 98, the slice controller 82 sends the slice configuration details to all the clusters that are specified in the configuration. These clusters will participate in the slice.
- 6. At 100, the slice operators 52 in the clusters implement the slice configuration
  - a. Orchestrates slice components
  - b. Creates required helm charts, deployment YAML files, specs and applies them to create custom and native resources
  - c. Creates overlay network services
  - d. Creates associated names spaces, if needed,—global namespace and associated sub-namespaces
  - e. Creates and applies appropriate network policies to associated namespaces and blocks traffic from other customers/slices/pods/etc.
  - f. Creates deployment specs for ingress/egress gateways and configures traffic management rules
  - g. Creates and applies appropriate slice optimization policies
- 7. At 102, the applications/services of namespaces that are associated with the application slice are onboarded on to the tenant's application slice
  - a. Slice Operator in auto mode will onboard all the services that belong to the namespaces that are associated with the slice on to the slice.
  - b. In Manual mode Admin can use UI or Helm charts/Deployment specs to onboard the services that belong to the namespaces that are associated with the slice.
  - c. One or more or all the services that belong to the namespaces that are associated with slice can be onboarded on to the slice.
- 8. Steps 94-102 are repeated for each slice of each tenant.
- 9. At 104, the slice operators 82 in worker clusters continue to monitor the application slices for enforcement of any multi-tenancy policies and detection of network/resource usage violations and configuration drifts. Slice Operator generates reports violations and drifts using alerts and events, which can surface any issues up to an administrative user. Slice operator enforces the policies, network/resource usage by the namespaces associated with slices.

While various embodiments of the invention have been particularly shown and described, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the invention as defined by the appended claims.

Claims

1. A distributed computing system having a plurality of clusters interconnected by a network, each cluster including a plurality of compute nodes connected by a cluster network and collectively executing a set of microservices in respective containers organized into multi-container pods, the pods being individually addressable in the cluster network, the distributed computing system including application slice components distributed among the clusters to define and operate a plurality of application slices each providing application slice services for respective sets of pods distributed among the clusters, the clusters further configured in a multi-tenancy in which a plurality of distinct tenants each includes a respective distinct set of the application slices and is configured according to respective per-tenant configuration data.

2. The distributed computing system of claim 1, wherein each tenant is an administrative grouping of resources and functions that enable inter-service operation within each tenant while providing for desired separation and protection between services of different tenants.

3. The distributed computing system of claim 2, wherein the tenants are distinct customers of a compute cloud provider or other compute service provider.

4. The distributed computing system of claim 1, wherein the tenants share cluster resources with isolation of per-tenant compute resources, network resources, security mechanisms, and policy among the tenants.

5. The distributed computing system of claim 1, wherein the per-tenant configuration data for a given tenant includes tenant-wide settings that are applied to all the slices of the tenant across all the clusters.

6. The distributed computing system of claim 1, wherein the per-tenant configuration data is stored and applied by a controller residing in a controller cluster separate from the clusters, and access control is used to provide controlled access to the per-tenant configuration data.

7. The distributed computing system of claim 1, wherein each cluster includes a slice operator configured and operative to implement orchestration of the slices in its cluster, the slice operator (1) implementing and enforcing application slice multi-tenancy requirements, and (2) monitoring slice metrics and configuration to enforce the multi-tenancy requirements.

8. The distributed computing system of claim 1, wherein the per-tenant configuration data includes resource quotas for the tenants and slices, the resource quotas being passed to respective slice operators of the clusters during slice orchestration, the slice operator in each cluster implementing the resource quotas across associated namespaces and other slice services, and monitoring resource usage and taking actions including generating alerts and events that can trigger corrective actions by a system controller.

9. The distributed computing system of claim 1, wherein the per-tenant configuration data includes respective quality of service (QOS) profiles for the tenants and slices, whereby distinct traffic control/priorities are provided for each of the tenants.

10. The distributed computing system of claim 1, wherein a system controller and respective slice operators of the clusters ingest telemetry from multi-tenancy related resources including namespaces, network policy, and other slice services, and upon detection of configuration drift or other violations, corresponding alerts and events are generated, and corresponding corrective actions are taken.

11. The distributed computing system of claim 1, wherein per-slice functionality includes configuration and implementation of different slice optimization policies selected from 1) load-balancer optimization for efficient traffic control and distribution, 2) workload placement for cost and resource optimization and 3) slice-wide auto-scalar policy to optimize cost and resources used for auto-scaling application/services deployment.

12. The distributed computing system of claim 1, wherein each slice has its own slice specific service discovery, and services discovered across the slice are isolated from other tenants and slices.

13. A method performed in a multi-cluster distributed computing system, each cluster including a plurality of compute nodes connected by a cluster network and collectively executing a set of microservices in respective containers organized into multi-container pods, the pods being individually addressable in the cluster network, the method comprising the steps, performed for each slice of respective sets of slices for respective tenants of the system, of:

by a slice controller, receiving and validating slice configuration data, generating dynamic slice configuration data, and sending the dynamic slice configuration data to slice operators of respective clusters as specified in the slice configuration data; and

by each of the slice operators of the selected clusters, (1) implementing the slices of the tenants in the respective cluster according to the dynamic slice configuration data from the slice controller, (2) onboarding the microservices of namespaces associated with the application slices onto the respective slices, and (3) during subsequent operation, monitoring the application slices for enforcement of multi-tenancy policies and for detection and reporting of resource usage violations and configuration drifts.

14. The method of claim 13, wherein generating dynamic slice configuration data includes:

generating overlay network configuration;

generating network policy configuration;

generating namespace associations and associated resource quotas; and

generating service discovery, authentication, and related configuration data.

15. The method of claim 13, wherein implementing the slices includes:

orchestrating slice components;

creating helm charts, deployment files, and specifications applying them to create custom and native resources;

creating overlay network services;

creating associated namespaces if needed, the namespaces including global namespace and associated sub-namespaces;

creating and applying appropriate network policies to associate namespaces and block traffic from other tenants and slices;

creating deployment specifications for ingress/egress gateways and configuring traffic management rules;

creating and applying appropriate slice optimization policies.