Patents by Inventor Yamini Nimmagadda

Yamini Nimmagadda has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Multi-level caching for dynamic deep learning models

Patent number: 12288141

Abstract: Systems, apparatuses and methods provide technology for model generation with intermediate stage caching and re-use, including generating, via a model pipeline, a multi-level set of intermediate stages for a model, caching each of the set of intermediate stages, and responsive to a change in the model pipeline, regenerating an executable for the model using a first one of the cached intermediate stages to bypass regeneration of at least one of the intermediate stages. The multi-level set of intermediate stages can correspond to a hierarchy of processing stages in the model pipeline, where using the first one of the cached intermediate stages results in bypassing regeneration of a corresponding intermediate stage and of all intermediate stages preceding the corresponding intermediate stage in the hierarchy. Further, regenerating an executable for the model can include regenerating one or more intermediate stages following the corresponding intermediate stage in the hierarchy.

Type: Grant

Filed: June 25, 2021

Date of Patent: April 29, 2025

Assignee: Intel Corporation

Inventors: Yamini Nimmagadda, Mustafa Cavus, Surya Siddharth Pemmaraju, Srinivasa Manohar Karlapalem
Graph context-based operator checks to improve graph clustering and execution in AI accelerator framework integration

Patent number: 12242973

Abstract: Systems, apparatuses and methods may provide for technology that parses, at runtime, a deep learning graph in topological order to identify a plurality of nodes, marks a first set of nodes in the plurality of nodes as unsupported by target hardware, and marks a second set of nodes in the plurality of nodes as supported by the target hardware, wherein the first set of nodes and the second set of nodes are marked based on one or more attributes defining operation functionality, and wherein the one or more attributes include one or more of an input node parameter, a dimension, or a shape.

Type: Grant

Filed: August 13, 2021

Date of Patent: March 4, 2025

Assignee: Intel Corporation

Inventors: Chandrakant Khandelwal, Ritesh Kumar Rajore, Laxmi Ganesan, Sai Jayanthi, Yamini Nimmagadda
Platform health engine in infrastructure processing unit

Patent number: 12182616

Abstract: A platform health engine for autonomous self-healing in platforms served by an Infrastructure Processing Unit (IPU), including: an analysis processor configured to apply analytics to telemetry data received from a telemetry agent of a monitored platform managed by the IPU, and to generate relevant platform health data; a prediction processor configured to predict, based on the relevant platform health data, a future health status of the monitored platform; and a dispatch processor configured to dispatch a workload of the monitored platform to another platform managed if the predicted future health status of the monitored platform is failure.

Type: Grant

Filed: September 24, 2021

Date of Patent: December 31, 2024

Assignee: Intel Corporation

Inventors: Susanne M. Balle, Yamini Nimmagadda, Olugbemisola Oniyinde
MODEL OPTIMIZATION IN INFRASTRUCTURE PROCESSING UNIT (IPU)

Publication number: 20240386272

Abstract: An Infrastructure Processing Unit (IPU), including: a model optimization processor configured to optimize an artificial intelligence (AI) model for an accelerator managed by the IPU, and deploy the optimized AI model to the accelerator for execution of an inference; and a local memory configured to store data related to the AI model optimization.

Type: Application

Filed: July 26, 2024

Publication date: November 21, 2024

Applicant: Intel Corporation

Inventors: Yamini Nimmagadda, Susanne M. Balle, Olugbemisola Oniyinde
Serverless computing architecture for artificial intelligence workloads on edge for dynamic reconfiguration of workloads and enhanced resource utilization

Patent number: 12106154

Abstract: Systems, apparatuses and methods include technology that analyzes an input stream and an artificial intelligence (AI) model graph to generate a workload characterization. The workload characterization characterizes one or more of compute resources or memory resources, and the one or more of the compute resources or the memory resources is associated with execution of the AI model graph based on the input stream. The technology partitions the AI model graph into subgraphs based on the workload characterization. The technology selects a plurality of hardware devices to execute the subgraphs.

Type: Grant

Filed: August 19, 2021

Date of Patent: October 1, 2024

Assignee: Intel Corporation

Inventors: Yamini Nimmagadda, Akhila Vidiyala, Suryaprakash Shanmugam, Divya Prakash
Integrity verification of pre-compiled artificial intelligence model blobs using model signatures

Patent number: 12086290

Abstract: Systems, apparatuses and methods include technology that generates a signature based on one or more characteristics of an artificial intelligence (AI) model. The AI model is in a source code. The technology generates a compiled blob based on the AI model and embeds an identifier based on the signature into a metadata field of the compiled blob.

Type: Grant

Filed: August 19, 2021

Date of Patent: September 10, 2024

Assignee: Intel Corporation

Inventors: Yamini Nimmagadda, Akhila Vidiyala, Suryaprakash Shanmugam
Graph partitioning to exploit batch-level parallelism

Patent number: 11941437

Abstract: Systems, apparatuses and methods provide technology for batch-level parallelism, including partitioning a graph into a plurality of clusters comprising batched clusters that support batched data and non-batched clusters that fail to support batched data, establishing an execution queue for execution of the plurality of clusters based on cluster dependencies, and scheduling inference execution of the plurality of clusters in the execution queue based on batch size. The technology can include identifying nodes of the graph as batched or non-batched, generating a batched cluster comprising a plurality of batched nodes based on a relationship between two or more of the batched nodes, and generating a non-batched cluster comprising a plurality of non-batched nodes based on a relationship between two or more of the non-batched nodes. The technology can also include generating a set of cluster dependencies, where the cluster dependencies are used to determine an execution order for the clusters.

Type: Grant

Filed: June 25, 2021

Date of Patent: March 26, 2024

Assignee: Intel Corporation

Inventors: Mustafa Cavus, Yamini Nimmagadda
Ensemble based cluster tuning and framework fallback for AI accelerators using telemetry, compute, and temperature metrics

Patent number: 11640326

Abstract: Systems, apparatuses and methods may provide for technology that identifies telemetry data associated with an execution of a cluster of artificial intelligence (AI) operations on an accelerated backend system, wherein the telemetry data includes one or more of temperature classifier data, compute classifier data or failure data, and determines whether to send a current instance of the cluster of AI operations to the accelerated backend system or a default backend system based on the telemetry data.

Type: Grant

Filed: August 3, 2021

Date of Patent: May 2, 2023

Assignee: Intel Corporation

Inventors: N Maajid Khan, Yamini Nimmagadda, Surya Siddharth Pemmaraju
HETEROGENEOUS COMPUTE-BASED ARTIFICIAL INTELLIGENCE MODEL PARTITIONING

Publication number: 20220222584

Abstract: Systems and techniques for heterogeneous compute-based artificial intelligence model partitioning are described herein. An intermediate representation of an input machine learning model may be generated. The intermediate representation may be analyzed to determine compute metrics for execution of the input machine learning model. An input processing device may be analyzed to determine normalization metrics for execution of the input machine learning model on the input processing device. A partition of the intermediate representation may be generated for the input processing device based on the compute metrics and the normalization metrics. The partition may be transmitted to the input processing device for execution.

Type: Application

Filed: April 1, 2022

Publication date: July 14, 2022

Inventors: Yamini Nimmagadda, Divya Prakash, Akhila Vidiyala, Venkata Sai Pavan Kumar Akkisetty
MODEL OPTIMIZATION IN INFRASTRUCTURE PROCESSING UNIT (IPU)

Publication number: 20220207358

Abstract: An Infrastructure Processing Unit (IPU), including: a model optimization processor configured to optimize an artificial intelligence (Al) model for an accelerator managed by the IPU, and deploy the optimized Al model to the accelerator for execution of an inference; and a local memory configured to store data related to the Al model optimization.

Type: Application

Filed: September 21, 2021

Publication date: June 30, 2022

Inventors: Yamini Nimmagadda, Susanne M. Balle, Olugbemisola Oniyinde
PLATFORM HEALTH ENGINE IN INFRASTRUCTURE PROCESSING UNIT

Publication number: 20220019461

Abstract: A platform health engine for autonomous self-healing in platforms served by an Infrastructure Processing Unit (IPU), including: an analysis processor configured to apply analytics to telemetry data received from a telemetry agent of a monitored platform managed by the IPU, and to generate relevant platform health data; a prediction processor configured to predict, based on the relevant platform health data, a future health status of the monitored platform; and a dispatch processor configured to dispatch a workload of the monitored platform to another platform managed if the predicted future health status of the monitored platform is failure.

Type: Application

Filed: September 24, 2021

Publication date: January 20, 2022

Inventors: Susanne M. Balle, Yamini Nimmagadda, Olugbemisola Oniyinde
AUTONOMOUS ALLOCATION OF DEEP NEURAL NETWORK INFERENCE REQUESTS IN A CLUSTER WITH HETEROGENEOUS DEVICES

Publication number: 20210406777

Abstract: Systems, apparatuses and methods include technology that identifies compute capacities of edge nodes and memory capacities of the edge nodes. The technology further identifies a first variant of an Artificial Intelligence (AI) model, and assigns the first variant to a first edge node of the edge nodes based on a compute capacity requirement associated with execution of the first variant, a memory resource requirement associated with execution of the first variant, the compute capacities and the memory capacities.

Type: Application

Filed: September 9, 2021

Publication date: December 30, 2021

Inventors: Suryaprakash Shanmugam, Yamini Nimmagadda, Akhila Vidiyala
COMPUTE AND MEMORY BASED ARTIFICIAL INTELLIGENCE MODEL PARTITIONING USING INTERMEDIATE REPRESENTATION

Publication number: 20210390460

Abstract: Systems, apparatuses and methods include technology that converts an artificial intelligence (AI) model graph into an intermediate representation. The technology partitions the intermediate representation of the AI model graph into a plurality of subgraphs based on computations associated with the AI model graph, each subgraph being associated with one or more memory resources and one or more of a plurality of hardware devices.

Type: Application

Filed: August 27, 2021

Publication date: December 16, 2021

Applicant: Intel Corporation

Inventors: Yamini Nimmagadda, Suryaprakash Shanmugam, Akhila Vidiyala, Divya Prakash
INTEGRITY VERIFICATION OF PRE-COMPILED ARTIFICIAL INTELLIGENCE MODEL BLOBS USING MODEL SIGNATURES

Publication number: 20210383026

Abstract: Systems, apparatuses and methods include technology that generates a signature based on one or more characteristics of an artificial intelligence (AI) model. The AI model is in a source code. The technology generates a compiled blob based on the AI model and embeds an identifier based on the signature into a metadata field of the compiled blob.

Type: Application

Filed: August 19, 2021

Publication date: December 9, 2021

Applicant: Intel Corporation

Inventors: Yamini Nimmagadda, Akhila Vidiyala, Suryaprakash Shanmugam
SERVERLESS COMPUTING ARCHITECTURE FOR ARTIFICIAL INTELLIGENCE WORKLOADS ON EDGE FOR DYNAMIC RECONFIGURATION OF WORKLOADS AND ENHANCED RESOURCE UTILIZATION

Publication number: 20210382754

Abstract: Systems, apparatuses and methods include technology that analyzes an input stream and an artificial intelligence (AI) model graph to generate a workload characterization. The workload characterization characterizes one or more of compute resources or memory resources, and the one or more of the compute resources or the memory resources is associated with execution of the AI model graph based on the input stream. The technology partitions the AI model graph into subgraphs based on the workload characterization. The technology selects a plurality of hardware devices to execute the subgraphs.

Type: Application

Filed: August 19, 2021

Publication date: December 9, 2021

Applicant: Intel Corporation

Inventors: Yamini Nimmagadda, Akhila Vidiyala, Suryaprakash Shanmugam, Divya Prakash
GRAPH CONTEXT-BASED OPERATOR CHECKS TO IMPROVE GRAPH CLUSTERING AND EXECUTION IN AI ACCELERATOR FRAMEWORK INTEGRATION

Publication number: 20210374554

Abstract: Systems, apparatuses and methods may provide for technology that parses, at runtime, a deep learning graph in topological order to identify a plurality of nodes, marks a first set of nodes in the plurality of nodes as unsupported by target hardware, and marks a second set of nodes in the plurality of nodes as supported by the target hardware, wherein the first set of nodes and the second set of nodes are marked based on one or more attributes defining operation functionality, and wherein the one or more attributes include one or more of an input node parameter, a dimension, or a shape.

Type: Application

Filed: August 13, 2021

Publication date: December 2, 2021

Inventors: Chandrakant Khandelwal, Ritesh Kumar Rajore, Laxmi Ganesan, Sai Jayanthi, Yamini Nimmagadda
DYNAMIC AI MODEL TRANSFER RECONFIGURATION TO MINIMIZE PERFORMANCE, ACCURACY AND LATENCY DISRUPTIONS

Publication number: 20210365804

Abstract: Systems, apparatuses and methods may provide for technology that detects a transfer condition with respect to an artificial intelligence (AI) workload that is active on a source edge node, conducts intra-node tuning on a destination edge node in response to the transfer condition, and moves the AI workload to the destination edge node after the intra-node tuning is complete.

Type: Application

Filed: August 5, 2021

Publication date: November 25, 2021

Inventors: Yamini Nimmagadda, Akhila Vidiyala, Suryaprakash Shanmugam
ENSEMBLE BASED CLUSTER TUNING AND FRAMEWORK FALLBACK FOR AI ACCELERATORS USING TELEMETRY, COMPUTE, AND TEMPERATURE METRICS

Publication number: 20210365304

Abstract: Systems, apparatuses and methods may provide for technology that identifies telemetry data associated with an execution of a cluster of artificial intelligence (AI) operations on an accelerated backend system, wherein the telemetry data includes one or more of temperature classifier data, compute classifier data or failure data, and determines whether to send a current instance of the cluster of AI operations to the accelerated backend system or a default backend system based on the telemetry data.

Type: Application

Filed: August 3, 2021

Publication date: November 25, 2021

Inventors: N Maajid Khan, Yamini Nimmagadda, Surya Siddharth Pemmaraju
MULTI-LEVEL CACHING FOR DYNAMIC DEEP LEARNING MODELS

Publication number: 20210319369

Abstract: Systems, apparatuses and methods provide technology for model generation with intermediate stage caching and re-use, including generating, via a model pipeline, a multi-level set of intermediate stages for a model, caching each of the set of intermediate stages, and responsive to a change in the model pipeline, regenerating an executable for the model using a first one of the cached intermediate stages to bypass regeneration of at least one of the intermediate stages. The multi-level set of intermediate stages can correspond to a hierarchy of processing stages in the model pipeline, where using the first one of the cached intermediate stages results in bypassing regeneration of a corresponding intermediate stage and of all intermediate stages preceding the corresponding intermediate stage in the hierarchy. Further, regenerating an executable for the model can include regenerating one or more intermediate stages following the corresponding intermediate stage in the hierarchy.

Type: Application

Filed: June 25, 2021

Publication date: October 14, 2021

Inventors: Yamini Nimmagadda, Mustafa Cavus, Surya Siddharth Pemmaraju, Srinivasa Manohar Karlapalem
GRAPH PARTITIONING TO EXPLOIT BATCH-LEVEL PARALLELISM

Publication number: 20210318908

Abstract: Systems, apparatuses and methods provide technology for batch-level parallelism, including partitioning a graph into a plurality of clusters comprising batched clusters that support batched data and non-batched clusters that fail to support batched data, establishing an execution queue for execution of the plurality of clusters based on cluster dependencies, and scheduling inference execution of the plurality of clusters in the execution queue based on batch size. The technology can include identifying nodes of the graph as batched or non-batched, generating a batched cluster comprising a plurality of batched nodes based on a relationship between two or more of the batched nodes, and generating a non-batched cluster comprising a plurality of non-batched nodes based on a relationship between two or more of the non-batched nodes. The technology can also include generating a set of cluster dependencies, where the cluster dependencies are used to determine an execution order for the clusters.

Type: Application

Filed: June 25, 2021

Publication date: October 14, 2021

Inventors: Mustafa Cavus, Yamini Nimmagadda

1 2 next