Patents by Inventor Yamini Nimmagadda

Yamini Nimmagadda has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11941437
    Abstract: Systems, apparatuses and methods provide technology for batch-level parallelism, including partitioning a graph into a plurality of clusters comprising batched clusters that support batched data and non-batched clusters that fail to support batched data, establishing an execution queue for execution of the plurality of clusters based on cluster dependencies, and scheduling inference execution of the plurality of clusters in the execution queue based on batch size. The technology can include identifying nodes of the graph as batched or non-batched, generating a batched cluster comprising a plurality of batched nodes based on a relationship between two or more of the batched nodes, and generating a non-batched cluster comprising a plurality of non-batched nodes based on a relationship between two or more of the non-batched nodes. The technology can also include generating a set of cluster dependencies, where the cluster dependencies are used to determine an execution order for the clusters.
    Type: Grant
    Filed: June 25, 2021
    Date of Patent: March 26, 2024
    Assignee: Intel Corporation
    Inventors: Mustafa Cavus, Yamini Nimmagadda
  • Patent number: 11640326
    Abstract: Systems, apparatuses and methods may provide for technology that identifies telemetry data associated with an execution of a cluster of artificial intelligence (AI) operations on an accelerated backend system, wherein the telemetry data includes one or more of temperature classifier data, compute classifier data or failure data, and determines whether to send a current instance of the cluster of AI operations to the accelerated backend system or a default backend system based on the telemetry data.
    Type: Grant
    Filed: August 3, 2021
    Date of Patent: May 2, 2023
    Assignee: Intel Corporation
    Inventors: N Maajid Khan, Yamini Nimmagadda, Surya Siddharth Pemmaraju
  • Publication number: 20220222584
    Abstract: Systems and techniques for heterogeneous compute-based artificial intelligence model partitioning are described herein. An intermediate representation of an input machine learning model may be generated. The intermediate representation may be analyzed to determine compute metrics for execution of the input machine learning model. An input processing device may be analyzed to determine normalization metrics for execution of the input machine learning model on the input processing device. A partition of the intermediate representation may be generated for the input processing device based on the compute metrics and the normalization metrics. The partition may be transmitted to the input processing device for execution.
    Type: Application
    Filed: April 1, 2022
    Publication date: July 14, 2022
    Inventors: Yamini Nimmagadda, Divya Prakash, Akhila Vidiyala, Venkata Sai Pavan Kumar Akkisetty
  • Publication number: 20220207358
    Abstract: An Infrastructure Processing Unit (IPU), including: a model optimization processor configured to optimize an artificial intelligence (Al) model for an accelerator managed by the IPU, and deploy the optimized Al model to the accelerator for execution of an inference; and a local memory configured to store data related to the Al model optimization.
    Type: Application
    Filed: September 21, 2021
    Publication date: June 30, 2022
    Inventors: Yamini Nimmagadda, Susanne M. Balle, Olugbemisola Oniyinde
  • Publication number: 20220019461
    Abstract: A platform health engine for autonomous self-healing in platforms served by an Infrastructure Processing Unit (IPU), including: an analysis processor configured to apply analytics to telemetry data received from a telemetry agent of a monitored platform managed by the IPU, and to generate relevant platform health data; a prediction processor configured to predict, based on the relevant platform health data, a future health status of the monitored platform; and a dispatch processor configured to dispatch a workload of the monitored platform to another platform managed if the predicted future health status of the monitored platform is failure.
    Type: Application
    Filed: September 24, 2021
    Publication date: January 20, 2022
    Inventors: Susanne M. Balle, Yamini Nimmagadda, Olugbemisola Oniyinde
  • Publication number: 20210406777
    Abstract: Systems, apparatuses and methods include technology that identifies compute capacities of edge nodes and memory capacities of the edge nodes. The technology further identifies a first variant of an Artificial Intelligence (AI) model, and assigns the first variant to a first edge node of the edge nodes based on a compute capacity requirement associated with execution of the first variant, a memory resource requirement associated with execution of the first variant, the compute capacities and the memory capacities.
    Type: Application
    Filed: September 9, 2021
    Publication date: December 30, 2021
    Inventors: Suryaprakash Shanmugam, Yamini Nimmagadda, Akhila Vidiyala
  • Publication number: 20210390460
    Abstract: Systems, apparatuses and methods include technology that converts an artificial intelligence (AI) model graph into an intermediate representation. The technology partitions the intermediate representation of the AI model graph into a plurality of subgraphs based on computations associated with the AI model graph, each subgraph being associated with one or more memory resources and one or more of a plurality of hardware devices.
    Type: Application
    Filed: August 27, 2021
    Publication date: December 16, 2021
    Applicant: Intel Corporation
    Inventors: Yamini Nimmagadda, Suryaprakash Shanmugam, Akhila Vidiyala, Divya Prakash
  • Publication number: 20210383026
    Abstract: Systems, apparatuses and methods include technology that generates a signature based on one or more characteristics of an artificial intelligence (AI) model. The AI model is in a source code. The technology generates a compiled blob based on the AI model and embeds an identifier based on the signature into a metadata field of the compiled blob.
    Type: Application
    Filed: August 19, 2021
    Publication date: December 9, 2021
    Applicant: Intel Corporation
    Inventors: Yamini Nimmagadda, Akhila Vidiyala, Suryaprakash Shanmugam
  • Publication number: 20210382754
    Abstract: Systems, apparatuses and methods include technology that analyzes an input stream and an artificial intelligence (AI) model graph to generate a workload characterization. The workload characterization characterizes one or more of compute resources or memory resources, and the one or more of the compute resources or the memory resources is associated with execution of the AI model graph based on the input stream. The technology partitions the AI model graph into subgraphs based on the workload characterization. The technology selects a plurality of hardware devices to execute the subgraphs.
    Type: Application
    Filed: August 19, 2021
    Publication date: December 9, 2021
    Applicant: Intel Corporation
    Inventors: Yamini Nimmagadda, Akhila Vidiyala, Suryaprakash Shanmugam, Divya Prakash
  • Publication number: 20210374554
    Abstract: Systems, apparatuses and methods may provide for technology that parses, at runtime, a deep learning graph in topological order to identify a plurality of nodes, marks a first set of nodes in the plurality of nodes as unsupported by target hardware, and marks a second set of nodes in the plurality of nodes as supported by the target hardware, wherein the first set of nodes and the second set of nodes are marked based on one or more attributes defining operation functionality, and wherein the one or more attributes include one or more of an input node parameter, a dimension, or a shape.
    Type: Application
    Filed: August 13, 2021
    Publication date: December 2, 2021
    Inventors: Chandrakant Khandelwal, Ritesh Kumar Rajore, Laxmi Ganesan, Sai Jayanthi, Yamini Nimmagadda
  • Publication number: 20210365304
    Abstract: Systems, apparatuses and methods may provide for technology that identifies telemetry data associated with an execution of a cluster of artificial intelligence (AI) operations on an accelerated backend system, wherein the telemetry data includes one or more of temperature classifier data, compute classifier data or failure data, and determines whether to send a current instance of the cluster of AI operations to the accelerated backend system or a default backend system based on the telemetry data.
    Type: Application
    Filed: August 3, 2021
    Publication date: November 25, 2021
    Inventors: N Maajid Khan, Yamini Nimmagadda, Surya Siddharth Pemmaraju
  • Publication number: 20210365804
    Abstract: Systems, apparatuses and methods may provide for technology that detects a transfer condition with respect to an artificial intelligence (AI) workload that is active on a source edge node, conducts intra-node tuning on a destination edge node in response to the transfer condition, and moves the AI workload to the destination edge node after the intra-node tuning is complete.
    Type: Application
    Filed: August 5, 2021
    Publication date: November 25, 2021
    Inventors: Yamini Nimmagadda, Akhila Vidiyala, Suryaprakash Shanmugam
  • Publication number: 20210318908
    Abstract: Systems, apparatuses and methods provide technology for batch-level parallelism, including partitioning a graph into a plurality of clusters comprising batched clusters that support batched data and non-batched clusters that fail to support batched data, establishing an execution queue for execution of the plurality of clusters based on cluster dependencies, and scheduling inference execution of the plurality of clusters in the execution queue based on batch size. The technology can include identifying nodes of the graph as batched or non-batched, generating a batched cluster comprising a plurality of batched nodes based on a relationship between two or more of the batched nodes, and generating a non-batched cluster comprising a plurality of non-batched nodes based on a relationship between two or more of the non-batched nodes. The technology can also include generating a set of cluster dependencies, where the cluster dependencies are used to determine an execution order for the clusters.
    Type: Application
    Filed: June 25, 2021
    Publication date: October 14, 2021
    Inventors: Mustafa Cavus, Yamini Nimmagadda
  • Publication number: 20210319369
    Abstract: Systems, apparatuses and methods provide technology for model generation with intermediate stage caching and re-use, including generating, via a model pipeline, a multi-level set of intermediate stages for a model, caching each of the set of intermediate stages, and responsive to a change in the model pipeline, regenerating an executable for the model using a first one of the cached intermediate stages to bypass regeneration of at least one of the intermediate stages. The multi-level set of intermediate stages can correspond to a hierarchy of processing stages in the model pipeline, where using the first one of the cached intermediate stages results in bypassing regeneration of a corresponding intermediate stage and of all intermediate stages preceding the corresponding intermediate stage in the hierarchy. Further, regenerating an executable for the model can include regenerating one or more intermediate stages following the corresponding intermediate stage in the hierarchy.
    Type: Application
    Filed: June 25, 2021
    Publication date: October 14, 2021
    Inventors: Yamini Nimmagadda, Mustafa Cavus, Surya Siddharth Pemmaraju, Srinivasa Manohar Karlapalem
  • Publication number: 20210319298
    Abstract: Systems, apparatuses and methods provide technology for efficient subgraph partitioning, including generating a first set of subgraphs based on supported nodes of a model graph, wherein the supported nodes have operators that are supported by a hardware backend device, evaluating a compute efficiency of each subgraph of the first set of subgraphs with respect to the hardware backend device and to a default CPU associated with a default runtime, and selecting, from the first set of subgraphs, a second set of subgraphs to be run on the hardware backend device based on the evaluated compute efficiency. The technology can include calculating a backend performance factor for each subgraph for the hardware backend device, calculating a default performance factor for each subgraph for the default CPU, and comparing, for each respective subgraph of the of the first set of subgraphs, the backend performance factor and the default performance factor.
    Type: Application
    Filed: June 24, 2021
    Publication date: October 14, 2021
    Inventors: Surya Siddharth Pemmaraju, Yamini Nimmagadda, Srinivasa Manohar Karlapalem