Patents by Inventor Suresh Kumar Vennam

Suresh Kumar Vennam has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11941440
    Abstract: A method includes: dequeuing a signal primitive from a signaling command queue in the set of command queues, the signal primitive pointing to a waiting command queue; in response to the signal primitive pointing to the waiting command queue, incrementing a number of pending signal primitives in the signal-wait counter matrix; dequeuing a wait primitive from the waiting command queue, the wait primitive pointing to the signaling command queue; in response to the wait primitive pointing to the signaling command queue, accessing the register to read the number of pending signal primitives; in response to the number of pending signal primitives indicating at least one pending signal primitive: decrementing the number of pending signal primitives; and dequeuing an instruction from the waiting command queue; and dispatching a control signal representing the instruction to a resource.
    Type: Grant
    Filed: October 25, 2022
    Date of Patent: March 26, 2024
    Assignee: Deep Vision Inc.
    Inventors: Mohamed Shahim, Sreenivas Aerra Reddy, Raju Datla, Lava Kumar Bokam, Suresh Kumar Vennam, Sameek Banerjee
  • Publication number: 20230409936
    Abstract: Proxy systems and methods for multiprocessing architectures are described. One method includes receiving an inference request and a statistics request from a client computing system. The method may access a load state of each processing device in a subset of processing devices preloaded with the neural network model, and select a target processing device from the subset based on the load states. One aspect includes transmitting the inference request to the target processing device, and monitoring an execution of the inference request by the target processing device based on the neural network model. The method may receive an inference result generated by the target processing device after executing the inference request, and compute an average inference time for the inference request execution based on the monitoring. The method may transmit the inference result and the average inference time to the client computing system.
    Type: Application
    Filed: May 17, 2023
    Publication date: December 21, 2023
    Inventors: Lava Kumar Bokam, Sriduth Jayhari, Divya Vipin, Rajashekar Reddy Ereddy, Snigdha Alkanti, Venkateswara Rao Andole Mankali, Suresh Kumar Vennam, Mohammed Mujahid, Sreenivas Aerra Reddy
  • Publication number: 20230376728
    Abstract: Proxy systems and methods for multiprocessing architectures are described. One method includes receiving a neural network model from a client computing system. System resource availability on a plurality of processing devices may be assessed, and a subset of available processing devices may be selected based on the system resource availability. In one aspect, the neural network model is loaded into each processing device in the subset. The method may include receiving an inference request from the client computing system. A load state of each processing device in the subset may be accessed, and a target processing device from the subset may be selected based on the load states. The inference request may be transmitted to the target processing device.
    Type: Application
    Filed: May 17, 2023
    Publication date: November 23, 2023
    Inventors: Lava Kumar Bokam, Sriduth Jayhari, Divya Vipin, Rajashekar Reddy Ereddy, Snigdha Alkanti, Venkateswara Rao Andole Mankali, Suresh Kumar Vennam, Mohammed Mujahid, Sreenivas Aerra Reddy
  • Publication number: 20230315464
    Abstract: A tensor traversal engine in a processor system comprising a source memory component and a destination memory component, the tensor traversal engine comprising: a control signal register storing a control signal for a strided data transfer operation from the source memory component to the destination memory component, the control signal comprising an initial source address, an initial destination address, a first source stride length in a first dimension, and a first source stride count in the first dimension; a source address register communicatively coupled to the control signal register; a destination address register communicatively coupled to the control signal register; a first source stride counter communicatively coupled to the control signal register; and control logic communicatively coupled to the control signal register, the source address register, and the first source stride counter.
    Type: Application
    Filed: June 8, 2023
    Publication date: October 5, 2023
    Inventors: Mohamed Shahim, Raju Datla, Abhilash Bharath Ghanore, Lava Kumar Bokam, Suresh Kumar Vennam, Rajashekar Reddy Ereddy
  • Patent number: 11714651
    Abstract: A tensor traversal engine in a processor system comprising a source memory component and a destination memory component, the tensor traversal engine comprising: a control signal register storing a control signal for a strided data transfer operation from the source memory component to the destination memory component, the control signal comprising an initial source address, an initial destination address, a first source stride length in a first dimension, and a first source stride count in the first dimension; a source address register communicatively coupled to the control signal register; a destination address register communicatively coupled to the control signal register; a first source stride counter communicatively coupled to the control signal register; and control logic communicatively coupled to the control signal register, the source address register, and the first source stride counter.
    Type: Grant
    Filed: May 26, 2021
    Date of Patent: August 1, 2023
    Assignee: Deep Vision Inc.
    Inventors: Mohamed Shahim, Raju Datla, Abhilash Bharath Ghanore, Lava Kumar Bokam, Suresh Kumar Vennam, Rajashekar Reddy Ereddy
  • Publication number: 20230195590
    Abstract: A method includes: accessing a static schedule of a target neural network for execution by a processing device, the target neural network including a set of layers; generating a set of expected performance metrics of the target neural network based on the static schedule, the set of expected performance metrics including a first expected performance metric for a first layer in the set of layers; accessing a set of runtime performance metrics captured during execution of the target neural network by the processing device, the set of runtime performance metrics including a first runtime performance metric for the first layer; and, in response to detecting a difference between the first runtime performance metric and the first expected performance metric exceeding a threshold, serving an alert at a user interface.
    Type: Application
    Filed: December 20, 2022
    Publication date: June 22, 2023
    Inventors: Satyanarayana Raju Uppalapati, Rajasekhar Reddy Ereddy, Sameek Banerjee, Mohammed Shahim, Shilpa Kallem, Suresh Kumar Vennam, Abhilash Bharath Ghanore, Raju Datla, Wajahat Qadeer, Rehan Hameed
  • Publication number: 20230063751
    Abstract: A broadcast subsystem of a processor system includes: a set of broadcast buses, each broadcast bus in the set of broadcast buses electrically coupled to a subset of primary memory units in the set of primary memory units; a primary memory unit queue: configured to store a first set of data transfer requests associated with the set of primary memory units; electrically coupled to the data buffer a broadcast scheduler: electrically coupled to the primary memory unit queue; electrically coupled to the set of broadcast buses; and configured to transfer source data from the data buffer to a target subset of primary memory units in the set of primary memory units via the set of broadcast buses based on the set of data transfer requests stored in the primary memory unit queue.
    Type: Application
    Filed: November 10, 2022
    Publication date: March 2, 2023
    Inventors: Raju Datla, Mohamed Shahim, Suresh Kumar Vennam, Sreenivas Aerra Reddy
  • Publication number: 20230052277
    Abstract: A method includes: dequeuing a signal primitive from a signaling command queue in the set of command queues, the signal primitive pointing to a waiting command queue; in response to the signal primitive pointing to the waiting command queue, incrementing a number of pending signal primitives in the signal-wait counter matrix; dequeuing a wait primitive from the waiting command queue, the wait primitive pointing to the signaling command queue; in response to the wait primitive pointing to the signaling command queue, accessing the register to read the number of pending signal primitives; in response to the number of pending signal primitives indicating at least one pending signal primitive: decrementing the number of pending signal primitives; and dequeuing an instruction from the waiting command queue; and dispatching a control signal representing the instruction to a resource.
    Type: Application
    Filed: October 25, 2022
    Publication date: February 16, 2023
    Inventors: Mohamed Shahim, Sreenivas Aerra Reddy, Raju Datla, Lava Kumar Bokam, Suresh Kumar Vennam, Sameek Banerjee
  • Patent number: 11526767
    Abstract: A broadcast subsystem of a processor system includes: a set of broadcast buses, each broadcast bus in the set of broadcast buses electrically coupled to a subset of primary memory units in the set of primary memory units; a primary memory unit queue: configured to store a first set of data transfer requests associated with the set of primary memory units; and electrically coupled to the data buffer a broadcast scheduler: electrically coupled to the primary memory unit queue; electrically coupled to the set of broadcast buses; and configured to transfer source data from the data buffer to a target subset of primary memory units in the set of primary memory units via the set of broadcast buses based on the set of data transfer requests stored in the primary memory unit queue.
    Type: Grant
    Filed: August 30, 2021
    Date of Patent: December 13, 2022
    Assignee: Deep Vision Inc.
    Inventors: Raju Datla, Mohamed Shahim, Suresh Kumar Vennam, Sreenivas Aerra Reddy
  • Patent number: 11513847
    Abstract: A method includes: dequeuing a signal primitive from a signaling command queue in the set of command queues, the signal primitive pointing to a waiting command queue; in response to the signal primitive pointing to the waiting command queue, incrementing a number of pending signal primitives in the signal-wait counter matrix; dequeuing a wait primitive from the waiting command queue, the wait primitive pointing to the signaling command queue; in response to the wait primitive pointing to the signaling command queue, accessing the register to read the number of pending signal primitives; in response to the number of pending signal primitives indicating at least one pending signal primitive: decrementing the number of pending signal primitives; and dequeuing an instruction from the waiting command queue; and dispatching a control signal representing the instruction to a resource.
    Type: Grant
    Filed: March 24, 2021
    Date of Patent: November 29, 2022
    Assignee: Deep Vision Inc.
    Inventors: Mohamed Shahim, Sreenivas Aerra Reddy, Raju Datla, Lava Kumar Bokam, Suresh Kumar Vennam, Sameek Banerjee
  • Publication number: 20220066963
    Abstract: A broadcast subsystem of a processor system includes: a set of broadcast buses, each broadcast bus in the set of broadcast buses electrically coupled to a subset of primary memory units in the set of primary memory units; a primary memory unit queue: configured to store a first set of data transfer requests associated with the set of primary memory units; and electrically coupled to the data buffer a broadcast scheduler: electrically coupled to the primary memory unit queue; electrically coupled to the set of broadcast buses; and configured to transfer source data from the data buffer to a target subset of primary memory units in the set of primary memory units via the set of broadcast buses based on the set of data transfer requests stored in the primary memory unit queue.
    Type: Application
    Filed: August 30, 2021
    Publication date: March 3, 2022
    Inventors: Raju Datla, Mohamed Shahim, Suresh Kumar Vennam, Sreenivas Aerra Reddy
  • Publication number: 20220067536
    Abstract: A broadcast subsystem of a processor system includes: a set of broadcast buses, each broadcast bus in the set of broadcast buses electrically coupled to a subset of primary memory units in the set of primary memory units; a primary memory unit queue: configured to store a first set of data transfer requests associated with the set of primary memory units; and electrically coupled to the data buffer a broadcast scheduler: electrically coupled to the primary memory unit queue; electrically coupled to the set of broadcast buses; and configured to transfer source data from the data buffer to a target subset of primary memory units in the set of primary memory units via the set of broadcast buses based on the set of data transfer requests stored in the primary memory unit queue.
    Type: Application
    Filed: August 30, 2021
    Publication date: March 3, 2022
    Inventors: Raju Datla, Mohamed Shahim, Suresh Kumar Vennam, Sreenivas Aerra Reddy
  • Publication number: 20210373792
    Abstract: A tensor traversal engine in a processor system comprising a source memory component and a destination memory component, the tensor traversal engine comprising: a control signal register storing a control signal for a strided data transfer operation from the source memory component to the destination memory component, the control signal comprising an initial source address, an initial destination address, a first source stride length in a first dimension, and a first source stride count in the first dimension; a source address register communicatively coupled to the control signal register; a destination address register communicatively coupled to the control signal register; a first source stride counter communicatively coupled to the control signal register; and control logic communicatively coupled to the control signal register, the source address register, and the first source stride counter.
    Type: Application
    Filed: May 26, 2021
    Publication date: December 2, 2021
    Inventors: Mohamed Shahim, Raju Datla, Abhilash Bharath Ghanore, Lava Kumar Bokam, Suresh Kumar Vennam, Rajashekar Reddy Ereddy
  • Publication number: 20210303346
    Abstract: A method includes: dequeuing a signal primitive from a signaling command queue in the set of command queues, the signal primitive pointing to a waiting command queue; in response to the signal primitive pointing to the waiting command queue, incrementing a number of pending signal primitives in the signal-wait counter matrix; dequeuing a wait primitive from the waiting command queue, the wait primitive pointing to the signaling command queue; in response to the wait primitive pointing to the signaling command queue, accessing the register to read the number of pending signal primitives; in response to the number of pending signal primitives indicating at least one pending signal primitive: decrementing the number of pending signal primitives; and dequeuing an instruction from the waiting command queue; and dispatching a control signal representing the instruction to a resource.
    Type: Application
    Filed: March 24, 2021
    Publication date: September 30, 2021
    Inventors: Mohamed Shahim, Sreenivas Aerra Reddy, Raju Datla, Lava Kumar Bokam, Suresh Kumar Vennam, Sameek Banerjee