Patents by Inventor Avinash Sodani

Avinash Sodani has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230096994
    Abstract: A method of converting a data stored in a memory from a first format to a second format is disclosed. The method includes extending a number of bits in the data stored in a double data rate (DDR) memory by one bit to form an extended data. The method further includes determining whether the data stored in the DDR is signed or unsigned data. Moreover, responsive to determining that the data is signed, a sign value is added to the most significant bit of the extended data and the data is copied to lower order bits of the extended data. Responsive to determining that the data is unsigned, the data is copied to lower order bits of the extended data and the most significant bit is set to an unsigned value, e.g., zero. The extended data is stored in an on-chip memory (OCM) of a processing tile of a machine learning computer array.
    Type: Application
    Filed: December 6, 2022
    Publication date: March 30, 2023
    Inventors: Avinash Sodani, Ulf Hanebutte, Chia-Hsin Chen
  • Patent number: 11604683
    Abstract: A new approach for supporting tag-based synchronization among different tasks of a machine learning (ML) operation. When a first task tagged with a set tag indicating that one or more subsequent tasks need to be synchronized with it is received at an instruction streaming engine, the engine saves the set tag in a tag table and transmits instructions of the first task to a set of processing tiles for execution. When a second task having an instruction sync tag indicating that it needs to be synchronized with one or more prior tasks is received at the engine, the engine matches the instruction sync tag with the set tags in the tag table to identify prior tasks that the second task depends on. The engine holds instructions of the second task until these matching prior tasks have been completed and then releases the instructions to the processing tiles for execution.
    Type: Grant
    Filed: April 30, 2020
    Date of Patent: March 14, 2023
    Assignee: Marvell Asia Pte Ltd
    Inventors: Avinash Sodani, Gopal Nalamalapu
  • Patent number: 11551148
    Abstract: A method of converting a data stored in a memory from a first format to a second format is disclosed. The method includes extending a number of bits in the data stored in a double data rate (DDR) memory by one bit to form an extended data. The method further includes determining whether the data stored in the DDR is signed or unsigned data. Moreover, responsive to determining that the data is signed, a sign value is added to the most significant bit of the extended data and the data is copied to lower order bits of the extended data. Responsive to determining that the data is unsigned, the data is copied to lower order bits of the extended data and the most significant bit is set to an unsigned value, e.g., zero. The extended data is stored in an on-chip memory (OCM) of a processing tile of a machine learning computer array.
    Type: Grant
    Filed: April 29, 2020
    Date of Patent: January 10, 2023
    Assignee: Marvell Asia Pte Ltd
    Inventors: Avinash Sodani, Ulf Hanebutte, Chia-Hsin Chen
  • Publication number: 20220405937
    Abstract: A new approach is proposed to support real-time camera-based agriculture inspection. One or more cameras are associated with a vehicle moving through a farm, wherein the one or more cameras each captures a plurality of images and/or video streams for an up-close, under-the-canopy view of crops on the farm. A compute box onboard the vehicle retrieves and processes the captured images and/or video streams to extract insights about current status of the crops on the farm and transmit the insights to a monitoring app running on a mobile computing device to be viewed as an inspection record by a user, e.g., farmer in real time as soon as the images and/or the video streams have been processed. The user may then control the compute box and/or the one or more cameras via the monitoring app accordingly while the vehicle is moving through the farm.
    Type: Application
    Filed: June 9, 2022
    Publication date: December 22, 2022
    Inventor: Avinash Sodani
  • Patent number: 11526440
    Abstract: In one embodiment, a processor comprises: at least one core formed on a die to execute instructions; a first memory controller to interface with an in-package memory; a second memory controller to interface with a platform memory to couple to the processor; and the in-package memory located within a package of the processor, where the in-package memory is to be identified as a more distant memory with respect to the at least one core than the platform memory. Other embodiments are described and claimed.
    Type: Grant
    Filed: June 6, 2019
    Date of Patent: December 13, 2022
    Assignee: Intel Corporation
    Inventors: Avinash Sodani, Robert J. Kyanko, Richard J. Greco, Andreas Kleen, Milind B. Girkar, Christopher M. Cantalupo
  • Patent number: 11526204
    Abstract: A system includes a plurality of cores. Each core includes a processing unit, an on-chip memory (OCM), and an idle detector unit. Data is received and stored in the OCM. Instructions are received to process data in the OCM. The core enters an idle mode if the idle detector unit detects that the core has been idle for a first number of clocking signals. The core receives a command to process when in idle mode and transitions from the idle mode to an operational mode. A number of no operation (No-Op) commands is inserted for each time segment. A No-Op command prevents the core from processing instructions for a certain number of clocking signals. A number of No-Op commands inserted for a first time segment is greater than a number of No-Op commands inserted for a last time segment. After the last time segment no No-Op command is inserted.
    Type: Grant
    Filed: October 22, 2021
    Date of Patent: December 13, 2022
    Assignee: Marvell Asia Pte Ltd
    Inventors: Chia-Hsin Chen, Avinash Sodani, Atul Bhattarai, Srinivas Sripada
  • Publication number: 20220374774
    Abstract: A system to support a machine learning (ML) operation comprises a core configured to receive and interpret commands into a set of instructions for the ML operation and a memory unit configured to maintain data for the ML operation. The system further comprises an inference engine having a plurality of processing tiles, each comprising an on-chip memory (OCM) configured to maintain data for local access by components in the processing tile and one or more processing units configured to perform tasks of the ML operation on the data in the OCM. The system also comprises an instruction streaming engine configured to distribute the instructions to the processing tiles to control their operations and to synchronize data communication between the core and the inference engine so that data transmitted between them correctly reaches the corresponding processing tiles while ensuring coherence of data shared and distributed among the core and the OCMs.
    Type: Application
    Filed: June 27, 2022
    Publication date: November 24, 2022
    Inventors: Avinash Sodani, Gopal Nalamalapu
  • Patent number: 11507170
    Abstract: A system includes a multicore chip configured to perform machine learning (ML) operations. The system also includes a power monitoring module configured to measure power consumption of the multicore chip on a main power rail of the multicore chip. The power monitoring module is further configured to assert a signal in response to the measured power consumption exceeding a first threshold. The power monitoring module is further configured to transmit the asserted signal to a power throttling module to initiate a power throttling for the multicore chip.
    Type: Grant
    Filed: October 30, 2020
    Date of Patent: November 22, 2022
    Assignee: Marvell Asia Pte Ltd
    Inventors: Atul Bhattarai, Srinivas Sripada, Avinash Sodani, Michael Dudek, Darren Walworth, Roshan Fernando, James Irvine, Mani Gopal
  • Patent number: 11494676
    Abstract: A processing unit to support inference acceleration for machine learning (ML) comprises an inline post processing unit configured to accept and maintain one or more lookup tables for performing each of one or more non-linear mathematical operations. The inline post processing unit is further configured to accept data from a set of registers maintaining output from a processing block instead of streaming the data from an on-chip memory (OCM), perform the one or more non-linear mathematical operations on elements of the data from the processing block via their corresponding lookup tables, and stream post processing result of the one or more non-linear mathematical operations back to the OCM after the one or more non-linear mathematical operations are complete.
    Type: Grant
    Filed: December 23, 2020
    Date of Patent: November 8, 2022
    Assignee: Marvell Asia Pte Ltd
    Inventors: Avinash Sodani, Ulf Hanebutte, Chia-Hsin Chen
  • Patent number: 11455575
    Abstract: A multi-dimensional mesh architecture is proposed to support transmitting data packets from one source to a plurality of destinations in multicasting or broadcasting modes. Each data packet to be transmitted to the destinations carries a destination mask, wherein each bit in the destination mask represents a corresponding destination processing block in the mesh architecture the data packet is sent to. The data packet traverses through the mesh architecture based on a routing scheme, wherein the data packet first traverses in a first direction across a first set of processing blocks and then traverses in a second direction across a second set of processing blocks to the first destination. During the process, the data packet is only replicated when it reaches a splitting processing block where the paths to different destinations diverge. The original and the replicated data packets are then routed in different directions until they reach their respective destinations.
    Type: Grant
    Filed: April 30, 2020
    Date of Patent: September 27, 2022
    Assignee: Marvell Asia Pte Ltd
    Inventors: Dan Tu, Enrique Musoll, Chia-Hsin Chen, Avinash Sodani
  • Publication number: 20220244767
    Abstract: A power throttling engine includes a register configured to receive a power throttling signal. The power throttling engine further includes a decoder configured to generate a vector based on a value of the power throttling signal. The value of the power throttling signal is an amount of power throttling of a device. The power throttling engine further includes a clock gating logic configured to receive the vector and further configured to receive a clocking signal. The clock gating logic is configured to remove clock edges of the clocking signal based on the vector to generate a throttled clocking signal.
    Type: Application
    Filed: April 22, 2022
    Publication date: August 4, 2022
    Inventors: Avinash Sodani, Srinivas Sripada, Ramacharan Sundararaman, Chia-Hsin Chen, Nikhil Jayakumar
  • Patent number: 11403561
    Abstract: A system to support a machine learning (ML) operation comprises a core configured to receive and interpret commands into a set of instructions for the ML operation and a memory unit configured to maintain data for the ML operation. The system further comprises an inference engine having a plurality of processing tiles, each comprising an on-chip memory (OCM) configured to maintain data for local access by components in the processing tile and one or more processing units configured to perform tasks of the ML operation on the data in the OCM. The system also comprises an instruction streaming engine configured to distribute the instructions to the processing tiles to control their operations and to synchronize data communication between the core and the inference engine so that data transmitted between them correctly reaches the corresponding processing tiles while ensuring coherence of data shared and distributed among the core and the OCMs.
    Type: Grant
    Filed: November 30, 2020
    Date of Patent: August 2, 2022
    Assignee: Marvell Asia Pte Ltd
    Inventors: Avinash Sodani, Gopal Nalamalapu
  • Publication number: 20220188108
    Abstract: A method includes receiving an input data at a floating point arithmetic operating unit, wherein the floating point operating unit is configured to perform a floating point arithmetic operation on the input data to generate an output result. The method also includes determining whether the output result is going to cause a floating point hardware exception responsive to the floating point arithmetic operation on the input data. The method further includes converting a value of the output result to a modified value responsive to the determining that the output result is going to cause the floating point hardware exception, wherein the modified value eliminates the floating point hardware exception responsive to the floating point arithmetic operation on the input data.
    Type: Application
    Filed: March 4, 2022
    Publication date: June 16, 2022
    Inventors: Chia-Hsin Chen, Avinash Sodani, Ulf Hanebutte, Rishan Tan, Soumya Gollamudi
  • Publication number: 20220188109
    Abstract: A method includes receiving an input data at a floating point arithmetic operating unit, wherein the floating point operating unit is configured to perform a floating point arithmetic operation on the input data. The method includes determining whether the received input data is a qnan (quiet not-a-number) or whether the received input data is an snan (signaling not-a-number) prior to performing the floating point arithmetic operation. The method also includes converting a value of the received input data to a modified value prior to performing the floating point arithmetic operation if the received input data is either qnan or snan, wherein the converting eliminates special handling associated with the floating point arithmetic operation on the input data being either qnan or snan.
    Type: Application
    Filed: March 4, 2022
    Publication date: June 16, 2022
    Inventors: Chia-Hsin Chen, Avinash Sodani, Ulf Hanebutte, Rishan Tan, Soumya Gollamudi
  • Publication number: 20220188111
    Abstract: A method includes receiving an input data at a floating point arithmetic operating unit, wherein the floating point operating unit is configured to perform a floating point arithmetic operation on the input data. The method also includes determining whether the received input data is positive infinity or negative infinity prior to performing the floating point arithmetic operation. The method further includes converting a value of the received input data to a modified value prior to performing the floating point arithmetic operation if the received input data is positive infinity or negative infinity.
    Type: Application
    Filed: March 4, 2022
    Publication date: June 16, 2022
    Inventors: Chia-Hsin Chen, Avinash Sodani, Ulf Hanebutte, Rishan Tan, Soumya Gollamudi
  • Publication number: 20220188110
    Abstract: A method includes receiving a first input data and a second input data at a floating point arithmetic operating unit, wherein the first input data and the second input data are associated with operands of a floating point arithmetic operation respectively, wherein the floating point operating unit is configured to perform a floating point arithmetic operation on the first input data and the second input data. The method further includes determining whether the first input data is a qnan (quiet not-a-number) or whether the first input data is an snan (signaling not-a-number) prior to performing the floating point arithmetic operation. A value of the first input data is modified prior to performing the floating point arithmetic operation if the first input data is either qnan or snan, wherein the converting eliminates special handling associated with the floating point arithmetic operation on the first input data being either qnan or snan.
    Type: Application
    Filed: March 4, 2022
    Publication date: June 16, 2022
    Inventors: Chia-Hsin Chen, Avinash Sodani, Ulf Hanebutte, Rishan Tan, Soumya Gollamudi
  • Patent number: 11340673
    Abstract: A power throttling engine includes a register configured to receive a power throttling signal. The power throttling engine further includes a decoder configured to generate a vector based on a value of the power throttling signal. The value of the power throttling signal is an amount of power throttling of a device. The power throttling engine further includes a clock gating logic configured to receive the vector and further configured to receive a clocking signal. The clock gating logic is configured to remove clock edges of the clocking signal based on the vector to generate a throttled clocking signal.
    Type: Grant
    Filed: April 30, 2020
    Date of Patent: May 24, 2022
    Assignee: Marvell Asia Pte Ltd
    Inventors: Avinash Sodani, Srinivas Sripada, Ramacharan Sundararaman, Chia-Hsin Chen, Nikhil Jayakumar
  • Publication number: 20220129270
    Abstract: A method includes receiving a TopK instruction to sort a highest K elements of a vector data. A first K elements of the vector data are sorted and stored in a first register. Another element of the vector data is read and determined whether it has a value that is greater than or is within a range of values of the first K elements. A position of the another element within the first K elements is determined if the another element has a value within that is within the range of values. A subset of the elements of the first K elements that are smaller than the another element are shifted down after determining the position of the another element in the first K elements. The another element is inserted in the determined position after the shifting. The process is repeated for each remaining element of the vector data.
    Type: Application
    Filed: October 21, 2021
    Publication date: April 28, 2022
    Inventors: Avinash Sodani, Ulf Hanebutte
  • Patent number: 11301247
    Abstract: A method includes receiving an input data at a FP arithmetic operating unit configured to perform a FP arithmetic operation on the input data. The method further includes determining whether the received input data generates a FP hardware exception responsive to the FP arithmetic operation on the input data, wherein the determining occurs prior to performing the FP arithmetic operation. The method also includes converting a value of the received input data to a modified value responsive to the determining that the received input data generates the FP hardware exception, wherein the converting eliminates generation of the FP hardware exception responsive to the FP arithmetic operation on the input data.
    Type: Grant
    Filed: April 30, 2020
    Date of Patent: April 12, 2022
    Assignee: Marvell Asia Pte Ltd
    Inventors: Chia-Hsin Chen, Avinash Sodani, Ulf Hanebutte, Rishan Tan, Soumya Gollamudi
  • Patent number: 11256517
    Abstract: A programmable hardware system for machine learning (ML) includes a core and an inference engine. The core receives commands from a host. The commands are in a first instruction set architecture (ISA) format. The core divides the commands into a first set for performance-critical operations, in the first ISA format, and a second set of performance non-critical operations, in the first ISA format. The core executes the second set to perform the performance non-critical operations of the ML operations and streams the first set to inference engine. The inference engine generates a stream of the first set of commands in a second ISA format based on the first set of commands in the first ISA format. The first set of commands in the second ISA format programs components within the inference engine to execute the ML operations to infer data.
    Type: Grant
    Filed: December 19, 2018
    Date of Patent: February 22, 2022
    Assignee: Marvell Asia Pte Ltd
    Inventors: Avinash Sodani, Ulf Hanebutte, Senad Durakovic, Hamid Reza Ghasemi, Chia-Hsin Chen