Patents by Inventor Avinash Sodani

Avinash Sodani has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

SYSTEM AND METHOD FOR INT9 QUANTIZATION

Publication number: 20230096994

Abstract: A method of converting a data stored in a memory from a first format to a second format is disclosed. The method includes extending a number of bits in the data stored in a double data rate (DDR) memory by one bit to form an extended data. The method further includes determining whether the data stored in the DDR is signed or unsigned data. Moreover, responsive to determining that the data is signed, a sign value is added to the most significant bit of the extended data and the data is copied to lower order bits of the extended data. Responsive to determining that the data is unsigned, the data is copied to lower order bits of the extended data and the most significant bit is set to an unsigned value, e.g., zero. The extended data is stored in an on-chip memory (OCM) of a processing tile of a machine learning computer array.

Type: Application

Filed: December 6, 2022

Publication date: March 30, 2023

Inventors: Avinash Sodani, Ulf Hanebutte, Chia-Hsin Chen
System and methods for tag-based synchronization of tasks for machine learning operations

Patent number: 11604683

Abstract: A new approach for supporting tag-based synchronization among different tasks of a machine learning (ML) operation. When a first task tagged with a set tag indicating that one or more subsequent tasks need to be synchronized with it is received at an instruction streaming engine, the engine saves the set tag in a tag table and transmits instructions of the first task to a set of processing tiles for execution. When a second task having an instruction sync tag indicating that it needs to be synchronized with one or more prior tasks is received at the engine, the engine matches the instruction sync tag with the set tags in the tag table to identify prior tasks that the second task depends on. The engine holds instructions of the second task until these matching prior tasks have been completed and then releases the instructions to the processing tiles for execution.

Type: Grant

Filed: April 30, 2020

Date of Patent: March 14, 2023

Assignee: Marvell Asia Pte Ltd

Inventors: Avinash Sodani, Gopal Nalamalapu
System and method for INT9 quantization

Patent number: 11551148

Abstract: A method of converting a data stored in a memory from a first format to a second format is disclosed. The method includes extending a number of bits in the data stored in a double data rate (DDR) memory by one bit to form an extended data. The method further includes determining whether the data stored in the DDR is signed or unsigned data. Moreover, responsive to determining that the data is signed, a sign value is added to the most significant bit of the extended data and the data is copied to lower order bits of the extended data. Responsive to determining that the data is unsigned, the data is copied to lower order bits of the extended data and the most significant bit is set to an unsigned value, e.g., zero. The extended data is stored in an on-chip memory (OCM) of a processing tile of a machine learning computer array.

Type: Grant

Filed: April 29, 2020

Date of Patent: January 10, 2023

Assignee: Marvell Asia Pte Ltd

Inventors: Avinash Sodani, Ulf Hanebutte, Chia-Hsin Chen
SYSTEM AND METHOD FOR REAL-TIME CAMERA-BASED INSPECTION FOR AGRICULTURE

Publication number: 20220405937

Abstract: A new approach is proposed to support real-time camera-based agriculture inspection. One or more cameras are associated with a vehicle moving through a farm, wherein the one or more cameras each captures a plurality of images and/or video streams for an up-close, under-the-canopy view of crops on the farm. A compute box onboard the vehicle retrieves and processes the captured images and/or video streams to extract insights about current status of the crops on the farm and transmit the insights to a monitoring app running on a mobile computing device to be viewed as an inspection record by a user, e.g., farmer in real time as soon as the images and/or the video streams have been processed. The user may then control the compute box and/or the one or more cameras via the monitoring app accordingly while the vehicle is moving through the farm.

Type: Application

Filed: June 9, 2022

Publication date: December 22, 2022

Inventor: Avinash Sodani
Providing multiple memory modes for a processor including internal memory

Patent number: 11526440

Abstract: In one embodiment, a processor comprises: at least one core formed on a die to execute instructions; a first memory controller to interface with an in-package memory; a second memory controller to interface with a platform memory to couple to the processor; and the in-package memory located within a package of the processor, where the in-package memory is to be identified as a more distant memory with respect to the at least one core than the platform memory. Other embodiments are described and claimed.

Type: Grant

Filed: June 6, 2019

Date of Patent: December 13, 2022

Assignee: Intel Corporation

Inventors: Avinash Sodani, Robert J. Kyanko, Richard J. Greco, Andreas Kleen, Milind B. Girkar, Christopher M. Cantalupo
Power management and transitioning cores within a multicore system from idle mode to operational mode over a period of time

Patent number: 11526204

Abstract: A system includes a plurality of cores. Each core includes a processing unit, an on-chip memory (OCM), and an idle detector unit. Data is received and stored in the OCM. Instructions are received to process data in the OCM. The core enters an idle mode if the idle detector unit detects that the core has been idle for a first number of clocking signals. The core receives a command to process when in idle mode and transitions from the idle mode to an operational mode. A number of no operation (No-Op) commands is inserted for each time segment. A No-Op command prevents the core from processing instructions for a certain number of clocking signals. A number of No-Op commands inserted for a first time segment is greater than a number of No-Op commands inserted for a last time segment. After the last time segment no No-Op command is inserted.

Type: Grant

Filed: October 22, 2021

Date of Patent: December 13, 2022

Assignee: Marvell Asia Pte Ltd

Inventors: Chia-Hsin Chen, Avinash Sodani, Atul Bhattarai, Srinivas Sripada
ARCHITECTURE TO SUPPORT SYNCHRONIZATION BETWEEN CORE AND INFERENCE ENGINE FOR MACHINE LEARNING

Publication number: 20220374774

Abstract: A system to support a machine learning (ML) operation comprises a core configured to receive and interpret commands into a set of instructions for the ML operation and a memory unit configured to maintain data for the ML operation. The system further comprises an inference engine having a plurality of processing tiles, each comprising an on-chip memory (OCM) configured to maintain data for local access by components in the processing tile and one or more processing units configured to perform tasks of the ML operation on the data in the OCM. The system also comprises an instruction streaming engine configured to distribute the instructions to the processing tiles to control their operations and to synchronize data communication between the core and the inference engine so that data transmitted between them correctly reaches the corresponding processing tiles while ensuring coherence of data shared and distributed among the core and the OCMs.

Type: Application

Filed: June 27, 2022

Publication date: November 24, 2022

Inventors: Avinash Sodani, Gopal Nalamalapu
Power management and current/ramp detection mechanism

Patent number: 11507170

Abstract: A system includes a multicore chip configured to perform machine learning (ML) operations. The system also includes a power monitoring module configured to measure power consumption of the multicore chip on a main power rail of the multicore chip. The power monitoring module is further configured to assert a signal in response to the measured power consumption exceeding a first threshold. The power monitoring module is further configured to transmit the asserted signal to a power throttling module to initiate a power throttling for the multicore chip.

Type: Grant

Filed: October 30, 2020

Date of Patent: November 22, 2022

Assignee: Marvell Asia Pte Ltd

Inventors: Atul Bhattarai, Srinivas Sripada, Avinash Sodani, Michael Dudek, Darren Walworth, Roshan Fernando, James Irvine, Mani Gopal
Architecture for table-based mathematical operations for inference acceleration in machine learning

Patent number: 11494676

Abstract: A processing unit to support inference acceleration for machine learning (ML) comprises an inline post processing unit configured to accept and maintain one or more lookup tables for performing each of one or more non-linear mathematical operations. The inline post processing unit is further configured to accept data from a set of registers maintaining output from a processing block instead of streaming the data from an on-chip memory (OCM), perform the one or more non-linear mathematical operations on elements of the data from the processing block via their corresponding lookup tables, and stream post processing result of the one or more non-linear mathematical operations back to the OCM after the one or more non-linear mathematical operations are complete.

Type: Grant

Filed: December 23, 2020

Date of Patent: November 8, 2022

Assignee: Marvell Asia Pte Ltd

Inventors: Avinash Sodani, Ulf Hanebutte, Chia-Hsin Chen
System and methods for mesh architecture for high bandwidth multicast and broadcast network

Patent number: 11455575

Abstract: A multi-dimensional mesh architecture is proposed to support transmitting data packets from one source to a plurality of destinations in multicasting or broadcasting modes. Each data packet to be transmitted to the destinations carries a destination mask, wherein each bit in the destination mask represents a corresponding destination processing block in the mesh architecture the data packet is sent to. The data packet traverses through the mesh architecture based on a routing scheme, wherein the data packet first traverses in a first direction across a first set of processing blocks and then traverses in a second direction across a second set of processing blocks to the first destination. During the process, the data packet is only replicated when it reaches a splitting processing block where the paths to different destinations diverge. The original and the replicated data packets are then routed in different directions until they reach their respective destinations.

Type: Grant

Filed: April 30, 2020

Date of Patent: September 27, 2022

Assignee: Marvell Asia Pte Ltd

Inventors: Dan Tu, Enrique Musoll, Chia-Hsin Chen, Avinash Sodani
SYSTEM AND METHOD TO MANAGE POWER THROTTLING

Publication number: 20220244767

Abstract: A power throttling engine includes a register configured to receive a power throttling signal. The power throttling engine further includes a decoder configured to generate a vector based on a value of the power throttling signal. The value of the power throttling signal is an amount of power throttling of a device. The power throttling engine further includes a clock gating logic configured to receive the vector and further configured to receive a clocking signal. The clock gating logic is configured to remove clock edges of the clocking signal based on the vector to generate a throttled clocking signal.

Type: Application

Filed: April 22, 2022

Publication date: August 4, 2022

Inventors: Avinash Sodani, Srinivas Sripada, Ramacharan Sundararaman, Chia-Hsin Chen, Nikhil Jayakumar
Architecture to support synchronization between core and inference engine for machine learning

Patent number: 11403561

Abstract: A system to support a machine learning (ML) operation comprises a core configured to receive and interpret commands into a set of instructions for the ML operation and a memory unit configured to maintain data for the ML operation. The system further comprises an inference engine having a plurality of processing tiles, each comprising an on-chip memory (OCM) configured to maintain data for local access by components in the processing tile and one or more processing units configured to perform tasks of the ML operation on the data in the OCM. The system also comprises an instruction streaming engine configured to distribute the instructions to the processing tiles to control their operations and to synchronize data communication between the core and the inference engine so that data transmitted between them correctly reaches the corresponding processing tiles while ensuring coherence of data shared and distributed among the core and the OCMs.

Type: Grant

Filed: November 30, 2020

Date of Patent: August 2, 2022

Assignee: Marvell Asia Pte Ltd

Inventors: Avinash Sodani, Gopal Nalamalapu
SYSTEM AND METHOD FOR HANDLING FLOATING POINT HARDWARE EXCEPTION

Publication number: 20220188108

Abstract: A method includes receiving an input data at a floating point arithmetic operating unit, wherein the floating point operating unit is configured to perform a floating point arithmetic operation on the input data to generate an output result. The method also includes determining whether the output result is going to cause a floating point hardware exception responsive to the floating point arithmetic operation on the input data. The method further includes converting a value of the output result to a modified value responsive to the determining that the output result is going to cause the floating point hardware exception, wherein the modified value eliminates the floating point hardware exception responsive to the floating point arithmetic operation on the input data.

Type: Application

Filed: March 4, 2022

Publication date: June 16, 2022

Inventors: Chia-Hsin Chen, Avinash Sodani, Ulf Hanebutte, Rishan Tan, Soumya Gollamudi
SYSTEM AND METHOD FOR HANDLING FLOATING POINT HARDWARE EXCEPTION

Publication number: 20220188109

Abstract: A method includes receiving an input data at a floating point arithmetic operating unit, wherein the floating point operating unit is configured to perform a floating point arithmetic operation on the input data. The method includes determining whether the received input data is a qnan (quiet not-a-number) or whether the received input data is an snan (signaling not-a-number) prior to performing the floating point arithmetic operation. The method also includes converting a value of the received input data to a modified value prior to performing the floating point arithmetic operation if the received input data is either qnan or snan, wherein the converting eliminates special handling associated with the floating point arithmetic operation on the input data being either qnan or snan.

Type: Application

Filed: March 4, 2022

Publication date: June 16, 2022

Inventors: Chia-Hsin Chen, Avinash Sodani, Ulf Hanebutte, Rishan Tan, Soumya Gollamudi
SYSTEM AND METHOD FOR HANDLING FLOATING POINT HARDWARE EXCEPTION

Publication number: 20220188111

Abstract: A method includes receiving an input data at a floating point arithmetic operating unit, wherein the floating point operating unit is configured to perform a floating point arithmetic operation on the input data. The method also includes determining whether the received input data is positive infinity or negative infinity prior to performing the floating point arithmetic operation. The method further includes converting a value of the received input data to a modified value prior to performing the floating point arithmetic operation if the received input data is positive infinity or negative infinity.

Type: Application

Filed: March 4, 2022

Publication date: June 16, 2022

Inventors: Chia-Hsin Chen, Avinash Sodani, Ulf Hanebutte, Rishan Tan, Soumya Gollamudi
SYSTEM AND METHOD FOR HANDLING FLOATING POINT HARDWARE EXCEPTION

Publication number: 20220188110

Abstract: A method includes receiving a first input data and a second input data at a floating point arithmetic operating unit, wherein the first input data and the second input data are associated with operands of a floating point arithmetic operation respectively, wherein the floating point operating unit is configured to perform a floating point arithmetic operation on the first input data and the second input data. The method further includes determining whether the first input data is a qnan (quiet not-a-number) or whether the first input data is an snan (signaling not-a-number) prior to performing the floating point arithmetic operation. A value of the first input data is modified prior to performing the floating point arithmetic operation if the first input data is either qnan or snan, wherein the converting eliminates special handling associated with the floating point arithmetic operation on the first input data being either qnan or snan.

Type: Application

Filed: March 4, 2022

Publication date: June 16, 2022

Inventors: Chia-Hsin Chen, Avinash Sodani, Ulf Hanebutte, Rishan Tan, Soumya Gollamudi
System and method to manage power throttling

Patent number: 11340673

Abstract: A power throttling engine includes a register configured to receive a power throttling signal. The power throttling engine further includes a decoder configured to generate a vector based on a value of the power throttling signal. The value of the power throttling signal is an amount of power throttling of a device. The power throttling engine further includes a clock gating logic configured to receive the vector and further configured to receive a clocking signal. The clock gating logic is configured to remove clock edges of the clocking signal based on the vector to generate a throttled clocking signal.

Type: Grant

Filed: April 30, 2020

Date of Patent: May 24, 2022

Assignee: Marvell Asia Pte Ltd

Inventors: Avinash Sodani, Srinivas Sripada, Ramacharan Sundararaman, Chia-Hsin Chen, Nikhil Jayakumar
METHOD AND SYSTEM FOR TOPK OPERATION

Publication number: 20220129270

Abstract: A method includes receiving a TopK instruction to sort a highest K elements of a vector data. A first K elements of the vector data are sorted and stored in a first register. Another element of the vector data is read and determined whether it has a value that is greater than or is within a range of values of the first K elements. A position of the another element within the first K elements is determined if the another element has a value within that is within the range of values. A subset of the elements of the first K elements that are smaller than the another element are shifted down after determining the position of the another element in the first K elements. The another element is inserted in the determined position after the shifting. The process is repeated for each remaining element of the vector data.

Type: Application

Filed: October 21, 2021

Publication date: April 28, 2022

Inventors: Avinash Sodani, Ulf Hanebutte
System and method for handling floating point hardware exception

Patent number: 11301247

Abstract: A method includes receiving an input data at a FP arithmetic operating unit configured to perform a FP arithmetic operation on the input data. The method further includes determining whether the received input data generates a FP hardware exception responsive to the FP arithmetic operation on the input data, wherein the determining occurs prior to performing the FP arithmetic operation. The method also includes converting a value of the received input data to a modified value responsive to the determining that the received input data generates the FP hardware exception, wherein the converting eliminates generation of the FP hardware exception responsive to the FP arithmetic operation on the input data.

Type: Grant

Filed: April 30, 2020

Date of Patent: April 12, 2022

Assignee: Marvell Asia Pte Ltd

Inventors: Chia-Hsin Chen, Avinash Sodani, Ulf Hanebutte, Rishan Tan, Soumya Gollamudi
Architecture of crossbar of inference engine

Patent number: 11256517

Abstract: A programmable hardware system for machine learning (ML) includes a core and an inference engine. The core receives commands from a host. The commands are in a first instruction set architecture (ISA) format. The core divides the commands into a first set for performance-critical operations, in the first ISA format, and a second set of performance non-critical operations, in the first ISA format. The core executes the second set to perform the performance non-critical operations of the ML operations and streams the first set to inference engine. The inference engine generates a stream of the first set of commands in a second ISA format based on the first set of commands in the first ISA format. The first set of commands in the second ISA format programs components within the inference engine to execute the ML operations to infer data.

Type: Grant

Filed: December 19, 2018

Date of Patent: February 22, 2022

Assignee: Marvell Asia Pte Ltd

Inventors: Avinash Sodani, Ulf Hanebutte, Senad Durakovic, Hamid Reza Ghasemi, Chia-Hsin Chen

prev 1 2 3 4 5 6 … next