Patents by Inventor Natarajan Vaidhyanathan

Natarajan Vaidhyanathan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

MEMORY STORAGE FORMAT FOR SUPPORTING MACHINE LEARNING ACCELERATION

Publication number: 20240095872

Abstract: A processor-implemented method for a memory storage format to accelerate machine learning (ML) on a computing device is described. The method includes receiving an image in a first layer storage format of a neural network. The method also includes assigning addresses to image pixels of each of three channels of the first layer storage format for accessing the image pixels in a blocked ML storage acceleration format. The method further includes storing the image pixels in the blocked ML storage acceleration format according to the assigned addresses of the image pixels. The method also includes accelerating inference video processing of the image according to the assigned addresses for the image pixels corresponding to the blocked ML storage acceleration format.

Type: Application

Filed: September 16, 2022

Publication date: March 21, 2024

Inventors: Colin Beaton VERRILLI, Natarajan VAIDHYANATHAN, Matthew SIMPSON, Geoffrey Carlton BERRY, Sandeep PANDE
INLINE DECOMPRESSION

Publication number: 20230223954

Abstract: Techniques and apparatuses to decompress data that has been stack compressed is described. Stack compression refers to compression of data in one or more dimensions. For uncompressed data blocks that are very sparse, i.e., data blocks that contain many zeros, stack compression can be effective. In stack compression, uncompressed data block is compressed into compressed data block by removing one or more zero words from the uncompressed data block. A map metadata that maps the zero words of the uncompressed data block is generated during compression. With the use of the map metadata, the compressed data block can be decompressed to restore the uncompressed data block.

Type: Application

Filed: May 7, 2021

Publication date: July 13, 2023

Inventors: Colin Beaton VERRILLI, Natarajan VAIDHYANATHAN
HYBRID COMPUTE-IN-MEMORY

Publication number: 20230078079

Abstract: A compute-in-memory array is provided that implements a filter for a layer in a neural network. The filter multiplies a plurality of activation bits by a plurality of filter weight bits for each channel in a plurality of channels through a charge accumulation from a plurality of capacitors. The accumulated charge is digitized to provide the output of the filter.

Type: Application

Filed: September 10, 2021

Publication date: March 16, 2023

Inventors: Francois Ibrahim ATALLAH, Hoan Huu NGUYEN, Colin Beaton VERRILLI, Natarajan VAIDHYANATHAN
Inline decompression

Patent number: 11362672

Abstract: Stack compression refers to compression of data in one or more dimensions. For uncompressed data blocks that are very sparse, i.e., data blocks that contain many zeros, stack compression can be effective. In stack compression, uncompressed data block is compressed into compressed data block by removing one or more zero words from the uncompressed data block. A map metadata that maps the zero words of the uncompressed data block is generated during compression. With the use of the map metadata, the compressed data block can be decompressed to restore the uncompressed data block.

Type: Grant

Filed: May 8, 2020

Date of Patent: June 14, 2022

Assignee: Qualcomm Incorporated

Inventors: Colin Beaton Verrilli, Natarajan Vaidhyanathan
INLINE DECOMPRESSION

Publication number: 20210351789

Abstract: Techniques and apparatuses to decompress data that has been stack compressed is described. Stack compression refers to compression of data in one or more dimensions. For uncompressed data blocks that are very sparse, i.e., data blocks that contain many zeros, stack compression can be effective. In stack compression, uncompressed data block is compressed into compressed data block by removing one or more zero words from the uncompressed data block. A map metadata that maps the zero words of the uncompressed data block is generated during compression. With the use of the map metadata, the compressed data block can be decompressed to restore the uncompressed data block.

Type: Application

Filed: May 8, 2020

Publication date: November 11, 2021

Inventors: Colin Beaton VERRILLI, Natarajan VAIDHYANATHAN
Providing self-resetting multi-producer multi-consumer semaphores in distributed processor-based systems

Patent number: 11144368

Abstract: Providing self-resetting multi-producer multi-consumer semaphores in distributed processor-based systems is disclosed. In one aspect, a synchronization management circuit provides a semaphore including a counting semaphore value indicator, a current wait count indicator, and a target wait count indicator. When a consumer completes a wait operation, the synchronization management circuit adjusts the value of the current wait count indicator towards the value of the target wait count indicator, and compares the value of the current wait count indicator to the value of the target wait count indicator. If the value of the current wait count indicator has reached the value of the target wait count indicator, the synchronization management circuit infers that all consumers have observed the semaphore, and accordingly resets both the counting semaphore value indicator and the current wait count indicator to an initial wait value to place the semaphore in its initial state for reuse.

Type: Grant

Filed: June 18, 2019

Date of Patent: October 12, 2021

Assignee: Qualcomm Incorproated

Inventors: Colin Beaton Verrilli, Natarajan Vaidhyanathan
Method, apparatus, and system for an architecture for machine learning acceleration

Patent number: 11010313

Abstract: A method, apparatus, and system for an architecture for machine learning acceleration is presented. An apparatus includes a plurality of processing elements, each including a tightly-coupled memory, and a memory system coupled to the processing elements. A global synchronization manager is coupled to the plurality of the processing elements and to the memory system. The processing elements do not implement a coherency protocol with respect to the memory system. The processing elements implement direct memory access with respect to the memory system, and the global synchronization manager is configured to synchronize operations of the plurality of processing elements through the TCMs.

Type: Grant

Filed: August 29, 2019

Date of Patent: May 18, 2021

Assignee: Qualcomm Incorporated

Inventors: Colin Beaton Verrilli, Natarajan Vaidhyanathan, Rexford Alan Hill
Providing flexible matrix processors for performing neural network convolution in matrix-processor-based devices

Patent number: 10936943

Abstract: Providing flexible matrix processors for performing neural network convolution in matrix-processor-based devices is disclosed. In this regard, a matrix-processor-based device provides a central processing unit (CPU) and a matrix processor. The matrix processor reorganizes a plurality of weight matrices and a plurality of input matrices into swizzled weight matrices and swizzled input matrices, respectively, that have regular dimensions natively supported by the matrix processor. The matrix-processor-based device then performs a convolution operation using the matrix processor to perform matrix multiplication/accumulation operations for the regular dimensions of the weight matrices and the input matrices, and further uses the CPU to execute instructions for handling the irregular dimensions of the weight matrices and the input matrices (e.g., by executing a series of nested loops, as a non-limiting example).

Type: Grant

Filed: August 30, 2018

Date of Patent: March 2, 2021

Assignee: Qualcomm Incorporated

Inventors: Colin Beaton Verrilli, Mattheus Cornelis Antonius Adrianus Heddes, Natarajan Vaidhyanathan, Koustav Bhattacharya, Robert Dreyer
Network control software notification and invalidation of static entries

Patent number: 10877951

Abstract: Techniques are disclosed for notifying network control software of new and moved source MAC addresses. In one embodiment, a switch may redirect a packet sent by a new or migrated virtual machine to the network control software as a notification. The switch does not forward the packet, thereby protecting against denial of service attacks. The switch further adds to a forwarding database a temporary entry which includes a “No_Redirect” flag for a new source MAC address, or updates an existing entry for a source MAC address that hits in the forwarding database by setting the “No_Redirect” flag. The “No_Redirect” flag indicates whether a notification has already been sent to the network control software for this source MAC address. The switch may periodically retry the notification to the network control software, until the network control software validates the source MAC address, depending on whether the “No_Redirect” is set.

Type: Grant

Filed: January 22, 2014

Date of Patent: December 29, 2020

Assignee: International Business Machines Corporation

Inventors: Claude Basso, Josep Cors, Venkatesh K. Janakiraman, Sze-Wa Lao, Sameer M. Shah, David A. Shedivy, Ethan M. Spiegel, Natarajan Vaidhyanathan, Colin B. Verrilli
PROVIDING SELF-RESETTING MULTI-PRODUCER MULTI-CONSUMER SEMAPHORES IN DISTRIBUTED PROCESSOR-BASED SYSTEMS

Publication number: 20200401461

Abstract: Providing self-resetting multi-producer multi-consumer semaphores in distributed processor-based systems is disclosed. In one aspect, a synchronization management circuit provides a semaphore including a counting semaphore value indicator, a current wait count indicator, and a target wait count indicator. When a consumer completes a wait operation, the synchronization management circuit adjusts the value of the current wait count indicator towards the value of the target wait count indicator, and compares the value of the current wait count indicator to the value of the target wait count indicator. If the value of the current wait count indicator has reached the value of the target wait count indicator, the synchronization management circuit infers that all consumers have observed the semaphore, and accordingly resets both the counting semaphore value indicator and the current wait count indicator to an initial wait value to place the semaphore in its initial state for reuse.

Type: Application

Filed: June 18, 2019

Publication date: December 24, 2020

Inventors: Colin Beaton Verrilli, Natarajan Vaidhyanathan
Memory controllers employing memory capacity compression, and related processor-based systems and methods

Patent number: 10838862

Abstract: Aspects disclosed herein include memory controllers employing memory capacity compression, and related processor-based systems and methods. In certain aspects, compressed memory controllers are employed that can provide memory capacity compression. In some aspects, a line-based memory capacity compression scheme can be employed where additional translation of a physical address (PA) to a physical buffer address is performed to allow compressed data in a system memory at the physical buffer address for efficient compressed data storage. A translation lookaside buffer (TLB) may also be employed to store TLB entries comprising PA tags corresponding to a physical buffer address in the system memory to more efficiently perform the translation of the PA to the physical buffer address in the system memory. In certain aspects, a line-based memory capacity compression scheme, a page-based memory capacity compression scheme, or a hybrid line-page-based memory capacity compression scheme can be employed.

Type: Grant

Filed: May 19, 2015

Date of Patent: November 17, 2020

Assignee: QUALCOMM Incorporated

Inventors: Mattheus Cornelis Antonius Adrianus Heddes, Natarajan Vaidhyanathan, Colin Beaton Verrilli
Network control software notification and invalidation of static entries

Patent number: 10838942

Abstract: Techniques are disclosed for notifying network control software of new and moved source MAC addresses. In one embodiment, a switch may redirect a packet sent by a new or migrated virtual machine to the network control software as a notification. The switch does not forward the packet, thereby protecting against denial of service attacks. The switch further adds to a forwarding database a temporary entry which includes a “No_Redirect” flag for a new source MAC address, or updates an existing entry for a source MAC address that hits in the forwarding database by setting the “No_Redirect” flag. The “No_Redirect” flag indicates whether a notification has already been sent to the network control software for this source MAC address. The switch may periodically retry the notification to the network control software, until the network control software validates the source MAC address, depending on whether the “No_Redirect” is set.

Type: Grant

Filed: February 11, 2014

Date of Patent: November 17, 2020

Assignee: International Business Machines Corporation

Inventors: Claude Basso, Josep Cors, Venkatesh K. Janakiraman, Sze-Wa Lao, Sameer M. Shah, David A. Shedivy, Ethan M. Spiegel, Natarajan Vaidhyanathan, Colin B. Verrilli
Providing efficient floating-point operations using matrix processors in processor-based systems

Patent number: 10747501

Abstract: Providing efficient floating-point operations using matrix processors in processor-based systems is disclosed. In this regard, a matrix-processor-based device provides a matrix processor comprising a positive partial sum accumulator and a negative partial sum accumulator. As the matrix processor processes pairs of floating-point operands, the matrix processor calculates an intermediate product based on a first floating-point operand and a second floating-point operand and determines a sign of the intermediate product. Based on the sign, the matrix processor normalizes the intermediate product with a partial sum fraction of the positive partial sum accumulator or the negative partial sum accumulator, then adds the intermediate product to the positive sum accumulator or the negative sum accumulator.

Type: Grant

Filed: August 30, 2018

Date of Patent: August 18, 2020

Assignee: Qualcomm Incorporated

Inventors: Mattheus Cornelis Antonius Adrianus Heddes, Natarajan Vaidhyanathan, Robert Dreyer, Colin Beaton Verrilli, Koustav Bhattacharya
Providing efficient multiplication of sparse matrices in matrix-processor-based devices

Patent number: 10725740

Abstract: Providing efficient multiplication of sparse matrices in matrix-processor-based devices is disclosed herein. In one aspect, a matrix processor of a matrix-processor-based device includes a plurality of sequencers coupled to a plurality of multiply/accumulate (MAC) units for performing multiplication and accumulation operations. Each sequencer determines whether a product of an element of a first input matrix to be multiplied with an element of a second input matrix has a value of zero (e.g., by determining whether the element of the first input matrix has a value of zero, or by determining whether either the element of the first input matrix or that of the second input matrix has a value of zero). If the product of the elements of the first input matrix and the second input matrix does not have a value of zero, the sequencer provides the elements to a MAC unit to perform a multiplication and accumulation operation.

Type: Grant

Filed: August 30, 2018

Date of Patent: July 28, 2020

Assignee: Qualcomm Incorporated

Inventors: Mattheus Cornelis Antonius Adrianus Heddes, Robert Dreyer, Colin Beaton Verrilli, Natarajan Vaidhyanathan, Koustav Bhattacharya
METHOD, APPARATUS, AND SYSTEM FOR AN ARCHITECTURE FOR MACHINE LEARNING ACCELERATION

Publication number: 20200073830

Abstract: A method, apparatus, and system for an architecture for machine learning acceleration is presented. An apparatus includes a plurality of processing elements, each including a tightly-coupled memory, and a memory system coupled to the processing elements. A global synchronization manager is coupled to the plurality of the processing elements and to the memory system. The processing elements do not implement a coherency protocol with respect to the memory system. The processing elements implement direct memory access with respect to the memory system, and the global synchronization manager is configured to synchronize operations of the plurality of processing elements through the TCMs.

Type: Application

Filed: August 29, 2019

Publication date: March 5, 2020

Inventors: Colin Beaton VERRILLI, Natarajan VAIDHYANATHAN, Rexford Alan HILL
Supporting access control list rules that apply to TCP segments belonging to ‘established’ connection

Patent number: 10541921

Abstract: Embodiments provide a TCAM-based access control list that supports disjunction operations in rules. A network frame is received. Embodiments determine set TCP flags of the network frame. Upon determining that the set TCP flags match a first entry in a numeric range table, bits of a search key corresponding to the first entry are updated. The search key accesses a second entry stored in a TCAM. The first entry further comprises an encode field to scan a TCP header of the network frame for set TCP flags, a first mask field to a condition corresponding to unset TCP flags to identify in the network frame, a second mask field to a condition corresponding to set TCP flags to identify in the network frame, and an operation field specifying a disjunction operation for comparing the set TCP flags with the first mask field and the second mask field.

Type: Grant

Filed: November 21, 2017

Date of Patent: January 21, 2020

Assignee: International Business Machines Corporation

Inventors: Claude Basso, Joseph A. Kirscht, Natarajan Vaidhyanathan
Providing memory bandwidth compression using compressed memory controllers (CMCs) in a central processing unit (CPU)-based system

Patent number: 10503661

Abstract: Providing memory bandwidth compression using compressed memory controllers (CMCs) in a central processing unit (CPU)-based system is disclosed. In this regard, in some aspects, a CMC is configured to receive a memory read request to a physical address in a system memory, and read a compression indicator (CI) for the physical address from a master directory and/or from error correcting code (ECC) bits of the physical address. Based on the CI, the CMC determines a number of memory blocks to be read for the memory read request, and reads the determined number of memory blocks. In some aspects, a CMC is configured to receive a memory write request to a physical address in the system memory, and generate a CI for write data based on a compression pattern of the write data. The CMC updates the master directory and/or the ECC bits of the physical address with the generated CI.

Type: Grant

Filed: May 20, 2015

Date of Patent: December 10, 2019

Assignee: QUALCOMM Incorporated

Inventors: Mattheus Cornelis Antonius Adrianus Heddes, Natarajan Vaidhyanathan, Colin Beaton Verrilli
Providing space-efficient storage for dynamic random access memory (DRAM) cache tags

Patent number: 10467092

Abstract: Providing space-efficient storage for dynamic random access memory (DRAM) cache tags is provided. In one aspect, a DRAM cache management circuit provides a plurality of cache entries, each of which contains a tag storage region, a data storage region, and an error protection region. The DRAM cache management circuit is configured to store data to be cached in the data storage region of each cache entry. The DRAM cache management circuit is also configured to use an error detection code (EDC) instead of an error correcting code (ECC), and to store a tag and the EDC for each cache entry in the error protection region of the cache entry. In this manner, the capacity of a DRAM cache can be increased by avoiding the need for the tag storage region for each cache entry, while still providing error detection for the cache entry.

Type: Grant

Filed: March 30, 2016

Date of Patent: November 5, 2019

Assignee: QUALCOMM Incorporated

Inventors: Natarajan Vaidhyanathan, Mattheus Cornelis Antonius Adrianus Heddes, Colin Beaton Verrilli
Network control software notification with advance learning

Patent number: 10419267

Abstract: Techniques are disclosed for notifying network control software of new and moved source MAC addresses. In one embodiment, a switch detects packets sent by a new or migrated virtual machine, and sends a copy of a detected packet to the network control software as a notification. The switch further learns the source MAC address, thereby permitting the entry to be used for normal forwarding prior to validation of the entry and the VM associated therewith by the network control software. Until the network control software has validated the VM, the switch may periodically retry the notification to the network control software. “No_Redirect” and “Not_Validated” flags may be used to indicate whether a notification has already been attempted and thus no retry is necessary, and that the VM associated with the VM has not yet been validated, respectively.

Type: Grant

Filed: January 22, 2014

Date of Patent: September 17, 2019

Assignee: LENOVO ENTERPRISE SOLUTIONS (SINGAPORE) PTE. LTD.

Inventors: Claude Basso, Josep Cors, Venkatesh K. Janakiraman, Sze-Wa Lao, Sameer M. Shah, David A. Shedivy, Ethan M. Spiegel, Natarajan Vaidhyanathan, Colin B. Verrilli
Providing memory bandwidth compression in chipkill-correct memory architectures

Patent number: 10236917

Abstract: Providing memory bandwidth compression in chipkill-correct memory architectures is disclosed. In this regard, a compressed memory controller (CMC) introduces a specified error pattern into chipkill-correct error correcting code (ECC) bits to indicate compressed data. To encode data, the CMC applies a compression algorithm to an uncompressed data block to generate a compressed data block. The CMC then generates ECC data for the compressed data block (i.e., an “inner” ECC segment), appends the inner ECC segment to the compressed data block, and generates ECC data for the compressed data block and the inner ECC segment (i.e., an “outer” ECC segment). The CMC then intentionally inverts a specified plurality of bytes of the outer ECC segment (e.g., in portions of the outer ECC segment stored in different physical memory chips by a chipkill-correct ECC mechanism). The outer ECC segment is then appended to the compressed data block and the inner ECC segment.

Type: Grant

Filed: September 15, 2016

Date of Patent: March 19, 2019

Assignee: QUALCOMM Incorporated

Inventors: Natarajan Vaidhyanathan, Luther James Blackwood, Mattheus Cornelis Antonius Adrianus Heddes, Michael Raymond Trombley, Colin Beaton Verrilli

1 2 3 4 5 … next