Patents by Inventor Natarajan Vaidhyanathan

Natarajan Vaidhyanathan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240095872
    Abstract: A processor-implemented method for a memory storage format to accelerate machine learning (ML) on a computing device is described. The method includes receiving an image in a first layer storage format of a neural network. The method also includes assigning addresses to image pixels of each of three channels of the first layer storage format for accessing the image pixels in a blocked ML storage acceleration format. The method further includes storing the image pixels in the blocked ML storage acceleration format according to the assigned addresses of the image pixels. The method also includes accelerating inference video processing of the image according to the assigned addresses for the image pixels corresponding to the blocked ML storage acceleration format.
    Type: Application
    Filed: September 16, 2022
    Publication date: March 21, 2024
    Inventors: Colin Beaton VERRILLI, Natarajan VAIDHYANATHAN, Matthew SIMPSON, Geoffrey Carlton BERRY, Sandeep PANDE
  • Publication number: 20230223954
    Abstract: Techniques and apparatuses to decompress data that has been stack compressed is described. Stack compression refers to compression of data in one or more dimensions. For uncompressed data blocks that are very sparse, i.e., data blocks that contain many zeros, stack compression can be effective. In stack compression, uncompressed data block is compressed into compressed data block by removing one or more zero words from the uncompressed data block. A map metadata that maps the zero words of the uncompressed data block is generated during compression. With the use of the map metadata, the compressed data block can be decompressed to restore the uncompressed data block.
    Type: Application
    Filed: May 7, 2021
    Publication date: July 13, 2023
    Inventors: Colin Beaton VERRILLI, Natarajan VAIDHYANATHAN
  • Publication number: 20230078079
    Abstract: A compute-in-memory array is provided that implements a filter for a layer in a neural network. The filter multiplies a plurality of activation bits by a plurality of filter weight bits for each channel in a plurality of channels through a charge accumulation from a plurality of capacitors. The accumulated charge is digitized to provide the output of the filter.
    Type: Application
    Filed: September 10, 2021
    Publication date: March 16, 2023
    Inventors: Francois Ibrahim ATALLAH, Hoan Huu NGUYEN, Colin Beaton VERRILLI, Natarajan VAIDHYANATHAN
  • Patent number: 11362672
    Abstract: Stack compression refers to compression of data in one or more dimensions. For uncompressed data blocks that are very sparse, i.e., data blocks that contain many zeros, stack compression can be effective. In stack compression, uncompressed data block is compressed into compressed data block by removing one or more zero words from the uncompressed data block. A map metadata that maps the zero words of the uncompressed data block is generated during compression. With the use of the map metadata, the compressed data block can be decompressed to restore the uncompressed data block.
    Type: Grant
    Filed: May 8, 2020
    Date of Patent: June 14, 2022
    Assignee: Qualcomm Incorporated
    Inventors: Colin Beaton Verrilli, Natarajan Vaidhyanathan
  • Publication number: 20210351789
    Abstract: Techniques and apparatuses to decompress data that has been stack compressed is described. Stack compression refers to compression of data in one or more dimensions. For uncompressed data blocks that are very sparse, i.e., data blocks that contain many zeros, stack compression can be effective. In stack compression, uncompressed data block is compressed into compressed data block by removing one or more zero words from the uncompressed data block. A map metadata that maps the zero words of the uncompressed data block is generated during compression. With the use of the map metadata, the compressed data block can be decompressed to restore the uncompressed data block.
    Type: Application
    Filed: May 8, 2020
    Publication date: November 11, 2021
    Inventors: Colin Beaton VERRILLI, Natarajan VAIDHYANATHAN
  • Patent number: 11144368
    Abstract: Providing self-resetting multi-producer multi-consumer semaphores in distributed processor-based systems is disclosed. In one aspect, a synchronization management circuit provides a semaphore including a counting semaphore value indicator, a current wait count indicator, and a target wait count indicator. When a consumer completes a wait operation, the synchronization management circuit adjusts the value of the current wait count indicator towards the value of the target wait count indicator, and compares the value of the current wait count indicator to the value of the target wait count indicator. If the value of the current wait count indicator has reached the value of the target wait count indicator, the synchronization management circuit infers that all consumers have observed the semaphore, and accordingly resets both the counting semaphore value indicator and the current wait count indicator to an initial wait value to place the semaphore in its initial state for reuse.
    Type: Grant
    Filed: June 18, 2019
    Date of Patent: October 12, 2021
    Assignee: Qualcomm Incorproated
    Inventors: Colin Beaton Verrilli, Natarajan Vaidhyanathan
  • Patent number: 11010313
    Abstract: A method, apparatus, and system for an architecture for machine learning acceleration is presented. An apparatus includes a plurality of processing elements, each including a tightly-coupled memory, and a memory system coupled to the processing elements. A global synchronization manager is coupled to the plurality of the processing elements and to the memory system. The processing elements do not implement a coherency protocol with respect to the memory system. The processing elements implement direct memory access with respect to the memory system, and the global synchronization manager is configured to synchronize operations of the plurality of processing elements through the TCMs.
    Type: Grant
    Filed: August 29, 2019
    Date of Patent: May 18, 2021
    Assignee: Qualcomm Incorporated
    Inventors: Colin Beaton Verrilli, Natarajan Vaidhyanathan, Rexford Alan Hill
  • Patent number: 10936943
    Abstract: Providing flexible matrix processors for performing neural network convolution in matrix-processor-based devices is disclosed. In this regard, a matrix-processor-based device provides a central processing unit (CPU) and a matrix processor. The matrix processor reorganizes a plurality of weight matrices and a plurality of input matrices into swizzled weight matrices and swizzled input matrices, respectively, that have regular dimensions natively supported by the matrix processor. The matrix-processor-based device then performs a convolution operation using the matrix processor to perform matrix multiplication/accumulation operations for the regular dimensions of the weight matrices and the input matrices, and further uses the CPU to execute instructions for handling the irregular dimensions of the weight matrices and the input matrices (e.g., by executing a series of nested loops, as a non-limiting example).
    Type: Grant
    Filed: August 30, 2018
    Date of Patent: March 2, 2021
    Assignee: Qualcomm Incorporated
    Inventors: Colin Beaton Verrilli, Mattheus Cornelis Antonius Adrianus Heddes, Natarajan Vaidhyanathan, Koustav Bhattacharya, Robert Dreyer
  • Patent number: 10877951
    Abstract: Techniques are disclosed for notifying network control software of new and moved source MAC addresses. In one embodiment, a switch may redirect a packet sent by a new or migrated virtual machine to the network control software as a notification. The switch does not forward the packet, thereby protecting against denial of service attacks. The switch further adds to a forwarding database a temporary entry which includes a “No_Redirect” flag for a new source MAC address, or updates an existing entry for a source MAC address that hits in the forwarding database by setting the “No_Redirect” flag. The “No_Redirect” flag indicates whether a notification has already been sent to the network control software for this source MAC address. The switch may periodically retry the notification to the network control software, until the network control software validates the source MAC address, depending on whether the “No_Redirect” is set.
    Type: Grant
    Filed: January 22, 2014
    Date of Patent: December 29, 2020
    Assignee: International Business Machines Corporation
    Inventors: Claude Basso, Josep Cors, Venkatesh K. Janakiraman, Sze-Wa Lao, Sameer M. Shah, David A. Shedivy, Ethan M. Spiegel, Natarajan Vaidhyanathan, Colin B. Verrilli
  • Publication number: 20200401461
    Abstract: Providing self-resetting multi-producer multi-consumer semaphores in distributed processor-based systems is disclosed. In one aspect, a synchronization management circuit provides a semaphore including a counting semaphore value indicator, a current wait count indicator, and a target wait count indicator. When a consumer completes a wait operation, the synchronization management circuit adjusts the value of the current wait count indicator towards the value of the target wait count indicator, and compares the value of the current wait count indicator to the value of the target wait count indicator. If the value of the current wait count indicator has reached the value of the target wait count indicator, the synchronization management circuit infers that all consumers have observed the semaphore, and accordingly resets both the counting semaphore value indicator and the current wait count indicator to an initial wait value to place the semaphore in its initial state for reuse.
    Type: Application
    Filed: June 18, 2019
    Publication date: December 24, 2020
    Inventors: Colin Beaton Verrilli, Natarajan Vaidhyanathan
  • Patent number: 10838862
    Abstract: Aspects disclosed herein include memory controllers employing memory capacity compression, and related processor-based systems and methods. In certain aspects, compressed memory controllers are employed that can provide memory capacity compression. In some aspects, a line-based memory capacity compression scheme can be employed where additional translation of a physical address (PA) to a physical buffer address is performed to allow compressed data in a system memory at the physical buffer address for efficient compressed data storage. A translation lookaside buffer (TLB) may also be employed to store TLB entries comprising PA tags corresponding to a physical buffer address in the system memory to more efficiently perform the translation of the PA to the physical buffer address in the system memory. In certain aspects, a line-based memory capacity compression scheme, a page-based memory capacity compression scheme, or a hybrid line-page-based memory capacity compression scheme can be employed.
    Type: Grant
    Filed: May 19, 2015
    Date of Patent: November 17, 2020
    Assignee: QUALCOMM Incorporated
    Inventors: Mattheus Cornelis Antonius Adrianus Heddes, Natarajan Vaidhyanathan, Colin Beaton Verrilli
  • Patent number: 10838942
    Abstract: Techniques are disclosed for notifying network control software of new and moved source MAC addresses. In one embodiment, a switch may redirect a packet sent by a new or migrated virtual machine to the network control software as a notification. The switch does not forward the packet, thereby protecting against denial of service attacks. The switch further adds to a forwarding database a temporary entry which includes a “No_Redirect” flag for a new source MAC address, or updates an existing entry for a source MAC address that hits in the forwarding database by setting the “No_Redirect” flag. The “No_Redirect” flag indicates whether a notification has already been sent to the network control software for this source MAC address. The switch may periodically retry the notification to the network control software, until the network control software validates the source MAC address, depending on whether the “No_Redirect” is set.
    Type: Grant
    Filed: February 11, 2014
    Date of Patent: November 17, 2020
    Assignee: International Business Machines Corporation
    Inventors: Claude Basso, Josep Cors, Venkatesh K. Janakiraman, Sze-Wa Lao, Sameer M. Shah, David A. Shedivy, Ethan M. Spiegel, Natarajan Vaidhyanathan, Colin B. Verrilli
  • Patent number: 10747501
    Abstract: Providing efficient floating-point operations using matrix processors in processor-based systems is disclosed. In this regard, a matrix-processor-based device provides a matrix processor comprising a positive partial sum accumulator and a negative partial sum accumulator. As the matrix processor processes pairs of floating-point operands, the matrix processor calculates an intermediate product based on a first floating-point operand and a second floating-point operand and determines a sign of the intermediate product. Based on the sign, the matrix processor normalizes the intermediate product with a partial sum fraction of the positive partial sum accumulator or the negative partial sum accumulator, then adds the intermediate product to the positive sum accumulator or the negative sum accumulator.
    Type: Grant
    Filed: August 30, 2018
    Date of Patent: August 18, 2020
    Assignee: Qualcomm Incorporated
    Inventors: Mattheus Cornelis Antonius Adrianus Heddes, Natarajan Vaidhyanathan, Robert Dreyer, Colin Beaton Verrilli, Koustav Bhattacharya
  • Patent number: 10725740
    Abstract: Providing efficient multiplication of sparse matrices in matrix-processor-based devices is disclosed herein. In one aspect, a matrix processor of a matrix-processor-based device includes a plurality of sequencers coupled to a plurality of multiply/accumulate (MAC) units for performing multiplication and accumulation operations. Each sequencer determines whether a product of an element of a first input matrix to be multiplied with an element of a second input matrix has a value of zero (e.g., by determining whether the element of the first input matrix has a value of zero, or by determining whether either the element of the first input matrix or that of the second input matrix has a value of zero). If the product of the elements of the first input matrix and the second input matrix does not have a value of zero, the sequencer provides the elements to a MAC unit to perform a multiplication and accumulation operation.
    Type: Grant
    Filed: August 30, 2018
    Date of Patent: July 28, 2020
    Assignee: Qualcomm Incorporated
    Inventors: Mattheus Cornelis Antonius Adrianus Heddes, Robert Dreyer, Colin Beaton Verrilli, Natarajan Vaidhyanathan, Koustav Bhattacharya
  • Publication number: 20200073830
    Abstract: A method, apparatus, and system for an architecture for machine learning acceleration is presented. An apparatus includes a plurality of processing elements, each including a tightly-coupled memory, and a memory system coupled to the processing elements. A global synchronization manager is coupled to the plurality of the processing elements and to the memory system. The processing elements do not implement a coherency protocol with respect to the memory system. The processing elements implement direct memory access with respect to the memory system, and the global synchronization manager is configured to synchronize operations of the plurality of processing elements through the TCMs.
    Type: Application
    Filed: August 29, 2019
    Publication date: March 5, 2020
    Inventors: Colin Beaton VERRILLI, Natarajan VAIDHYANATHAN, Rexford Alan HILL
  • Patent number: 10541921
    Abstract: Embodiments provide a TCAM-based access control list that supports disjunction operations in rules. A network frame is received. Embodiments determine set TCP flags of the network frame. Upon determining that the set TCP flags match a first entry in a numeric range table, bits of a search key corresponding to the first entry are updated. The search key accesses a second entry stored in a TCAM. The first entry further comprises an encode field to scan a TCP header of the network frame for set TCP flags, a first mask field to a condition corresponding to unset TCP flags to identify in the network frame, a second mask field to a condition corresponding to set TCP flags to identify in the network frame, and an operation field specifying a disjunction operation for comparing the set TCP flags with the first mask field and the second mask field.
    Type: Grant
    Filed: November 21, 2017
    Date of Patent: January 21, 2020
    Assignee: International Business Machines Corporation
    Inventors: Claude Basso, Joseph A. Kirscht, Natarajan Vaidhyanathan
  • Patent number: 10503661
    Abstract: Providing memory bandwidth compression using compressed memory controllers (CMCs) in a central processing unit (CPU)-based system is disclosed. In this regard, in some aspects, a CMC is configured to receive a memory read request to a physical address in a system memory, and read a compression indicator (CI) for the physical address from a master directory and/or from error correcting code (ECC) bits of the physical address. Based on the CI, the CMC determines a number of memory blocks to be read for the memory read request, and reads the determined number of memory blocks. In some aspects, a CMC is configured to receive a memory write request to a physical address in the system memory, and generate a CI for write data based on a compression pattern of the write data. The CMC updates the master directory and/or the ECC bits of the physical address with the generated CI.
    Type: Grant
    Filed: May 20, 2015
    Date of Patent: December 10, 2019
    Assignee: QUALCOMM Incorporated
    Inventors: Mattheus Cornelis Antonius Adrianus Heddes, Natarajan Vaidhyanathan, Colin Beaton Verrilli
  • Patent number: 10467092
    Abstract: Providing space-efficient storage for dynamic random access memory (DRAM) cache tags is provided. In one aspect, a DRAM cache management circuit provides a plurality of cache entries, each of which contains a tag storage region, a data storage region, and an error protection region. The DRAM cache management circuit is configured to store data to be cached in the data storage region of each cache entry. The DRAM cache management circuit is also configured to use an error detection code (EDC) instead of an error correcting code (ECC), and to store a tag and the EDC for each cache entry in the error protection region of the cache entry. In this manner, the capacity of a DRAM cache can be increased by avoiding the need for the tag storage region for each cache entry, while still providing error detection for the cache entry.
    Type: Grant
    Filed: March 30, 2016
    Date of Patent: November 5, 2019
    Assignee: QUALCOMM Incorporated
    Inventors: Natarajan Vaidhyanathan, Mattheus Cornelis Antonius Adrianus Heddes, Colin Beaton Verrilli
  • Patent number: 10419267
    Abstract: Techniques are disclosed for notifying network control software of new and moved source MAC addresses. In one embodiment, a switch detects packets sent by a new or migrated virtual machine, and sends a copy of a detected packet to the network control software as a notification. The switch further learns the source MAC address, thereby permitting the entry to be used for normal forwarding prior to validation of the entry and the VM associated therewith by the network control software. Until the network control software has validated the VM, the switch may periodically retry the notification to the network control software. “No_Redirect” and “Not_Validated” flags may be used to indicate whether a notification has already been attempted and thus no retry is necessary, and that the VM associated with the VM has not yet been validated, respectively.
    Type: Grant
    Filed: January 22, 2014
    Date of Patent: September 17, 2019
    Assignee: LENOVO ENTERPRISE SOLUTIONS (SINGAPORE) PTE. LTD.
    Inventors: Claude Basso, Josep Cors, Venkatesh K. Janakiraman, Sze-Wa Lao, Sameer M. Shah, David A. Shedivy, Ethan M. Spiegel, Natarajan Vaidhyanathan, Colin B. Verrilli
  • Patent number: 10236917
    Abstract: Providing memory bandwidth compression in chipkill-correct memory architectures is disclosed. In this regard, a compressed memory controller (CMC) introduces a specified error pattern into chipkill-correct error correcting code (ECC) bits to indicate compressed data. To encode data, the CMC applies a compression algorithm to an uncompressed data block to generate a compressed data block. The CMC then generates ECC data for the compressed data block (i.e., an “inner” ECC segment), appends the inner ECC segment to the compressed data block, and generates ECC data for the compressed data block and the inner ECC segment (i.e., an “outer” ECC segment). The CMC then intentionally inverts a specified plurality of bytes of the outer ECC segment (e.g., in portions of the outer ECC segment stored in different physical memory chips by a chipkill-correct ECC mechanism). The outer ECC segment is then appended to the compressed data block and the inner ECC segment.
    Type: Grant
    Filed: September 15, 2016
    Date of Patent: March 19, 2019
    Assignee: QUALCOMM Incorporated
    Inventors: Natarajan Vaidhyanathan, Luther James Blackwood, Mattheus Cornelis Antonius Adrianus Heddes, Michael Raymond Trombley, Colin Beaton Verrilli