Patents by Inventor Karthikeyan Avudaiyappan

Karthikeyan Avudaiyappan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11853717
    Abstract: Embodiments of the present disclosure include systems and methods for accelerating processing based on sparsity for neural network hardware processors. An input manager determines a pair of non-zero values from a pair of data streams in a plurality of pairs of data streams and retrieve the pair of non-zero values from the pair of data streams. A multiplier performs a multiplication operation on the pair of non-zero values and generate a product of the pair of non-zero values. An accumulator manager receives the product of the pair of non-zero values from the multiplier and sends the product of the pair of non-zero values to a corresponding accumulator in a plurality of accumulators.
    Type: Grant
    Filed: January 14, 2021
    Date of Patent: December 26, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Karthikeyan Avudaiyappan, Jeffrey Andrews
  • Publication number: 20230343374
    Abstract: Embodiments of the present disclosure include techniques storing and retrieving data. In one embodiment, sub-matrices of data are stored as row slices and column slices. A fetch circuit determines if particular slices of one sub-matrix, when combined with corresponding slices of another sub-matrix, produce a zero result and need not be retrieved. In another embodiment, the present disclosure includes a memory circuit comprising memory banks and sub-banks. The sub-banks store slices of sub-matrices. A request moves between serially configured memory banks and slices in different sub-banks may be retrieved at the same time.
    Type: Application
    Filed: April 26, 2022
    Publication date: October 26, 2023
    Inventors: Karthikeyan AVUDAIYAPPAN, Jeffrey A ANDREWS
  • Publication number: 20230342291
    Abstract: Embodiments of the present disclosure include techniques storing and retrieving data. In one embodiment, sub-matrices of data are stored as row slices and column slices. A fetch circuit determines if particular slices of one sub-matrix, when combined with corresponding slices of another sub-matrix, produce a zero result and need not be retrieved. In another embodiment, the present disclosure includes a memory circuit comprising memory banks and sub-banks. The sub-banks store slices of sub-matrices. A request moves between serially configured memory banks and slices in different sub-banks may be retrieved at the same time.
    Type: Application
    Filed: April 26, 2022
    Publication date: October 26, 2023
    Inventors: Karthikeyan AVUDAIYAPPAN, Jeffrey A ANDREWS
  • Publication number: 20230244448
    Abstract: Embodiments of the present disclosure include a multiply-accumulator (MAC) array circuit comprising an activation cache and a plurality of multiply-accumulator (MA) groups. The activation cache comprises cache lines configured to store sub-slices of an input activation array. The cache lines are coupled to particular MA groups. Activations stored in the cache lines may be used and reused across multiple MA groups.
    Type: Application
    Filed: February 1, 2022
    Publication date: August 3, 2023
    Inventors: Karthikeyan AVUDAIYAPPAN, Jeffrey A ANDREWS
  • Publication number: 20230214185
    Abstract: Embodiments of the present disclosure include a multipurpose multiply-accumulator (MAC) array circuit comprising one or more input memories for receiving operands and a plurality of multiply-accumulator circuits each selectively coupled to the one or more input memories to receive at least a pair of operands and generate a result. Each of the plurality of multiply-accumulator circuits receives operands from the one or more input memories independently. Additionally, selection of operands from the one or more input memories is controlled based on at least an operation and/or data types, where different operation and/or data types configure the plurality of multiply-accumulator circuits to receive different pairs of operands from the one or more input memories to execute particular operation types.
    Type: Application
    Filed: December 28, 2021
    Publication date: July 6, 2023
    Inventors: Karthikeyan AVUDAIYAPPAN, Jeffrey A. ANDREWS
  • Publication number: 20220222043
    Abstract: Embodiments of the present disclosure include systems and methods for accelerating processing based on sparsity for neural network hardware processors. An input manager determines a pair of non-zero values from a pair of data streams in a plurality of pairs of data streams and retrieve the pair of non-zero values from the pair of data streams. A multiplier performs a multiplication operation on the pair of non-zero values and generate a product of the pair of non-zero values. An accumulator manager receives the product of the pair of non-zero values from the multiplier and sends the product of the pair of non-zero values to a corresponding accumulator in a plurality of accumulators.
    Type: Application
    Filed: January 14, 2021
    Publication date: July 14, 2022
    Inventors: Karthikeyan Avudaiyappan, Jeffrey Andrews
  • Publication number: 20220215234
    Abstract: Embodiments of the present disclosure include systems and methods for determining schedules for processing neural networks on hardware. A set of instructions for processing data through a neural network is received. Based on a hardware definition specifying the set of hardware units and functions that each hardware unit in the set of the hardware unit is configured to perform, a schedule of a set of operations to be performed by a subset of the set of hardware units to implement the set of instructions are determined. The schedule of the set of operations are distributed to the subset of the set of hardware units.
    Type: Application
    Filed: January 7, 2021
    Publication date: July 7, 2022
    Inventors: Karthikeyan Avudaiyappan, Jeffrey Andrews
  • Patent number: 11314647
    Abstract: Methods and systems for managing synonyms in VIPT caches are disclosed. A method includes tracking lines of a copied cache using a directory, examining a specified bit of a virtual address that is associated with a load request and determining its status and making an entry in one of a plurality of parts of the directory based on the status of the specified bit of the virtual address that is examined. The method further includes updating one of, and invalidating the other of, a cache line that is associated with the virtual address that is stored in a first index of the copied cache, and a cache line that is associated with a synonym of the virtual address that is stored at a second index of the copied cache, upon receiving a request to update a physical address associated with the virtual address.
    Type: Grant
    Filed: December 23, 2019
    Date of Patent: April 26, 2022
    Assignee: INTEL CORPORATION
    Inventor: Karthikeyan Avudaiyappan
  • Patent number: 10884739
    Abstract: Systems and methods for load canceling in a processor that is connected to an external interconnect fabric are disclosed. As a part of a method for load canceling in a processor that is connected to an external bus, and responsive to a flush request and a corresponding cancellation of pending speculative loads from a load queue, a type of one or more of the pending speculative loads that are positioned in the instruction pipeline external to the processor, is converted from load to prefetch. Data corresponding to one or more of the pending speculative loads that are positioned in the instruction pipeline external to the processor is accessed and returned to cache as prefetch data. The prefetch data is retired in a cache location of the processor.
    Type: Grant
    Filed: May 24, 2018
    Date of Patent: January 5, 2021
    Assignee: INTEL CORPORATION
    Inventors: Karthikeyan Avudaiyappan, Mohammad Abdallah
  • Patent number: 10698833
    Abstract: A method for supporting a plurality of requests for access to a data cache memory (“cache”) is disclosed. The method comprises accessing a first set of requests to access the cache, wherein the cache comprises a plurality of blocks. Further, responsive to the first set of requests to access the cache, the method comprises accessing a tag memory that maintains a plurality of copies of tags for each entry in the cache and identifying tags that correspond to individual requests of the first set. The method also comprises performing arbitration in a same clock cycle as the accessing and identifying of tags, wherein the arbitration comprises: (a) identifying a second set of requests to access the cache from the first set, wherein the second set accesses a same block within the cache; and (b) selecting each request from the second set to receive data from the same block.
    Type: Grant
    Filed: January 26, 2018
    Date of Patent: June 30, 2020
    Assignee: Intel Corporation
    Inventors: Karthikeyan Avudaiyappan, Sourabh Alurkar
  • Patent number: 10585804
    Abstract: Systems and methods for non-blocking implementation of cache flush instructions are disclosed. As a part of a method, data is accessed that is received in a write-back data holding buffer from a cache flushing operation, the data is flagged with a processor identifier and a serialization flag, and responsive to the flagging, the cache is notified that the cache flush is completed. Subsequent to the notifying, access is provided to data then present in the write-back data holding buffer to determine if data then present in the write-back data holding buffer is flagged.
    Type: Grant
    Filed: November 7, 2017
    Date of Patent: March 10, 2020
    Assignee: Intel Corporation
    Inventors: Karthikeyan Avudaiyappan, Mohammad Abdallah
  • Patent number: 10565113
    Abstract: Methods and systems for managing synonyms in VIPT caches are disclosed. A method includes tracking lines of a copied cache using a directory, examining a specified bit of a virtual address that is associated with a load request and determining its status and making an entry in one of a plurality of parts of the directory based on the status of the specified bit of the virtual address that is examined. The method further includes updating one of, and invalidating the other of, a cache line that is associated with the virtual address that is stored in a first index of the copied cache, and a cache line that is associated with a synonym of the virtual address that is stored at a second index of the copied cache, upon receiving a request to update a physical address associated with the virtual address.
    Type: Grant
    Filed: August 13, 2015
    Date of Patent: February 18, 2020
    Assignee: INTEL CORPORATION
    Inventor: Karthikeyan Avudaiyappan
  • Patent number: 10552334
    Abstract: A method and system acquires cache line data associated with a load from respective hierarchical cache data storage components. As a part of the method and system, a store queue is accessed for one or more portions of a cache line associated with the load, and, if the one or more portions of the cache line is held in the store queue, the one or more portions of the cache line is stored in a load queue location associated with the load. The load is completed if the one or more portions of the cache line stored in the load queue location includes all portions of the cache line associated with the load.
    Type: Grant
    Filed: March 24, 2017
    Date of Patent: February 4, 2020
    Assignee: INTEL CORPORATION
    Inventors: Karthikeyan Avudaiyappan, Paul G. Chan
  • Patent number: 10402322
    Abstract: Methods for read after write forwarding using a virtual address are disclosed. A method includes determining when a virtual address has been remapped from corresponding to a first physical address to a second physical address and determining if all stores occupying a store queue before the remapping have been retired from the store queue. Loads that are younger than the stores that occupied the store queue before the remapping are prevented from being dispatched and executed until the stores that occupied the store queue before the remapping have left the store queue and become globally visible.
    Type: Grant
    Filed: August 25, 2017
    Date of Patent: September 3, 2019
    Assignee: Intel Corporation
    Inventors: Karthikeyan Avudaiyappan, Paul Chan
  • Patent number: 10346302
    Abstract: A method for maintaining the coherency of a store coalescing cache and a load cache is disclosed. As a part of the method, responsive to a write-back of an entry from a level one store coalescing cache to a level two cache, the entry is written into the level two cache and into the level one load cache. The writing of the entry into the level two cache and into the level one load cache is executed at the speed of access of the level two cache.
    Type: Grant
    Filed: July 19, 2017
    Date of Patent: July 9, 2019
    Assignee: Intel Corporation
    Inventors: Karthikeyan Avudaiyappan, Mohammad Abdallah
  • Patent number: 10310987
    Abstract: Systems and methods for accessing a unified translation lookaside buffer (TLB) are disclosed. A method includes receiving an indicator of a level one translation lookaside buffer (L1TLB) miss corresponding to a request for a virtual address to physical address translation, searching a cache that includes virtual addresses and page sizes that correspond to translation table entries (TTEs) that have been evicted from the L1TLB, where a page size is identified, and searching a second level TLB and identifying a physical address that is contained in the second level TLB. Access is provided to the identified physical address.
    Type: Grant
    Filed: August 15, 2017
    Date of Patent: June 4, 2019
    Assignee: INTEL CORPORATION
    Inventors: Karthikeyan Avudaiyappan, Mohammad Abdallah
  • Patent number: 10296432
    Abstract: Methods for invasive debug of a processor without processor execution of instructions are disclosed. As a part of a method, a memory mapped I/O of the processor is accessed using a debug bus and an operation is initiated that causes a debug port to gain access to registers of the processor using the memory mapped I/O. The invasive debug of the processor is executed from the debug port via registers of the processor.
    Type: Grant
    Filed: May 3, 2017
    Date of Patent: May 21, 2019
    Assignee: INTEL CORPORATION
    Inventors: Karthikeyan Avudaiyappan, Brian McGee
  • Patent number: 10255187
    Abstract: A method for weak stream software data and instruction prefetching using a hardware data prefetcher is disclosed. A method includes, determining if software includes software prefetch instructions, using a hardware data prefetcher, and, accessing the software prefetch instructions if the software includes software prefetch instructions. Using the hardware data prefetcher, weak stream software data and instruction prefetching operations are executed based on the software prefetch instructions, free of training operations.
    Type: Grant
    Filed: May 3, 2016
    Date of Patent: April 9, 2019
    Assignee: Intel Corporation
    Inventors: Karthikeyan Avudaiyappan, Mohammad Abdallah
  • Patent number: 10248570
    Abstract: A method for predicting a way of a set associative shadow cache is disclosed. As a part of a method, a request to fetch a first far taken branch instruction of a first cache line from an instruction cache is received, and responsive to a hit in the instruction cache, a predicted way is selected from a way array using a way that corresponds to the hit in the instruction cache. A second cache line is selected from a shadow cache using the predicted way and the first cache line and the second cache line are forwarded in the same clock cycle.
    Type: Grant
    Filed: January 4, 2018
    Date of Patent: April 2, 2019
    Assignee: Intel Corporation
    Inventors: Mohammad Abdallah, Ravishankar Rao, Karthikeyan Avudaiyappan
  • Patent number: 10210101
    Abstract: Systems and methods for flushing a cache with modified data are disclosed. Responsive to a request to flush data from a cache with modified data to a next level cache that does not include the cache with modified data, the cache with modified data is accessed using an index and a way and an address associated with the index and the way is secured. Using the address, the cache with modified data is accessed a second time and an entry that is associated with the address is retrieved from the cache with modified data. The entry is placed into a location of the next level cache.
    Type: Grant
    Filed: November 27, 2017
    Date of Patent: February 19, 2019
    Assignee: Intel Corporation
    Inventors: Karthikeyan Avudaiyappan, Mohammad Abdallah