Patents by Inventor Shomit N. Das

Shomit N. Das has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11842199
    Abstract: An asynchronous pipeline includes a first stage and one or more second stages. A controller provides control signals to the first stage to indicate a modification to an operating speed of the first stage. The modification is determined based on a comparison of a completion status of the first stage to one or more completion statuses of the one or more second stages. In some cases, the controller provides control signals indicating modifications to an operating voltage applied to the first stage and a drive strength of a buffer in the first stage. Modules can be used to determine the completion statuses of the first stage and the one or more second stages based on the monitored output signals generated by the stages, output signals from replica critical paths associated with the stages, or a lookup table that indicates estimated completion times.
    Type: Grant
    Filed: June 26, 2020
    Date of Patent: December 12, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Greg Sadowski, John Kalamatianos, Shomit N. Das
  • Patent number: 11740791
    Abstract: In some embodiments, a memory controller in a processor includes a base value cache, a compressor, and a metadata cache. The compressor is coupled to the base value cache and the metadata cache. The compressor compresses a data block using at least a base value and delta values. The compressor determines whether the size of the data block exceeds a data block threshold value. Based on the determination of whether the size of the compressed data block generated by the compressor exceeds the data block threshold value, the memory controller transfers only a set of the compressed delta values to memory for storage. A decompressor located in the lower level cache of the processor decompresses the compressed data block using the base value stored in the base value cache, metadata stored in the metadata cache and the delta values stored in memory.
    Type: Grant
    Filed: October 8, 2021
    Date of Patent: August 29, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Seyed Mohammad Seyedzadehdelcheh, Xianwei Zhang, Bradford Beckmann, Shomit N. Das
  • Patent number: 11726546
    Abstract: Systems, methods, devices, and computer-implemented instructions for processor power management implemented in a compiler. In some implementations, a characteristic of code is determined. An instruction based on the determined characteristic is inserted into the code. The code and inserted instruction are compiled to generate compiled code. The compiled code is output.
    Type: Grant
    Filed: September 25, 2020
    Date of Patent: August 15, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Vedula Venkata Srikant Bharadwaj, Shomit N. Das, Anthony T. Gutierrez, Vignesh Adhinarayanan
  • Patent number: 11726837
    Abstract: In some examples, thermal aware optimization logic determines a characteristic (e.g., a workload or type) of a wavefront (e.g., multiple threads). For example, the characteristic indicates whether the wavefront is compute intensive, memory intensive, mixed, and/or another type of wavefront. The thermal aware optimization logic determines temperature information for one or more compute units (CUs) in one or more processing cores. The temperature information includes predictive thermal information indicating expected temperatures corresponding to the one or more CUs and historical thermal information indicating current or past thermal temperatures of at least a portion of a graphics processing unit (GPU). The logic selects the one or more compute units to process the plurality of threads based on the determined characteristic and the temperature information. The logic provides instructions to the selected subset of the plurality of CUs to execute the wavefront.
    Type: Grant
    Filed: November 4, 2021
    Date of Patent: August 15, 2023
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Karthik Rao, Shomit N. Das, Xudong An, Wei Huang
  • Publication number: 20230110376
    Abstract: Systems, apparatuses, and methods for implementing a multi-tiered approach to cache compression are disclosed. A cache includes a cache controller, light compressor, and heavy compressor. The decision on which compressor to use for compressing cache lines is made based on certain resource availability such as cache capacity or memory bandwidth. This allows the cache to opportunistically use complex algorithms for compression while limiting the adverse effects of high decompression latency on system performance. To address the above issue, the proposed design takes advantage of the heavy compressors for effectively reducing memory bandwidth in high bandwidth memory (HBM) interfaces as long as they do not sacrifice system performance. Accordingly, the cache combines light and heavy compressors with a decision-making unit to achieve reduced off-chip memory traffic without sacrificing system performance.
    Type: Application
    Filed: November 23, 2022
    Publication date: April 13, 2023
    Inventors: SeyedMohammad SeyedzadehDelcheh, Shomit N. Das, Bradford Michael Beckmann
  • Patent number: 11604738
    Abstract: A processing device is provided which includes memory comprising data cache memory configured to store compressed data and metadata cache memory configured to store metadata, each portion of metadata comprising an encoding used to compress a portion of data. The processing device also includes at least one processor configured to compress portions of data and select, based on one or more utility level metrics, portions of metadata to be stored in the metadata cache memory. The at least one processor is also configured to store, in the metadata cache memory, the portions of metadata selected to be stored in the metadata cache memory, store, in the data cache memory, each portion of compressed data having a selected portion of corresponding metadata stored in the metadata cache memory. Each portion of compressed data, having the selected portion of corresponding metadata stored in the metadata cache memory, is decompressed.
    Type: Grant
    Filed: September 28, 2018
    Date of Patent: March 14, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Shomit N. Das, Matthew Tomei, David A. Wood
  • Patent number: 11544196
    Abstract: Systems, apparatuses, and methods for implementing a multi-tiered approach to cache compression are disclosed. A cache includes a cache controller, light compressor, and heavy compressor. The decision on which compressor to use for compressing cache lines is made based on certain resource availability such as cache capacity or memory bandwidth. This allows the cache to opportunistically use complex algorithms for compression while limiting the adverse effects of high decompression latency on system performance. To address the above issue, the proposed design takes advantage of the heavy compressors for effectively reducing memory bandwidth in high bandwidth memory (HBM) interfaces as long as they do not sacrifice system performance. Accordingly, the cache combines light and heavy compressors with a decision-making unit to achieve reduced off-chip memory traffic without sacrificing system performance.
    Type: Grant
    Filed: December 23, 2019
    Date of Patent: January 3, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: SeyedMohammad SeyedzadehDelcheh, Shomit N. Das, Bradford Michael Beckmann
  • Patent number: 11362673
    Abstract: Entropy agnostic data encoding includes: receiving, by an encoder, input data including a bit string; generating a plurality of candidate codewords, including encoding the input data bit string with a plurality of binary vectors, wherein the plurality of binary vectors includes a set of deterministic biased binary vectors and a set of random binary vectors; selecting, in dependence upon a predefined criteria, one of the plurality of candidate codewords; and transmitting the selected candidate codeword to a decoder.
    Type: Grant
    Filed: November 4, 2020
    Date of Patent: June 14, 2022
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Seyedmohammad Seyedzadehdelcheh, Shomit N. Das
  • Publication number: 20220107849
    Abstract: In some examples, thermal aware optimization logic determines a characteristic (e.g., a workload or type) of a wavefront (e.g., multiple threads). For example, the characteristic indicates whether the wavefront is compute intensive, memory intensive, mixed, and/or another type of wavefront. The thermal aware optimization logic determines temperature information for one or more compute units (CUs) in one or more processing cores. The temperature information includes predictive thermal information indicating expected temperatures corresponding to the one or more CUs and historical thermal information indicating current or past thermal temperatures of at least a portion of a graphics processing unit (GPU). The logic selects the one or more compute units to process the plurality of threads based on the determined characteristic and the temperature information. The logic provides instructions to the selected subset of the plurality of CUs to execute the wavefront.
    Type: Application
    Filed: November 4, 2021
    Publication date: April 7, 2022
    Inventors: KARTHIK RAO, SHOMIT N. DAS, XUDONG AN, WEI HUANG
  • Publication number: 20220100257
    Abstract: Systems, methods, devices, and computer-implemented instructions for processor power management implemented in a compiler. In some implementations, a characteristic of code is determined. An instruction based on the determined characteristic is inserted into the code. The code and inserted instruction are compiled to generate compiled code. The compiled code is output.
    Type: Application
    Filed: September 25, 2020
    Publication date: March 31, 2022
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Vedula Venkata Srikant Bharadwaj, Shomit N. Das, Anthony T. Gutierrez, Vignesh Adhinarayanan
  • Publication number: 20220083233
    Abstract: In some embodiments, a memory controller in a processor includes a base value cache, a compressor, and a metadata cache. The compressor is coupled to the base value cache and the metadata cache. The compressor compresses a data block using at least a base value and delta values. The compressor determines whether the size of the data block exceeds a data block threshold value. Based on the determination of whether the size of the compressed data block generated by the compressor exceeds the data block threshold value, the memory controller transfers only a set of the compressed delta values to memory for storage. A decompressor located in the lower level cache of the processor decompresses the compressed data block using the base value stored in the base value cache, metadata stored in the metadata cache and the delta values stored in memory.
    Type: Application
    Filed: October 8, 2021
    Publication date: March 17, 2022
    Inventors: Seyed Mohammad SEYEDZADEHDELCHEH, Xianwei ZHANG, Bradford BECKMANN, Shomit N. DAS
  • Patent number: 11194634
    Abstract: In some examples, thermal aware optimization logic determines a characteristic (e.g., a workload or type) of a wavefront (e.g., multiple threads). For example, the characteristic indicates whether the wavefront is compute intensive, memory intensive, mixed, and/or another type of wavefront. The thermal aware optimization logic determines temperature information for one or more compute units (CUs) in one or more processing cores. The temperature information includes predictive thermal information indicating expected temperatures corresponding to the one or more CUs and historical thermal information indicating current or past thermal temperatures of at least a portion of a graphics processing unit (GPU). The logic selects the one or more compute units to process the plurality of threads based on the determined characteristic and the temperature information. The logic provides instructions to the selected subset of the plurality of CUs to execute the wavefront.
    Type: Grant
    Filed: December 14, 2018
    Date of Patent: December 7, 2021
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Karthik Rao, Shomit N. Das, Xudong An, Wei Huang
  • Patent number: 11144208
    Abstract: In some embodiments, a memory controller in a processor includes a base value cache, a compressor, and a metadata cache. The compressor is coupled to the base value cache and the metadata cache. The compressor compresses a data block using at least a base value and delta values. The compressor determines whether the size of the data block exceeds a data block threshold value. Based on the determination of whether the size of the compressed data block generated by the compressor exceeds the data block threshold value, the memory controller transfers only a set of the compressed delta values to memory for storage. A decompressor located in the lower level cache of the processor decompresses the compressed data block using the base value stored in the base value cache, metadata stored in the metadata cache and the delta values stored in memory.
    Type: Grant
    Filed: December 23, 2019
    Date of Patent: October 12, 2021
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: SeyedMohammad Seyedzadehdelcheh, Xianwei Zhang, Bradford Beckmann, Shomit N. Das
  • Patent number: 11119665
    Abstract: A processing system scales power to memory and memory channels based on identifying causes of stalls of threads of a wavefront. If the cause is other than an outstanding memory request, the processing system throttles power to the memory to save power. If the stall is due to memory stalls for a subset of the memory channels servicing memory access requests for threads of a wavefront, the processing system adjusts power of the memory channels servicing memory access request for the wavefront based on the subset. By boosting power to the subset of channels, the processing system enables the wavefront to complete processing more quickly, resulting in increased processing speed. Conversely, by throttling power to the remainder of channels, the processing system saves power without affecting processing speed.
    Type: Grant
    Filed: December 6, 2018
    Date of Patent: September 14, 2021
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Shomit N. Das, Kishore Punniyamurthy
  • Patent number: 11061429
    Abstract: A technique for fine-granularity speed binning for a processing device is provided. The processing device includes a plurality of clock domains, each of which may be clocked with independent clock signals. The clock frequency at which a particular clock domain may operate is determined based on the longest propagation delay between clocked elements in that particular clock domain. The processing device includes measurement circuits for each clock domain that measure such propagation delay. The measurement circuits are replica propagation delay paths of actual circuit elements within each particular clock domain. A speed bin for each clock domain is determined based on the propagation delay measured for the measurement circuits for a particular clock domain. Specifically, a speed bin is chosen that is associated with the fastest clock speed whose clock period is longer than the slowest propagation delay measured for the measurement circuit for the clock domain.
    Type: Grant
    Filed: October 26, 2017
    Date of Patent: July 13, 2021
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Greg Sadowski, Shomit N. Das
  • Publication number: 20210191869
    Abstract: Systems, apparatuses, and methods for implementing a multi-tiered approach to cache compression are disclosed. A cache includes a cache controller, light compressor, and heavy compressor. The decision on which compressor to use for compressing cache lines is made based on certain resource availability such as cache capacity or memory bandwidth. This allows the cache to opportunistically use complex algorithms for compression while limiting the adverse effects of high decompression latency on system performance. To address the above issue, the proposed design takes advantage of the heavy compressors for effectively reducing memory bandwidth in high bandwidth memory (HBM) interfaces as long as they do not sacrifice system performance. Accordingly, the cache combines light and heavy compressors with a decision-making unit to achieve reduced off-chip memory traffic without sacrificing system performance.
    Type: Application
    Filed: December 23, 2019
    Publication date: June 24, 2021
    Inventors: SeyedMohammad SeyedzadehDelcheh, Shomit N. Das, Bradford Michael Beckmann
  • Publication number: 20210191620
    Abstract: In some embodiments, a memory controller in a processor includes a base value cache, a compressor, and a metadata cache. The compressor is coupled to the base value cache and the metadata cache. The compressor compresses a data block using at least a base value and delta values. The compressor determines whether the size of the data block exceeds a data block threshold value. Based on the determination of whether the size of the compressed data block generated by the compressor exceeds the data block threshold value, the memory controller transfers only a set of the compressed delta values to memory for storage. A decompressor located in the lower level cache of the processor decompresses the compressed data block using the base value stored in the base value cache, metadata stored in the metadata cache and the delta values stored in memory.
    Type: Application
    Filed: December 23, 2019
    Publication date: June 24, 2021
    Inventors: SeyedMohammad SEYEDZADEHDELCHEH, Xianwei ZHANG, Bradford BECKMANN, Shomit N. DAS
  • Publication number: 20210191770
    Abstract: A processing unit preemptively cools selected compute units prior to initiating execution of a wavefront at the selected compute units. A scheduler of the processing unit identifies that a wavefront is to be executed at a selected subset of compute units of the processing unit. In response, the processing unit's temperature control subsystem activates one or more cooling elements to reduce the temperature of the subset of compute units, prior to the scheduler initiating execution of the wavefront. By preemptively cooling the compute units, the temperature control subsystem increases the difference between the initial temperature of the compute units and a thermal throttling threshold that triggers performance-impacting temperature control measures, such as the reduction of a compute unit clock frequency.
    Type: Application
    Filed: December 18, 2019
    Publication date: June 24, 2021
    Inventors: Karthik RAO, Shomit N. DAS, Manish ARORA
  • Publication number: 20210157485
    Abstract: Systems, methods, and devices for performing pattern-based cache block compression and decompression. An uncompressed cache block is input to the compressor. Byte values are identified within the uncompressed cache block. A cache block pattern is searched for in a set of cache block patterns based on the byte values. A compressed cache block is output based on the byte values and the cache block pattern. A compressed cache block is input to the decompressor. A cache block pattern is identified based on metadata of the cache block. The cache block pattern is applied to a byte dictionary of the cache block. An uncompressed cache block is output based on the cache block pattern and the byte dictionary. A subset of cache block patterns is determined from a training cache trace based on a set of compressed sizes and a target number of patterns for each size.
    Type: Application
    Filed: September 23, 2020
    Publication date: May 27, 2021
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Matthew Tomei, Shomit N. Das, David A. Wood
  • Publication number: 20210091788
    Abstract: Entropy agnostic data encoding includes: receiving, by an encoder, input data including a bit string; generating a plurality of candidate codewords, including encoding the input data bit string with a plurality of binary vectors, wherein the plurality of binary vectors includes a set of deterministic biased binary vectors and a set of random binary vectors; selecting, in dependence upon a predefined criteria, one of the plurality of candidate codewords; and transmitting the selected candidate codeword to a decoder.
    Type: Application
    Filed: November 4, 2020
    Publication date: March 25, 2021
    Inventors: SEYEDMOHAMMAD SEYEDZADEHDELCHEH, SHOMIT N. DAS