Patents by Inventor Shomit N. Das
Shomit N. Das has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11842199Abstract: An asynchronous pipeline includes a first stage and one or more second stages. A controller provides control signals to the first stage to indicate a modification to an operating speed of the first stage. The modification is determined based on a comparison of a completion status of the first stage to one or more completion statuses of the one or more second stages. In some cases, the controller provides control signals indicating modifications to an operating voltage applied to the first stage and a drive strength of a buffer in the first stage. Modules can be used to determine the completion statuses of the first stage and the one or more second stages based on the monitored output signals generated by the stages, output signals from replica critical paths associated with the stages, or a lookup table that indicates estimated completion times.Type: GrantFiled: June 26, 2020Date of Patent: December 12, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Greg Sadowski, John Kalamatianos, Shomit N. Das
-
Patent number: 11740791Abstract: In some embodiments, a memory controller in a processor includes a base value cache, a compressor, and a metadata cache. The compressor is coupled to the base value cache and the metadata cache. The compressor compresses a data block using at least a base value and delta values. The compressor determines whether the size of the data block exceeds a data block threshold value. Based on the determination of whether the size of the compressed data block generated by the compressor exceeds the data block threshold value, the memory controller transfers only a set of the compressed delta values to memory for storage. A decompressor located in the lower level cache of the processor decompresses the compressed data block using the base value stored in the base value cache, metadata stored in the metadata cache and the delta values stored in memory.Type: GrantFiled: October 8, 2021Date of Patent: August 29, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Seyed Mohammad Seyedzadehdelcheh, Xianwei Zhang, Bradford Beckmann, Shomit N. Das
-
Patent number: 11726546Abstract: Systems, methods, devices, and computer-implemented instructions for processor power management implemented in a compiler. In some implementations, a characteristic of code is determined. An instruction based on the determined characteristic is inserted into the code. The code and inserted instruction are compiled to generate compiled code. The compiled code is output.Type: GrantFiled: September 25, 2020Date of Patent: August 15, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Vedula Venkata Srikant Bharadwaj, Shomit N. Das, Anthony T. Gutierrez, Vignesh Adhinarayanan
-
Patent number: 11726837Abstract: In some examples, thermal aware optimization logic determines a characteristic (e.g., a workload or type) of a wavefront (e.g., multiple threads). For example, the characteristic indicates whether the wavefront is compute intensive, memory intensive, mixed, and/or another type of wavefront. The thermal aware optimization logic determines temperature information for one or more compute units (CUs) in one or more processing cores. The temperature information includes predictive thermal information indicating expected temperatures corresponding to the one or more CUs and historical thermal information indicating current or past thermal temperatures of at least a portion of a graphics processing unit (GPU). The logic selects the one or more compute units to process the plurality of threads based on the determined characteristic and the temperature information. The logic provides instructions to the selected subset of the plurality of CUs to execute the wavefront.Type: GrantFiled: November 4, 2021Date of Patent: August 15, 2023Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Karthik Rao, Shomit N. Das, Xudong An, Wei Huang
-
Publication number: 20230110376Abstract: Systems, apparatuses, and methods for implementing a multi-tiered approach to cache compression are disclosed. A cache includes a cache controller, light compressor, and heavy compressor. The decision on which compressor to use for compressing cache lines is made based on certain resource availability such as cache capacity or memory bandwidth. This allows the cache to opportunistically use complex algorithms for compression while limiting the adverse effects of high decompression latency on system performance. To address the above issue, the proposed design takes advantage of the heavy compressors for effectively reducing memory bandwidth in high bandwidth memory (HBM) interfaces as long as they do not sacrifice system performance. Accordingly, the cache combines light and heavy compressors with a decision-making unit to achieve reduced off-chip memory traffic without sacrificing system performance.Type: ApplicationFiled: November 23, 2022Publication date: April 13, 2023Inventors: SeyedMohammad SeyedzadehDelcheh, Shomit N. Das, Bradford Michael Beckmann
-
Patent number: 11604738Abstract: A processing device is provided which includes memory comprising data cache memory configured to store compressed data and metadata cache memory configured to store metadata, each portion of metadata comprising an encoding used to compress a portion of data. The processing device also includes at least one processor configured to compress portions of data and select, based on one or more utility level metrics, portions of metadata to be stored in the metadata cache memory. The at least one processor is also configured to store, in the metadata cache memory, the portions of metadata selected to be stored in the metadata cache memory, store, in the data cache memory, each portion of compressed data having a selected portion of corresponding metadata stored in the metadata cache memory. Each portion of compressed data, having the selected portion of corresponding metadata stored in the metadata cache memory, is decompressed.Type: GrantFiled: September 28, 2018Date of Patent: March 14, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Shomit N. Das, Matthew Tomei, David A. Wood
-
Patent number: 11544196Abstract: Systems, apparatuses, and methods for implementing a multi-tiered approach to cache compression are disclosed. A cache includes a cache controller, light compressor, and heavy compressor. The decision on which compressor to use for compressing cache lines is made based on certain resource availability such as cache capacity or memory bandwidth. This allows the cache to opportunistically use complex algorithms for compression while limiting the adverse effects of high decompression latency on system performance. To address the above issue, the proposed design takes advantage of the heavy compressors for effectively reducing memory bandwidth in high bandwidth memory (HBM) interfaces as long as they do not sacrifice system performance. Accordingly, the cache combines light and heavy compressors with a decision-making unit to achieve reduced off-chip memory traffic without sacrificing system performance.Type: GrantFiled: December 23, 2019Date of Patent: January 3, 2023Assignee: Advanced Micro Devices, Inc.Inventors: SeyedMohammad SeyedzadehDelcheh, Shomit N. Das, Bradford Michael Beckmann
-
Patent number: 11362673Abstract: Entropy agnostic data encoding includes: receiving, by an encoder, input data including a bit string; generating a plurality of candidate codewords, including encoding the input data bit string with a plurality of binary vectors, wherein the plurality of binary vectors includes a set of deterministic biased binary vectors and a set of random binary vectors; selecting, in dependence upon a predefined criteria, one of the plurality of candidate codewords; and transmitting the selected candidate codeword to a decoder.Type: GrantFiled: November 4, 2020Date of Patent: June 14, 2022Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Seyedmohammad Seyedzadehdelcheh, Shomit N. Das
-
Publication number: 20220107849Abstract: In some examples, thermal aware optimization logic determines a characteristic (e.g., a workload or type) of a wavefront (e.g., multiple threads). For example, the characteristic indicates whether the wavefront is compute intensive, memory intensive, mixed, and/or another type of wavefront. The thermal aware optimization logic determines temperature information for one or more compute units (CUs) in one or more processing cores. The temperature information includes predictive thermal information indicating expected temperatures corresponding to the one or more CUs and historical thermal information indicating current or past thermal temperatures of at least a portion of a graphics processing unit (GPU). The logic selects the one or more compute units to process the plurality of threads based on the determined characteristic and the temperature information. The logic provides instructions to the selected subset of the plurality of CUs to execute the wavefront.Type: ApplicationFiled: November 4, 2021Publication date: April 7, 2022Inventors: KARTHIK RAO, SHOMIT N. DAS, XUDONG AN, WEI HUANG
-
Publication number: 20220100257Abstract: Systems, methods, devices, and computer-implemented instructions for processor power management implemented in a compiler. In some implementations, a characteristic of code is determined. An instruction based on the determined characteristic is inserted into the code. The code and inserted instruction are compiled to generate compiled code. The compiled code is output.Type: ApplicationFiled: September 25, 2020Publication date: March 31, 2022Applicant: Advanced Micro Devices, Inc.Inventors: Vedula Venkata Srikant Bharadwaj, Shomit N. Das, Anthony T. Gutierrez, Vignesh Adhinarayanan
-
Publication number: 20220083233Abstract: In some embodiments, a memory controller in a processor includes a base value cache, a compressor, and a metadata cache. The compressor is coupled to the base value cache and the metadata cache. The compressor compresses a data block using at least a base value and delta values. The compressor determines whether the size of the data block exceeds a data block threshold value. Based on the determination of whether the size of the compressed data block generated by the compressor exceeds the data block threshold value, the memory controller transfers only a set of the compressed delta values to memory for storage. A decompressor located in the lower level cache of the processor decompresses the compressed data block using the base value stored in the base value cache, metadata stored in the metadata cache and the delta values stored in memory.Type: ApplicationFiled: October 8, 2021Publication date: March 17, 2022Inventors: Seyed Mohammad SEYEDZADEHDELCHEH, Xianwei ZHANG, Bradford BECKMANN, Shomit N. DAS
-
Patent number: 11194634Abstract: In some examples, thermal aware optimization logic determines a characteristic (e.g., a workload or type) of a wavefront (e.g., multiple threads). For example, the characteristic indicates whether the wavefront is compute intensive, memory intensive, mixed, and/or another type of wavefront. The thermal aware optimization logic determines temperature information for one or more compute units (CUs) in one or more processing cores. The temperature information includes predictive thermal information indicating expected temperatures corresponding to the one or more CUs and historical thermal information indicating current or past thermal temperatures of at least a portion of a graphics processing unit (GPU). The logic selects the one or more compute units to process the plurality of threads based on the determined characteristic and the temperature information. The logic provides instructions to the selected subset of the plurality of CUs to execute the wavefront.Type: GrantFiled: December 14, 2018Date of Patent: December 7, 2021Assignee: Advanced Micro Devices, Inc.Inventors: Karthik Rao, Shomit N. Das, Xudong An, Wei Huang
-
Patent number: 11144208Abstract: In some embodiments, a memory controller in a processor includes a base value cache, a compressor, and a metadata cache. The compressor is coupled to the base value cache and the metadata cache. The compressor compresses a data block using at least a base value and delta values. The compressor determines whether the size of the data block exceeds a data block threshold value. Based on the determination of whether the size of the compressed data block generated by the compressor exceeds the data block threshold value, the memory controller transfers only a set of the compressed delta values to memory for storage. A decompressor located in the lower level cache of the processor decompresses the compressed data block using the base value stored in the base value cache, metadata stored in the metadata cache and the delta values stored in memory.Type: GrantFiled: December 23, 2019Date of Patent: October 12, 2021Assignee: ADVANCED MICRO DEVICES, INC.Inventors: SeyedMohammad Seyedzadehdelcheh, Xianwei Zhang, Bradford Beckmann, Shomit N. Das
-
Patent number: 11119665Abstract: A processing system scales power to memory and memory channels based on identifying causes of stalls of threads of a wavefront. If the cause is other than an outstanding memory request, the processing system throttles power to the memory to save power. If the stall is due to memory stalls for a subset of the memory channels servicing memory access requests for threads of a wavefront, the processing system adjusts power of the memory channels servicing memory access request for the wavefront based on the subset. By boosting power to the subset of channels, the processing system enables the wavefront to complete processing more quickly, resulting in increased processing speed. Conversely, by throttling power to the remainder of channels, the processing system saves power without affecting processing speed.Type: GrantFiled: December 6, 2018Date of Patent: September 14, 2021Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Shomit N. Das, Kishore Punniyamurthy
-
Patent number: 11061429Abstract: A technique for fine-granularity speed binning for a processing device is provided. The processing device includes a plurality of clock domains, each of which may be clocked with independent clock signals. The clock frequency at which a particular clock domain may operate is determined based on the longest propagation delay between clocked elements in that particular clock domain. The processing device includes measurement circuits for each clock domain that measure such propagation delay. The measurement circuits are replica propagation delay paths of actual circuit elements within each particular clock domain. A speed bin for each clock domain is determined based on the propagation delay measured for the measurement circuits for a particular clock domain. Specifically, a speed bin is chosen that is associated with the fastest clock speed whose clock period is longer than the slowest propagation delay measured for the measurement circuit for the clock domain.Type: GrantFiled: October 26, 2017Date of Patent: July 13, 2021Assignee: Advanced Micro Devices, Inc.Inventors: Greg Sadowski, Shomit N. Das
-
Publication number: 20210191869Abstract: Systems, apparatuses, and methods for implementing a multi-tiered approach to cache compression are disclosed. A cache includes a cache controller, light compressor, and heavy compressor. The decision on which compressor to use for compressing cache lines is made based on certain resource availability such as cache capacity or memory bandwidth. This allows the cache to opportunistically use complex algorithms for compression while limiting the adverse effects of high decompression latency on system performance. To address the above issue, the proposed design takes advantage of the heavy compressors for effectively reducing memory bandwidth in high bandwidth memory (HBM) interfaces as long as they do not sacrifice system performance. Accordingly, the cache combines light and heavy compressors with a decision-making unit to achieve reduced off-chip memory traffic without sacrificing system performance.Type: ApplicationFiled: December 23, 2019Publication date: June 24, 2021Inventors: SeyedMohammad SeyedzadehDelcheh, Shomit N. Das, Bradford Michael Beckmann
-
Publication number: 20210191620Abstract: In some embodiments, a memory controller in a processor includes a base value cache, a compressor, and a metadata cache. The compressor is coupled to the base value cache and the metadata cache. The compressor compresses a data block using at least a base value and delta values. The compressor determines whether the size of the data block exceeds a data block threshold value. Based on the determination of whether the size of the compressed data block generated by the compressor exceeds the data block threshold value, the memory controller transfers only a set of the compressed delta values to memory for storage. A decompressor located in the lower level cache of the processor decompresses the compressed data block using the base value stored in the base value cache, metadata stored in the metadata cache and the delta values stored in memory.Type: ApplicationFiled: December 23, 2019Publication date: June 24, 2021Inventors: SeyedMohammad SEYEDZADEHDELCHEH, Xianwei ZHANG, Bradford BECKMANN, Shomit N. DAS
-
Publication number: 20210191770Abstract: A processing unit preemptively cools selected compute units prior to initiating execution of a wavefront at the selected compute units. A scheduler of the processing unit identifies that a wavefront is to be executed at a selected subset of compute units of the processing unit. In response, the processing unit's temperature control subsystem activates one or more cooling elements to reduce the temperature of the subset of compute units, prior to the scheduler initiating execution of the wavefront. By preemptively cooling the compute units, the temperature control subsystem increases the difference between the initial temperature of the compute units and a thermal throttling threshold that triggers performance-impacting temperature control measures, such as the reduction of a compute unit clock frequency.Type: ApplicationFiled: December 18, 2019Publication date: June 24, 2021Inventors: Karthik RAO, Shomit N. DAS, Manish ARORA
-
Publication number: 20210157485Abstract: Systems, methods, and devices for performing pattern-based cache block compression and decompression. An uncompressed cache block is input to the compressor. Byte values are identified within the uncompressed cache block. A cache block pattern is searched for in a set of cache block patterns based on the byte values. A compressed cache block is output based on the byte values and the cache block pattern. A compressed cache block is input to the decompressor. A cache block pattern is identified based on metadata of the cache block. The cache block pattern is applied to a byte dictionary of the cache block. An uncompressed cache block is output based on the cache block pattern and the byte dictionary. A subset of cache block patterns is determined from a training cache trace based on a set of compressed sizes and a target number of patterns for each size.Type: ApplicationFiled: September 23, 2020Publication date: May 27, 2021Applicant: Advanced Micro Devices, Inc.Inventors: Matthew Tomei, Shomit N. Das, David A. Wood
-
Publication number: 20210091788Abstract: Entropy agnostic data encoding includes: receiving, by an encoder, input data including a bit string; generating a plurality of candidate codewords, including encoding the input data bit string with a plurality of binary vectors, wherein the plurality of binary vectors includes a set of deterministic biased binary vectors and a set of random binary vectors; selecting, in dependence upon a predefined criteria, one of the plurality of candidate codewords; and transmitting the selected candidate codeword to a decoder.Type: ApplicationFiled: November 4, 2020Publication date: March 25, 2021Inventors: SEYEDMOHAMMAD SEYEDZADEHDELCHEH, SHOMIT N. DAS