Patents by Inventor Shomit N. Das

Shomit N. Das has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

ENTROPY AGNOSTIC DATA ENCODING AND DECODING

Publication number: 20210091787

Abstract: Entropy agnostic data encoding includes: receiving, by an encoder, input data including a bit string; generating a plurality of candidate codewords, including encoding the input data bit string with a plurality of binary vectors, wherein the plurality of binary vectors includes a set of deterministic biased binary vectors and a set of random binary vectors; selecting, in dependence upon a predefined criteria, one of the plurality of candidate codewords; and transmitting the selected candidate codeword to a decoder.

Type: Application

Filed: September 23, 2019

Publication date: March 25, 2021

Inventors: SEYEDMOHAMMAD SEYEDZADEHDELCHEH, SHOMIT N. DAS
CONTROLLING THE OPERATING SPEED OF STAGES OF AN ASYNCHRONOUS PIPELINE

Publication number: 20210089324

Abstract: An asynchronous pipeline includes a first stage and one or more second stages. A controller provides control signals to the first stage to indicate a modification to an operating speed of the first stage. The modification is determined based on a comparison of a completion status of the first stage to one or more completion statuses of the one or more second stages. In some cases, the controller provides control signals indicating modifications to an operating voltage applied to the first stage and a drive strength of a buffer in the first stage. Modules can be used to determine the completion statuses of the first stage and the one or more second stages based on the monitored output signals generated by the stages, output signals from replica critical paths associated with the stages, or a lookup table that indicates estimated completion times.

Type: Application

Filed: June 26, 2020

Publication date: March 25, 2021

Inventors: Greg Sadowski, John Kalamatianos, Shomit N. Das
Entropy agnostic data encoding and decoding

Patent number: 10944422

Abstract: Entropy agnostic data encoding includes: receiving, by an encoder, input data including a bit string; generating a plurality of candidate codewords, including encoding the input data bit string with a plurality of binary vectors, wherein the plurality of binary vectors includes a set of deterministic biased binary vectors and a set of random binary vectors; selecting, in dependence upon a predefined criteria, one of the plurality of candidate codewords; and transmitting the selected candidate codeword to a decoder.

Type: Grant

Filed: September 23, 2019

Date of Patent: March 9, 2021

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Seyedmohammad Seyedzadehdelcheh, Shomit N. Das
Routing flits in a network-on-chip based on operating states of routers

Patent number: 10944693

Abstract: A system is described that includes an integrated circuit chip having a network-on-chip. The network-on-chip includes multiple routers arranged in a topology and a separate communication link coupled between each router and each of one or more neighboring routers of that router among the multiple routers in the topology. The integrated circuit chip also includes multiple nodes, each node coupled to a router of the multiple routers. When operating, a given router of the multiple routers keeps a record of operating states of some or all of the multiple routers and corresponding communication links. The given router then routes flits to destination nodes via one or more other routers of the multiple routers based at least in part on the operating states of the some or all of the multiple routers and the corresponding communication links.

Type: Grant

Filed: November 13, 2018

Date of Patent: March 9, 2021

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Srikant Bharadwaj, Shomit N. Das
Byte select cache compression

Patent number: 10860489

Abstract: Techniques are disclosed for designing cache compression algorithms that control how data in caches are compressed. The techniques generate a custom “byte select algorithm” by applying repeated transforms applied to an initial compression algorithm until a set of suitability criteria is met. The suitability criteria include that the “cost” is below a threshold and that a metadata constraint is met. The “cost” is the number of blocks that can be compressed by an algorithm as compared with the “ideal” algorithm. The metadata constraint is the number of bits required for metadata.

Type: Grant

Filed: October 31, 2018

Date of Patent: December 8, 2020

Assignee: Advanced Micro Devices, Inc.

Inventors: Shomit N. Das, Matthew Tomei, David A. Wood
Device and method for cache utilization aware data compression

Patent number: 10838727

Abstract: A processing device is provided which includes memory and at least one processor. The memory includes main memory and cache memory in communication with the main memory via a link. The at least one processor is configured to receive a request for a cache line and read the cache line from main memory. The at least one processor is also configured to compress the cache line according to a compression algorithm and, when the compressed cache line includes at least one byte predicted not to be accessed, drop the at least one byte from the compressed cache line based on whether the compression algorithm is determined to successfully compress the cache line according to a compression parameter.

Type: Grant

Filed: December 14, 2018

Date of Patent: November 17, 2020

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Shomit N. Das, Kishore Punniyamurthy, Matthew Tomei, Bradford M. Beckmann
Compressing data for storage in cache memories in a hierarchy of cache memories

Patent number: 10795825

Abstract: An electronic device includes at least one compression-decompression functional block and a hierarchy of cache memories with a first cache memory and a second cache memory. The at least one compression-decompression functional block receives data in an uncompressed state, compresses the data using one of a first compression or a second compression, and, after compressing the data, provides the data to the first cache memory for storage therein. When the data is retrieved from the first cache memory to be stored in the second cache memory, when the data is compressed using the first compression, the compression-decompression functional block decompresses the data to reverse effects of the first compression on the data, thereby restoring the data to the uncompressed state and provides the data compressed using the second compression or in the uncompressed state to the second cache memory for storage therein.

Type: Grant

Filed: December 26, 2018

Date of Patent: October 6, 2020

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Matthew J. Tomei, Philip B. Bedoukian, Shomit N. Das
Compressing Data for Storage in Cache Memories in a Hierarchy of Cache Memories

Publication number: 20200210343

Abstract: An electronic device includes at least one compression-decompression functional block and a hierarchy of cache memories with a first cache memory and a second cache memory. The at least one compression-decompression functional block receives data in an uncompressed state, compresses the data using one of a first compression or a second compression, and, after compressing the data, provides the data to the first cache memory for storage therein. When the data is retrieved from the first cache memory to be stored in the second cache memory, when the data is compressed using the first compression, the compression-decompression functional block decompresses the data to reverse effects of the first compression on the data, thereby restoring the data to the uncompressed state and provides the data compressed using the second compression or in the uncompressed state to the second cache memory for storage therein.

Type: Application

Filed: December 26, 2018

Publication date: July 2, 2020

Inventors: Matthew J. Tomei, Philip B. Bedoukian, Shomit N. Das
Controlling the operating speed of stages of an asynchronous pipeline

Patent number: 10698692

Abstract: An asynchronous pipeline includes a first stage and one or more second stages. A controller provides control signals to the first stage to indicate a modification to an operating speed of the first stage. The modification is determined based on a comparison of a completion status of the first stage to one or more completion statuses of the one or more second stages. In some cases, the controller provides control signals indicating modifications to an operating voltage applied to the first stage and a drive strength of a buffer in the first stage. Modules can be used to determine the completion statuses of the first stage and the one or more second stages based on the monitored output signals generated by the stages, output signals from replica critical paths associated with the stages, or a lookup table that indicates estimated completion times.

Type: Grant

Filed: July 21, 2016

Date of Patent: June 30, 2020

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Greg Sadowski, John Kalamatianos, Shomit N. Das
APPARATUS AND METHOD FOR PROVIDING WORKLOAD DISTRIBUTION OF THREADS AMONG MULTIPLE COMPUTE UNITS

Publication number: 20200192705

Abstract: In some examples, thermal aware optimization logic determines a characteristic (e.g., a workload or type) of a wavefront (e.g., multiple threads). For example, the characteristic indicates whether the wavefront is compute intensive, memory intensive, mixed, and/or another type of wavefront. The thermal aware optimization logic determines temperature information for one or more compute units (CUs) in one or more processing cores. The temperature information includes predictive thermal information indicating expected temperatures corresponding to the one or more CUs and historical thermal information indicating current or past thermal temperatures of at least a portion of a graphics processing unit (GPU). The logic selects the one or more compute units to process the plurality of threads based on the determined characteristic and the temperature information. The logic provides instructions to the selected subset of the plurality of CUs to execute the wavefront.

Type: Application

Filed: December 14, 2018

Publication date: June 18, 2020

Inventors: KARTHIK RAO, SHOMIT N. DAS, XUDONG AN, WEI HUANG
DEVICE AND METHOD FOR CACHE UTILIZATION AWARE DATA COMPRESSION

Publication number: 20200192671

Abstract: A processing device is provided which includes memory and at least one processor. The memory includes main memory and cache memory in communication with the main memory via a link. The at least one processor is configured to receive a request for a cache line and read the cache line from main memory. The at least one processor is also configured to compress the cache line according to a compression algorithm and, when the compressed cache line includes at least one byte predicted not to be accessed, drop the at least one byte from the compressed cache line based on whether the compression algorithm is determined to successfully compress the cache line according to a compression parameter.

Type: Application

Filed: December 14, 2018

Publication date: June 18, 2020

Applicant: Advanced Micro Devices, Inc.

Inventors: Shomit N. Das, Kishore Punniyamurthy, Matthew Tomei, Bradford M. Beckmann
DYNAMIC VOLTAGE AND FREQUENCY SCALING BASED ON MEMORY CHANNEL SLACK

Publication number: 20200183597

Abstract: A processing system scales power to memory and memory channels based on identifying causes of stalls of threads of a wavefront. If the cause is other than an outstanding memory request, the processing system throttles power to the memory to save power. If the stall is due to memory stalls for a subset of the memory channels servicing memory access requests for threads of a wavefront, the processing system adjusts power of the memory channels servicing memory access request for the wavefront based on the subset. By boosting power to the subset of channels, the processing system enables the wavefront to complete processing more quickly, resulting in increased processing speed. Conversely, by throttling power to the remainder of channels, the processing system saves power without affecting processing speed.

Type: Application

Filed: December 6, 2018

Publication date: June 11, 2020

Inventors: Shomit N. DAS, Kishore PUNNIYAMURTHY
HINT-BASED FINE-GRAINED DYNAMIC VOLTAGE AND FREQUENCY SCALING IN GPUS

Publication number: 20200183485

Abstract: A processing system dynamically scales at least one of voltage and frequency at a subset of a plurality of compute units of a graphics processing unit (GPU) based on characteristics of a kernel or workload to be executed at the subset. A system management unit for the processing system receives a compute unit mask, designating the subset of a plurality of compute units of a GPU to execute the kernel or workload, and workload characteristics indicating the compute-boundedness or memory bandwidth-boundedness of the kernel or workload from a central processing unit of the processing system. The system management unit determines a dynamic voltage and frequency scaling policy for the subset of the plurality of compute units of the GPU based on the compute unit mask and the workload characteristics.

Type: Application

Filed: December 7, 2018

Publication date: June 11, 2020

Inventors: Shomit N. DAS, Joseph L. GREATHOUSE
Routing Flits in a Network-on-Chip Based on Operating States of Routers

Publication number: 20200153757

Abstract: A system is described that includes an integrated circuit chip having a network-on-chip. The network-on-chip includes multiple routers arranged in a topology and a separate communication link coupled between each router and each of one or more neighboring routers of that router among the multiple routers in the topology. The integrated circuit chip also includes multiple nodes, each node coupled to a router of the multiple routers. When operating, a given router of the multiple routers keeps a record of operating states of some or all of the multiple routers and corresponding communication links. The given router then routes flits to destination nodes via one or more other routers of the multiple routers based at least in part on the operating states of the some or all of the multiple routers and the corresponding communication links.

Type: Application

Filed: November 13, 2018

Publication date: May 14, 2020

Inventors: Srikant Bharadwaj, Shomit N. Das
DYNAMIC PRECISION SCALING AT EPOCH GRANULARITY IN NEURAL NETWORKS

Publication number: 20200151573

Abstract: A processor determines losses of samples within an input volume that is provided to a neural network during a first epoch, groups the samples into subsets based on losses, and assigns the subsets to operands in the neural network that represent the samples at different precisions. Each subset is associated with a different precision. The processor then processes the subsets in the neural network at the different precisions during the first epoch. In some cases, the samples in the subsets are used in a forward pass and a backward pass through the neural network. A memory configured to store information representing the samples in the subsets at the different precisions. In some cases, the processor stores information representing model parameters of the neural network in the memory at the different precisions of the subsets of the corresponding samples.

Type: Application

Filed: May 29, 2019

Publication date: May 14, 2020

Inventors: Shomit N. DAS, Abhinav VISHNU
BYTE SELECT CACHE COMPRESSION

Publication number: 20200133866

Abstract: The disclosure herein provides techniques for designing cache compression algorithms that control how data in caches are compressed. The techniques generate a custom “byte select algorithm” by applying repeated transforms applied to an initial compression algorithm until a set of suitability criteria is met. The suitability criteria include that the “cost” is below a threshold and that a metadata constraint is met. The “cost” is the number of blocks that can be compressed by an algorithm as compared with the “ideal” algorithm. The metadata constraint is the number of bits required for metadata.

Type: Application

Filed: October 31, 2018

Publication date: April 30, 2020

Applicant: Advanced Micro Devices, Inc.

Inventors: Shomit N. Das, Matthew Tomei, David A. Wood
DEVICE AND METHOD FOR DATA COMPRESSION USING A METADATA CACHE

Publication number: 20200104262

Abstract: A processing device is provided which includes memory comprising data cache memory configured to store compressed data and metadata cache memory configured to store metadata, each portion of metadata comprising an encoding used to compress a portion of data. The processing device also includes at least one processor configured to compress portions of data and select, based on one or more utility level metrics, portions of metadata to be stored in the metadata cache memory. The at least one processor is also configured to store, in the metadata cache memory, the portions of metadata selected to be stored in the metadata cache memory, store, in the data cache memory, each portion of compressed data having a selected portion of corresponding metadata stored in the metadata cache memory. Each portion of compressed data, having the selected portion of corresponding metadata stored in the metadata cache memory, is decompressed.

Type: Application

Filed: September 28, 2018

Publication date: April 2, 2020

Applicant: Advanced Micro Devices, Inc.

Inventors: Shomit N. Das, Matthew Tomei, David A. Wood
RELIABLE VOLTAGE SCALED LINKS FOR COMPRESSED DATA

Publication number: 20200073845

Abstract: Systems, apparatuses, and methods for reliably transmitting data over voltage scaled links are disclosed. A computing system includes at least first and second devices connected via a link. In one implementation, if a data block can be compressed to less than or equal to half the original size of the data block, then the data block is compressed and sent on the link in a single clock cycle rather than two clock cycles. If the data block cannot be compressed to half the original size, but if the data block can be compressed enough to include error correction code (ECC) bits without exceeding the original size, then ECC bits are added to the compressed block which is sent on the link at a reduced voltage. The ECC bits help to correct for any errors that are generated as a result of operating the link at the reduced voltage.

Type: Application

Filed: August 30, 2018

Publication date: March 5, 2020

Inventors: Shomit N. Das, Matthew Tomei, Shrikanth Ganapathy, John Kalamatianos
Reliable voltage scaled links for compressed data

Patent number: 10558606

Abstract: Systems, apparatuses, and methods for reliably transmitting data over voltage scaled links are disclosed. A computing system includes at least first and second devices connected via a link. In one implementation, if a data block can be compressed to less than or equal to half the original size of the data block, then the data block is compressed and sent on the link in a single clock cycle rather than two clock cycles. If the data block cannot be compressed to half the original size, but if the data block can be compressed enough to include error correction code (ECC) bits without exceeding the original size, then ECC bits are added to the compressed block which is sent on the link at a reduced voltage. The ECC bits help to correct for any errors that are generated as a result of operating the link at the reduced voltage.

Type: Grant

Filed: August 30, 2018

Date of Patent: February 11, 2020

Assignee: Advanced Micro Devices, Inc.

Inventors: Shomit N. Das, Matthew Tomei, Shrikanth Ganapathy, John Kalamatianos
Device and method of compressing data using tiered data compression

Patent number: 10411731

Abstract: A processing device is provided which includes a plurality of encoders each configured to compress a portion of data using a different compression algorithm. The processing device also includes one or more processors configured to cause an encoder, of the plurality of encoders, to compress the portion of data when it is determined that the portion of data, which is compressed by another encoder configured to compress the portion of data prior to the encoder in an encoder hierarchy, is not successfully compressed according to a compression metric by the other encoder in the encoder hierarchy. The one or more processors are also configured to prevent the encoder from compressing the portion of data when it is determined that the portion of data is successfully compressed according to the compression metric by the other encoder in the encoder hierarchy.

Type: Grant

Filed: September 24, 2018

Date of Patent: September 10, 2019

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Shomit N. Das, Matthew Tomei

prev 1 2 3 next