Patents by Inventor Alok Garg
Alok Garg has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240111674Abstract: Data reuse cache techniques are described. In one example, a load instruction is generated by an execution unit of a processor unit. In response to the load instruction, data is loaded by a load-store unit for processing by the execution unit and is also stored to a data reuse cache communicatively coupled between the load-store unit and the execution unit. Upon receipt of a subsequent load instruction for the data from the execution unit, the data is loaded from the data reuse cache for processing by the execution unit.Type: ApplicationFiled: September 29, 2022Publication date: April 4, 2024Applicant: Advanced Micro Devices, Inc.Inventors: Alok Garg, Neil N Marketkar, Matthew T. Sobel
-
Publication number: 20230305849Abstract: Array of pointers prefetching is described. In accordance with described techniques, a pointer target instruction is detected by identifying that a destination location of a load instruction is used in an address compute for a memory operation and the load instruction is included in a sequence of load instructions having addresses separated by a step size. An instruction for fetching data of a future load instruction is injected in an instruction stream of a processor. The data of the future load instruction is stored in a temporary register. An additional instruction is injected in the instruction stream for prefetching a pointer target based on an address of the memory operation and the data of the future load instruction.Type: ApplicationFiled: March 25, 2022Publication date: September 28, 2023Applicant: Advanced Micro Devices, Inc.Inventors: Chetana N. Keltcher, Alok Garg, Paul S. Keltcher
-
Publication number: 20230297381Abstract: Load dependent branch prediction is described. In accordance with described techniques, a load dependent branch instruction is detected by identifying that a destination location of a load instruction is used in an operation for determining whether a conditional branch is taken or not taken. The load instruction is included in a sequence of load instructions having addresses separated by a step size. An instruction is injected in an instruction stream of a processor for fetching data of a future load instruction using an address of the load instruction offset by a distance based on the step size. An additional instruction is injected in the instruction stream of the processor for precomputing an outcome of a load dependent branch using an address computed based on an address of the operation and the data of the future load instruction.Type: ApplicationFiled: March 21, 2022Publication date: September 21, 2023Applicant: Advanced Micro Devices, Inc.Inventors: Chetana N. Keltcher, Alok Garg, Paul S Keltcher
-
Publication number: 20230205872Abstract: A method includes receiving an indication that a number of activations of a memory structure exceeds a threshold number of activations for a time period, and in response to the indication, throttling instruction execution for a thread issuing the activations.Type: ApplicationFiled: December 23, 2021Publication date: June 29, 2023Inventors: Jagadish B. Kotra, Onur Kayiran, John Kalamatianos, Alok Garg
-
Patent number: 11676073Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed that analyze performance of manufacturer independent devices. An example apparatus includes a software development kit (SDK) deployment engine to deploy an SDK to a manufacturer of a device, the SDK to define heartbeat data to be collected from the device and interfacing techniques to transmit the heartbeat data to a measurement entity. In some examples, the apparatus includes a machine learning engine to predict whether the device is associated with one or more failure modes. The example apparatus also includes an alert generator to generate an alert based on a prediction, the alert to indicate at least one of a type of a first one of the failure modes or at least one component of the device to be remedied according to the first one of the one or more failure modes, and transmit the alert to a management agent.Type: GrantFiled: July 12, 2021Date of Patent: June 13, 2023Assignee: The Nielsen Company (US), LLCInventors: John T. Livoti, Susan Cimino, Stanley Wellington Woodruff, Rajakumar Madhanganesh, Alok Garg
-
Publication number: 20220318017Abstract: Techniques are described for a hardware processor to dynamically configure a component that improves a processor function with a configuration setting based on invariant statistics. The invariant statistics are generated by execution of the instructions from one or more applications and are independent of the performance metrics of the processor function for the execution. In an embodiment, the configuration setting for the component is generated using a machine learning model.Type: ApplicationFiled: March 30, 2021Publication date: October 6, 2022Inventors: ALOK GARG, PAUL KELTCHER, MAYANK CHHABLANI, FURKAN ERIS
-
Patent number: 11455252Abstract: Techniques for generating a model for predicting when different hybrid prefetcher configurations should be used are disclosed. Techniques for using the model to predict when different hybrid prefetcher configurations should be used are also disclosed. The techniques for generating the model include obtaining a set of input data, and generating trees based on the training data. Each tree is associated with a different hybrid prefetcher configuration and the trees output certainty scores for the associated hybrid prefetcher configuration based on hardware feature measurements. To decide on a hybrid prefetcher configuration to use, a prefetcher traverses multiple trees to obtain certainty scores for different hybrid prefetcher configurations and identifies a hybrid prefetcher configuration to used based on a comparison of the certainty scores.Type: GrantFiled: June 26, 2019Date of Patent: September 27, 2022Assignee: Advanced Micro Devices, Inc.Inventors: John Kalamatianos, Paul S. Keltcher, Mayank Chhablani, Alok Garg, Furkan Eris
-
Publication number: 20220206798Abstract: Systems, apparatuses, and methods for implementing scheduler queue assignment logic are disclosed. A processor includes at least a decode unit, scheduler queue assignment logic, scheduler queues, pickers, and execution units. The assignment logic receives a plurality of operations from a decode unit in each clock cycle. The assignment logic includes a separate logical unit for each different type of operation which is executable by the different execution units of the processor. For each different type of operation, the assignment logic determines which of the possible assignment permutations are valid for assigning different numbers of operations to scheduler queues in a given clock cycle. The assignment logic receives an indication of how many operations to assign in the given clock cycle, and then the assignment logic selects one of the valid assignment permutations for the number of operations specified by the indication.Type: ApplicationFiled: March 18, 2022Publication date: June 30, 2022Inventors: Matthew T. Sobel, Donald A. Priore, Alok Garg
-
Publication number: 20220197809Abstract: Techniques for identifying a hardware configuration for operation are disclosed. The techniques include applying feature measurements to a trained model; obtaining output values from the trained model, the output values corresponding to different hardware configurations; and operating according to the output values, wherein the output values include one of a certainty score, a ranking, or a regression value.Type: ApplicationFiled: December 23, 2020Publication date: June 23, 2022Applicant: Advanced Micro Devices, Inc.Inventors: Furkan Eris, Paul S. Keltcher, John Kalamatianos, Mayank Chhablani, Alok Garg
-
Patent number: 11334384Abstract: Systems, apparatuses, and methods for implementing scheduler queue assignment burst mode are disclosed. A scheduler queue assignment unit receives a dispatch packet with a plurality of operations from a decode unit in each clock cycle. The scheduler queue assignment unit determines if the number of operations in the dispatch packet for any class of operations is greater than a corresponding threshold for dispatching to the scheduler queues in a single cycle. If the number of operations for a given class is greater than the corresponding threshold, and if a burst mode counter is less than a burst mode window threshold, the scheduler queue assignment unit dispatches the extra number of operations for the given class in a single cycle. By operating in burst mode for a given operation class during a small number of cycles, processor throughput can be increased without starving the processor of other operation classes.Type: GrantFiled: December 10, 2019Date of Patent: May 17, 2022Assignee: Advanced Micro Devices, Inc.Inventors: Alok Garg, Scott Andrew McLelland, Marius Evers, Matthew T. Sobel
-
Patent number: 11294678Abstract: Systems, apparatuses, and methods for implementing scheduler queue assignment logic are disclosed. A processor includes at least a decode unit, scheduler queue assignment logic, scheduler queues, pickers, and execution units. The assignment logic receives a plurality of operations from a decode unit in each clock cycle. The assignment logic includes a separate logical unit for each different type of operation which is executable by the different execution units of the processor. For each different type of operation, the assignment logic determines which of the possible assignment permutations are valid for assigning different numbers of operations to scheduler queues in a given clock cycle. The assignment logic receives an indication of how many operations to assign in the given clock cycle, and then the assignment logic selects one of the valid assignment permutations for the number of operations specified by the indication.Type: GrantFiled: May 29, 2018Date of Patent: April 5, 2022Assignee: Advanced Micro Devices, Inc.Inventors: Matthew T. Sobel, Donald A. Priore, Alok Garg
-
Publication number: 20210342746Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed that analyze performance of manufacturer independent devices. An example apparatus includes a software development kit (SDK) deployment engine to deploy an SDK to a manufacturer of a device, the SDK to define heartbeat data to be collected from the device and interfacing techniques to transmit the heartbeat data to a measurement entity. In some examples, the apparatus includes a machine learning engine to predict whether the device is associated with one or more failure modes. The example apparatus also includes an alert generator to generate an alert based on a prediction, the alert to indicate at least one of a type of a first one of the failure modes or at least one component of the device to be remedied according to the first one of the one or more failure modes, and transmit the alert to a management agent.Type: ApplicationFiled: July 12, 2021Publication date: November 4, 2021Inventors: John T. Livoti, Susan Cimino, Stanley Wellington Woodruff, Rajakumar Madhanganesh, Alok Garg
-
Patent number: 11062233Abstract: Methods, apparatus, systems and articles of manufacture are disclosed that provide an apparatus to monitor watermark encoder operation, the apparatus comprising: a data collector to collect one or more types of heartbeat data from a watermark encoder, the heartbeat data including time varying data, the one or more types of the heartbeat data defined by a software development kit (SDK); a machine learning engine to process the heartbeat data to predict whether the watermark encoder is associated with respective ones of a plurality of failure modes; and an alert generator to, in response to the machine learning engine predicting the watermark encoder is associated with a first one of the failure modes: generate an alert indicating the at least one of the one or more components to be remedied according to the first one of the failure modes; and transmit the alert to a watermark encoder management agent.Type: GrantFiled: December 21, 2018Date of Patent: July 13, 2021Assignee: THE NIELSEN COMPANY (US), LLCInventors: John T. Livoti, Susan Cimino, Stanley Wellington Woodruff, Rajakumar Madhanganesh, Alok Garg
-
Publication number: 20210173702Abstract: Systems, apparatuses, and methods for implementing scheduler queue assignment burst mode are disclosed. A scheduler queue assignment unit receives a dispatch packet with a plurality of operations from a decode unit in each clock cycle. The scheduler queue assignment unit determines if the number of operations in the dispatch packet for any class of operations is greater than a corresponding threshold for dispatching to the scheduler queues in a single cycle. If the number of operations for a given class is greater than the corresponding threshold, and if a burst mode counter is less than a burst mode window threshold, the scheduler queue assignment unit dispatches the extra number of operations for the given class in a single cycle. By operating in burst mode for a given operation class during a small number of cycles, processor throughput can be increased without starving the processor of other operation classes.Type: ApplicationFiled: December 10, 2019Publication date: June 10, 2021Inventors: Alok Garg, Scott Andrew McLelland, Marius Evers, Matthew T. Sobel
-
Publication number: 20200409851Abstract: Techniques for generating a model for predicting when different hybrid prefetcher configurations should be used are disclosed. Techniques for using the model to predict when different hybrid prefetcher configurations should be used are also disclosed. The techniques for generating the model include obtaining a set of input data, and generating trees based on the training data. Each tree is associated with a different hybrid prefetcher configuration and the trees output certainty scores for the associated hybrid prefetcher configuration based on hardware feature measurements. To decide on a hybrid prefetcher configuration to use, a prefetcher traverses multiple trees to obtain certainty scores for different hybrid prefetcher configurations and identifies a hybrid prefetcher configuration to used based on a comparison of the certainty scores.Type: ApplicationFiled: June 26, 2019Publication date: December 31, 2020Applicant: Advanced Micro Devices, Inc.Inventors: John Kalamatianos, Paul S. Keltcher, Mayank Chhablani, Alok Garg, Furkan Eris
-
Publication number: 20200201735Abstract: Methods, apparatus, systems and articles of manufacture are disclosed that provide an apparatus to monitor watermark encoder operation, the apparatus comprising: a data collector to collect one or more types of heartbeat data from a watermark encoder, the heartbeat data including time varying data, the one or more types of the heartbeat data defined by a software development kit (SDK); a machine learning engine to process the heartbeat data to predict whether the watermark encoder is associated with respective ones of a plurality of failure modes; and an alert generator to, in response to the machine learning engine predicting the watermark encoder is associated with a first one of the failure modes: generate an alert indicating the at least one of the one or more components to be remedied according to the first one of the failure modes; and transmit the alert to a watermark encoder management agent.Type: ApplicationFiled: December 21, 2018Publication date: June 25, 2020Inventors: John T. Livoti, Susan Cimino, Stanley Wellington Woodruff, Rajakumar Madhanganesh, Alok Garg
-
Publication number: 20190369991Abstract: Systems, apparatuses, and methods for implementing scheduler queue assignment logic are disclosed. A processor includes at least a decode unit, scheduler queue assignment logic, scheduler queues, pickers, and execution units. The assignment logic receives a plurality of operations from a decode unit in each clock cycle. The assignment logic includes a separate logical unit for each different type of operation which is executable by the different execution units of the processor. For each different type of operation, the assignment logic determines which of the possible assignment permutations are valid for assigning different numbers of operations to scheduler queues in a given clock cycle. The assignment logic receives an indication of how many operations to assign in the given clock cycle, and then the assignment logic selects one of the valid assignment permutations for the number of operations specified by the indication.Type: ApplicationFiled: May 29, 2018Publication date: December 5, 2019Inventors: Matthew T. Sobel, Donald A. Priore, Alok Garg
-
Patent number: 9916246Abstract: A processing system includes a shadow tag memory, which stores a plurality of entries containing coherency information for the cachelines residing at the various levels of private caches. If a cache miss occurs at a private cache, or if coherency information for a cacheline requires updating, a probe is sent to the shadow tag memory maintained at the shared cache to determine whether the requested (or affected) cacheline is stored at another private cache. The probe includes a tag which can be divided into two or more portions. To more efficiently compare the probe tag to the shadow tag entries, the comparison is performed in multiple stages based on the portions of the probe tag.Type: GrantFiled: August 16, 2016Date of Patent: March 13, 2018Assignee: Advanced Micro Devices, Inc.Inventors: Carson Donahue Henrion, Michael K. Ciraula, Gregg Donley, Alok Garg, Eric Busta
-
Publication number: 20180052770Abstract: A processing system includes a shadow tag memory, which stores a plurality of entries containing coherency information for the cachelines residing at the various levels of private caches. If a cache miss occurs at a private cache, or if coherency information for a cacheline requires updating, a probe is sent to the shadow tag memory maintained at the shared cache to determine whether the requested (or affected) cacheline is stored at another private cache. The probe includes a tag which can be divided into two or more portions. To more efficiently compare the probe tag to the shadow tag entries, the comparison is performed in multiple stages based on the portions of the probe tag.Type: ApplicationFiled: August 16, 2016Publication date: February 22, 2018Inventors: Carson Donahue Henrion, Michael K. Ciraula, Gregg Donley, Alok Garg, Eric Busta
-
Patent number: 9058277Abstract: Methods and systems for prefetching data for a processor are provided. A system is configured for and a method includes selecting one of a first prefetching control logic and a second prefetching control logic of the processor as a candidate feature, capturing the performance metric of the processor over an inactive sample period when the candidate feature is inactive, capturing a performance metric of the processor over an active sample period when the candidate feature is active, comparing the performance metric of the processor for the active and inactive sample periods, and setting a status of the candidate feature as enabled when the performance metric in the active period indicates improvement over the performance metric in the inactive period, and as disabled when the performance metric in the inactive period indicates improvement over the performance metric in the active period.Type: GrantFiled: November 8, 2012Date of Patent: June 16, 2015Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Sharad Dilip Bade, Alok Garg, John Kalamatianos, Paul Keltcher, Marius Evers, Chitresh Narasimhaiah