Patents by Inventor Mitesh R. Meswani
Mitesh R. Meswani has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11573724Abstract: A processing apparatus is provided that includes NVRAM and one or more processors configured to process a first set and a second set of instructions according to a hierarchical processing scope and process a scoped persistence barrier residing in the program after the first instruction set and before the second instruction set. The barrier includes an instruction to cause first data to persist in the NVRAM before second data persists in the NVRAM. The first data results from execution of each of the first set of instructions processed according to the one hierarchical processing scope. The second data results from execution of each of the second set of instructions processed according to the one hierarchical processing scope. The processing apparatus also includes a controller configured to cause the first data to persist in the NVRAM before the second data persists in the NVRAM based on the scoped persistence barrier.Type: GrantFiled: June 5, 2019Date of Patent: February 7, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Arkaprava Basu, Mitesh R. Meswani, Dibakar Gope, Sooraj Puthoor
-
Patent number: 10713059Abstract: A compute unit configured to execute multiple threads in parallel is presented. The compute unit includes one or more single instruction multiple data (SIMD) units and a fetch and decode logic. The SIMD units have differing numbers of arithmetic logic units (ALUs), such that each SIMD unit can execute a different number of threads. The fetch and decode logic is in communication with each of the SIMD units, and is configured to assign the threads to the SIMD units for execution based on such differing numbers of ALUs.Type: GrantFiled: September 18, 2014Date of Patent: July 14, 2020Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Joseph L. Greathouse, Mitesh R. Meswani, Sooraj Puthoor, Dmitri Yudanov, James M. O'Connor
-
Patent number: 10592279Abstract: A method and processing apparatus for accelerating program processing is provided that includes a plurality of processors configured to process a plurality of tasks of a program and a controller. The controller is configured to determine, from the plurality of tasks being processed by the plurality of processors, a task being processed on a first processor to be a lagging task causing a delay in execution of one or more other tasks of the plurality of tasks. The controller is further configured to provide the determined lagging task to a second processor to be executed by the second processor to accelerate execution of the lagging task.Type: GrantFiled: June 23, 2016Date of Patent: March 17, 2020Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Arkaprava Basu, Dmitri Yudanov, David A. Roberts, Mitesh R. Meswani, Sergey Blagodurov
-
Patent number: 10503655Abstract: The described embodiments include a computing device that caches data acquired from a main memory in a high-bandwidth memory (HBM), the computing device including channels for accessing data stored in corresponding portions of the HBM. During operation, the computing device sets each of the channels so that data blocks stored in the corresponding portions of the HBM include corresponding numbers of cache lines. Based on records of accesses of cache lines in the HBM that were acquired from pages in the main memory, the computing device sets a data block size for each of the pages, the data block size being a number of cache lines. The computing device stores, in the HBM, data blocks acquired from each of the pages in the main memory using a channel having a data block size corresponding to the data block size for each of the pages.Type: GrantFiled: July 21, 2016Date of Patent: December 10, 2019Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Mitesh R. Meswani, Jee Ho Ryoo
-
Publication number: 20190286362Abstract: A processing apparatus is provided that includes NVRAM and one or more processors configured to process a first set and a second set of instructions according to a hierarchical processing scope and process a scoped persistence barrier residing in the program after the first instruction set and before the second instruction set. The barrier includes an instruction to cause first data to persist in the NVRAM before second data persists in the NVRAM. The first data results from execution of each of the first set of instructions processed according to the one hierarchical processing scope. The second data results from execution of each of the second set of instructions processed according to the one hierarchical processing scope. The processing apparatus also includes a controller configured to cause the first data to persist in the NVRAM before the second data persists in the NVRAM based on the scoped persistence barrier.Type: ApplicationFiled: June 5, 2019Publication date: September 19, 2019Applicant: Advanced Micro Devices, Inc.Inventors: Arkaprava Basu, Mitesh R. Meswani, Dibakar Gope, Sooraj Puthoor
-
Patent number: 10365996Abstract: Techniques for selecting one of a plurality of heterogeneous memory units for placement of blocks of data (e.g., memory pages), based on both reliability and performance, are disclosed. A “cost” for each data block/memory unit combination is determined, based on the frequency of access of the data block, the latency of the memory unit, and, optionally, an architectural vulnerability factor (which represents the level of exposure of a particular memory data value to memory faults such as bit flips). A memory unit is selected for the data block for which the determined cost is the lowest, out of all memory units considered, and the data block is placed into that memory unit.Type: GrantFiled: October 21, 2016Date of Patent: July 30, 2019Assignee: Advanced Micro Devices, Inc.Inventors: Manish Gupta, David A. Roberts, Mitesh R. Meswani, Vilas Sridharan, Steven Raasch, Daniel I. Lowell
-
Patent number: 10324650Abstract: A processing apparatus is provided that includes NVRAM and one or more processors configured to process a first set and a second set of instructions according to a hierarchical processing scope and process a scoped persistence barrier residing in the program after the first instruction set and before the second instruction set. The barrier includes an instruction to cause first data to persist in the NVRAM before second data persists in the NVRAM. The first data results from execution of each of the first set of instructions processed according to the one hierarchical processing scope. The second data results from execution of each of the second set of instructions processed according to the one hierarchical processing scope. The processing apparatus also includes a controller configured to cause the first data to persist in the NVRAM before the second data persists in the NVRAM based on the scoped persistence barrier.Type: GrantFiled: September 23, 2016Date of Patent: June 18, 2019Assignee: Advanced Micro Devices, Inc.Inventors: Arkaprava Basu, Mitesh R. Meswani, Dibakar Gope, Sooraj Puthoor
-
Patent number: 10282292Abstract: Cluster manager functional blocks perform operations for migrating pages in portions in corresponding migration clusters. During operation, each cluster manager keeps an access record that includes information indicating accesses of pages in the portions in the corresponding migration cluster. Based on the access record and one or more migration policies, each cluster manager migrates pages between the portions in the corresponding migration cluster.Type: GrantFiled: October 17, 2016Date of Patent: May 7, 2019Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Andreas Prodromou, Mitesh R. Meswani, Arkaprava Basu, Nuwan S. Jayasena, Gabriel H. Loh
-
Patent number: 10235290Abstract: Systems, apparatuses, and methods for sorting memory pages in a multi-level heterogeneous memory architecture. The system may classify pages into a first “hot” category or a second “cold” category. The system may attempt to place the “hot” pages into the memory level(s) closest to the systems' processor cores. The system may track parameters associated with each page, with the parameters including number of accesses, types of accesses, power consumed per access, temperature, wearability, and/or other parameters. Based on these parameters, the system may generate a score for each page. Then, the system may compare the score of each page to a threshold. If the score of a given page is greater than the threshold, the given page may be designated as “hot”. If the score of the given page is less than the threshold, the given page may be designated as “cold”.Type: GrantFiled: June 26, 2015Date of Patent: March 19, 2019Assignee: Advanced Micro Devices, Inc.Inventors: Sergey Blagodurov, Gabriel H. Loh, Mitesh R. Meswani
-
Patent number: 10089014Abstract: Systems, apparatuses, and methods for implementing a memory sampling based migrating page cache are disclosed. In one embodiment, a system includes one or more processors and a multi-level memory hierarchy. The system is configured to record metadata associated with a portion of memory access instructions executed by one or more processors in one or more sampling intervals. The system generates predictions on which memory pages will be accessed in a subsequent sampling interval based on the recorded metadata. The system migrates one or more memory pages to a first memory level from a second memory level responsive to predicting that the one or more memory pages will be accessed in the subsequent sampling interval. The system also adjusts a duration of the sampling interval based on the number of memory accesses or a number of page faults per interval.Type: GrantFiled: September 22, 2016Date of Patent: October 2, 2018Assignee: Advanced Micro Devices, Inc.Inventors: Ahsen Jamshed Uppal, Mitesh R. Meswani
-
Patent number: 9983655Abstract: A method and apparatus for performing inter-lane power management includes de-energizing one or more execution lanes upon a determination that the one or more execution lanes are to be predicated. Energy from the predicated execution lanes is redistributed to one or more active execution lanes.Type: GrantFiled: December 9, 2015Date of Patent: May 29, 2018Assignee: Advanced Micro Devices, Inc.Inventors: Mitesh R. Meswani, David A. Roberts, Dmitri Yudanov, Arkaprava Basu, Sergey Blagodurov
-
Publication number: 20180107598Abstract: Cluster manager functional blocks perform operations for migrating pages in portions in corresponding migration clusters. During operation, each cluster manager keeps an access record that includes information indicating accesses of pages in the portions in the corresponding migration cluster. Based on the access record and one or more migration policies, each cluster manager migrates pages between the portions in the corresponding migration cluster.Type: ApplicationFiled: October 17, 2016Publication date: April 19, 2018Inventors: Andreas Prodromou, Mitesh R. Meswani, Arkaprava Basu, Nuwan S. Jayasena, Gabriel H. Loh
-
Publication number: 20180088858Abstract: A processing apparatus is provided that includes NVRAM and one or more processors configured to process a first set and a second set of instructions according to a hierarchical processing scope and process a scoped persistence barrier residing in the program after the first instruction set and before the second instruction set. The barrier includes an instruction to cause first data to persist in the NVRAM before second data persists in the NVRAM. The first data results from execution of each of the first set of instructions processed according to the one hierarchical processing scope. The second data results from execution of each of the second set of instructions processed according to the one hierarchical processing scope. The processing apparatus also includes a controller configured to cause the first data to persist in the NVRAM before the second data persists in the NVRAM based on the scoped persistence barrier.Type: ApplicationFiled: September 23, 2016Publication date: March 29, 2018Applicant: Advanced Micro Devices, Inc.Inventors: Arkaprava Basu, Mitesh R. Meswani, Dibakar Gope, Sooraj Puthoor
-
Publication number: 20180081541Abstract: Systems, apparatuses, and methods for implementing a memory sampling based migrating page cache are disclosed. In one embodiment, a system includes one or more processors and a multi-level memory hierarchy. The system is configured to record metadata associated with a portion of memory access instructions executed by one or more processors in one or more sampling intervals. The system generates predictions on which memory pages will be accessed in a subsequent sampling interval based on the recorded metadata. The system migrates one or more memory pages to a first memory level from a second memory level responsive to predicting that the one or more memory pages will be accessed in the subsequent sampling interval. The system also adjusts a duration of the sampling interval based on the number of memory accesses or a number of page faults per interval.Type: ApplicationFiled: September 22, 2016Publication date: March 22, 2018Inventors: Ahsen Jamshed Uppal, Mitesh R. Meswani
-
Publication number: 20180024935Abstract: The described embodiments include a computing device that caches data acquired from a main memory in a high-bandwidth memory (HBM), the computing device including channels for accessing data stored in corresponding portions of the HBM. During operation, the computing device sets each of the channels so that data blocks stored in the corresponding portions of the HBM include corresponding numbers of cache lines. Based on records of accesses of cache lines in the HBM that were acquired from pages in the main memory, the computing device sets a data block size for each of the pages, the data block size being a number of cache lines. The computing device stores, in the HBM, data blocks acquired from each of the pages in the main memory using a channel having a data block size corresponding to the data block size for each of the pages.Type: ApplicationFiled: July 21, 2016Publication date: January 25, 2018Inventors: Mitesh R. Meswani, Jee Ho Ryoo
-
Publication number: 20170371720Abstract: A method and processing apparatus for accelerating program processing is provided that includes a plurality of processors configured to process a plurality of tasks of a program and a controller. The controller is configured to determine, from the plurality of tasks being processed by the plurality of processors, a task being processed on a first processor to be a lagging task causing a delay in execution of one or more other tasks of the plurality of tasks. The controller is further configured to provide the determined lagging task to a second processor to be executed by the second processor to accelerate execution of the lagging task.Type: ApplicationFiled: June 23, 2016Publication date: December 28, 2017Applicant: Advanced Micro Devices, Inc.Inventors: Arkaprava Basu, Dmitri Yudanov, David A. Roberts, Mitesh R. Meswani, Sergey Blagodurov
-
Publication number: 20170277441Abstract: Techniques for selecting one of a plurality of heterogeneous memory units for placement of blocks of data (e.g., memory pages), based on both reliability and performance, are disclosed. A “cost” for each data block/memory unit combination is determined, based on the frequency of access of the data block, the latency of the memory unit, and, optionally, an architectural vulnerability factor (which represents the level of exposure of a particular memory data value to memory faults such as bit flips). A memory unit is selected for the data block for which the determined cost is the lowest, out of all memory units considered, and the data block is placed into that memory unit.Type: ApplicationFiled: October 21, 2016Publication date: September 28, 2017Applicant: Advanced Micro Devices, Inc.Inventors: Manish Gupta, David A. Roberts, Mitesh R. Meswani, Vilas Sridharan, Steven Raasch, Daniel I. Lowell
-
Patent number: 9727241Abstract: A processor maintains a count of accesses to each memory page. When the accesses to a memory page exceed a threshold amount for that memory page, the processor sets an indicator for the page. Based on the indicators for the memory pages, the processor manages data at one or more levels of the processor's memory hierarchy.Type: GrantFiled: February 6, 2015Date of Patent: August 8, 2017Assignee: Advanced Micro Devices, Inc.Inventors: Gabriel H. Loh, David A. Roberts, Mitesh R. Meswani, Mark R. Nutter, John R. Slice, Prashant Nair, Michael Ignatowski
-
Publication number: 20170168546Abstract: A method and apparatus for performing inter-lane power management includes de-energizing one or more execution lanes upon a determination that the one or more execution lanes are to be predicated. Energy from the predicated execution lanes is redistributed to one or more active execution lanes.Type: ApplicationFiled: December 9, 2015Publication date: June 15, 2017Applicant: Advanced Micro Devices, Inc.Inventors: Mitesh R. Meswani, David A. Roberts, Dmitri Yudanov, Arkaprava Basu, Sergey Blagodurov
-
Publication number: 20170147228Abstract: A plurality of memory blocks are connected to a computation-enabled switch that provides data paths between the plurality of memory blocks. The computation-enabled switch performs one or more computations on data stored in one or more of the plurality of memory blocks during transfer of the data along one or more of the data paths between the plurality of memory blocks.Type: ApplicationFiled: November 25, 2015Publication date: May 25, 2017Inventors: Dmitri Yudanov, Sergey Blagodurov, David A. Roberts, Mitesh R. Meswani, Nuwan Jayasena, Michael Ignatowski