Patents by Inventor Arun Radhakrishnan
Arun Radhakrishnan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11899588Abstract: A graphics processing unit (GPU) includes a table located in a memory of the GPU and a cache hierarchy. The table contains an address of inactive data in a cache hierarchy of the GPU in which the inactive data is associated with an intermediate render target. The cache hierarchy is responsive to an eviction event by discarding the inactive data from the cache hierarchy without performing a writeback to a system memory associated with the GPU based on the address of the inactive data being contained in the table. The cache hierarchy may obtain the address of the inactive data from the table, and the inactive data may be located in a last-level cache of the cache hierarchy. In one embodiment, the address of inactive data in a cache hierarchy of the GPU includes a range of addresses for the inactive data.Type: GrantFiled: February 12, 2021Date of Patent: February 13, 2024Inventors: Anshujit Sharma, Sushant Kondguli, Zhenhong Liu, Wilson Wai Lun Fung, Arun Radhakrishnan, Wayne Yamamoto
-
Patent number: 11675782Abstract: A disparate data source aggregation system and methods are provided which may pull or retrieve talent data or features from disparate data sources, automatically correlate the data across the different data sources, build a self-adjusting system database that captures the talent data from the disparate data sources, and lets users search, query and build model insights on the aggregated data of the system database without human intervention. A method for disparate data source aggregation may include: extracting a first feature set having a first extracted feature and a second feature set having a second extracted feature; determining, if the first extracted feature of the first feature set matches the second extracted feature of the second feature set; and aggregating the first feature set with the second feature set if the first extracted feature of the first feature set matches the second extracted feature of the second feature set.Type: GrantFiled: December 20, 2021Date of Patent: June 13, 2023Assignee: SAP SEInventors: Stacy Chapman, Satish Sallakonda, Arun Radhakrishnan
-
Patent number: 11610281Abstract: A method of processing a workload in a graphics processing unit (GPU) may include detecting a work item of the workload in the GPU, determining a cache policy for the work item, and operating at least a portion of a cache memory hierarchy in the GPU for at least a portion of the work item based on the cache policy. The work item may be detected based on information received from an application and/or monitoring one or more performance counters by a driver and/or hardware detection logic. The method may further include monitoring one or more performance counters, wherein the cache policy for the work item may be determined and/or changed based on the one or more performance counters. The cache policy for the work item may be selected based on a runtime learning model.Type: GrantFiled: January 11, 2021Date of Patent: March 21, 2023Inventors: Sushant Kondguli, Arun Radhakrishnan, Zachary D. Neyland, David C. Tannenbaum
-
Publication number: 20220114167Abstract: A disparate data source aggregation system and methods are provided which may pull or retrieve talent data or features from disparate data sources, automatically correlate the data across the different data sources, build a self-adjusting system database that captures the talent data from the disparate data sources, and lets users search, query and build model insights on the aggregated data of the system database without human intervention. A method for disparate data source aggregation may include: extracting a first feature set having a first extracted feature and a second feature set having a second extracted feature; determining, if the first extracted feature of the first feature set matches the second extracted feature of the second feature set; and aggregating the first feature set with the second feature set if the first extracted feature of the first feature set matches the second extracted feature of the second feature set.Type: ApplicationFiled: December 20, 2021Publication date: April 14, 2022Inventors: Stacy Chapman, Satish Sallakonda, Arun Radhakrishnan
-
Publication number: 20220083473Abstract: A graphics processing unit (GPU) includes a table located in a memory of the GPU and a cache hierarchy. The table contains an address of inactive data in a cache hierarchy of the GPU in which the inactive data is associated with an intermediate render target. The cache hierarchy is responsive to an eviction event by discarding the inactive data from the cache hierarchy without performing a writeback to a system memory associated with the GPU based on the address of the inactive data being contained in the table. The cache hierarchy may obtain the address of the inactive data from the table, and the inactive data may be located in a last-level cache of the cache hierarchy. In one embodiment, the address of inactive data in a cache hierarchy of the GPU includes a range of addresses for the inactive data.Type: ApplicationFiled: February 12, 2021Publication date: March 17, 2022Inventors: Anshujit SHARMA, Sushant KONDGULI, Zhenhong LIU, Wilson Wai Lun FUNG, Arun RADHAKRISHNAN, Wayne YAMAMOTO
-
Publication number: 20220067876Abstract: A method of processing a workload in a graphics processing unit (GPU) may include detecting a work item of the workload in the GPU, determining a cache policy for the work item, and operating at least a portion of a cache memory hierarchy in the GPU for at least a portion of the work item based on the cache policy. The work item may be detected based on information received from an application and/or monitoring one or more performance counters by a driver and/or hardware detection logic. The method may further include monitoring one or more performance counters, wherein the cache policy for the work item may be determined and/or changed based on the one or more performance counters. The cache policy for the work item may be selected based on a runtime learning model.Type: ApplicationFiled: January 11, 2021Publication date: March 3, 2022Inventors: Sushant KONDGULI, Arun RADHAKRISHNAN, Zachary D. NEYLAND, David C. TANNENBAUM
-
Patent number: 11216452Abstract: A disparate data source aggregation system and methods are provided which may pull or retrieve talent data or features from disparate data sources, automatically correlate the data across the different data sources, build a self-adjusting system database that captures the talent data from the disparate data sources, and lets users search, query and build model insights on the aggregated data of the system database without human intervention. A method for disparate data source aggregation may include: extracting a first feature set having a first extracted feature and a second feature set having a second extracted feature; determining, if the first extracted feature of the first feature set matches the second extracted feature of the second feature set; and aggregating the first feature set with the second feature set if the first extracted feature of the first feature set matches the second extracted feature of the second feature set.Type: GrantFiled: October 27, 2018Date of Patent: January 4, 2022Assignee: SAP SEInventors: Stacy Chapman, Satish Sallakonda, Arun Radhakrishnan
-
Patent number: 10540287Abstract: Apparatuses and methods of manufacturing same, systems, and methods for a spatial memory streaming (SMS) prefetch engine are described. In one aspect, the SMS prefetch engine includes a pattern history table (PHT), which has a table in which each entry has an offset list field comprising sub-fields for offset values from a base offset value within a region and a per-offset confidence field comprising sub-fields for per-offset confidence levels corresponding to each offset value. When a PHT entry is activated, the per-offset confidence values corresponding to each offset value in the offset list field of the PHT entry are updated by matching current accesses to the stored offset values in the offset list field of the activated PHT entry. Continuous learning may be provided to the SMS engine at least by the per-offset confidence levels.Type: GrantFiled: August 30, 2017Date of Patent: January 21, 2020Assignee: Samsung Electronics Co., LtdInventors: Edward A Brekelbaum, Arun Radhakrishnan
-
Patent number: 10417130Abstract: Apparatuses, systems, methods for a spatial memory streaming (SMS) prefetch engine are described. In one aspect, an SMS prefetch engine uses trigger-to-trigger stride detection to promote training table entries to pattern history table (PHT) entries and to drive spatially related prefetches in more distant regions. In another aspect, an SMS prefetch engine maintains a blacklist of program counter (PC) values to not use as trigger values. In yet another aspect, an SMS prefetch engine uses hashed values of certain fields, such as the trigger PC, in entries of, e.g., filter tables, training tables, and PHTs, as index values for the table.Type: GrantFiled: October 10, 2017Date of Patent: September 17, 2019Assignee: Samsung Electronics Co., LtdInventors: Edward A Brekelbaum, Arun Radhakrishnan
-
Publication number: 20190129996Abstract: A disparate data source aggregation system and methods are provided which may pull or retrieve talent data or features from disparate data sources, automatically correlate the data across the different data sources, build a self-adjusting system database that captures the talent data from the disparate data sources, and lets users search, query and build model insights on the aggregated data of the system database without human intervention. A method for disparate data source aggregation may include: extracting a first feature set having a first extracted feature and a second feature set having a second extracted feature; determining, if the first extracted feature of the first feature set matches the second extracted feature of the second feature set; and aggregating the first feature set with the second feature set if the first extracted feature of the first feature set matches the second extracted feature of the second feature set.Type: ApplicationFiled: October 27, 2018Publication date: May 2, 2019Applicant: Tahlent Inc DBA SwoopTalentInventors: Stacy Chapman, Satish Sallakonda, Arun Radhakrishnan
-
Publication number: 20180329823Abstract: Apparatuses, systems, methods for a spatial memory streaming (SMS) prefetch engine are described. In one aspect, an SMS prefetch engine uses trigger-to-trigger stride detection to promote training table entries to pattern history table (PHT) entries and to drive spatially related prefetches in more distant regions. In another aspect, an SMS prefetch engine maintains a blacklist of program counter (PC) values to not use as trigger values. In yet another aspect, an SMS prefetch engine uses hashed values of certain fields, such as the trigger PC, in entries of, e.g., filter tables, training tables, and PHTs, as index values for the table.Type: ApplicationFiled: October 10, 2017Publication date: November 15, 2018Inventors: Edward A. BREKELBAUM, Arun RADHAKRISHNAN
-
Publication number: 20180329822Abstract: Apparatuses and methods of manufacturing same, systems, and methods for a spatial memory streaming (SMS) prefetch engine are described. In one aspect, the SMS prefetch engine includes a pattern history table (PHT), which has a table in which each entry has an offset list field comprising sub-fields for offset values from a base offset value within a region and a per-offset confidence field comprising sub-fields for per-offset confidence levels corresponding to each offset value. When a PHT entry is activated, the per-offset confidence values corresponding to each offset value in the offset list field of the PHT entry are updated by matching current accesses to the stored offset values in the offset list field of the activated PHT entry. Continuous learning may be provided to the SMS engine at least by the per-offset confidence levels.Type: ApplicationFiled: August 30, 2017Publication date: November 15, 2018Inventors: Edward A. BREKELBAUM, Arun Radhakrishnan
-
Patent number: 10031851Abstract: A computing system includes: an instruction dispatch module module configured to receive a program instruction; and an address reordering module, coupled to the instruction dispatch module, configured to filter the program instruction when the program instruction is a hit in a cache-line in a prefetch filter. The computer system further includes: an instruction dispatch module configured to receive a program instruction; an address reordering module, coupled to the instruction dispatch module, configured to: allocate a tag in a tag module for the program instruction in a program order, allocate a virtual address in a virtual address module for the program instruction and out-of-order relative to the program order, and insert a pointer associated with the tag to link the tag to the virtual address.Type: GrantFiled: January 9, 2017Date of Patent: July 24, 2018Assignee: Samsung Electronics Co., Ltd.Inventors: Karthik Sundaram, Arun Radhakrishnan
-
Patent number: 9665491Abstract: According to one general aspect, a method may include monitoring a plurality of pre-fetch cache requests associated with a data stream. The method may also include evaluating an accuracy of the pre-fetch cache requests. The method may further include, based at least in part upon the accuracy of the pre-fetch cache requests, adjusting a maximum amount of data that is allowably pre-fetched in excess of a data stream's current actual demand for data.Type: GrantFiled: June 3, 2015Date of Patent: May 30, 2017Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Arun Radhakrishnan, Karthik Sundaram
-
Publication number: 20170116128Abstract: A computing system includes: an instruction dispatch module module configured to receive a program instruction; and an address reordering module, coupled to the instruction dispatch module, configured to filter the program instruction when the program instruction is a hit in a cache-line in a prefetch filter. The computer system further includes: an instruction dispatch module configured to receive a program instruction; an address reordering module, coupled to the instruction dispatch module, configured to: allocate a tag in a tag module for the program instruction in a program order, allocate a virtual address in a virtual address module for the program instruction and out-of-order relative to the program order, and insert a pointer associated with the tag to link the tag to the virtual address.Type: ApplicationFiled: January 9, 2017Publication date: April 27, 2017Inventors: Karthik Sundaram, Arun Radhakrishnan
-
Patent number: 9569361Abstract: According to one general aspect, an apparatus may include a cache pre-fetcher, and a pre-fetch scheduler. The cache pre-fetcher may be configured to predict, based at least in part upon a virtual address, data to be retrieved from a memory system. The pre-fetch scheduler may be configured to convert the virtual address of the data to a physical address of the data, and request the data from one of a plurality of levels of the memory system. The memory system may include a plurality of levels, each level of the memory system configured to store data.Type: GrantFiled: July 7, 2014Date of Patent: February 14, 2017Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Arun Radhakrishnan, Kevin Lepak, Rama Gopal, Murali Chinnakonda, Karthik Sundaram, Brian Grayson
-
Patent number: 9542323Abstract: A computing system includes: an instruction dispatch module configured to receive a program instruction; an address reordering module, coupled to the instruction dispatch module, configured to filter the program instruction when the program instruction is a hit in a cache-line in a prefetch filter. The computer system further includes: an instruction dispatch module configured to receive a program instruction; an address reordering module, coupled to the instruction dispatch module, configured to: allocate a tag in a tag module for the program instruction in a program order, allocate a virtual address in a virtual address module for the program instruction in an out-of-order relative to the program order, and insert a pointer associated with the tag to link the tag to the virtual address.Type: GrantFiled: September 26, 2014Date of Patent: January 10, 2017Assignee: Samsung Electronics Co., Ltd.Inventors: Karthik Sundaram, Arun Radhakrishnan
-
Publication number: 20160054997Abstract: A computing system includes: an instruction dispatch module configured to receive an address stream; a prefetch module, coupled to the instruction dispatch module, configured to: train to concurrently detect a single-stride pattern or a multi-stride pattern from the address stream, speculatively fetch a program data based on the single-stride pattern or the multi-stride pattern, and continue to train for the single-stride pattern with a larger value for a stride count or for the multi-stride pattern.Type: ApplicationFiled: August 21, 2015Publication date: February 25, 2016Inventors: Arun Radhakrishnan, Karthik Sundaram, Brian Grayson
-
Publication number: 20160019155Abstract: According to one general aspect, a method may include monitoring a plurality of pre-fetch cache requests associated with a data stream. The method may also include evaluating an accuracy of the pre-fetch cache requests. The method may further include, based at least in part upon the accuracy of the pre-fetch cache requests, adjusting a maximum amount of data that is allowably pre-fetched in excess of a data stream's current actual demand for data.Type: ApplicationFiled: June 3, 2015Publication date: January 21, 2016Inventors: Arun RADHAKRISHNAN, Karthik SUNDARAM
-
Publication number: 20150278100Abstract: A computing system includes: an instruction dispatch module configured to receive a program instruction; an address reordering module, coupled to the instruction dispatch module, configured to filter the program instruction when the program instruction is a hit in a cache-line in a prefetch filter. The computer system further includes: an instruction dispatch module configured to receive a program instruction; an address reordering module, coupled to the instruction dispatch module, configured to: allocate a tag in a tag module for the program instruction in a program order, allocate a virtual address in a virtual address module for the program instruction in an out-of-order relative to the program order, and insert a pointer associated with the tag to link the tag to the virtual address.Type: ApplicationFiled: September 26, 2014Publication date: October 1, 2015Inventors: Karthik Sundaram, Arun Radhakrishnan