Patents by Inventor Madhavi G. Valluri
Madhavi G. Valluri has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 9298630Abstract: A computer processor collects information for a dominant data access loop and reference code patterns based on data reference pattern analysis, and for pointer aliasing and data shape based on pointer escape analysis. The computer processor selects a candidate array for data splitting wherein the candidate array is referenced by a dominant data access loop. The computer processor determines a data splitting mode by which to split the data of the candidate array, based on the reference code patterns, the pointer aliasing, and the data shape information, and splits the data into two or more split arrays. The computer processor creates a software cache that includes a portion of the data of the two or more split arrays in a transposed format, and maintains the portion of the transposed data within the software cache and consults the software cache during an access of the split arrays.Type: GrantFiled: June 13, 2014Date of Patent: March 29, 2016Assignee: GLOBALFOUNDRIES INC.Inventors: Christopher M. Barton, Shimin Cui, Satish K. Sadasivam, Raul E. Silvera, Madhavi G. Valluri, Steven W White
-
Patent number: 9229745Abstract: A computing device identifies a load instruction and store instruction pair that causes a load-hit-store conflict. A processor tags a first load instruction that instructs the processor to load a first data set from memory. The processor stores an address at which the first load instruction is located in memory in a special purpose register. The processor determines where the first load instruction has a load-hit-store conflict with a first store instruction. If the processor determines the first load instruction has a load-hit store conflict with the first store instruction, the processor stores an address at which the first data set is located in memory in a second special purpose register, tags the first data set being stored by the first store instruction, stores an address at which the first store instruction is located in memory in a third special purpose register and increases a conflict counter.Type: GrantFiled: September 12, 2012Date of Patent: January 5, 2016Assignee: International Business Machines CorporationInventors: Venkat R. Indukuru, Alexander E. Mericas, Satish K. Sadasivam, Madhavi G. Valluri
-
Patent number: 9229746Abstract: A computing device identifies a load instruction and store instruction pair that causes a load-hit-store conflict. A processor tags a first load instruction that instructs the processor to load a first data set from memory. The processor stores an address at which the first load instruction is located in memory in a special purpose register. The processor determines where the first load instruction has a load-hit-store conflict with a first store instruction. If the processor determines the first load instruction has a load-hit store conflict with the first store instruction, the processor stores an address at which the first data set is located in memory in a second special purpose register, tags the first data set being stored by the first store instruction, stores an address at which the first store instruction is located in memory in a third special purpose register and increases a conflict counter.Type: GrantFiled: December 18, 2013Date of Patent: January 5, 2016Assignee: International Business Machines CorporationInventors: Venkat R. Indukuru, Alexander E. Mericas, Satish K. Sadasivam, Madhavi G. Valluri
-
Patent number: 9032375Abstract: A computer program product for identifying bottlenecks includes a computer readable storage medium with stored computer readable program instructions. The computer readable program instructions, when executed, provide a data collector module, a mapper module, and an analyzer module that are collectively configured to read mapped data and configuration files, and identify, based upon the mapped data and the configuration files, an undesirable bottleneck condition that causes a computer program to run inefficiently. A method includes reading a configuration file that includes data regarding processor components, and collecting data from hardware activity counters based upon the configuration file.Type: GrantFiled: April 27, 2011Date of Patent: May 12, 2015Assignee: International Business Machines CorporationInventors: Prathiba Kumar, Rajan Ravindran, Satish K. Sadasivam, Madhavi G. Valluri
-
Publication number: 20150067268Abstract: A computer processor collects information for a dominant data access loop and reference code patterns based on data reference pattern analysis, and for pointer aliasing and data shape based on pointer escape analysis. The computer processor selects a candidate array for data splitting wherein the candidate array is referenced by a dominant data access loop. The computer processor determines a data splitting mode by which to split the data of the candidate array, based on the reference code patterns, the pointer aliasing, and the data shape information, and splits the data into two or more split arrays. The computer processor creates a software cache that includes a portion of the data of the two or more split arrays in a transposed format, and maintains the portion of the transposed data within the software cache and consults the software cache during an access of the split arrays.Type: ApplicationFiled: June 13, 2014Publication date: March 5, 2015Inventors: Christopher M. Barton, Shimin Cui, Satish K. Sadasivam, Raul E. Silvera, Madhavi G. Valluri, Steven W. White
-
Patent number: 8745607Abstract: According to one aspect of the present disclosure, a method and technique for reducing branch misprediction impact for nested loop code is disclosed. The method includes: responsive to identifying code having an outer loop and an inner loop, determining a quantity of iterations of the inner loop for an initial number of iterations of the outer loop; determining a number of processor cycles for executing the quantity of iterations of the inner loop for the initial number of iterations of the outer loop; determining whether the number of processor cycles is less than a threshold; and responsive to determining that the number of processor cycles is less than the threshold, fully unrolling the inner loop for the initial number of iterations of the outer loop.Type: GrantFiled: November 11, 2011Date of Patent: June 3, 2014Assignee: International Business Machines CorporationInventors: Madhavi G. Valluri, Steven W. White
-
Publication number: 20140108770Abstract: A computing device identifies a load instruction and store instruction pair that causes a load-hit-store conflict. A processor tags a first load instruction that instructs the processor to load a first data set from memory. The processor stores an address at which the first load instruction is located in memory in a special purpose register. The processor determines where the first load instruction has a load-hit-store conflict with a first store instruction. If the processor determines the first load instruction has a load-hit store conflict with the first store instruction, the processor stores an address at which the first data set is located in memory in a second special purpose register, tags the first data set being stored by the first store instruction, stores an address at which the first store instruction is located in memory in a third special purpose register and increases a conflict counter.Type: ApplicationFiled: December 18, 2013Publication date: April 17, 2014Inventors: Venkat R. Indukuru, Alexander E. Mericas, Satish K. Sadasivam, Madhavi G. Valluri
-
Publication number: 20140075158Abstract: A computing device identifies a load instruction and store instruction pair that causes a load-hit-store conflict. A processor tags a first load instruction that instructs the processor to load a first data set from memory. The processor stores an address at which the first load instruction is located in memory in a special purpose register. The processor determines where the first load instruction has a load-hit-store conflict with a first store instruction. If the processor determines the first load instruction has a load-hit store conflict with the first store instruction, the processor stores an address at which the first data set is located in memory in a second special purpose register, tags the first data set being stored by the first store instruction, stores an address at which the first store instruction is located in memory in a third special purpose register and increases a conflict counter.Type: ApplicationFiled: September 12, 2012Publication date: March 13, 2014Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Venkat R. Indukuru, Alexander E. Mericas, Satish K. Sadasivam, Madhavi G. Valluri
-
Publication number: 20130125104Abstract: According to one aspect of the present disclosure, a method and technique for reducing branch misprediction impact for nested loop code is disclosed. The method includes: responsive to identifying code having an outer loop and an inner loop, determining a quantity of iterations of the inner loop for an initial number of iterations of the outer loop; determining a number of processor cycles for executing the quantity of iterations of the inner loop for the initial number of iterations of the outer loop; determining whether the number of processor cycles is less than a threshold; and responsive to determining that the number of processor cycles is less than the threshold, fully unrolling the inner loop for the initial number of iterations of the outer loop.Type: ApplicationFiled: November 11, 2011Publication date: May 16, 2013Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Madhavi G. Valluri, Steven W. White
-
Publication number: 20120278594Abstract: A computer program product for identifying bottlenecks includes a computer readable storage medium with stored computer readable program instructions. The computer readable program instructions, when executed, provide a data collector module, a mapper module, and an analyzer module that are collectively configured to read mapped data and configuration files, and identify, based upon the mapped data and the configuration files, an undesirable bottleneck condition that causes a computer program to run inefficiently. A method includes reading a configuration file that includes data regarding processor components, and collecting data from hardware activity counters based upon the configuration file.Type: ApplicationFiled: April 27, 2011Publication date: November 1, 2012Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Prathiba Kumar, Rajan Ravindran, Satish K. Sadasivam, Madhavi G. Valluri
-
Patent number: 8091073Abstract: A method, system, and computer program product are provided for identifying instructions to obtain representative traces. A phase instruction budget is calculated for each phase in a set of phases. The phase instruction budget is based on a weight associated with each phase and a global instruction budget. A starting index and an ending index are identified for instructions within a set of intervals in each phase in order to meet the phase instruction budget for that phase, thereby forming a set of interval indices. A determination is made as to whether the instructions within the set of interval indices meet the global instruction budget. Responsive to the global instruction budget being met, the set of interval indices are output as collection points for the representative traces.Type: GrantFiled: June 5, 2007Date of Patent: January 3, 2012Assignee: International Business Machines CorporationInventors: Robert H. Bell, Jr., Wen-Tzer T. Chen, Richard J. Eickemeyer, Venkat R. Indukuru, Pattabi M. Seshadri, Madhavi G. Valluri
-
Patent number: 8010334Abstract: A test system or simulator includes an integrated circuit (IC) benchmark software program that executes workload program software on a semiconductor die IC design model. The benchmark software program includes trace, simulation point, basic block vector (BBV) generation, cycles per instruction (CPI) error, clustering and other programs. The test system also includes CPI stack program software that generates CPI stack data that includes microarchitecture dependent information for each instruction interval of workload program software. The CPI stack data may also include an overall analysis of CPI data for the entire workload program. IC designers may utilize the benchmark software and CPI stack program to develop a reduced representative workload program that includes CPI data as well as microarchitecture dependent information.Type: GrantFiled: April 30, 2008Date of Patent: August 30, 2011Assignee: International Business Machines CorporationInventors: Robert H Bell, Thomas W Chen, Jr., Venkat R Indukuru, Alex E Mericas, Pattabi M Seshadri, Madhavi G Valluri
-
Patent number: 7844928Abstract: A test system or simulator includes an IC benchmark software program that executes application software on a semiconductor die IC design model. The benchmark software includes trace, simulation point, clustering and other programs. IC designers utilize the benchmark software to evaluate the performance characteristics of IC designs with customer user software applications. The benchmark software generates basic block vectors BBVs from instruction traces of application software. The benchmark software analyzes data dependent information that it appends to BBVs to create enhanced BBVs or EBBVs. The benchmark software may graph the EBBV information in a cluster diagram and selects a subset of EBBVs as a representative sample for each program phase. Benchmarking software generates a reduced application software program from the representative EBBV samples.Type: GrantFiled: January 11, 2008Date of Patent: November 30, 2010Assignee: International Business Machines CorporationInventors: Robert H. Bell, Thomas W. Chen, Richard J. Eickemeyer, Venkat R. Indukuru, Pattabi M. Seshadri, Madhavi G. Valluri
-
Patent number: 7770140Abstract: A test system or simulator includes an IC test application sampling software program that executes test application software on a semiconductor die IC design model. The test application sampling software includes trace, simulation point, CPI error, clustering and other programs. IC designers utilize the test application sampling software to evaluate the performance characteristics of IC designs with test software applications. The test application sampling software generates basic block vectors (BBVs) and fly-by vectors (FBVs) from instruction trace analysis of test application software. The test application sampling software analyzes microarchitecture dependent information that it uses to generate the FBVs. Test application sampling software generates a reduced representative test application software program from the BBV and FBV data utilizing an instruction budgeting method.Type: GrantFiled: February 5, 2008Date of Patent: August 3, 2010Assignee: International Business Machines CorporationInventors: Robert H. Bell, Thomas W. Chen, Venkat R. Indukuru, Pattabi M. Seshadri, Madhavi G. Valluri
-
Publication number: 20090276191Abstract: A test system or simulator includes an enhanced IC test application sampling software program that executes test application software on a semiconductor die IC design model. The enhanced test application sampling software may include trace, simulation point, CPI error, clustering, instruction budgeting, and other programs. The enhanced test application sampling software generates basic block vectors (BBVs) and fly-by vectors (FBVs) from instruction trace analysis of test application software workloads. The enhanced test application sampling software utilizes the microarchitecture dependent information to generate the FBVs to select representative instruction intervals from the test application software. The enhanced test application sampling software generates a reduced representative test application software program from the BBV and FBV data utilizing a global instruction budgeting analysis method.Type: ApplicationFiled: April 30, 2008Publication date: November 5, 2009Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Robert H. Bell, JR., Wen-Tzer Thomas Chen, Venkat R. Indukuru, Pattabi M. Seshadri, Madhavi G. Valluri
-
Publication number: 20090276190Abstract: A test system or simulator includes an integrated circuit (IC) benchmark software program that executes workload program software on a semiconductor die IC design model. The benchmark software program includes trace, simulation point, basic block vector (BBV) generation, cycles per instruction (CPI) error, clustering and other programs. The test system also includes CPI stack program software that generates CPI stack data that includes microarchitecture dependent information for each instruction interval of workload program software. The CPI stack data may also include an overall analysis of CPI data for the entire workload program. IC designers may utilize the benchmark software and CPI stack program to develop a reduced representative workload program that includes CPI data as well as microarchitecture dependent information.Type: ApplicationFiled: April 30, 2008Publication date: November 5, 2009Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Robert H. Bell, JR., Thomas W. Chen, Venkat R. Indukuru, Alexander E. Mericas, Pattabi M. Seshadri, Madhavi G. Valluri
-
Publication number: 20090199138Abstract: A test system or simulator includes an IC test application sampling software program that executes test application software on a semiconductor die IC design model. The test application sampling software includes trace, simulation point, CPI error, clustering and other programs. IC designers utilize the test application sampling software to evaluate the performance characteristics of IC designs with test software applications. The test application sampling software generates basic block vectors (BBVs) and fly-by vectors (FBVs) from instruction trace analysis of test application software. The test application sampling software analyzes microarchitecture dependent information that it uses to generate the FBVs. Test application sampling software generates a reduced representative test application software program from the BBV and FBV data utilizing an instruction budgeting method.Type: ApplicationFiled: February 5, 2008Publication date: August 6, 2009Applicant: IBM CorporationInventors: Robert H. Bell, Thomas W. Chen, Venkat R. Indukuru, Pattabi M. Seshadri, Madhavi G. Valluri
-
Publication number: 20090183127Abstract: A test system or simulator includes an IC benchmark software program that executes application software on a semiconductor die IC design model. The benchmark software includes trace, simulation point, clustering and other programs. IC designers utilize the benchmark software to evaluate the performance characteristics of IC designs with customer user software applications. The benchmark software generates basic block vectors BBVs from instruction traces of application software. The benchmark software analyzes data dependent information that it appends to BBVs to create enhanced BBVs or EBBVs. The benchmark software may graph the EBBV information in a cluster diagram and selects a subset of EBBVs as a representative sample for each program phase. Benchmarking software generates a reduced application software program from the representative EBBV samples.Type: ApplicationFiled: January 11, 2008Publication date: July 16, 2009Applicant: IBM CorporationInventors: Robert H. Bell, Thomas W. Chen, Richard J. Eickemeyer, Venkat R. Indukuru, Pattabi M. Seshadri, Madhavi G. Valluri
-
Publication number: 20080307203Abstract: A method, system, and computer program product are provided for identifying instructions to obtain representative traces. A phase instruction budget is calculated for each phase in a set of phases. The phase instruction budget is based on a weight associated with each phase and a global instruction budget. A starting index and an ending index are identified for instructions within a set of intervals in each phase in order to meet the phase instruction budget for that phase, thereby forming a set of interval indices. A determination is made as to whether the instructions within the set of interval indices meet the global instruction budget. Responsive to the global instruction budget being met, the set of interval indices are output as collection points for the representative traces.Type: ApplicationFiled: June 5, 2007Publication date: December 11, 2008Inventors: Robert H. Bell, JR., Wen-Tzer T. Chen, Richard J. Eickemeyer, Venkat R. Indukuru, Pattabi M. Seshadri, Madhavi G. Valluri