Patents by Inventor Geoff Lowney
Geoff Lowney has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20210326504Abstract: Methods, apparatus, systems and articles of manufacture are disclosed to improve FPGA pipeline emulation efficiency on CPUs. An example disclosed apparatus includes a loop detector to identify a register shift loop in field programmable gate array (FPGA) code, an unroller to shift and store pipeline stages in the register shift loop to a temporary unroll array, an intermediate canceller to cancel out intermediate load and store values of the temporary unroll array to retain last shifted values of the pipeline stages, and a propagator to improve emulation efficiency of the FPGA code by generating a scalar loop of the retained last shifted values for a vectorization input.Type: ApplicationFiled: December 26, 2020Publication date: October 21, 2021Inventors: Xinmin Tian, Geoff Lowney
-
Patent number: 10909287Abstract: Methods, apparatus, systems and articles of manufacture are disclosed to improve FPGA pipeline emulation efficiency on CPUs. An example disclosed apparatus includes a loop detector to identify a register shift loop in field programmable gate array (FPGA) code, an unroller to shift and store pipeline stages in the register shift loop to a temporary unroll array, an intermediate canceller to cancel out intermediate load and store values of the temporary unroll array to retain last shifted values of the pipeline stages, and a propagator to improve emulation efficiency of the FPGA code by generating a scalar loop of the retained last shifted values for a vectorization input.Type: GrantFiled: June 28, 2017Date of Patent: February 2, 2021Assignee: INTEL CORPORATIONInventors: Xinmin Tian, Geoff Lowney
-
Publication number: 20190005175Abstract: Methods, apparatus, systems and articles of manufacture are disclosed to improve FPGA pipeline emulation efficiency on CPUs. An example disclosed apparatus includes a loop detector to identify a register shift loop in field programmable gate array (FPGA) code, an unroller to shift and store pipeline stages in the register shift loop to a temporary unroll array, an intermediate canceller to cancel out intermediate load and store values of the temporary unroll array to retain last shifted values of the pipeline stages, and a propagator to improve emulation efficiency of the FPGA code by generating a scalar loop of the retained last shifted values for a vectorization input.Type: ApplicationFiled: June 28, 2017Publication date: January 3, 2019Inventors: Xinmin Tian, Geoff Lowney
-
Patent number: 9720663Abstract: Methods, apparatus, systems and articles of manufacture are disclosed to optimize sparse matrix execution. An example disclosed apparatus includes a context former to identify a matrix function call from a matrix function library, the matrix function call associated with a sparse matrix, a pattern matcher to identify an operational pattern associated with the matrix function call, and a code generator to associate a function data structure with the matrix function call exhibiting the operational pattern, the function data structure stored external to the matrix function library, and facilitate a runtime link between the function data structure and the matrix function call.Type: GrantFiled: June 25, 2015Date of Patent: August 1, 2017Assignee: INTEL CORPORATIONInventors: Hongbo Rong, Jong Soo Park, Mikhail Smelyanskiy, Geoff Lowney
-
Publication number: 20160378442Abstract: Methods, apparatus, systems and articles of manufacture are disclosed to optimize sparse matrix execution. An example disclosed apparatus includes a context former to identify a matrix function call from a matrix function library, the matrix function call associated with a sparse matrix, a pattern matcher to identify an operational pattern associated with the matrix function call, and a code generator to associate a function data structure with the matrix function call exhibiting the operational pattern, the function data structure stored external to the matrix function library, and facilitate a runtime link between the function data structure and the matrix function call.Type: ApplicationFiled: June 25, 2015Publication date: December 29, 2016Inventors: Hongbo Rong, Jong Soo Park, Mikhail Smelyanskiy, Geoff Lowney
-
Patent number: 8707012Abstract: In one embodiment, the present invention includes an apparatus having a register file to store vector data, an address generator coupled to the register file to generate addresses for a vector memory operation, and a controller to generate an output slice from one or more slices each including multiple addresses, where the output slice includes addresses each corresponding to a separately addressable portion of a memory. Other embodiments are described and claimed.Type: GrantFiled: October 12, 2012Date of Patent: April 22, 2014Assignee: Intel CorporationInventors: Roger Espasa, Joel Emer, Geoff Lowney, Roger Gramunt, Santiago Galan, Toni Juan, Jesus Corbal, Federico Ardanaz, Isaac Hernandez
-
Publication number: 20130036268Abstract: In one embodiment, the present invention includes an apparatus having a register file to store vector data, an address generator coupled to the register file to generate addresses for a vector memory operation, and a controller to generate an output slice from one or more slices each including multiple addresses, where the output slice includes addresses each corresponding to a separately addressable portion of a memory. Other embodiments are described and claimed.Type: ApplicationFiled: October 12, 2012Publication date: February 7, 2013Inventors: Roger Espasa, Joel Emer, Geoff Lowney, Roger Gramunt, Santiago Galan, Toni Juan, Jesus Corbal, Federico Ardanaz, Isaac Hernandez
-
Patent number: 8316216Abstract: In one embodiment, the present invention includes an apparatus having a register file to store vector data, an address generator coupled to the register file to generate addresses for a vector memory operation, and a controller to generate an output slice from one or more slices each including multiple addresses, where the output slice includes addresses each corresponding to a separately addressable portion of a memory. Other embodiments are described and claimed.Type: GrantFiled: October 21, 2009Date of Patent: November 20, 2012Assignee: Intel CorporationInventors: Roger Espasa, Joel Emer, Geoff Lowney, Roger Gramunt, Santiago Galan, Toni Juan, Jesus Corbal, Federico Ardanaz, Isaac Hernandez
-
Patent number: 8181170Abstract: Analyzing a first binary version of a program and unwind information associated with the first binary version of the program, performing optimization on the first binary version of the program to produce a second binary version of the program based at least in part on the results of the analysis, and generating new unwind information for the second binary version of the program based at least in part on the results of the analysis and at least in part on the optimization performed.Type: GrantFiled: January 16, 2009Date of Patent: May 15, 2012Assignee: Intel CorporationInventors: Harish G. Patil, Robert Muth, Geoff Lowney
-
Publication number: 20100042779Abstract: In one embodiment, the present invention includes an apparatus having a register file to store vector data, an address generator coupled to the register file to generate addresses for a vector memory operation, and a controller to generate an output slice from one or more slices each including multiple addresses, where the output slice includes addresses each corresponding to a separately addressable portion of a memory. Other embodiments are described and claimed.Type: ApplicationFiled: October 21, 2009Publication date: February 18, 2010Inventors: Roger Espasa, Joel Emer, Geoff Lowney, Roger Gramunt, Santiago Galan, Toni Juan, Jesus Corbal, Federico Ardanaz, Isaac Hernandez
-
Patent number: 7627735Abstract: In one embodiment, the present invention includes an apparatus having a register file to store vector data, an address generator coupled to the register file to generate addresses for a vector memory operation, and a controller to generate an output slice from one or more slices each including multiple addresses, where the output slice includes addresses each corresponding to a separately addressable portion of a memory. Other embodiments are described and claimed.Type: GrantFiled: October 21, 2005Date of Patent: December 1, 2009Assignee: Intel CorporationInventors: Roger Espasa, Joel Emer, Geoff Lowney, Roger Gramunt, Santiago Galan, Toni Juan, Jesus Corbal, Federico Ardanaz, Isaac Hernandez
-
Publication number: 20090133008Abstract: Analyzing a first binary version of a program and unwind information associated with the first binary version of the program, performing optimization on the first binary version of the program to produce a second binary version of the program based at least in part on the results of the analysis, and generating new unwind information for the second binary version of the program based at least in part on the results of the analysis and at least in part on the optimization performed.Type: ApplicationFiled: January 16, 2009Publication date: May 21, 2009Applicant: INTEL CORPORATIONInventors: Harish G. Patil, Robert Muth, Geoff Lowney
-
Patent number: 7480902Abstract: Analyzing a first binary version of a program and unwind information associated with the first binary version of the program, performing optimization on the first binary version of the program to produce a second binary version of the program based at least in part on the results of the analysis, and generating new unwind information for the second binary version of the program based at least in part on the results of the analysis and at least in part on the optimization performed.Type: GrantFiled: July 8, 2004Date of Patent: January 20, 2009Assignee: Intel CorporationInventors: Harish G. Patil, Robert Muth, Geoff Lowney
-
Publication number: 20070094477Abstract: In one embodiment, the present invention includes an apparatus having a register file to store vector data, an address generator coupled to the register file to generate addresses for a vector memory operation, and a controller to generate an output slice from one or more slices each including multiple addresses, where the output slice includes addresses each corresponding to a separately addressable portion of a memory. Other embodiments are described and claimed.Type: ApplicationFiled: October 21, 2005Publication date: April 26, 2007Inventors: Roger Espasa, Joel Emer, Geoff Lowney, Roger Gramunt, Santiago Galan, Toni Juan, Jesus Corbal, Federico Ardanaz, Isaac Hernandez
-
Patent number: 7181723Abstract: Methods and an apparatus for stride profiling a software application are disclosed. An example system uses a hardware performance counter to report instruction addresses and data addresses associated with memory access instructions triggered by some event, such as a data cache miss. When the same instruction address is associated with more than one data address, the difference between the two data addresses is recorded. When two or more of these data address differences are recorded for the same instruction, the system determines a stride associated with the instruction to be the greatest common divisor of the two or more differences. This stride may be used by a compiler to optimize data cache prefetching. In addition, any overhead associated with monitoring addresses of data cache misses may be reduced by cycling between an inspection phase and a skipping phase. More data cache misses are monitored during the inspection phase than during the skipping phase.Type: GrantFiled: May 27, 2003Date of Patent: February 20, 2007Assignee: Intel CorporationInventors: Chi-Keung Luk, Geoff Lowney
-
Publication number: 20060010431Abstract: Analyzing a first binary version of a program and unwind information associated with the first binary version of the program, performing optimization on the first binary version of the program to produce a second binary version of the program based at least in part on the results of the analysis, and generating new unwind information for the second binary version of the program based at least in part on the results of the analysis and at least in part on the optimization performed.Type: ApplicationFiled: July 8, 2004Publication date: January 12, 2006Inventors: Harish Patil, Robert Muth, Geoff Lowney
-
Publication number: 20040243981Abstract: Methods and an apparatus for stride profiling a software application are disclosed. An example system uses a hardware performance counter to report instruction addresses and data addresses associated with memory access instructions triggered by some event, such as a data cache miss. When the same instruction address is associated with more than one data address, the difference between the two data addresses is recorded. When two or more of these data address differences are recorded for the same instruction, the system determines a stride associated with the instruction to be the greatest common divisor of the two or more differences. This stride may be used by a compiler to optimize data cache prefetching. In addition, any overhead associated with monitoring addresses of data cache misses may be reduced by cycling between an inspection phase and a skipping phase. More data cache misses are monitored during the inspection phase than during the skipping phase.Type: ApplicationFiled: May 27, 2003Publication date: December 2, 2004Inventors: Chi-Keung Luk, Geoff Lowney