Patents by Inventor Paul Keltcher
Paul Keltcher has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20220318017Abstract: Techniques are described for a hardware processor to dynamically configure a component that improves a processor function with a configuration setting based on invariant statistics. The invariant statistics are generated by execution of the instructions from one or more applications and are independent of the performance metrics of the processor function for the execution. In an embodiment, the configuration setting for the component is generated using a machine learning model.Type: ApplicationFiled: March 30, 2021Publication date: October 6, 2022Inventors: ALOK GARG, PAUL KELTCHER, MAYANK CHHABLANI, FURKAN ERIS
-
Patent number: 10671535Abstract: A prefetcher maintains the state of stored prefetch information, such as a prefetch confidence level, when a prefetch would cross a memory page boundary. The maintained prefetch information can be used both to identify whether the stride pattern for a particular sequence of demand requests persists after the memory page boundary has been crossed, and to continue to issue prefetch requests according to the identified pattern. The prefetcher therefore does not have re-identify a stride pattern each time a page boundary is crossed by a sequence of demand requests, thereby improving the efficiency and accuracy of the prefetcher.Type: GrantFiled: July 17, 2013Date of Patent: June 2, 2020Assignee: ADVANCED MICRO DEVICES, INC.Inventors: John Kalamatianos, Paul Keltcher, Marius Evers, Chitresh Narasimhaiah
-
Patent number: 9058278Abstract: A method, an apparatus, and a non-transitory computer readable medium for tracking accuracy and coverage of a prefetcher in a processor are presented. A table is maintained and indexed by an address, wherein each entry in the table corresponds to one address. A number of demand requests that hit in the table on a prefetch, a total number of demand requests, and a number of prefetch requests are counted. The accuracy of the prefetcher is calculated by dividing the number of demand requests that hit in the table on a prefetch by the number of prefetch requests. The coverage of the prefetcher is calculated by dividing the number of demand requests that hit in the table on a prefetch by the total number of demand requests. The table and the counters are reset when a reset condition is reached.Type: GrantFiled: December 19, 2012Date of Patent: June 16, 2015Assignee: Advanced Micro Devices, Inc.Inventors: John Kalamatianos, Paul Keltcher
-
Patent number: 9058277Abstract: Methods and systems for prefetching data for a processor are provided. A system is configured for and a method includes selecting one of a first prefetching control logic and a second prefetching control logic of the processor as a candidate feature, capturing the performance metric of the processor over an inactive sample period when the candidate feature is inactive, capturing a performance metric of the processor over an active sample period when the candidate feature is active, comparing the performance metric of the processor for the active and inactive sample periods, and setting a status of the candidate feature as enabled when the performance metric in the active period indicates improvement over the performance metric in the inactive period, and as disabled when the performance metric in the inactive period indicates improvement over the performance metric in the active period.Type: GrantFiled: November 8, 2012Date of Patent: June 16, 2015Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Sharad Dilip Bade, Alok Garg, John Kalamatianos, Paul Keltcher, Marius Evers, Chitresh Narasimhaiah
-
Patent number: 9026739Abstract: One or more lines of a cache are prefetched according to a first prefetch routine while training a prefetcher to prefetch one or more lines of the cache according to a second prefetch routine. In response to determining that the prefetcher has been trained, one or more lines of the cache may be prefetched according to the second prefetch routine.Type: GrantFiled: March 7, 2012Date of Patent: May 5, 2015Assignee: Advanced Micro Devices, Inc.Inventors: Srilatha Manne, Nitya Ranganathan, Paul Keltcher, Donald W. McCauley
-
Patent number: 9021207Abstract: In response to a processor core exiting a low-power state, a cache is set to a minimum size so that fewer than all of the cache's entries are available to store data, thus reducing the cache's power consumption. Over time, the size of the cache can be increased to account for heightened processor activity, thus ensuring that processing efficiency is not significantly impacted by a reduced cache size. In some embodiments, the cache size is increased based on a measured processor performance metric, such as an eviction rate of the cache. In some embodiments, the cache size is increased at regular intervals until a maximum size is reached.Type: GrantFiled: December 20, 2012Date of Patent: April 28, 2015Assignee: Advanced Micro Devices, Inc.Inventors: John Kalamatianos, Edward J. McLellan, Paul Keltcher, Srilatha Manne, Richard E. Klass, James M. O'Connor
-
Publication number: 20150026414Abstract: A prefetcher maintains the state of stored prefetch information, such as a prefetch confidence level, when a prefetch would cross a memory page boundary. The maintained prefetch information can be used both to identify whether the stride pattern for a particular sequence of demand requests persists after the memory page boundary has been crossed, and to continue to issue prefetch requests according to the identified pattern. The prefetcher therefore does not have re-identify a stride pattern each time a page boundary is crossed by a sequence of demand requests, thereby improving the efficiency and accuracy of the prefetcher.Type: ApplicationFiled: July 17, 2013Publication date: January 22, 2015Applicant: Advanced Micro Devices, Inc.Inventors: John Kalamatianos, Paul Keltcher, Marius Evers, Chitresh Narasimhaiah
-
Publication number: 20140181410Abstract: In response to a processor core exiting a low-power state, a cache is set to a minimum size so that fewer than all of the cache's entries are available to store data, thus reducing the cache's power consumption. Over time, the size of the cache can be increased to account for heightened processor activity, thus ensuring that processing efficiency is not significantly impacted by a reduced cache size. In some embodiments, the cache size is increased based on a measured processor performance metric, such as an eviction rate of the cache. In some embodiments, the cache size is increased at regular intervals until a maximum size is reached.Type: ApplicationFiled: December 20, 2012Publication date: June 26, 2014Applicant: Advanced Micro Devices, Inc.Inventors: John Kalamatianos, Edward J. McLellan, Paul Keltcher, Srilatha Manne, Richard E. Klass, James M. O'Connor
-
Publication number: 20140173217Abstract: A method, an apparatus, and a non-transitory computer readable medium for tracking accuracy and coverage of a prefetcher in a processor are presented. A table is maintained and indexed by an address, wherein each entry in the table corresponds to one address. A number of demand requests that hit in the table on a prefetch, a total number of demand requests, and a number of prefetch requests are counted. The accuracy of the prefetcher is calculated by dividing the number of demand requests that hit in the table on a prefetch by the number of prefetch requests. The coverage of the prefetcher is calculated by dividing the number of demand requests that hit in the table on a prefetch by the total number of demand requests. The table and the counters are reset when a reset condition is reached.Type: ApplicationFiled: December 19, 2012Publication date: June 19, 2014Applicant: ADVANCED MICRO DEVICES, INC.Inventors: John Kalamatianos, Paul Keltcher
-
Publication number: 20140129780Abstract: Methods and systems for prefetching data for a processor are provided. A system is configured for and a method includes selecting one of a first prefetching control logic and a second prefetching control logic of the processor as a candidate feature, capturing the performance metric of the processor over an inactive sample period when the candidate feature is inactive, capturing a performance metric of the processor over an active sample period when the candidate feature is active, comparing the performance metric of the processor for the active and inactive sample periods, and setting a status of the candidate feature as enabled when the performance metric in the active period indicates improvement over the performance metric in the inactive period, and as disabled when the performance metric in the inactive period indicates improvement over the performance metric in the active period.Type: ApplicationFiled: November 8, 2012Publication date: May 8, 2014Applicant: ADVANCED MICRO DEVICES, INC.Inventors: Sharad Dilip Bade, Alok Garg, John Kalamatianos, Paul Keltcher, Marius Evers, Chitresh Narasimhaiah
-
Publication number: 20130238861Abstract: One or more lines of a cache are prefetched according to a first prefetch routine while training a prefetcher to prefetch one or more lines of the cache according to a second prefetch routine. In response to determining that the prefetcher has been trained, one or more lines of the cache may be prefetched according to the second prefetch routine.Type: ApplicationFiled: March 7, 2012Publication date: September 12, 2013Inventors: Srilatha Manne, Nitya Ranganathan, Paul Keltcher, Donald W. McCauley
-
Patent number: 7139897Abstract: Circuit arrangement and method for dispatching computer instructions. In a processor having a plurality of types of execution units, the computer instructions are grouped in bundles, and each bundle includes a plurality of instructions and an associated index code. Template values are stored in a plurality of template registers, and each template value specifies types of execution units for a bundle of instructions and those instructions in a bundle that are executable in parallel. A dispatch logic circuit is coupled to the template registers and is responsive to an input bundle of instructions and associated index value. The dispatch logic circuit reads a code from a selected one of the plurality of template registers referenced by the index value and issues one or more selected instructions in the bundle to at least one execution unit of a selected type responsive to the code read from the selected one of the plurality of template registers.Type: GrantFiled: April 1, 2002Date of Patent: November 21, 2006Assignee: Hewlett-Packard Development Company, L.P.Inventors: Paul Keltcher, Gary Vondran
-
Patent number: 6915402Abstract: A method and system providing dynamic allocation of memory through hardware is disclosed. An embodiment provides for a multi-processor system providing for a secure partitioned memory. The system comprises a processor(s), a hardware implemented memory router coupled to the processor(s), and memory coupled to the memory router. The memory router stores memory partition information, which describes the memory allocated to the processor(s). Furthermore, the memory router maps a memory access request from a processor to an address in the memory.Type: GrantFiled: May 23, 2001Date of Patent: July 5, 2005Assignee: Hewlett-Packard Development Company, L.P.Inventors: Kenneth Mark Wilson, Paul Keltcher, Yoshio Turner
-
Patent number: 6874014Abstract: Multiple processors are mounted on a single die. The die is connected to a memory storing multiple operating systems or images of multiple operating systems. Each of the processors or a group of one or more of the processors is operable to execute a distinct one of the multiple operating systems. Therefore, resources for a single operating system may be dedicated to one processor or a group of processors. Consequently, a large number of processors mounted on a single die can operate efficiently.Type: GrantFiled: May 29, 2001Date of Patent: March 29, 2005Assignee: Hewlett-Packard Development Company, L.P.Inventors: Stephen E. Richardson, Gary Vondran, Stuart Siu, Paul Keltcher, Shankar Venkataraman, Padmanabha Venkitakrishnan, Joseph Ku
-
Patent number: 6782453Abstract: In response to determining a requested line of data is not stored within a local memory, the requested line of data is written to the local memory from a remote memory. Additionally, a victim page is selected in the local memory in response to the requested line of data not being in the local memory and it is determined whether one or more lines of the victim page are dirty. Furthermore, the one or more dirty lines are written to the remote memory in response to determining that the one or more lines are dirty and the requested line of data is fetched from the remote memory. Moreover, the requested line of data is stored within the page of data at a location previously occupied by the victim page.Type: GrantFiled: February 12, 2002Date of Patent: August 24, 2004Assignee: Hewlett-Packard Development Company, L.P.Inventors: Paul Keltcher, Stephen Richardson
-
Publication number: 20030188132Abstract: Circuit arrangement and method for dispatching computer instructions. In a processor having a plurality of types of execution units, the computer instructions are grouped in bundles, and each bundle includes a plurality of instructions and an associated index code. Template values are stored in a plurality of template registers, and each template value specifies types of execution units for a bundle of instructions and those instructions in a bundle that are executable in parallel. A dispatch logic circuit is coupled to the template registers and is responsive to an input bundle of instructions and associated index value. The dispatch logic circuit reads a code from a selected one of the plurality of template registers referenced by the index value and issues one or more selected instructions in the bundle to at least one execution unit of a selected type responsive to the code read from the selected one of the plurality of template registers.Type: ApplicationFiled: April 1, 2002Publication date: October 2, 2003Applicant: Hewlett-Parkard CompanyInventors: Paul Keltcher, Gary Vondran
-
Publication number: 20030154348Abstract: In response to determining a requested line of data is not stored within a local memory, the requested line of data is written to the local memory from a remote memory. Additionally, a victim page is selected in the local memory in response to the requested line of data not being in the local memory and it is determined whether one or more lines of the victim page are dirty. Furthermore, the one or more dirty lines are written to the remote memory in response to determining that the one or more lines are dirty and the requested line of data is fetched from the remote memory. Moreover, the requested line of data is stored within the page of data at a location previously occupied by the victim page.Type: ApplicationFiled: February 12, 2002Publication date: August 14, 2003Inventors: Paul Keltcher, Stephen Richardson
-
Publication number: 20030023794Abstract: A cache coherent multiple processor integrated circuit. The circuit includes a plurality of processor units. The processor units are each provided with a cache unit. An embedded RAM unit is included for storing instructions and data for the processor units. A cache coherent bus is coupled to the processor units and the embedded RAM unit. The bus is configured to provide cache coherent snooping commands to enable the processor units to ensure cache coherency between their respective cache units and the embedded RAM unit. The multiple processor integrated circuit can further include an input output unit coupled to the bus to provide input and output transactions for the processor units. The bus is configured to provide split transactions for the processor units coupled to the bus, providing better bandwidth utilization of the bus. The bus can be configured to transfer an entire cache line for the cache units of the processor units in a single clock cycle, wherein the bus is 256 bits wide.Type: ApplicationFiled: July 26, 2001Publication date: January 30, 2003Inventors: Padmanabha I. Venkitakrishnan, Shankar Venkataraman, Paul Keltcher, Stuart C. Siu, Stephen E. Richardson, Gary Lee Vondran
-
Publication number: 20020184328Abstract: Multiple processors are mounted on a single die. The die is connected to a memory storing multiple operating systems or images of multiple operating systems. Each of the processors or a group of one or more of the processors is operable to execute a distinct one of the multiple operating systems. Therefore, resources for a single operating system may be dedicated to one processor or a group of processors. Consequently, a large number of processors mounted on a single die can operate efficiently.Type: ApplicationFiled: May 29, 2001Publication date: December 5, 2002Inventors: Stephen E. Richardson, Gary Lee Vondan, Stuart C. Siu, Paul Keltcher, Shankar Venkataraman, Padmanabha I. Venkitakrishnan, Joseph Weiyeh Ku
-
Publication number: 20020178337Abstract: A method and system providing dynamic allocation of memory through hardware is disclosed. An embodiment provides for a multi-processor system providing for a secure partitioned memory. The system comprises a processor(s), a hardware implemented memory router coupled to the processor(s), and memory coupled to the memory router. The memory router stores memory partition information, which describes the memory allocated to the processor(s). Furthermore, the memory router maps a memory access request from a processor to an address in the memory.Type: ApplicationFiled: May 23, 2001Publication date: November 28, 2002Inventors: Kenneth Mark Wilson, Paul Keltcher, Yoshio Turner