Patents by Inventor Paul Keltcher

Paul Keltcher has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20220318017
    Abstract: Techniques are described for a hardware processor to dynamically configure a component that improves a processor function with a configuration setting based on invariant statistics. The invariant statistics are generated by execution of the instructions from one or more applications and are independent of the performance metrics of the processor function for the execution. In an embodiment, the configuration setting for the component is generated using a machine learning model.
    Type: Application
    Filed: March 30, 2021
    Publication date: October 6, 2022
    Inventors: ALOK GARG, PAUL KELTCHER, MAYANK CHHABLANI, FURKAN ERIS
  • Patent number: 10671535
    Abstract: A prefetcher maintains the state of stored prefetch information, such as a prefetch confidence level, when a prefetch would cross a memory page boundary. The maintained prefetch information can be used both to identify whether the stride pattern for a particular sequence of demand requests persists after the memory page boundary has been crossed, and to continue to issue prefetch requests according to the identified pattern. The prefetcher therefore does not have re-identify a stride pattern each time a page boundary is crossed by a sequence of demand requests, thereby improving the efficiency and accuracy of the prefetcher.
    Type: Grant
    Filed: July 17, 2013
    Date of Patent: June 2, 2020
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: John Kalamatianos, Paul Keltcher, Marius Evers, Chitresh Narasimhaiah
  • Patent number: 9058278
    Abstract: A method, an apparatus, and a non-transitory computer readable medium for tracking accuracy and coverage of a prefetcher in a processor are presented. A table is maintained and indexed by an address, wherein each entry in the table corresponds to one address. A number of demand requests that hit in the table on a prefetch, a total number of demand requests, and a number of prefetch requests are counted. The accuracy of the prefetcher is calculated by dividing the number of demand requests that hit in the table on a prefetch by the number of prefetch requests. The coverage of the prefetcher is calculated by dividing the number of demand requests that hit in the table on a prefetch by the total number of demand requests. The table and the counters are reset when a reset condition is reached.
    Type: Grant
    Filed: December 19, 2012
    Date of Patent: June 16, 2015
    Assignee: Advanced Micro Devices, Inc.
    Inventors: John Kalamatianos, Paul Keltcher
  • Patent number: 9058277
    Abstract: Methods and systems for prefetching data for a processor are provided. A system is configured for and a method includes selecting one of a first prefetching control logic and a second prefetching control logic of the processor as a candidate feature, capturing the performance metric of the processor over an inactive sample period when the candidate feature is inactive, capturing a performance metric of the processor over an active sample period when the candidate feature is active, comparing the performance metric of the processor for the active and inactive sample periods, and setting a status of the candidate feature as enabled when the performance metric in the active period indicates improvement over the performance metric in the inactive period, and as disabled when the performance metric in the inactive period indicates improvement over the performance metric in the active period.
    Type: Grant
    Filed: November 8, 2012
    Date of Patent: June 16, 2015
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Sharad Dilip Bade, Alok Garg, John Kalamatianos, Paul Keltcher, Marius Evers, Chitresh Narasimhaiah
  • Patent number: 9026739
    Abstract: One or more lines of a cache are prefetched according to a first prefetch routine while training a prefetcher to prefetch one or more lines of the cache according to a second prefetch routine. In response to determining that the prefetcher has been trained, one or more lines of the cache may be prefetched according to the second prefetch routine.
    Type: Grant
    Filed: March 7, 2012
    Date of Patent: May 5, 2015
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Srilatha Manne, Nitya Ranganathan, Paul Keltcher, Donald W. McCauley
  • Patent number: 9021207
    Abstract: In response to a processor core exiting a low-power state, a cache is set to a minimum size so that fewer than all of the cache's entries are available to store data, thus reducing the cache's power consumption. Over time, the size of the cache can be increased to account for heightened processor activity, thus ensuring that processing efficiency is not significantly impacted by a reduced cache size. In some embodiments, the cache size is increased based on a measured processor performance metric, such as an eviction rate of the cache. In some embodiments, the cache size is increased at regular intervals until a maximum size is reached.
    Type: Grant
    Filed: December 20, 2012
    Date of Patent: April 28, 2015
    Assignee: Advanced Micro Devices, Inc.
    Inventors: John Kalamatianos, Edward J. McLellan, Paul Keltcher, Srilatha Manne, Richard E. Klass, James M. O'Connor
  • Publication number: 20150026414
    Abstract: A prefetcher maintains the state of stored prefetch information, such as a prefetch confidence level, when a prefetch would cross a memory page boundary. The maintained prefetch information can be used both to identify whether the stride pattern for a particular sequence of demand requests persists after the memory page boundary has been crossed, and to continue to issue prefetch requests according to the identified pattern. The prefetcher therefore does not have re-identify a stride pattern each time a page boundary is crossed by a sequence of demand requests, thereby improving the efficiency and accuracy of the prefetcher.
    Type: Application
    Filed: July 17, 2013
    Publication date: January 22, 2015
    Applicant: Advanced Micro Devices, Inc.
    Inventors: John Kalamatianos, Paul Keltcher, Marius Evers, Chitresh Narasimhaiah
  • Publication number: 20140181410
    Abstract: In response to a processor core exiting a low-power state, a cache is set to a minimum size so that fewer than all of the cache's entries are available to store data, thus reducing the cache's power consumption. Over time, the size of the cache can be increased to account for heightened processor activity, thus ensuring that processing efficiency is not significantly impacted by a reduced cache size. In some embodiments, the cache size is increased based on a measured processor performance metric, such as an eviction rate of the cache. In some embodiments, the cache size is increased at regular intervals until a maximum size is reached.
    Type: Application
    Filed: December 20, 2012
    Publication date: June 26, 2014
    Applicant: Advanced Micro Devices, Inc.
    Inventors: John Kalamatianos, Edward J. McLellan, Paul Keltcher, Srilatha Manne, Richard E. Klass, James M. O'Connor
  • Publication number: 20140173217
    Abstract: A method, an apparatus, and a non-transitory computer readable medium for tracking accuracy and coverage of a prefetcher in a processor are presented. A table is maintained and indexed by an address, wherein each entry in the table corresponds to one address. A number of demand requests that hit in the table on a prefetch, a total number of demand requests, and a number of prefetch requests are counted. The accuracy of the prefetcher is calculated by dividing the number of demand requests that hit in the table on a prefetch by the number of prefetch requests. The coverage of the prefetcher is calculated by dividing the number of demand requests that hit in the table on a prefetch by the total number of demand requests. The table and the counters are reset when a reset condition is reached.
    Type: Application
    Filed: December 19, 2012
    Publication date: June 19, 2014
    Applicant: ADVANCED MICRO DEVICES, INC.
    Inventors: John Kalamatianos, Paul Keltcher
  • Publication number: 20140129780
    Abstract: Methods and systems for prefetching data for a processor are provided. A system is configured for and a method includes selecting one of a first prefetching control logic and a second prefetching control logic of the processor as a candidate feature, capturing the performance metric of the processor over an inactive sample period when the candidate feature is inactive, capturing a performance metric of the processor over an active sample period when the candidate feature is active, comparing the performance metric of the processor for the active and inactive sample periods, and setting a status of the candidate feature as enabled when the performance metric in the active period indicates improvement over the performance metric in the inactive period, and as disabled when the performance metric in the inactive period indicates improvement over the performance metric in the active period.
    Type: Application
    Filed: November 8, 2012
    Publication date: May 8, 2014
    Applicant: ADVANCED MICRO DEVICES, INC.
    Inventors: Sharad Dilip Bade, Alok Garg, John Kalamatianos, Paul Keltcher, Marius Evers, Chitresh Narasimhaiah
  • Publication number: 20130238861
    Abstract: One or more lines of a cache are prefetched according to a first prefetch routine while training a prefetcher to prefetch one or more lines of the cache according to a second prefetch routine. In response to determining that the prefetcher has been trained, one or more lines of the cache may be prefetched according to the second prefetch routine.
    Type: Application
    Filed: March 7, 2012
    Publication date: September 12, 2013
    Inventors: Srilatha Manne, Nitya Ranganathan, Paul Keltcher, Donald W. McCauley
  • Patent number: 7139897
    Abstract: Circuit arrangement and method for dispatching computer instructions. In a processor having a plurality of types of execution units, the computer instructions are grouped in bundles, and each bundle includes a plurality of instructions and an associated index code. Template values are stored in a plurality of template registers, and each template value specifies types of execution units for a bundle of instructions and those instructions in a bundle that are executable in parallel. A dispatch logic circuit is coupled to the template registers and is responsive to an input bundle of instructions and associated index value. The dispatch logic circuit reads a code from a selected one of the plurality of template registers referenced by the index value and issues one or more selected instructions in the bundle to at least one execution unit of a selected type responsive to the code read from the selected one of the plurality of template registers.
    Type: Grant
    Filed: April 1, 2002
    Date of Patent: November 21, 2006
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Paul Keltcher, Gary Vondran
  • Patent number: 6915402
    Abstract: A method and system providing dynamic allocation of memory through hardware is disclosed. An embodiment provides for a multi-processor system providing for a secure partitioned memory. The system comprises a processor(s), a hardware implemented memory router coupled to the processor(s), and memory coupled to the memory router. The memory router stores memory partition information, which describes the memory allocated to the processor(s). Furthermore, the memory router maps a memory access request from a processor to an address in the memory.
    Type: Grant
    Filed: May 23, 2001
    Date of Patent: July 5, 2005
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Kenneth Mark Wilson, Paul Keltcher, Yoshio Turner
  • Patent number: 6874014
    Abstract: Multiple processors are mounted on a single die. The die is connected to a memory storing multiple operating systems or images of multiple operating systems. Each of the processors or a group of one or more of the processors is operable to execute a distinct one of the multiple operating systems. Therefore, resources for a single operating system may be dedicated to one processor or a group of processors. Consequently, a large number of processors mounted on a single die can operate efficiently.
    Type: Grant
    Filed: May 29, 2001
    Date of Patent: March 29, 2005
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Stephen E. Richardson, Gary Vondran, Stuart Siu, Paul Keltcher, Shankar Venkataraman, Padmanabha Venkitakrishnan, Joseph Ku
  • Patent number: 6782453
    Abstract: In response to determining a requested line of data is not stored within a local memory, the requested line of data is written to the local memory from a remote memory. Additionally, a victim page is selected in the local memory in response to the requested line of data not being in the local memory and it is determined whether one or more lines of the victim page are dirty. Furthermore, the one or more dirty lines are written to the remote memory in response to determining that the one or more lines are dirty and the requested line of data is fetched from the remote memory. Moreover, the requested line of data is stored within the page of data at a location previously occupied by the victim page.
    Type: Grant
    Filed: February 12, 2002
    Date of Patent: August 24, 2004
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Paul Keltcher, Stephen Richardson
  • Publication number: 20030188132
    Abstract: Circuit arrangement and method for dispatching computer instructions. In a processor having a plurality of types of execution units, the computer instructions are grouped in bundles, and each bundle includes a plurality of instructions and an associated index code. Template values are stored in a plurality of template registers, and each template value specifies types of execution units for a bundle of instructions and those instructions in a bundle that are executable in parallel. A dispatch logic circuit is coupled to the template registers and is responsive to an input bundle of instructions and associated index value. The dispatch logic circuit reads a code from a selected one of the plurality of template registers referenced by the index value and issues one or more selected instructions in the bundle to at least one execution unit of a selected type responsive to the code read from the selected one of the plurality of template registers.
    Type: Application
    Filed: April 1, 2002
    Publication date: October 2, 2003
    Applicant: Hewlett-Parkard Company
    Inventors: Paul Keltcher, Gary Vondran
  • Publication number: 20030154348
    Abstract: In response to determining a requested line of data is not stored within a local memory, the requested line of data is written to the local memory from a remote memory. Additionally, a victim page is selected in the local memory in response to the requested line of data not being in the local memory and it is determined whether one or more lines of the victim page are dirty. Furthermore, the one or more dirty lines are written to the remote memory in response to determining that the one or more lines are dirty and the requested line of data is fetched from the remote memory. Moreover, the requested line of data is stored within the page of data at a location previously occupied by the victim page.
    Type: Application
    Filed: February 12, 2002
    Publication date: August 14, 2003
    Inventors: Paul Keltcher, Stephen Richardson
  • Publication number: 20030023794
    Abstract: A cache coherent multiple processor integrated circuit. The circuit includes a plurality of processor units. The processor units are each provided with a cache unit. An embedded RAM unit is included for storing instructions and data for the processor units. A cache coherent bus is coupled to the processor units and the embedded RAM unit. The bus is configured to provide cache coherent snooping commands to enable the processor units to ensure cache coherency between their respective cache units and the embedded RAM unit. The multiple processor integrated circuit can further include an input output unit coupled to the bus to provide input and output transactions for the processor units. The bus is configured to provide split transactions for the processor units coupled to the bus, providing better bandwidth utilization of the bus. The bus can be configured to transfer an entire cache line for the cache units of the processor units in a single clock cycle, wherein the bus is 256 bits wide.
    Type: Application
    Filed: July 26, 2001
    Publication date: January 30, 2003
    Inventors: Padmanabha I. Venkitakrishnan, Shankar Venkataraman, Paul Keltcher, Stuart C. Siu, Stephen E. Richardson, Gary Lee Vondran
  • Publication number: 20020184328
    Abstract: Multiple processors are mounted on a single die. The die is connected to a memory storing multiple operating systems or images of multiple operating systems. Each of the processors or a group of one or more of the processors is operable to execute a distinct one of the multiple operating systems. Therefore, resources for a single operating system may be dedicated to one processor or a group of processors. Consequently, a large number of processors mounted on a single die can operate efficiently.
    Type: Application
    Filed: May 29, 2001
    Publication date: December 5, 2002
    Inventors: Stephen E. Richardson, Gary Lee Vondan, Stuart C. Siu, Paul Keltcher, Shankar Venkataraman, Padmanabha I. Venkitakrishnan, Joseph Weiyeh Ku
  • Publication number: 20020178337
    Abstract: A method and system providing dynamic allocation of memory through hardware is disclosed. An embodiment provides for a multi-processor system providing for a secure partitioned memory. The system comprises a processor(s), a hardware implemented memory router coupled to the processor(s), and memory coupled to the memory router. The memory router stores memory partition information, which describes the memory allocated to the processor(s). Furthermore, the memory router maps a memory access request from a processor to an address in the memory.
    Type: Application
    Filed: May 23, 2001
    Publication date: November 28, 2002
    Inventors: Kenneth Mark Wilson, Paul Keltcher, Yoshio Turner