Patents by Inventor Per H. Hammarlund

Per H. Hammarlund has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7502912
    Abstract: A method and apparatus for rescheduling operations in a processor. More particularly, the present invention relates to optimally using a scheduler resource in a processor by analyzing, predicting, and sorting the write order of instructions into the scheduler so that the duration the instructions sit idle in the scheduler is minimized. The analyses, prediction, and sorting may be done between an instruction queue and a scheduler by using delay units. The prediction can be based on history (latency, dependency, and resource) or on a general prediction scheme.
    Type: Grant
    Filed: December 30, 2003
    Date of Patent: March 10, 2009
    Assignee: Intel Corporation
    Inventors: Avinash Sodani, Per H. Hammarlund, Stephan J. Jourdan
  • Patent number: 7502892
    Abstract: Embodiments of the present invention relate to cache coherency. In an embodiment of the invention, a cache includes one or more cache lines. A store pipeline may retrieve a tag associated with one of the cache lines. The data associated with the cache line may not retrieved and the cache line may be updated if, based on the tag, the cache line is determined to be in a modified or exclusive state.
    Type: Grant
    Filed: December 30, 2003
    Date of Patent: March 10, 2009
    Assignee: Intel Corporation
    Inventors: Per H. Hammarlund, Hermann W. Gartler
  • Patent number: 7174428
    Abstract: Embodiments of the present invention provide a method, apparatus and system for memory renaming. In one embodiment, a decode unit may decode a load instruction. If the load instruction is predicted to be memory renamed, the load instruction may have a predicted store identifier associated with the load instruction. The decode unit may transform the load instruction that is predicted to be memory renamed into a data move instruction and a load check instruction. The data move instruction may read data from the cache based on the predicted store identifier and load check instruction may compare an identifier associated with an identified source store with the predicted store identifier. A retirement unit may retire the load instruction if the predicted store identifier matches an identifier associated with the identified source store.
    Type: Grant
    Filed: December 29, 2003
    Date of Patent: February 6, 2007
    Assignee: Intel Corporation
    Inventors: Sebastien Hily, Per H. Hammarlund, Avinash Sodani
  • Patent number: 7130965
    Abstract: Embodiments of the present invention relate to a memory management scheme and apparatus that enables efficient cache memory management. The method includes writing an entry to a store buffer at execute time; determining if the entry's address is in a first-level cache associated with the store buffer before retirement; and setting a status bit associated with the entry in said store buffer, if the address is in the cache in either exclusive or modified state. The method further includes immediately writing the entry to the first-level cache at or after retirement when the status bit is set; and de-allocating the entry from said store buffer at retirement. The method further may comprise resetting the status bit if the cacheline is allocated over or is evicted from the cache before the store buffer entry attempts to write to the cache.
    Type: Grant
    Filed: December 23, 2003
    Date of Patent: October 31, 2006
    Assignee: Intel Corporation
    Inventors: Per H. Hammarlund, Stephan Jourdan, Sebastien Hily, Aravindh Baktha, Hermann Gartler
  • Patent number: 6990551
    Abstract: A system and method for reducing linear address aliasing is described. In one embodiment, a portion of a linear address is combined with a process identifier, e.g., a page directory base pointer to form an adjusted-linear address. The page directory base pointer is unique to a process and combining it with a portion of the linear address produces an adjusted-linear address that provides a high probability of no aliasing. A portion of the adjusted-linear address is used to search an adjusted-linear-addressed cache memory for a data block specified by the linear address. If the data block does not reside in the adjusted-linear-addressed cache memory, then a replacement policy selects one of the cache lines in the adjusted-linear-addressed cache memory and replaces the data block of the selected cache line with a data block located at a physical address produced from translating the linear address.
    Type: Grant
    Filed: August 13, 2004
    Date of Patent: January 24, 2006
    Assignee: Intel Corporation
    Inventors: Herbert H. J. Hum, Stephan J. Jourdan, Per H. Hammarlund
  • Patent number: 6981129
    Abstract: Breaking replay dependency loops in a processor using a rescheduled replay queue. The processor comprises a replay queue to receive a plurality of instructions, and an execution unit to execute the plurality of instructions. A scheduler is coupled between the replay queue and the execution unit. The scheduler speculatively schedules instructions for execution and increments a counter for each of the plurality of instructions to reflect the number of times each of the plurality of instructions has been executed. The scheduler also dispatches each instruction to the execution unit either when the counter does not exceed a maximum number of replays or, if the counter exceeds the maximum number of replays, when the instruction is safe to execute. A checker is coupled to the execution unit to determine whether each instruction has executed successfully. The checker is also coupled to the replay queue to communicate to the replay queue each instruction that has not executed successfully.
    Type: Grant
    Filed: November 2, 2000
    Date of Patent: December 27, 2005
    Assignee: Intel Corporation
    Inventors: Darrell D. Boggs, Douglas M. Carmean, Per H. Hammarlund, Francis X. McKeen, David J. Sager, Ronak Singhal
  • Patent number: 6877086
    Abstract: Rescheduling multiple micro-operations in a processor using a replay queue. The processor comprises a replay queue to receive a plurality of instructions and an execution unit to execute the plurality of instructions. A scheduler is coupled between the replay queue and the execution unit. The scheduler speculatively schedules instructions for execution and dispatches each instruction to the execution unit. A checker is coupled to the execution unit to determine whether each instruction has executed successfully. The checker is also coupled to the replay queue to communicate to the replay queue each instruction that has not executed successfully.
    Type: Grant
    Filed: November 2, 2000
    Date of Patent: April 5, 2005
    Assignee: Intel Corporation
    Inventors: Darrell D. Boggs, Douglas M. Carmean, Per H. Hammarlund, Francis X. McKeen, David J. Sager, Ronak Singhal
  • Patent number: 6798364
    Abstract: A method and apparatus for variable length coding is described. A method comprises receiving a group of data having a group of set values, identifying a group of positions of the group of set values within the group of data without branching, for each of the group of positions, encoding a run of non-set values preceding each of the group of positions.
    Type: Grant
    Filed: February 5, 2002
    Date of Patent: September 28, 2004
    Assignee: Intel Corporation
    Inventors: Yen-Kuang Chen, Matthew J. Holliman, Herbert Hum, Per H. Hammarlund, Thomas Huff, William W. Macy
  • Patent number: 6675282
    Abstract: A system and method for storing only one copy of a data block that is shared by two or more processes is described. In one embodiment, a global/non-global predictor predicts whether a data block, specified by a linear address, is shared or not shared by two or more processes. If the data block is predicted to be non-shared, then a portion of the linear address referencing the data block is combined with a process identifier that is unique to form a global/non-global linear address. If the data block is predicted to be shared, then the global/non-global linear address is the linear address itself. If the prediction as to whether or not the data block is shared is incorrect, then the actual value of whether or not the data block is shared is used in computing a corrected global/non-global linear address.
    Type: Grant
    Filed: February 12, 2003
    Date of Patent: January 6, 2004
    Assignee: Intel Corporation
    Inventors: Herbert H. J. Hum, Stephan J. Jourdan, Deborrah Marr, Per H. Hammarlund
  • Patent number: 6643747
    Abstract: A request received from a requester to access a processor cache or register file or the like is buffered, by storing requestor identification, request type, address, and a status of the request. This buffered request may be forwarded to the cache if it has the highest priority among a number of buffered requests that also wish to access the cache. The priority is a function of at least the requestor identification, the requester type, and the status of the request. For buffered requests which include a read, the buffered request is deleted after, not before, receiving an indication that the requestor has received the data read from the cache.
    Type: Grant
    Filed: December 27, 2000
    Date of Patent: November 4, 2003
    Assignee: Intel Corporation
    Inventors: Per H. Hammarlund, Douglas M. Carmean, Michael D. Upton
  • Publication number: 20030146858
    Abstract: A method and apparatus for variable length coding is described. A method comprises receiving a group of data having a group of set values, identifying a group of positions of the group of set values within the group of data without branching, for each of the group of positions, encoding a run of non-set values preceding each of the group of positions.
    Type: Application
    Filed: February 5, 2002
    Publication date: August 7, 2003
    Inventors: Yen-Kuang Chen, Matthew J. Holliman, Herbert Hum, Per H. Hammarlund, Thomas Huff, William W. Macy
  • Publication number: 20030120892
    Abstract: A system and method for storing only one copy of a data block that is shared by two or more processes is described. In one embodiment, a global/non-global predictor predicts whether a data block, specified by a linear address, is shared or not shared by two or more processes. If the data block is predicted to be non-shared, then a portion of the linear address referencing the data block is combined with a process identifier that is unique to form a global/non-global linear address. If the data block is predicted to be shared, then the global/non-global linear address is the linear address itself. If the prediction as to whether or not the data block is shared is incorrect, then the actual value of whether or not the data block is shared is used in computing a corrected global/non-global linear address.
    Type: Application
    Filed: February 12, 2003
    Publication date: June 26, 2003
    Applicant: Intel Corporation
    Inventors: Herbert H.J. Hum, Stephan J. Jourdan, Deborrah Marr, Per H. Hammarlund
  • Patent number: 6560690
    Abstract: A system and method for storing only one copy of a data block that is shared by two or more processes is described. In one embodiment, a global/non-global predictor predicts whether a data block, specified by a linear address, is shared or not shared by two or more processes. If the data block is predicted to be non-shared, then a portion of the linear address referencing the data block is combined with a process identifier that is unique to form a global/non-global linear address. If the data block is predicted to be shared, then the global/non-global linear address is the linear address itself. If the prediction as to whether or not the data block is shared is incorrect, then the actual value of whether or not the data block is shared is used in computing a corrected global/non-global linear address.
    Type: Grant
    Filed: December 29, 2000
    Date of Patent: May 6, 2003
    Assignee: Intel Corporation
    Inventors: Herbert H. J. Hum, Stephan J. Jourdan, Deborrah Marr, Per H. Hammarlund
  • Publication number: 20020087824
    Abstract: A system and method for reducing linear address aliasing is described. In one embodiment, a portion of a linear address is combined with a process identifier, e.g., a page directory base pointer to form an adjusted-linear address. The page directory base pointer is unique to a process and combining it with a portion of the linear address produces an adjusted-linear address that provides a high probability of no aliasing. A portion of the adjusted-linear address is used to search an adjusted-linear-addressed cache memory for a data block specified by the linear address. If the data block does not reside in the adjusted-linear-addressed cache memory, then a replacement policy selects one of the cache lines in the adjusted-linear-addressed cache memory and replaces the data block of the selected cache line with a data block located at a physical address produced from translating the linear address.
    Type: Application
    Filed: December 29, 2000
    Publication date: July 4, 2002
    Inventors: Herbert H.J. Hum, Stephan J. Jourdan, Per H. Hammarlund
  • Publication number: 20020087795
    Abstract: A system and method for storing only one copy of a data block that is shared by two or more processes is described. In one embodiment, a global/non-global predictor predicts whether a data block, specified by a linear address, is shared or not shared by two or more processes. If the data block is predicted to be non-shared, then a portion of the linear address referencing the data block is combined with a process identifier that is unique to form a global/non-global linear address. If the data block is predicted to be shared, then the global/non-global linear address is the linear address itself. If the prediction as to whether or not the data block is shared is incorrect, then the actual value of whether or not the data block is shared is used in computing a corrected global/non-global linear address.
    Type: Application
    Filed: December 29, 2000
    Publication date: July 4, 2002
    Inventors: Herbert H.J. Hum, Stephan J. Jourdan, Deborrah Marr, Per H. Hammarlund
  • Publication number: 20020083244
    Abstract: A request received from a requester to access a processor cache or register file or the like is buffered, by storing requestor identification, request type, address, and a status of the request. This buffered request may be forwarded to the cache if it has the highest priority among a number of buffered requests that also wish to access the cache. The priority is a function of at least the requestor identification, the requester type, and the status of the request. For buffered requests which include a read, the buffered request is deleted after, not before, receiving an indication that the requestor has received the data read from the cache.
    Type: Application
    Filed: December 27, 2000
    Publication date: June 27, 2002
    Inventors: Per H. Hammarlund, Douglas M. Carmean, Michael D. Upton
  • Patent number: 6105111
    Abstract: A cache technique for maximizing cache efficiency by assigning ages to elements which access the cache, is described. In one embodiment, the cache technique includes receiving a first element of a first type by a cache and writing the first element in a set of the cache. The first element has a first age. The cache technique further includes receiving a second element of a second type by the cache and writing the second element in the set of the cache. The second element has a middle age, where the first age is a more recently used age than the middle age. In another embodiment, the cache technique includes receiving a first element of a first stream by a cache and writing the first element in a set of the cache. The first element has a first age. The cache technique further includes receiving a second element of a second stream by the cache and writing the second element in the set of the cache. The second element has a middle age, where the first age is a more recently used age than the middle age.
    Type: Grant
    Filed: March 31, 1998
    Date of Patent: August 15, 2000
    Assignee: Intel Corporation
    Inventors: Per H. Hammarlund, Glenn J. Hinton