Patents by Inventor Per Hammarlund

Per Hammarlund has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 6912648
    Abstract: A method for stick and spoke replay in a processor. The method of one embodiment comprises dispatching an instruction for execution. The instruction is speculatively executed. It is determined whether the instruction executed correctly. The instruction is routed to a replay mechanism if the instruction did not execute correctly. It is determined incorrect execution of the instruction is due to a long latency operation. The instruction is routed for immediate re-execution if the incorrect execution is not due to the long latency operation. The routing of the instruction for re-execution is delayed if the incorrect execution is due to the long latency operation. The instruction is re-executed if the instruction did not execute correctly. The instruction is retired if the instruction executed correctly.
    Type: Grant
    Filed: December 31, 2001
    Date of Patent: June 28, 2005
    Assignee: Intel Corporation
    Inventors: Per Hammarlund, Stephan Jourdan
  • Publication number: 20050138339
    Abstract: Embodiments of the present invention relate to a memory management scheme and apparatus that enables efficient memory renaming. The method includes computing a store address, writing the store address in a first storage, writing data associated with the store address to a memory, and de-allocating the store address from the first storage, allocating the store address in a second storage, predicting a load instruction to be memory renamed, computing a load store source index, computing a load address, disambiguating the memory renamed load instruction, and retiring the memory renamed load instruction, if the store instruction is still allocated in at least one of the first storage and the second storage and should have effectively provided to the load the full data. The method may also include re-executing the load instruction without memory renaming, if the store instruction is not in at the first storage or in the second storage.
    Type: Application
    Filed: December 23, 2003
    Publication date: June 23, 2005
    Inventors: Sebastien Hily, Per Hammarlund
  • Publication number: 20050138290
    Abstract: Embodiments of the present invention relate to selectively re-executing instructions in a computer processor based on their association with a particular cache miss.
    Type: Application
    Filed: December 23, 2003
    Publication date: June 23, 2005
    Inventors: Per Hammarlund, Avinash Sodani, James Allen, Ronak Singhal, Francis McKeen, Hermann Gartler
  • Publication number: 20050138295
    Abstract: Embodiments of the present invention relate to a memory management scheme and apparatus that enables efficient cache memory management. The method includes writing an entry to a store buffer at execute time; determining if the entry's address is in a first-level cache associated with the store buffer before retirement; and setting a status bit associated with the entry in said store buffer, if the address is in the cache in either exclusive or modified state. The method further includes immediately writing the entry to the first-level cache at or after retirement when the status bit is set; and de-allocating the entry from said store buffer at retirement. The method further may comprise resetting the status bit if the cacheline is allocated over or is evicted from the cache before the store buffer entry attempts to write to the cache.
    Type: Application
    Filed: December 23, 2003
    Publication date: June 23, 2005
    Inventors: Per Hammarlund, Stephan Jourdan, Sebastien Hily, Aravindh Baktha, Hermann Gartler
  • Publication number: 20050138297
    Abstract: Embodiments of the present invention relate to a system and method for associating a register file cache with a register file in a computer processor.
    Type: Application
    Filed: December 23, 2003
    Publication date: June 23, 2005
    Inventors: Avinash Sodani, Per Hammarlund, Samie Samaan, Kurt Kreitzer, Tom Fletcher
  • Publication number: 20050138334
    Abstract: Embodiments of the present invention relate to a method and system for providing virtual identifiers corresponding to physical registers in a computer processor. According to the embodiments, the virtual identifiers may be used to represent the physical registers during operations in a pipeline of the processor.
    Type: Application
    Filed: December 23, 2003
    Publication date: June 23, 2005
    Inventors: Avinash Sodani, Per Hammarlund, Stephan Jourdan
  • Publication number: 20050071518
    Abstract: According to an embodiment of the invention, a method and apparatus for flag value renaming. An embodiment of a method comprises setting a flag for a processor via a first instruction, the first instruction being either a direct update instruction or an indirect update instruction; if the setting of the flag is by a direct update instruction, executing a succeeding second instruction that reads the flag prior to completion of the first instruction; and if the setting of the flag is by an indirect update instruction, delaying the second instruction until after completion of the first instruction.
    Type: Application
    Filed: September 30, 2003
    Publication date: March 31, 2005
    Inventors: Nicholas Samra, Stephan Jourdan, Jonathan Combs, Avinash Sodani, Per Hammarlund, Michael Cornaby
  • Publication number: 20050071614
    Abstract: A method and system for multiple branch paths in a microprocessor is described. The method includes assigning an identification number (ID) to each of a plurality of micro-operations (uops) to identify a branch path to which the uop belongs, determining whether one or more branches are predicted correctly, determining which of the one or more branch paths are dependent on a mispredicted branch, and determining whether one or more of the plurality of uops belong to a branch path that is dependent on a mispredicted branch based on their assigned IDs.
    Type: Application
    Filed: September 30, 2003
    Publication date: March 31, 2005
    Inventors: Stephan Jourdan, Per Hammarlund, Avinash Sodani, James Allen, Francis McKeen, Pierre Michaud
  • Publication number: 20050055541
    Abstract: Embodiments of an apparatus, system and method enhance the efficiency of processor resource utilization during instruction prefetching via one or more speculative threads. Renamer logic and a map table are utilized to perform filtering of instructions in a speculative thread instruction stream. The map table includes a yes-a-thing bit to indicate whether the associated physical register's content reflects the value that would be computed by the main thread. A thread progress beacon table is utilized to track relative progress of a main thread and a speculative helper thread. Based upon information in the thread progress beacon table, the main thread may effect termination of a helper thread that is not likely to provide a performance benefit for the main thread.
    Type: Application
    Filed: September 8, 2003
    Publication date: March 10, 2005
    Inventors: Tor Aamodt, Hong Wang, Per Hammarlund, John Shen, Steve Liao, Perry Wang
  • Publication number: 20050027941
    Abstract: Apparatus, system and methods are provided for performing speculative data prefetching in a chip multiprocessor (CMP). Data is prefetched by a helper thread that runs on one core of the CMP while a main program runs concurrently on another core of the CMP. Data prefetched by the helper thread is provided to the helper core. For one embodiment, the data prefetched by the helper thread is pushed to the main core. It may or may not be provided to the helper core as well. A push of prefetched data to the main core may occur during a broadcast of the data to all cores of an affinity group. For at least one other embodiment, the data prefetched by a helper thread is provided, upon request from the main core, to the main core from the helper core's local cache.
    Type: Application
    Filed: July 31, 2003
    Publication date: February 3, 2005
    Inventors: Hong Wang, Perry Wang, Jeffery Brown, Per Hammarlund, George Chrysos, Doron Orenstein, Steve Liao, John Shen
  • Publication number: 20050027963
    Abstract: A system and method for reducing linear address aliasing is described. In one embodiment, a portion of a linear address is combined with a process identifier, e.g., a page directory base pointer to form an adjusted-linear address. The page directory base pointer is unique to a process and combining it with a portion of the linear address produces an adjusted-linear address that provides a high probability of no aliasing. A portion of the adjusted-linear address is used to search an adjusted-linear-addressed cache memory for a data block specified by the linear address. If the data block does not reside in the adjusted-linear-addressed cache memory, then a replacement policy selects one of the cache lines in the adjusted-linear-addressed cache memory and replaces the data block of the selected cache line with a data block located at a physical address produced from translating the linear address.
    Type: Application
    Filed: August 13, 2004
    Publication date: February 3, 2005
    Inventors: Herbert Hum, Stephan Jourdan, Per Hammarlund
  • Publication number: 20040267996
    Abstract: A method, apparatus, and system are provided for monitoring locks using monitor-memory wait. According to one embodiment, a node associated with a contended lock is monitored; and a processor seeking the contended lock is put to sleep until a monitor event occurs.
    Type: Application
    Filed: June 27, 2003
    Publication date: December 30, 2004
    Inventors: Per Hammarlund, James B. Crossland, Anil Aggarwal, Shivnandan D. Kaushik
  • Publication number: 20040225867
    Abstract: An article comprising an instruction stored on a storage medium. The instruction includes opcode field storing an opcode signal and an operand field storing an operand signal. The operand is compressed prior to being stored in the operand field.
    Type: Application
    Filed: February 5, 2004
    Publication date: November 11, 2004
    Inventors: Alan B. Kyker, Per Hammarlund, Chan Lee, Robert F. Krick, Hitesh Ahuja, William Alexander, Joseph Rohlman
  • Publication number: 20040163083
    Abstract: Method, apparatus, and program means for a programmable event driven yield mechanism that may activate other threads. In one embodiment, an apparatus includes execution resources to execute a plurality of instructions and a monitor to detect a condition indicating a low level of progress. The monitor can disrupt processing of a program by transferring to a handler in response to detecting the condition indicating a low level of progress. In another embodiment, thread switch logic may be coupled to a plurality of event monitors which monitor events within the multithreading execution logic. The thread switch logic switches threads based at least partially on a programmable condition of one or more of the performance monitors.
    Type: Application
    Filed: February 19, 2003
    Publication date: August 19, 2004
    Inventors: Hong Wang, Per Hammarlund, Xiang Zou, John Shen, Xinmin Tian, Milind Girkar, Perry Wang, Piyush Desai
  • Publication number: 20040154010
    Abstract: A method for generating instructions to facilitate control-quasi-independent-point multithreading is provided. A spawn point and control-quasi-independent-point are determined. An instruction stream is generated to partition a program so that portions of the program are parallelized by speculative threads. A method of performing control-quasi-independent-point guided speculative multithreading includes spawning a speculative thread when the spawn point is encountered. An embodiment of the method further includes performing speculative precomputation to determine a live-in value for the speculative thread.
    Type: Application
    Filed: January 31, 2003
    Publication date: August 5, 2004
    Inventors: Pedro Marcuello, Antonio Gonzalez, Hong Wang, John P. Shen, Per Hammarlund, Gerolf F. Hoflehner, Perry H. Wang, Steve Shih-wei Liao
  • Publication number: 20040154012
    Abstract: The latencies associated with retrieving instruction information for a main thread are decreased through the use of a simultaneous helper thread. The helper thread is permitted to execute Store instructions. Store blocker logic operates to prevent data associated with a Store instruction in a helper thread from being committed to memory. Dependence blocker logic operates to prevent data associated with a Store instruction in a speculative helper thread from being bypassed to a Load instruction in a non-speculative thread.
    Type: Application
    Filed: August 1, 2003
    Publication date: August 5, 2004
    Inventors: Hong Wang, Tor Aamodt, Per Hammarlund, John Shen, Xinmin Tian, Milind Girkar, Perry Wang, Steve Shih-wei Liao
  • Publication number: 20040154011
    Abstract: The latencies associated with retrieving instruction information for a main thread are decreased through the use of a simultaneous helper thread. The helper thread is a speculative prefetch thread to perform instruction prefetch and/or trace pre-build for the main thread.
    Type: Application
    Filed: April 24, 2003
    Publication date: August 5, 2004
    Inventors: Hong Wang, Tor M. Aamodt, Pedro Marcuello, Jared W. Stark, John P. Shen, Antonio Gonzalez, Per Hammarlund, Gerolf F. Hoflehner, Perry H. Wang, Steve Shih-wei Liao
  • Publication number: 20040154019
    Abstract: Methods and an apparatus for generating a speculative helper thread for cache prefetch are disclosed. The disclosed techniques select spawn-target pairs based on profile data and a series of calculations. Helper threads are then generated to launch at the selected spawn points in order to prefetch software instructions (or data) for a single-threaded software application. The generated helper threads are then attached to the single-threaded software application to create a multi-threaded software application.
    Type: Application
    Filed: April 24, 2003
    Publication date: August 5, 2004
    Inventors: Tor M. Aamodt, Hong Wang, John Shen, Per Hammarlund
  • Patent number: 6711669
    Abstract: An article comprising an instruction stored on a storage medium. The instruction includes opcode field storing an opcode signal and an operand field storing an operand signal. The operand is compressed prior to being stored in the operand field.
    Type: Grant
    Filed: January 10, 2003
    Date of Patent: March 23, 2004
    Assignee: Intel Corporation
    Inventors: Alan B. Kyker, Per Hammarlund, Chan Lee, Robert F. Krick, Hitesh Ahuja, William Alexander, Joseph Rohlman
  • Publication number: 20040049491
    Abstract: A resource including a plurality of elements, such as a cache memory having a plurality of addressable blocks or ways, is shared between two or more components based on the operation of an access controller. The access controller, controls which of the elements are accessed exclusively by a component and which are shared by two or more components. In one embodiment, the components include the execution of instructions in first and second threads in a multi-threaded processor environment. To prevent one thread from dominating the cache memory, a first mask value is provided for each thread. The access of the components to the cache memory is controlled by the first mask values. For example, the mask values can be selected so as to prevent a thread from accessing one or more of the ways in the cache (e.g., to evict, erase, delete, etc. a particular way in the cache). Also, the mask values can be set to allow certain of the ways in the cache to be shared between threads.
    Type: Application
    Filed: September 10, 2003
    Publication date: March 11, 2004
    Inventors: Per Hammarlund, Robert Greiner