Patents by Inventor Per Hammarlund

Per Hammarlund has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20070006231
    Abstract: In an embodiment, a method is provided. The method includes managing user-level threads on a first instruction sequencer in response to executing user-level instructions on a second instruction sequencer that is under control of an application level program. A first user-level thread is run on the second instruction sequencer and contains one or more user level instructions. A first user level instruction has at least 1) a field that makes reference to one or more instruction sequencers or 2) implicitly references with a pointer to code that specifically addresses one or more instruction sequencers when the code is executed.
    Type: Application
    Filed: June 30, 2005
    Publication date: January 4, 2007
    Inventors: Hong Wang, John Shen, Ed Grochowski, James Held, Bryant Bigbee, Shivnandan Kaushik, Gautham Chinya, Xiang Zou, Per Hammarlund, Xinmin Tian, Anil Aggarwal, Scott Rodgers, Prashant Sethi, Baiju Patel, Richard Hankins
  • Publication number: 20060294347
    Abstract: Method, apparatus, and system for a programmable event driven yield mechanism that may activate other threads. The yield mechanism may allow triggering of a service thread that may execute currently with a main thread upon occurrence of an architecturally-defined condition. The service thread may be activated, in response to the condition, with limited intervention of an operating system. In one embodiment, an apparatus includes execution resources to execute a plurality of instructions and a monitor to detect an architecturally-defined condition. The apparatus may include an event handler to handle a yield event generated when the architecturally-defined condition has been detected. An architectural mechanism, including processor instructions and channel registers, may be utilized to allow user-level code to enable the yield event mechanism. Other embodiments are also described and claimed.
    Type: Application
    Filed: May 19, 2005
    Publication date: December 28, 2006
    Inventors: Xiang Zou, Hong Wang, Scott Rodgers, Darrell Boggs, Bryant Bigbee, Shivanandan Kaushik, Anil Aggarwal, Ittai Anati, Doron Orenstein, Per Hammarlund, John Shen, Larry Smith, James Crossland, Chris Newburn
  • Publication number: 20060294326
    Abstract: A processor may include an address monitor table and an atomic update table to support speculative threading. The processor may also include one or more registers to maintain state associated with execution of speculative threads. The processor may support one or more of the following primitives: an instruction to write to a register of the state, an instruction to trigger the committing of buffered memory updates, an instruction to read the a status register of the state, and/or an instruction to clear one of the state bits associated with trap/exception/interrupt handling. Other embodiments are also described and claimed.
    Type: Application
    Filed: June 23, 2005
    Publication date: December 28, 2006
    Inventors: Quinn Jacobson, Hong Wang, John Shen, Gautham Chinya, Per Hammarlund, Xiang Zou, Bryant Bigbee, Shivnandan Kaushik
  • Patent number: 7149883
    Abstract: A buffer mechanism for buffering microinstructions between a trace cache and an allocator performs a compacting operation by overwriting entries within a queue, known not to store valid instructions or data, with valid instructions or data. Following a write operation to a queue included within the buffer mechanism, pointer logic determines whether the entries to which instructions or data have been written include the valid data or instructions. If an entry is shown to be invalid, the write pointer is not advanced past the relevant entry. In this way, an immediately following write operation will overwrite the invalid data or instruction with data or instruction. The overwriting instruction or data will again be subject to scrutiny (e.g., a qualitative determination) to determine whether it is valid or invalid, and will only be retained within the queue if valid.
    Type: Grant
    Filed: March 30, 2000
    Date of Patent: December 12, 2006
    Assignee: Intel Corporation
    Inventors: Per Hammarlund, Robert Krick
  • Publication number: 20060224858
    Abstract: Disclosed are embodiments of a system, methods and mechanism for management and translation of mapping between logical sequencer addresses and physical or logical sequencers in a multi-sequencer multithreading system. A mapping manager may manage assignment and mapping of logical sequencer addresses or pages to actual sequencers or frames of the system. Rationing logic associated with the mapping manager may take into account sequencer attributes when such mapping is performed Relocation logic associated with the mapping manager may manage spill and fill of context information to/from a backing store when re-mapping actual sequencers. Sequencers may be allocated singly, or may be allocated as part of partitioned blocks. The mapping manager may also include translation logic that provides an identifier for the mapped sequencer each time a logical sequencer address is used in a user program. Other embodiments are also described and claimed.
    Type: Application
    Filed: April 5, 2005
    Publication date: October 5, 2006
    Inventors: Hong Wang, Gautham Chinya, Richard Hankins, Shivnandan Kaushik, Bryant Bigbee, Per Hammarlund, Xiang Zou, Jason Brandt, Prashant Sethi, Douglas Carmean, Baiju Patel, John Shen, Scott Rodgers, Ryan Rakvic, John Reid, David Poulsen, Sanjiv Shah, James Held, James Abel
  • Patent number: 7114057
    Abstract: An article comprising an instruction stored on a storage medium. The instruction includes opcode field storing an opcode signal and an operand field storing an operand signal. The operand is compressed prior to being stored in the operand field.
    Type: Grant
    Filed: October 30, 2001
    Date of Patent: September 26, 2006
    Assignee: Intel Corporation
    Inventors: Alan B. Kyker, Per Hammarlund, Chan Lee, Robert F. Krick, Hitesh Ahuja, William Alexander, Joseph Rohlman
  • Patent number: 7085889
    Abstract: A context identifier is used in a cache memory apparatus. The context identifier may be written into the tag of a cache line or may be written as an addition to the tag of a cache line, during cache write operation. During a cache read operation, the context identifier of as issued instruction may be compared with the context identifier in the cache line's tag. The cache line's data block may be transferred if the context identifiers and the tags match.
    Type: Grant
    Filed: March 22, 2002
    Date of Patent: August 1, 2006
    Assignee: Intel Corporation
    Inventors: Per Hammarlund, Aravindh Baktha, Michael D Upton, Venkat K. S. Venkatraman
  • Publication number: 20060161738
    Abstract: In one embodiment, the present invention includes a predictor to predict contention of an operation to be executed in a program. The operation may be processed based on a result of the prediction, which may be based on multiple independent predictions. In one embodiment, the operation may be optimized if no contention is predicted. Other embodiments are described and claimed.
    Type: Application
    Filed: December 29, 2004
    Publication date: July 20, 2006
    Inventors: Bratin Saha, Matthew Merten, Sebastien Hily, David Koufaty, Per Hammarlund
  • Publication number: 20060004998
    Abstract: A method and apparatus for executing lock instructions speculatively in an out-of-order processor are disclosed. In one embodiment, a prediction is made whether a given lock instruction will actually be contended. If not, then the lock instruction may be treated as having a normal load micro-operation which may be speculatively executed. Monitor logic may look for indications that the lock instruction is actually contended. If no such indications are found, the speculative load micro-operation and other micro-operations corresponding to the lock instruction may retire. However, if such indications are in fact found, the lock instruction may be restarted, and the prediction mechanism may be updated.
    Type: Application
    Filed: June 30, 2004
    Publication date: January 5, 2006
    Inventors: Bratin Saha, Matthew Merten, Per Hammarlund
  • Publication number: 20060005197
    Abstract: A method, apparatus, and system are provided for performing compare and exchange operations using a sleep-wakeup mechanism. According to one embodiment, an instruction at a processor is executed to help acquire a lock on behalf of the processor. If the lock is unavailable to be acquired by the processor, the instruction is put to sleep until an event has occurred.
    Type: Application
    Filed: June 30, 2004
    Publication date: January 5, 2006
    Inventors: Bratin Saha, Matthew Merten, Per Hammarlund
  • Patent number: 6952764
    Abstract: A method for stopping replay tornadoes in a processor. The method of one embodiment comprises scheduling an instruction for execution speculatively. A determination is made whether the instruction executed correctly. The instruction is routed to a replay mechanism if the instruction did not execute correctly. A determination is made whether a replay tornado exists. The instruction is routed for re-execution if the instruction executed incorrectly and no replay tornado exists. Breaking the replay tornado if the replay tornado exists. Replay safe instructions in the pipeline are retired. Non-replay safe instructions in the pipeline are marked for re-execution. The non-replay safe instructions are rescheduled for re-execution.
    Type: Grant
    Filed: December 31, 2001
    Date of Patent: October 4, 2005
    Assignee: Intel Corporation
    Inventors: David J. Sager, Stephan Jourdan, Per Hammarlund
  • Publication number: 20050193278
    Abstract: Systems and methods of managing threads provide for supporting a plurality of logical threads with a plurality of simultaneous physical threads in which the number of logical threads may be greater than or less than the number of physical threads. In one approach, each of the plurality of logical threads is maintained in one of a wait state, an active state, a drain state, and a stall state. A state machine and hardware sequencer can be used to transition the logical threads between states based on triggering events and whether or not an interruptible point has been encountered in the logical threads. The logical threads are scheduled on the physical threads to meet, for example, priority, performance or fairness goals. It is also possible to specify the resources that are available to each logical thread in order to meet these and other, goals. In one example, a single logical thread can speculatively use more than one physical thread, pending a selection of which physical thread should be committed.
    Type: Application
    Filed: December 29, 2003
    Publication date: September 1, 2005
    Inventors: Per Hammarlund, Stephan Jourdan, Pierre Michaud, Alexandre Farcy, Morris Marden, Robert Hinton, Douglas Carmean
  • Publication number: 20050166039
    Abstract: Method, apparatus, and program means for a programmable event driven yield mechanism that may activate other threads. In one embodiment, an apparatus includes execution resources to execute a plurality of instructions and a monitor to detect a condition indicating a low level of progress. The monitor can disrupt processing of a program by transferring to a handler in response to detecting the condition indicating a low level of progress. In another embodiment, thread switch logic may be coupled to a plurality of event monitors which monitor events within the multithreading execution logic. The thread switch logic switches threads based at least partially on a programmable condition of one or more of the performance monitors.
    Type: Application
    Filed: November 5, 2004
    Publication date: July 28, 2005
    Inventors: Hong Wang, Per Hammarlund, Xiang Zou, John Shen, Xinmin Tian, Milind Girkar, Perry Wang, Piyush Desai
  • Publication number: 20050147036
    Abstract: A method and apparatus for enabling an adaptive replay loop in a processor. More particularly, the present invention relates to allowing instructions in the replay loop to change its relative position, thereby decreasing the latency for execution of instructions, resolving dynamic resource conflicts, and also increasing the overall efficiency of the processor.
    Type: Application
    Filed: December 30, 2003
    Publication date: July 7, 2005
    Inventors: Per Hammarlund, Stephan Jourdan
  • Publication number: 20050149702
    Abstract: Embodiments of the present invention provide a method, apparatus and system for memory renaming. In one embodiment, a decode unit may decode a load instruction. If the load instruction is predicted to be memory renamed, the load instruction may have a predicted store identifier associated with the load instruction. The decode unit may transform the load instruction that is predicted to be memory renamed into a data move instruction and a load check instruction. The data move instruction may read data from the cache based on the predicted store identifier and load check instruction may compare an identifier associated with an identified source store with the predicted store identifier. A retirement unit may retire the load instruction if the predicted store identifier matches an identifier associated with the identified source store.
    Type: Application
    Filed: December 29, 2003
    Publication date: July 7, 2005
    Inventors: Sebastien Hily, Per Hammarlund, Avinash Sodani
  • Publication number: 20050149697
    Abstract: Method, apparatus, and program means for a programmable event driven yield mechanism that may activate other threads. In one embodiment, an apparatus includes execution resources to execute a plurality of instructions and an event detector to detect a long latency event associated with a synchronization object. The event detector can cause a first thread switch in response to the long latency event associated with the synchronization object. The apparatus may also include a spin detector to detect that the synchronization object is a contended synchronization object. The spin detector can cause a second thread switch in response to the detection of the contended synchronization object to enable a spin detect response.
    Type: Application
    Filed: March 2, 2005
    Publication date: July 7, 2005
    Inventors: Natalie Enright, Jamison Collins, Perry Wang, Hong Wang, Xinmin Tran, John Shen, Gad Sheaffer, Per Hammarlund
  • Publication number: 20050149912
    Abstract: A system and method for optimizing a series of traces to be executed by a processing core is disclosed. The lines of a trace are sent to an optimizer each time they are sent to a processing core to be executed. Runtime information may be collected on a line of a trace each time that trace is executed by a processing core. The runtime information may be used by the optimizer to better optimize the micro-operations of the lines of the trace. The optimizer optimizes a trace each time the trace is executed to improve the efficiency of future iterations of the trace. Most of the optimizations result in a reduction of the number of ?ops within the trace. The optimizer may optimize two or more lines at a time in order to find more opportunities to remove ?ops and shorten the trace. The two lines may be alternately offset so that each line has the maximum allowed number of micro-operations.
    Type: Application
    Filed: December 29, 2003
    Publication date: July 7, 2005
    Inventors: Alexandre Farcy, Stephan Jourdan, Avinash Sodani, Per Hammarlund
  • Publication number: 20050149689
    Abstract: A method and apparatus for enabling an adaptive replay loop in a processor. More particularly, the present invention relates to allowing instructions in the replay loop to change its relative position, thereby decreasing the latency for execution of instructions, resolving dynamic resource conflicts, and also increasing the overall efficiency of the processor.
    Type: Application
    Filed: December 30, 2003
    Publication date: July 7, 2005
    Inventors: Avinash Sodani, Per Hammarlund, Stephan Jourdan
  • Publication number: 20050141554
    Abstract: Embodiments of the present invention provide a dynamic resource allocator to allocate resources performance optimization in, for example, a computer system. The dynamic resource allocator to allocate a resource to one or more threads associated with an application based on a performance rate. Embodiments of the present invention may further include a performance monitor to monitor the performance rate of the one or more threads. The dynamic resource allocator to allocate an additional resource to the one or more threads, if the thread is performing above a performance threshold. In embodiments of the present invention, the dynamic resource allocation strategy may be decided based on, for example, optimizing the overall system throughput, minimizing power consumption, meeting system performance goals (e.g., real time requirements), user specified performance priorities and/or application specified performance priorities.
    Type: Application
    Filed: December 29, 2003
    Publication date: June 30, 2005
    Inventors: Per Hammarlund, Melih Ozgul
  • Publication number: 20050144398
    Abstract: Embodiments of the present invention relate to cache coherency. In an embodiment of the invention, a cache includes one or more cache lines. A store pipeline may retrieve a tag associated with one of the cache lines. The data associated with the cache line may not retrieved and the cache line may be updated if, based on the tag, the cache line is determined to be in a modified or exclusive state.
    Type: Application
    Filed: December 30, 2003
    Publication date: June 30, 2005
    Inventors: Per Hammarlund, Hermann Gartler