Patents by Inventor David J. Sager

David J. Sager has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20040083351
    Abstract: A processor includes a memory execution unit for executing load and store instructions and a replay system for replaying instructions which have not executed properly. The memory execution unit including an invalid store flag that is set for a store instruction if the replay system detects that the store instruction has not executed properly and is cleared if the store instruction has executed properly. If an invalid store flag is set for a store instruction, the replay system replays load instructions which are programmatically younger than the invalid store instruction until the store instruction executes properly.
    Type: Application
    Filed: October 23, 2003
    Publication date: April 29, 2004
    Inventors: Amit A. Merchant, Darrell D. Boggs, David J. Sager
  • Patent number: 6704861
    Abstract: A mechanism for executing computer instructions in parallel includes a compiler for generating and grouping instructions into a plurality of sets of instructions to be executed in parallel, each set having a unique identification. A computer system having a real state and a speculative state executes the sets in parallel, the computer system executing a particular set of instructions in the speculative state if the instructions of the particular set have dependencies which can not be resolved until the instructions are actually executed. The computer system generates speculative data while executing instructions in the speculative state. Logic circuits are provided to detect any exception conditions which occur while executing the particular set in the speculative state. If the particular set is subject to an exception condition, the instructions of the set are re-executed to resolve the exception condition, and to incorporate the speculative data in the real state of the computer system.
    Type: Grant
    Filed: November 19, 1996
    Date of Patent: March 9, 2004
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Francis X. McKeen, Michael C. Adler, Joel S. Emer, Robert P. Nix, David J. Sager, P. Geoffrey Lowney
  • Publication number: 20040019746
    Abstract: A processor includes a cache memory with a data storage unit operating at a first clock frequency, and a tag unit and hit/miss logic operating at a second clock frequency different than the first clock frequency. The data storage unit may advantageously be clocked faster than the tag unit and hit/miss logic, such as two times (2×) faster. The processor may also include a replay mechanism for recovering from data speculation when the hit/miss logic or the tag unit signals that speculated data from the higher clocked data storage unit is, in fact, invalid.
    Type: Application
    Filed: July 28, 2003
    Publication date: January 29, 2004
    Inventor: David J. Sager
  • Publication number: 20030236966
    Abstract: Fusing a load micro-operation (uop) together with an arithmetic uop. Intra-instruction fusing can increase cache memory storage efficiency and computer instruction processing bandwidth within a microprocessor without incurring significant computer system cost. Uops are fused, stored in a cache memory, un-fused, executed in parallel, and retired in order to optimized cost and performance.
    Type: Application
    Filed: June 25, 2002
    Publication date: December 25, 2003
    Inventors: Nicholas G. Samra, Stephan J. Jourdan, David J. Sager, Glenn J. Hinton
  • Patent number: 6665792
    Abstract: A processor includes a memory execution unit for executing load and store instructions and a replay system for replaying instructions which have not executed properly. The memory execution unit including an invalid store flag that is set for a store instruction if the replay system detects that the store instruction has not executed properly and is cleared if the store instruction has executed properly. If an invalid store flag is set for a store instruction, the replay system replays load instructions which are programmatically younger than the invalid store instruction until the store instruction executes properly.
    Type: Grant
    Filed: December 30, 1999
    Date of Patent: December 16, 2003
    Assignee: Intel Corporation
    Inventors: Amit A. Merchant, Darrell D. Boggs, David J. Sager
  • Patent number: 6633970
    Abstract: A mechanism is provided for allowing a processor to recover from a failure of a predicted path of instructions (e.g., from a mispredicted branch or other event). The mechanism includes a plurality of physical registers, each physical register can store either architectural data or speculative data. The apparatus also includes a primary array to store a mapping from logical registers to physical registers, the primary array storing a speculative state of the processor. The apparatus also includes a buffer coupled to the primary array to store information identifying which physical registers store architectural data and which physical registers store speculative data. According to another embodiment, a history buffer is coupled to the secondary array and stores historical physical register to logical register mappings performed for each of a plurality of instructions part of a predicted path.
    Type: Grant
    Filed: December 28, 1999
    Date of Patent: October 14, 2003
    Assignee: Intel Corporation
    Inventors: David W. Clift, Darrell D. Boggs, David J. Sager
  • Patent number: 6631454
    Abstract: A processor includes a cache memory with a data storage unit operating at a first clock frequency, and a tag unit and hit/miss logic operating at a second clock frequency different than the first clock frequency. The data storage unit may advantageously be clocked faster than the tag unit and hit/miss logic, such as two times (2×) faster. The processor may also include a replay mechanism for recovering from data speculation when the hit/miss logic or the tag unit signals that speculated data from the higher clocked data storage unit is, in fact, invalid.
    Type: Grant
    Filed: November 13, 1996
    Date of Patent: October 7, 2003
    Assignee: Intel Corporation
    Inventor: David J. Sager
  • Publication number: 20030188125
    Abstract: A dual-cycle address generation unit is described to generate linear addresses. The dual-cycle address generation unit includes a first adder to add a product of an index and a scaling factor to an offset and a segment base during a first clock cycle and a second adder to add output of the first adder with a base during a second clock cycle.
    Type: Application
    Filed: March 28, 2002
    Publication date: October 2, 2003
    Inventors: Ross A. Segelken, Feng Chen, David J. Sager
  • Publication number: 20030158885
    Abstract: The present invention provides a method and apparatus for controlling a processing priority assigned alternately to a first thread and a second thread in a multithreaded processor to prevent deadlock and livelock problems between the first thread and the second thread. In one embodiment, the processing priority is initially assigned to the first thread for a first duration. It is then determined whether the first duration has expired in a given processing cycle. If the first duration has expired, the processing priority is assigned to the second thread for a second duration.
    Type: Application
    Filed: February 13, 2003
    Publication date: August 21, 2003
    Inventor: David J. Sager
  • Publication number: 20030154235
    Abstract: The present invention provides a method and apparatus for controlling a processing priority assigned alternately to a first thread and a second thread in a multithreaded processor to prevent deadlock and livelock problems between the first thread and the second thread. In one embodiment, the processing priority is initially assigned to the first thread for a first duration. It is then determined whether the first duration has expired in a given processing cycle. If the first duration has expired, the processing priority is assigned to the second thread for a second duration.
    Type: Application
    Filed: February 10, 2003
    Publication date: August 14, 2003
    Inventor: David J. Sager
  • Publication number: 20030126417
    Abstract: A method and apparatus to execute data speculative instructions in a processor comprising at least one source register, each source register comprising a bit to indicate validity of data in the at least one source register. A data validity circuit coupled to the one or more source registers to determine the validity of the data in the source registers, and to indicate the validity of the data in a destination register based upon the validity bit in the at least one source register. The processor optionally comprising a checker unit to retire those instructions from the execution unit which write valid data to the destination register, and to re-schedules those instructions for execution which write invalid data to the destination register.
    Type: Application
    Filed: January 2, 2002
    Publication date: July 3, 2003
    Inventors: Eric Sprangle, Michael J. Haertel, David J. Sager
  • Publication number: 20030126405
    Abstract: A method for stopping replay tornadoes in a processor. The method of one embodiment comprises scheduling an instruction for execution speculatively. A determination is made whether the instruction executed correctly. The instruction is routed to a replay mechanism if the instruction did not execute correctly. A determination is made whether a replay tornado exists. The instruction is routed for re-execution if the instruction executed incorrectly and no replay tornado exists. Breaking the replay tornado if the replay tornado exists. Replay safe instructions in the pipeline are retired. Non-replay safe instructions in the pipeline are marked for re-execution. The non-replay safe instructions are rescheduled for re-execution.
    Type: Application
    Filed: December 31, 2001
    Publication date: July 3, 2003
    Inventors: David J. Sager, Stephan Jourdan, Per Hammarlund
  • Patent number: 6542921
    Abstract: The present invention provides a method and apparatus for controlling a processing priority assigned alternately to a first thread and a second thread in a multithreaded processor to prevent deadlock and livelock problems between the first thread and the second thread. In one embodiment, the processing priority is initially assigned to the first thread for a first duration. It is then determined whether the first duration has expired in a given processing cycle. If the first duration has expired, the processing priority is assigned to the second thread for a second duration.
    Type: Grant
    Filed: July 8, 1999
    Date of Patent: April 1, 2003
    Assignee: Intel Corporation
    Inventor: David J. Sager
  • Publication number: 20020199088
    Abstract: In a multi-threaded processor, thread priority variables are set up in memory. The actual assignment of thread priority is based on the expiration of a thread precedence counter. To further augment, the effectiveness of the thread precedence counters, starting counters are associated with each thread that serve as a multiplier for the value to be used in the thread precedence counter. The value in the starting counters are manipulated so as to prevent one thread from getting undue priority to the resources of the multi-threaded processor.
    Type: Application
    Filed: June 22, 2001
    Publication date: December 26, 2002
    Inventors: David W. Burns, James D. Allen, Michael D. Upton, Darrell D. Boggs, David J. Sager
  • Patent number: 6487675
    Abstract: A processor including a first execution core section clocked to perform execution operations at a first clock frequency, and a second execution core section clocked to perform execution operations at a second clock frequency which is different than the first clock frequency. The second execution core section runs faster and includes a data cache and critical ALU functions, while the first execution core section includes latency-tolerant functions such as instruction fetch and decode units and non-critical ALU functions. The processor may further include an I/O ring which may be still slower than the first execution core section. Optionally, the first execution core section may include a third execution core section whose clock rate is between that of the first and second execution core sections. Clock multipliers/dividers may be used between the various sections to derive their clocks from a single source, such as the I/O clock.
    Type: Grant
    Filed: February 2, 2001
    Date of Patent: November 26, 2002
    Assignee: Intel Corporation
    Inventors: David J. Sager, Thomas D. Fletcher, Glenn J. Hinton, Michael D. Upton
  • Patent number: 6425055
    Abstract: An apparatus and method for accessing a cache memory. In a cache memory, an address is received that includes a set field and a partial tag field, the set field and the partial tag field together including fewer bits than necessary to uniquely identify a region of memory equal in size to a cache line of the cache memory. The set field is decoded to select one of a plurality of storage units within the cache memory, each of the plurality of storage units including a plurality of cache lines of the cache memory. The partial tag field is compared to a plurality of previously stored partial tags that correspond to the plurality of cache lines within the selected one of the plurality of storage units to determine if the partial tag field matches one of the plurality of previously stored partial tags. If the one of the previously stored partial tags matches the partial tag field, one of the plurality of cache lines that corresponds to the one of the plurality of previously stored partial tags is output.
    Type: Grant
    Filed: February 24, 1999
    Date of Patent: July 23, 2002
    Assignee: Intel Corporation
    Inventors: David J. Sager, Glenn J. Hinton
  • Publication number: 20020091914
    Abstract: A processor is provided that includes an execution unit for executing instructions and a replay system for replaying instructions which have not executed properly. The replay system is coupled to the execution unit and includes a checker for determining whether each instruction has executed properly and a plurality of replay queues or replay queue sections coupled to the checker for temporarily storing one or more instructions for replay. In one embodiment, thread-specific replay queue sections may each be used to store a long latency instruction for each thread until the long latency instruction is ready to be executed (e.g., data for a load instruction has been retrieved from external memory). By storing the long latency instruction and its dependents in a replay queue section for one thread which has stalled, execution resources are made available for improving the speed of execution of other threads which have not stalled.
    Type: Application
    Filed: February 1, 2002
    Publication date: July 11, 2002
    Inventors: Amit A. Merchant, Darrell D. Buggs, David J. Sager
  • Publication number: 20020083307
    Abstract: Embodiments of the present invention relate to systems and methods for partial merges for sub-register data operations. An instruction is examined before execution to determine which portion of a source register identified in the instruction should remain unchanged into a destination register. A portion of the source register determined to remain unchanged is moved into the destination register before instruction execution is complete.
    Type: Application
    Filed: December 26, 2000
    Publication date: June 27, 2002
    Inventors: David J. Sager, Alan B. Kyker, Andy F. Glew
  • Patent number: 6385715
    Abstract: A processor is provided that includes an execution unit for executing instructions and a replay system for replaying instructions which have not executed properly. The replay system is coupled to the execution unit and includes a checker for determining whether each instruction has executed properly and a plurality of replay queues or replay queue sections coupled to the checker for temporarily storing one or more instructions for replay. In one embodiment, thread-specific replay queue sections may each be used to store long latency instruction for each thread until the long latency instruction is ready to be executed (e.g., data for a load instruction has been retrieved from external memory). By storing the long latency instruction and its dependents in a replay queue section for one thread which has stalled, execution resources are made available for improving the speed of execution of other threads which have not stalled.
    Type: Grant
    Filed: May 4, 2001
    Date of Patent: May 7, 2002
    Assignee: Intel Corporation
    Inventors: Amit A. Merchant, Darrell D. Boggs, David J. Sager
  • Patent number: 6341327
    Abstract: A content addressable memory compares the value of redundant form input data to non-redundant form values stored in registers of the memory. The memory decodes the redundant form input data in a data decoder. Thereafter, the CAM performs match detection on the decoded data. The present invention performs decoding and match detection more quickly than traditional adders and even more quickly than complete redundant form adders.
    Type: Grant
    Filed: August 13, 1998
    Date of Patent: January 22, 2002
    Assignee: Intel Corporation
    Inventor: David J. Sager