Patents by Inventor Amit A. Merchant

Amit A. Merchant has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7219349
    Abstract: A processor is provided that includes an execution unit for executing instructions and a replay system for replaying instructions which have not executed properly. The replay system is coupled to the execution unit and includes a checker for determining whether each instruction has executed properly and a plurality of replay queues or replay queue sections coupled to the checker for temporarily storing one or more instructions for replay. In one embodiment, thread-specific replay queue sections may each be used to store a long latency instruction for each thread until the long latency instruction is ready to be executed (e.g., data for a load instruction has been retrieved from external memory). By storing the long latency instruction and its dependents in a replay queue section for one thread which has stalled, execution resources are made available for improving the speed of execution of other threads which have not stalled.
    Type: Grant
    Filed: March 2, 2004
    Date of Patent: May 15, 2007
    Assignee: Intel Corporation
    Inventors: Amit A. Merchant, Darrell D. Boggs, David J. Sager
  • Patent number: 7200737
    Abstract: A processor is provided that includes an execution unit for executing instructions and a replay system for replaying instructions which have not executed properly. The replay system is coupled to the execution unit and includes a checker for determining whether each instruction has executed properly and a replay queue coupled to the checker for temporarily storing one or more instructions for replay. The replay queue may be used to store a long latency instruction, such as a load in which data must be retrieved from an external memory device. The long latency instruction and possibly one or more dependent instruction are stored in the replay queue until the long latency instruction is ready to be executed (e.g., data for the load instruction has been retrieved from external memory). Once the long latency instruction is ready to be executed, (e.g., the data is available), the long latency instruction may then be unloaded from the replay queue for re-execution.
    Type: Grant
    Filed: December 29, 1999
    Date of Patent: April 3, 2007
    Assignee: Intel Corporation
    Inventors: Amit A. Merchant, Darrell D. Boggs, David J. Sager
  • Patent number: 7089409
    Abstract: A processor includes a memory execution unit for executing load and store instructions and a replay system for replaying instructions which have not executed properly. The memory execution unit including an invalid store flag that is set for a store instruction if the replay system detects that the store instruction has not executed properly and is cleared if the store instruction has executed properly. If an invalid store flag is set for a store instruction, the replay system replays load instructions which are programmatically younger than the invalid store instruction until the store instruction executes properly.
    Type: Grant
    Filed: October 23, 2003
    Date of Patent: August 8, 2006
    Assignee: Intel Corporation
    Inventors: Amit A. Merchant, Darrell D. Boggs, David J. Sager
  • Patent number: 6792446
    Abstract: A processor is provided that includes an execution unit for executing instructions and a replay system for replaying instructions which have not executed properly. The replay system is coupled to the execution unit and includes a checker for determining whether each instruction has executed properly and a plurality of replay queues or replay queue sections coupled to the checker for temporarily storing one or more instructions for replay. In one embodiment, thread-specific replay queue sections may each be used to store a long latency instruction for each thread until the long latency instruction is ready to be executed (e.g., data for a load instruction has been retrieved from external memory). By storing the long latency instruction and its dependents in a replay queue section for one thread which has stalled, execution resources are made available for improving the speed of execution of other threads which have not stalled.
    Type: Grant
    Filed: February 1, 2002
    Date of Patent: September 14, 2004
    Assignee: Intel Corporation
    Inventors: Amit A. Merchant, Darrell D. Buggs, David J. Sager
  • Publication number: 20040172523
    Abstract: A processor is provided that includes an execution unit for executing instructions and a replay system for replaying instructions which have not executed properly. The replay system is coupled to the execution unit and includes a checker for determining whether each instruction has executed properly and a plurality of replay queues or replay queue sections coupled to the checker for temporarily storing one or more instructions for replay. In one embodiment, thread-specific replay queue sections may each be used to store a long latency instruction for each thread until the long latency instruction is ready to be executed (e.g., data for a load instruction has been retrieved from external memory). By storing the long latency instruction and its dependents in a replay queue section for one thread which has stalled, execution resources are made available for improving the speed of execution of other threads which have not stalled.
    Type: Application
    Filed: March 2, 2004
    Publication date: September 2, 2004
    Inventors: Amit A. Merchant, Darrell D. Buggs, David J. Sager
  • Patent number: 6785803
    Abstract: A technique is provided for breaking a stalled condition or livelock in a processor having a replay queue. A livelock or stalled condition is detected. One or more instructions are temporarily stored in a replay queue. A release or break in the livelock or stalled condition is detected, and the instructions are then unloaded from the replay queue for replay or re-execution. For a multi-threaded processor, a stall is detected in one of the threads. Instructions of the stalled thread are temporarily stored in a replay queue, except the oldest instruction of the stalled thread which is allowed to replay or re-execute. This allows other threads to have access to execution and replay resources. Eventually, the oldest instruction will execute and retire, which breaks or releases the stalled thread. The instructions stored in the replay queue are then unloaded from the replay queue.
    Type: Grant
    Filed: September 22, 2000
    Date of Patent: August 31, 2004
    Assignee: Intel Corporation
    Inventors: Amit A. Merchant, David J. Sager, James D. Allen
  • Publication number: 20040083351
    Abstract: A processor includes a memory execution unit for executing load and store instructions and a replay system for replaying instructions which have not executed properly. The memory execution unit including an invalid store flag that is set for a store instruction if the replay system detects that the store instruction has not executed properly and is cleared if the store instruction has executed properly. If an invalid store flag is set for a store instruction, the replay system replays load instructions which are programmatically younger than the invalid store instruction until the store instruction executes properly.
    Type: Application
    Filed: October 23, 2003
    Publication date: April 29, 2004
    Inventors: Amit A. Merchant, Darrell D. Boggs, David J. Sager
  • Patent number: 6665792
    Abstract: A processor includes a memory execution unit for executing load and store instructions and a replay system for replaying instructions which have not executed properly. The memory execution unit including an invalid store flag that is set for a store instruction if the replay system detects that the store instruction has not executed properly and is cleared if the store instruction has executed properly. If an invalid store flag is set for a store instruction, the replay system replays load instructions which are programmatically younger than the invalid store instruction until the store instruction executes properly.
    Type: Grant
    Filed: December 30, 1999
    Date of Patent: December 16, 2003
    Assignee: Intel Corporation
    Inventors: Amit A. Merchant, Darrell D. Boggs, David J. Sager
  • Publication number: 20020091914
    Abstract: A processor is provided that includes an execution unit for executing instructions and a replay system for replaying instructions which have not executed properly. The replay system is coupled to the execution unit and includes a checker for determining whether each instruction has executed properly and a plurality of replay queues or replay queue sections coupled to the checker for temporarily storing one or more instructions for replay. In one embodiment, thread-specific replay queue sections may each be used to store a long latency instruction for each thread until the long latency instruction is ready to be executed (e.g., data for a load instruction has been retrieved from external memory). By storing the long latency instruction and its dependents in a replay queue section for one thread which has stalled, execution resources are made available for improving the speed of execution of other threads which have not stalled.
    Type: Application
    Filed: February 1, 2002
    Publication date: July 11, 2002
    Inventors: Amit A. Merchant, Darrell D. Buggs, David J. Sager
  • Patent number: 6385715
    Abstract: A processor is provided that includes an execution unit for executing instructions and a replay system for replaying instructions which have not executed properly. The replay system is coupled to the execution unit and includes a checker for determining whether each instruction has executed properly and a plurality of replay queues or replay queue sections coupled to the checker for temporarily storing one or more instructions for replay. In one embodiment, thread-specific replay queue sections may each be used to store long latency instruction for each thread until the long latency instruction is ready to be executed (e.g., data for a load instruction has been retrieved from external memory). By storing the long latency instruction and its dependents in a replay queue section for one thread which has stalled, execution resources are made available for improving the speed of execution of other threads which have not stalled.
    Type: Grant
    Filed: May 4, 2001
    Date of Patent: May 7, 2002
    Assignee: Intel Corporation
    Inventors: Amit A. Merchant, Darrell D. Boggs, David J. Sager
  • Patent number: 6334182
    Abstract: A method and apparatus for scheduling operations using a dependency matrix. A child operation, such as a micro-operation, is received for scheduling. The child operation is dependent on the completion of a parent operation, such as when one of the child operation's sources is the parent operations's destination. An entry corresponding to the child operation is placed in a scheduling queue and the child operation is compared with other entries in the scheduling queue. The result of this comparison is stored in a dependency matrix. Each row in the dependency matrix corresponds to an entry in the scheduling queue, and each column corresponds to a dependency on an entry in the scheduling queue. Entries in the scheduling queue can then be scheduled based on the information in the dependency matrix, such as when the entire row associated with an entry is clear.
    Type: Grant
    Filed: August 18, 1998
    Date of Patent: December 25, 2001
    Assignee: Intel Corporation
    Inventors: Amit A. Merchant, David J. Sager
  • Patent number: 6212626
    Abstract: A computer processor that has a checker for receiving an instruction. The checker includes a scoreboard, an input for receiving an external replay signal, and decision logic coupled to the scoreboard and the input. The decision logic determines whether the instruction executed correctly based on both the scoreboard and the external replay signal.
    Type: Grant
    Filed: December 30, 1998
    Date of Patent: April 3, 2001
    Assignee: Intel Corporation
    Inventors: Amit A. Merchant, David J. Sager
  • Patent number: 6163838
    Abstract: A computer processor includes a multiplexer having a first input, a second input, and an output, and a scheduler coupled to the multiplexer first input. The processor further includes an execution unit coupled to the multiplexer output. The execution unit is adapted to receive a plurality of instructions from the multiplexer. The processor further includes a replay system coupled to the second multiplexer input and the scheduler. The replay system replays an instruction that has not correctly executed by sending a stop scheduler signal to the scheduler and sending the instruction to the multiplexer.
    Type: Grant
    Filed: June 30, 1998
    Date of Patent: December 19, 2000
    Assignee: Intel Corporation
    Inventors: Amit A. Merchant, David J. Sager, Darrell D. Boggs
  • Patent number: 6094717
    Abstract: A computer processor includes a multiplexer having a first input, a second input, a third input, and an output. The processor further includes a scheduler coupled to the multiplexer first input, an execution unit coupled to the multiplexer output, and a replay system that has an input coupled to the multiplexer output. The replay system includes a first checker coupled to the replay system input and the second multiplexer input, and a second checker coupled to the first checker and the third multiplexer input.
    Type: Grant
    Filed: July 31, 1998
    Date of Patent: July 25, 2000
    Assignee: Intel Corp.
    Inventors: Amit A. Merchant, David J. Sager, Darrell D. Boggs, Michael D. Upton
  • Patent number: 5909699
    Abstract: Requests to memory issued by an agent on a bus are satisfied while maintaining cache consistency. The requesting agent may issue a request to another agent, or the memory unit, by placing the request on the bus. Each agent on the bus snoops the bus to determine whether the issued request can be satisfied by accessing its cache. An agent which can satisfy the request using its cache, i.e., the snooping agent, issues a signal to the requesting agent indicating so. The snooping agent places the cache line which corresponds to the request onto the bus, which is retrieved by the requesting agent. In the event of a read request, the memory unit also retrieves the cache line data from the bus and stores the cache line in main memory. In the event of a write request, the requesting agent transfers write data over the bus along with the request. This write data is retrieved by both the memory unit, which temporarily stores the data, and the snooping agent.
    Type: Grant
    Filed: June 28, 1996
    Date of Patent: June 1, 1999
    Assignee: Intel Corporation
    Inventors: Nitin V. Sarangdhar, Michael W. Rhodehamel, Amit A. Merchant, Matthew A. Fisch, James M. Brayton
  • Patent number: 5893151
    Abstract: An apparatus for maintaining cache coherency for snoop operations includes a processor core for fetching, decoding, and executing instructions, a data cache coupled to the processor core for providing data to the processor core and for receiving data from the processor core, and a system bus coupling the processor core to the data cache. The apparatus further includes a snoop scheduler coupled to the processor core, the data cache, and the system bus, where the snoop scheduler is coupled to receive addresses from the system bus. The snoop scheduler also determines if snoop operations are orthogonal and schedules one or more out-of-order and at least partially overlapping snoop operations. Determining which snoop operations are orthogonal includes utilizing a block bit, a sleep bit, and a plurality of previously pending snoop request bits in a snoop queue entry to determine if the entry is orthogonal or not.
    Type: Grant
    Filed: April 4, 1997
    Date of Patent: April 6, 1999
    Assignee: Intel Corporation
    Inventor: Amit A. Merchant
  • Patent number: 5890200
    Abstract: An apparatus for maintaining cache coherency for snoop operations includes a snoop scheduler coupled to receive addresses from a system bus. The snoop scheduler utilizes a content addressable memory array. The snoop scheduler determines if snoop operations are orthogonal and schedules one or more out-of-order and at least partially overlapping snoop operations. Determining which snoop operations are orthogonal includes utilizing a block bit, a sleep bit, and a plurality of previously pending snoop request bits in a snoop queue entry to determine if the entry is orthogonal or not.
    Type: Grant
    Filed: April 4, 1997
    Date of Patent: March 30, 1999
    Assignee: Intel Corporation
    Inventor: Amit A. Merchant
  • Patent number: 5875467
    Abstract: A method of maintaining cache coherency for snoop operations includes initiating a first snoop operation in response to a snoop request while one or more previous snoop operations are pending in a queue. Furthermore, one or more subsequent snoop operations can be queued, wherein a step of determining if the one or more snoop requests are orthogonal, i.e. the processing of one request is not dependent on the outcome of a previous request, is included. The step of determining if the one or more snoop requests are orthogonal includes utilizing a block bit, a sleep bit, and a plurality of previously pending snoop request bits in a snoop queue entry to determine if the entry is orthogonal or not.
    Type: Grant
    Filed: April 4, 1997
    Date of Patent: February 23, 1999
    Assignee: Intel Corporation
    Inventor: Amit A. Merchant
  • Patent number: 5797026
    Abstract: A self-snooping mechanism for enabling a processor being coupled to dedicated cache memory and a processor-system bus to snoop its own request issued on the processor-system bus. The processor-system bus enables communication between the processor and other bus agents such as a memory subsystem, I/O subsystem and/or other processors. The self-snooping mechanism is commenced upon determination that the request is based on a boundary condition so that initial internal cache lookup is bypassed to improve system efficiency.
    Type: Grant
    Filed: September 2, 1997
    Date of Patent: August 18, 1998
    Assignee: Intel Corporation
    Inventors: Michael W. Rhodehamel, Nitin V. Sarangdhar, Amit A. Merchant, Matthew A. Fisch, James M. Brayton
  • Patent number: 5778438
    Abstract: A method of maintaining cache coherency. The method comprises the steps of allocating an entry in a snoop queue to a snoopable request and blocking the snoopable request to delay performing a snoop operation in response to the snoopable request until a blocking condition is satisfied. The method also comprises the steps of performing the snoop operations in response to the snoopable request after the blocking condition is satisfied and deallocating the entry from the snoop queue.
    Type: Grant
    Filed: December 6, 1995
    Date of Patent: July 7, 1998
    Assignee: Intel Corporation
    Inventor: Amit A. Merchant