Patents by Inventor Dung Quoc Nguyen

Dung Quoc Nguyen has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7631308
    Abstract: A method is disclosed in a data processing system for ensuring processing fairness in simultaneous multi-threading (SMT) microprocessors that concurrently execute multiple threads during each clock cycle. A clock cycle priority is assigned to a first thread and to a second thread during a standard selection state that lasts for an expected number of clock cycles. The clock cycle priority is assigned according to a standard selection definition during the standard selection state by selecting the first thread to be a primary thread and the second thread to be a secondary thread during the standard selection state. If a condition exists that requires overriding the standard selection definition, an override state is executed during which the standard selection definition is overridden by selecting the second thread to be the primary thread and the first thread to be the secondary thread.
    Type: Grant
    Filed: February 11, 2005
    Date of Patent: December 8, 2009
    Assignee: International Business Machines Corporation
    Inventors: James Wilson Bishop, Hung Qui Le, Dung Quoc Nguyen, Balaram Sinharoy, Brian William Thompto, Raymond Cheung Yeung
  • Publication number: 20090292892
    Abstract: A method for reducing the power consumption of a register file of a microprocessor supporting simultaneous multithreading (SMT) is disclosed. Mapping logic and associated table entries monitor a total number of processing threads currently executing in the processor and signal control logic to disable specific register file entries not required for currently executing or pending instruction threads or register file entries not meeting a minimum access threshold using a least recently used algorithm (LRU). The register file utilization is controlled such that a register file address range selected for deactivation is not assigned for pending or future instruction threads. One or more power saving techniques are then applied to disabled register files to reduce overall power dissipation in the system.
    Type: Application
    Filed: May 15, 2008
    Publication date: November 26, 2009
    Inventors: Christopher M. Abernathy, Jens Leenstra, Nicolas Maeding, Dung Quoc Nguyen
  • Patent number: 7620799
    Abstract: Mechanisms to identify and speculatively execute future instructions during a stall condition are provided. In speculative mode, instruction operands may be invalid due to a number of reasons. Dependency and dirty bits are tracked and used to determine which speculative instructions are valid for execution. A modified value register storage and bit vector are used to improve the availability of speculative results that would otherwise be discarded once they leave the execution pipeline because they cannot be written to the architected registers. The modified general purpose registers are used to store speculative results when the corresponding instruction reaches writeback and the modified bit vector tracks the results that have been stored there. Younger speculative instructions that do not bypass directly from older instructions use this modified data when the corresponding bit in the modified bit vector indicates the data has been modified. Otherwise, data from the architected registers is used.
    Type: Grant
    Filed: April 2, 2008
    Date of Patent: November 17, 2009
    Assignee: International Business Machines Corporation
    Inventors: Richard James Eickemeyer, Hung Qui Le, Dung Quoc Nguyen, Benjamin Walter Stolt, Brian William Thompto
  • Patent number: 7620801
    Abstract: A method for pseudo-randomly, without bias, selecting instructions for marking in a microprocessor. Responsive to reading an instruction from an instruction cache, an instruction tag associated with the instruction is compared against a pseudo-randomly generated value in a linear feedback shift register (LFSR). If the instruction tag matches the value in the LFSR, a mark bit, indicating the instruction is a marked instruction, is sent with the instruction to an execution unit. Responsive to an indication from the performance monitor, the value in the LFSR is incremented prior to selecting a next instruction to mark. If the value equals a predetermined prime number of increments, the value is reset to all ones to avoid any harmonics with the code stream being executed. Upon receiving the marked instruction, the execution unit combines the marked bit with a selected event and reports the marked event to the performance monitor.
    Type: Grant
    Filed: February 11, 2005
    Date of Patent: November 17, 2009
    Assignee: International Business Machines Corporation
    Inventors: James Wilson Bishop, Michael Stephen Floyd, Alexander Erik Mericas, Robert Dominick Mirabella, Dung Quoc Nguyen, Philip Lee Vitale
  • Patent number: 7600099
    Abstract: A system and method for predictive early allocation of stores in a microprocessor is presented. During instruction dispatch, an instruction dispatch unit retrieves an instruction from an instruction cache (Icache). When the retrieved instruction is an interruptible instruction, the instruction dispatch unit loads the interruptible instruction's instruction tag (IITAG) into an interruptible instruction tag register. A load store unit loads subsequent instruction information (instruction tag and store data) along with the interruptible instruction tag in a store data queue entry. Comparison logic receives a completing instruction tag from completion logic, and compares the completing instruction tag with the interruptible instruction tags included in the store data queue entries. In turn, deallocation logic deallocates those store data queue entries that include an interruptible instruction tag that matches the completing instruction tag.
    Type: Grant
    Filed: March 8, 2007
    Date of Patent: October 6, 2009
    Assignee: International Business Machines Corporation
    Inventors: Hung Qui Le, Dung Quoc Nguyen
  • Patent number: 7594096
    Abstract: The present invention allows a microprocessor to identify and speculatively execute future load instructions during a stall condition. This allows forward progress to be made through the instruction stream during the stall condition which would otherwise cause the microprocessor or thread of execution to be idle. The data for such future load instructions can be prefetched from a distant cache or main memory such that when the load instruction is re-executed (non speculative executed) after the stall condition expires, its data will reside either in the L1 cache, or will be enroute to the processor, resulting in a reduced execution latency. When an extended stall condition is detected, load lookahead prefetch is started allowing speculative execution of instructions that would normally have been stalled.
    Type: Grant
    Filed: December 5, 2007
    Date of Patent: September 22, 2009
    Assignee: International Business Machines Corporation
    Inventors: Richard James Eickemeyer, Hung Qui Le, Dung Quoc Nguyen, Benjamin Walter Stolt, Brian William Thompto
  • Patent number: 7552318
    Abstract: A method of handling program instructions in a microprocessor which reduces delays associated with mispredicted branch instructions, by detecting the occurrence of a stall condition during execution of the program instructions, speculatively executing one or more pending instructions which include at least one branch instruction during the stall condition, and determining the validity of data utilized by the speculative execution. Dispatch logic determines the validity of the data by marking one or more registers of an instruction dispatch unit to indicate which results of the pending instructions are invalid. The speculative execution of instructions can occur across multiple pipeline stages of the microprocessor, and the validity of the data is tracked during their execution in the multiple pipeline stages while monitoring a dependency of the speculatively executed instructions relative to one another during their execution in the multiple pipeline stages.
    Type: Grant
    Filed: December 17, 2004
    Date of Patent: June 23, 2009
    Assignee: International Business Machines Corporation
    Inventors: Richard James Eickemeyer, Hung Qui Le, Dung Quoc Nguyen, Benjamin Walter Stolt, Brian William Thompto
  • Publication number: 20090063898
    Abstract: Recovery circuits react to errors in a processor core by waiting for an error-free completion of any pending store-conditional instruction or a cache-inhibited load before ceasing to checkpoint or backup progress of a processor core. Recovery circuits remove the processor core from the logical configuration of the symmetric multiprocessor system, potentially reducing propagation of errors to other parts of the system. The processor core is reset and the checkpointed values may be restored to registers of the processor core. The core processor is allowed not just to resume execution just prior to the instructions that failed to execute correctly the first time, but is allowed to operate in a reduced execution mode for a preprogrammed number of groups. If the preprogrammed number of instruction groups execute without error, the processor core is allowed to resume normal execution.
    Type: Application
    Filed: November 13, 2008
    Publication date: March 5, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Susan Elizabeth Eisen, Hung Qui Le, Michael James Mack, Dung Quoc Nguyen, Jose Angel Paredes, Scott Barnett Swaney
  • Patent number: 7490226
    Abstract: A method and related apparatus is provided for a processor having a number of registers, wherein instructions are sequentially issued to move through a sequence of execution stages, from an initial stage to a final write back stage. As a method, an embodiment includes the step of issuing a first instruction, such as an FMA instruction, to move through the sequence of execution stages, the first instruction being directed to a specified one of the registers. The method further includes issuing a second instruction to move through the execution stages, the second instruction being issued after the first instruction has issued, but before the first instruction reaches the final write back stage. The second instruction is likewise directed to the specified register, and comprises either a store instruction or a load instruction, selectively.
    Type: Grant
    Filed: February 9, 2005
    Date of Patent: February 10, 2009
    Assignee: International Business Machines Corporation
    Inventors: Hung Qui Le, Dung Quoc Nguyen, Raymond Cheung Yeung
  • Patent number: 7478276
    Abstract: A method and apparatus are provided for dispatch group checkpointing in a microprocessor, including provisions for handling partially completed dispatch groups and instructions which modify system coherent state prior to completion. An instruction checkpoint retry mechanism is implemented to recover from soft errors in logic. The processor is able to dispatch fixed point unit (FXU), load/store unit (LSU), and floating point unit (FPU) or vector multimedia extension (VMX) instructions on the same cycle. Store data is written to a store queue when a store instruction finishes executing. The data is held in the store queue until the store instruction is checkpointed, at which point it can be released to the coherently shared level 2 (L2) cache.
    Type: Grant
    Filed: February 10, 2005
    Date of Patent: January 13, 2009
    Assignee: International Business Machines Corporation
    Inventors: James Wilson Bishop, Hung Qui Le, Michael James Mack, Jafar Nahidi, Dung Quoc Nguyen, Jose Angel Paredes, Scott Barnett Swaney, Brian William Thompto
  • Patent number: 7467325
    Abstract: Recovery circuits react to errors in a processor core by waiting for an error-free completion of any pending store-conditional instruction or a cache-inhibited load before ceasing to checkpoint or backup progress of a processor core. Recovery circuits remove the processor core from the logical configuration of the symmetric multiprocessor system, potentially reducing propagation of errors to other parts of the system. The processor core is reset and the checkpointed values may be restored to registers of the processor core. The core processor is allowed not just to resume execution just prior to the instructions that failed to execute correctly the first time, but is allowed to operate in a reduced execution mode for a preprogrammed number of groups. If the preprogrammed number of instruction groups execute without error, the processor core is allowed to resume normal execution.
    Type: Grant
    Filed: February 10, 2005
    Date of Patent: December 16, 2008
    Assignee: International Business Machines Corporation
    Inventors: Susan Elizabeth Eisen, Hung Qui Le, Michael James Mack, Dung Quoc Nguyen, Jose Angel Paredes, Scott Barnett Swaney
  • Publication number: 20080294884
    Abstract: A method, apparatus, and computer program product are disclosed in a data processing system for ensuring processing fairness in simultaneous multi-threading (SMT) microprocessors that concurrently execute multiple threads during each clock cycle. A clock cycle priority is assigned to a first thread and to a second thread during a standard selection state that lasts for an expected number of clock cycles. The clock cycle priority is assigned according to a standard selection definition during the standard selection state by selecting the first thread to be a primary thread and the second thread to be a secondary thread during the standard selection state. If a condition exists that requires overriding the standard selection definition, an override state is executed during which the standard selection definition is overridden by selecting the second thread to be the primary thread and the first thread to be the secondary thread.
    Type: Application
    Filed: May 30, 2008
    Publication date: November 27, 2008
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: James Wilson Bishop, Hung Qui Le, Dung Quoc Nguyen, Balaram Sinharoy, Brian William Thompto, Raymond Cheung Yeung
  • Patent number: 7444498
    Abstract: The present invention allows a microprocessor to identify and speculatively execute future load instructions during a stall condition. This allows forward progress to be made through the instruction stream during the stall condition which would otherwise cause the microprocessor or thread of execution to be idle. The data for such future load instructions can be prefetched from a distant cache or main memory such that when the load instruction is re-executed (non speculative executed) after the stall condition expires, its data will reside either in the L1 cache, or will be enroute to the processor, resulting in a reduced execution latency. When an extended stall condition is detected, load lookahead prefetch is started allowing speculative execution of instructions that would normally have been stalled.
    Type: Grant
    Filed: December 17, 2004
    Date of Patent: October 28, 2008
    Assignee: International Business Machines Corporation
    Inventors: Richard James Eickemeyer, Hung Qui Le, Dung Quoc Nguyen, Benjamin Walter Stolt, Brian William Thompto
  • Publication number: 20080250230
    Abstract: The present invention allows a microprocessor to identify and speculatively execute future instructions during a stall condition. This allows forward progress to be made through the instruction stream during the stall condition which would otherwise cause the microprocessor or thread of execution to be idle. The execution of such future instructions can initiate a prefetch of data or instructions from a distant cache or main memory, or otherwise make forward progress through the instruction stream. In this manner, when the instructions are re-executed (non speculatively executed) after the stall condition expires, they will execute with a reduced execution latency; e.g. by accessing data prefetched into the L1 cache, or enroute to the processor, or by executing the target instructions following a speculatively resolved mispredicted branch.
    Type: Application
    Filed: April 2, 2008
    Publication date: October 9, 2008
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Richard James Eickemeyer, Hung Qui Le, Dung Quoc Nguyen, Benjamin Walter Stolt, Brian William Thompto
  • Publication number: 20080250226
    Abstract: A multi-mode register rename mechanism which allows a simultaneous multi-threaded processor to support full out-of-order thread execution when the number of threads is low and in-order thread execution when the number of threads increases. Responsive to changing an execution mode of a processor to operate in in-order thread execution mode, the illustrative embodiments switch a physical register in the data processing system to an architected facility, thereby forming a switched physical register. When an instruction is issued to an execution unit, wherein the issued instruction comprises a thread bit, the thread bit is examined to determine if the instruction accesses an architected facility. If the issued instruction accesses an architected facility, the instruction is executed, and the results of the executed instruction are written to the switched physical register.
    Type: Application
    Filed: April 4, 2007
    Publication date: October 9, 2008
    Inventors: Richard James Eickemeyer, Hung Qui Le, Dung Quoc Nguyen, Balaram Sinharoy
  • Publication number: 20080239860
    Abstract: An apparatus and method are provided for reading a plurality of consecutive entries and writing a plurality of consecutive entries with only one read address and one write address using a 2Read/2Write register file. In one exemplary embodiment, a 64 entry register file array is partitioned into four sub-arrays. Each sub-array contains sixteen entries having one or more 2Read/2Write SRAM cells. The apparatus and method provide a mechanism to write the consecutive entries by only having a 4 to 16 decode of one address. In addition, the apparatus and method provide a mechanism for reading data from the register file array using a starting read word address and two read word lines generated based on the starting read word address. The two read word lines are used to access the two read ports of the entries in the sub-arrays.
    Type: Application
    Filed: June 6, 2008
    Publication date: October 2, 2008
    Applicant: International Business Machines Corporation
    Inventors: Sam Gat-Shang Chu, Maureen Anne Delaney, Saiful Islam, Dung Quoc Nguyen, Jafar Nahidi
  • Publication number: 20080229058
    Abstract: A configurable microprocessor that handles low computing-intensive workloads by partitioning a single processor core into two smaller corelets. The process partitions resources of a single microprocessor core to form a plurality of corelets and assigns a set of the partitioned resources to each corelet. Each set of partitioned resources is dedicated to one corelet to allow each corelet to function independently of other corelets in the plurality of corelets. The process also combines a plurality of corelets into a single microprocessor core by combining corelet resources to form a single microprocessor core. The combined resources feed the single microprocessor core.
    Type: Application
    Filed: March 13, 2007
    Publication date: September 18, 2008
    Inventors: Hung Qui Le, Dung Quoc Nguyen, Balaram Sinharoy
  • Publication number: 20080229065
    Abstract: A configurable microprocessor which combines a plurality of corelets into a single microprocessor core to handle high computing-intensive workloads. The process first selects two or more corelets in the plurality of corelets. The process combines resources of the two or more corelets to form combined resources, wherein each combined resource comprises a larger amount of a resource available to each individual corelet. The process then forms a single microprocessor core from the two or more corelets by assigning the combined resources to the single microprocessor core, wherein the combined resources are dedicated to the single microprocessor core, and wherein the single microprocessor core processes instructions with the dedicated combined resources.
    Type: Application
    Filed: March 13, 2007
    Publication date: September 18, 2008
    Inventors: Hung Qui Le, Dung Quoc Nguyen, Balaram Sinharoy
  • Publication number: 20080222395
    Abstract: A system and method for predictive early allocation of stores in a microprocessor is presented. During instruction dispatch, an instruction dispatch unit retrieves an instruction from an instruction cache (Icache). When the retrieved instruction is an interruptible instruction, the instruction dispatch unit loads the interruptible instruction's instruction tag (IITAG) into an interruptible instruction tag register. A load store unit loads subsequent instruction information (instruction tag and store data) along with the interruptible instruction tag in a store data queue entry. Comparison logic receives a completing instruction tag from completion logic, and compares the completing instruction tag with the interruptible instruction tags included in the store data queue entries. In turn, deallocation logic deallocates those store data queue entries that include an interruptible instruction tag that matches the completing instruction tag.
    Type: Application
    Filed: March 8, 2007
    Publication date: September 11, 2008
    Inventors: Hung Qui Le, Dung Quoc Nguyen
  • Patent number: 7421567
    Abstract: The present invention allows a microprocessor to identify and speculatively execute future instructions during a stall condition. This allows forward progress to be made through the instruction stream during the stall condition which would otherwise cause the microprocessor or thread of execution to be idle. The execution of such future instructions can initiate a prefetch of data or instructions from a distant cache or main memory, or otherwise make forward progress through the instruction stream. In this manner, when the instructions are re-executed (non speculatively executed) after the stall condition expires, they will execute with a reduced execution latency; e.g. by accessing data prefetched into the L1 cache, or enroute to the processor, or by executing the target instructions following a speculatively resolved mispredicted branch.
    Type: Grant
    Filed: December 17, 2004
    Date of Patent: September 2, 2008
    Assignee: International Business Machines Corporation
    Inventors: Richard James Eickemeyer, Hung Qui Le, Dung Quoc Nguyen, Benjamin Walter Stolt, Brian William Thompto