Patents by Inventor Hung Qui Le

Hung Qui Le has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9665372
    Abstract: A processor core having multiple parallel instruction execution slices and coupled to multiple dispatch queues by a dispatch routing network provides flexible and efficient use of internal resources. The dispatch routing network is controlled to dynamically vary the relationship between the slices and instruction streams according to execution requirements for the instruction streams and the availability of resources in the instruction execution slices. The instruction execution slices may be dynamically reconfigured as between single-instruction-multiple-data (SIMD) instruction execution and ordinary instruction execution on a per-instruction basis, permitting the mixture of those instruction types. Instructions having an operand width greater than the width of a single instruction execution slice may be processed by multiple instruction execution slices configured to act in concert for the particular instructions.
    Type: Grant
    Filed: May 12, 2014
    Date of Patent: May 30, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Lee Evan Eisen, Hung Qui Le, Jentje Leenstra, Jose Eduardo Moreira, Bruce Joseph Ronchetti, Brian William Thompto, Albert James Van Norstrand, Jr.
  • Publication number: 20160202990
    Abstract: An execution slice circuit for a processor core has multiple parallel instruction execution slices and provides flexible and efficient use of internal resources. The execution slice circuit includes a master execution slice for receiving instructions of a first instruction stream and a slave execution slice for receiving instructions of a second instruction stream and instructions of the first instruction stream that require an execution width greater than a width of the slices. The execution slice circuit also includes a control logic that detects when a first instruction of the first instruction stream has the greater width and controls the slave execution slice to reserve a first issue cycle for issuing the first instruction in parallel across the master execution slice and the slave execution slice.
    Type: Application
    Filed: January 13, 2015
    Publication date: July 14, 2016
    Inventors: Jeffrey Carl Brownscheidle, Sundeep Chadha, Maureen Anne Delaney, Hung Qui Le, Dung Quoc Nguyen, Brian William Thompto
  • Publication number: 20160202989
    Abstract: A processor core having multiple parallel instruction execution slices and coupled to multiple dispatch queues by a dispatch routing network provides flexible and efficient use of internal resources. The configuration of the execution slices is selectable so that capabilities of the processor core can be adjusted according to execution requirements for the instruction streams. Two or more execution slices can be combined as super-slices to handle wider data, wider operands and/or vector operations, according to one or more mode control signal that also serves as a configuration control signal. The mode control signal is also used to partition clusters of the execution slices within the processor core according to whether single-threaded or multi-threaded operation is selected, and additionally according to a number of hardware threads that are active.
    Type: Application
    Filed: January 12, 2015
    Publication date: July 14, 2016
    Inventors: Lee Evan Eisen, Hung Qui Le, Jentje Leenstra, Jose Eduardo Moreira, Bruce Joseph Ronchetti, Brian William Thompto, Albert James Van Norstrand, JR.
  • Publication number: 20160202988
    Abstract: A method of operation of a processor core execution unit circuit provides efficient use of area and energy by reducing the per-entry storage requirement of a load-store unit issue queue. The execution unit circuit includes a recirculation queue that stores the effective address of the load and store operations and the values to be stored by the store operations. A queue control logic controls the recirculation queue and issue queue so that that after the effective address of a load or store operation has been computed, the effective address of the load operation or the store operation is written to the recirculation queue and the operation is removed from the issue queue, so that address operands and other values that were in the issue queue entry no longer require storage. When a load or store operation is rejected by the cache unit, it is subsequently reissued from the recirculation queue.
    Type: Application
    Filed: May 28, 2015
    Publication date: July 14, 2016
    Inventors: Salma Ayub, Sundeep Chadha, Robert Allen Cordes, David Allen Hrusecky, Hung Qui Le, Dung Quoc Nguyen, Brian William Thompto
  • Publication number: 20160202991
    Abstract: A method of operating a processor core having multiple parallel instruction execution slices and coupled to multiple dispatch queues by a dispatch routing network provides flexible and efficient use of internal resources. The configuration of the execution slices is selectable so that capabilities of the processor core can be adjusted according to execution requirements for the instruction streams. Two or more execution slices can be combined as super-slices to handle wider data, wider operands and/or vector operations, according to one or more mode control signal that also serves as a configuration control signal. The mode control signal is also used to partition clusters of the execution slices within the processor core according to whether single-threaded or multi-threaded operation is selected, and additionally according to a number of hardware threads that are active.
    Type: Application
    Filed: May 28, 2015
    Publication date: July 14, 2016
    Inventors: Lee Evan Eisen, Hung Qui Le, Jentje Leenstra, Jose Eduardo Moreira, Bruce Joseph Ronchetti, Brian William Thompto, Albert James Van Norstrand, JR.
  • Publication number: 20160202992
    Abstract: A method of processing using an execution slice circuit including multiple parallel instruction execution slices provides flexible and efficient use of internal resources. The execution slice circuit includes a master execution slice for receiving instructions of a first instruction stream and a slave execution slice for receiving instructions of a second instruction stream and instructions of the first instruction stream that require an execution width greater than a width of the slices. The method also detects when a first instruction of the first instruction stream has the greater width and controls the slave execution slice to reserve a first issue cycle for issuing the first instruction in parallel across the master execution slice and the slave execution slice.
    Type: Application
    Filed: May 28, 2015
    Publication date: July 14, 2016
    Inventors: Jeffrey Carl Brownscheidle, Sundeep Chadha, Maureen Anne Delaney, Hung Qui Le, Dung Quoc Nguyen, Brian William Thompto
  • Publication number: 20160202986
    Abstract: An execution unit circuit for use in a processor core provides efficient use of area and energy by reducing the per-entry storage requirement of a load-store unit issue queue. The execution unit circuit includes a recirculation queue that stores the effective address of the load and store operations and the values to be stored by the store operations. A queue control logic controls the recirculation queue and issue queue so that that after the effective address of a load or store operation has been computed, the effective address of the load operation or the store operation is written to the recirculation queue and the operation is removed from the issue queue, so that address operands and other values that were in the issue queue entry no longer require storage. When a load or store operation is rejected by the cache unit, it is subsequently reissued from the recirculation queue.
    Type: Application
    Filed: January 13, 2015
    Publication date: July 14, 2016
    Inventors: Salma Ayub, Sundeep Chadha, Robert Allen Cordes, David Allen Hrusecky, Hung Qui Le, Dung Quoc Nguyen, Brian William Thompto
  • Publication number: 20150324206
    Abstract: A method of operation of a processor core having multiple parallel instruction execution slices and coupled to multiple dispatch queues coupled by a dispatch routing network provides flexible and efficient use of internal resources. The dispatch routing network is controlled to dynamically vary the relationship between the slices and instruction streams according to execution requirements for the instruction streams and the availability of resources in the instruction execution slices. The instruction execution slices may be dynamically reconfigured as between single-instruction-multiple-data (SIMD) instruction execution and ordinary instruction execution on a per-instruction basis. Instructions having an operand width greater than the width of a single instruction execution slice may be processed by multiple instruction execution slices configured to act in concert for the particular instructions.
    Type: Application
    Filed: June 10, 2014
    Publication date: November 12, 2015
    Inventors: Lee Evan Eisen, Hung Qui Le, Jentje Leenstra, Jose Eduardo Moreira, Bruce Joseph Ronchetti, Brian William Thompto, Albert James Van Norstrand, JR.
  • Publication number: 20150324205
    Abstract: Techniques for managing instruction execution for multiple instruction streams using a processor core having multiple parallel instruction execution slices provide flexibility in execution of program instructions by a processor core. An event is detected indicating that either resource requirement or resource availability will not be met by the execution slice currently executing the instruction stream. In response to detecting the event, dispatch of at least a portion of the subsequent instruction is made to another instruction execution slice. The event may be a compiler-inserted directive, may be an event detected by logic in the processor core, or may be determined by a thread sequencer. The instruction execution slices may be dynamically reconfigured as between single-instruction-multiple-data (SIMD) instruction execution, ordinary instruction execution, wide instruction execution.
    Type: Application
    Filed: May 12, 2014
    Publication date: November 12, 2015
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Lee Evan Eisen, Hung Qui Le, Jentje Leenstra, Jose Eduardo Moreira, Bruce Joseph Ronchetti, Brian William Thompto, Albert James Van Norstrand, JR.
  • Publication number: 20150324207
    Abstract: A method of managing instruction execution for multiple instruction streams using a processor core having multiple parallel instruction execution slices provides instruction processing flexibility. An event is detected indicating that either resource requirement or resource availability for a subsequent instruction of an instruction stream will not be met by the instruction execution slice currently executing the instruction stream. In response to detecting the event, dispatch of at least a portion of the subsequent instruction is made to another instruction execution slice. The event may be a compiler-inserted directive, may be an event detected by logic in the processor core, or may be determined by a thread sequencer. The execution slices may be dynamically reconfigured as between single-instruction-multiple-data (SIMD) instruction execution, ordinary instruction execution, wide instruction execution.
    Type: Application
    Filed: June 12, 2014
    Publication date: November 12, 2015
    Inventors: Lee Evan Eisen, Hung Qui Le, Jentje Leenstra, Jose Eduardo Moreira, Bruce Joseph Ronchetti, Brian William Thompto, Albert James Van Norstrand, JR.
  • Publication number: 20150324204
    Abstract: A processor core having multiple parallel instruction execution slices and coupled to multiple dispatch queues by a dispatch routing network provides flexible and efficient use of internal resources. The dispatch routing network is controlled to dynamically vary the relationship between the slices and instruction streams according to execution requirements for the instruction streams and the availability of resources in the instruction execution slices. The instruction execution slices may be dynamically reconfigured as between single-instruction-multiple-data (SIMD) instruction execution and ordinary instruction execution on a per-instruction basis, permitting the mixture of those instruction types. Instructions having an operand width greater than the width of a single instruction execution slice may be processed by multiple instruction execution slices configured to act in concert for the particular instructions.
    Type: Application
    Filed: May 12, 2014
    Publication date: November 12, 2015
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Lee Evan Eisen, Hung Qui Le, Jentje Leenstra, Jose Eduardo Moreira, Bruce Joseph Ronchetti, Brian William Thompto, Albert James Van Norstrand, JR.
  • Patent number: 8521998
    Abstract: A method and apparatus for tracking instructions in a processor. A completion unit in the processor receives an instruction group to add to a table to form a received instruction group. In response to receiving the received instruction group, the completion unit determines whether an entry is present that contains a previously stored instruction group in a first location and has space for storing the received instruction group. In response to the entry being present, the completion unit stores the received instruction group in a second location in the entry to form a stored instruction group.
    Type: Grant
    Filed: June 4, 2010
    Date of Patent: August 27, 2013
    Assignee: International Business Machines Corporation
    Inventors: Christopher Michael Abernathy, Hung Qui Le, Dung Quoc Nguyen, Benjamin Walter Stolt
  • Patent number: 8418180
    Abstract: A method, apparatus, and computer program product are disclosed for ensuring processing fairness in simultaneous multi-threading (SMT) microprocessors. A clock cycle priority is assigned to a first thread and to a second thread during a standard selection state that lasts for an expected number of clock cycles by selecting the first thread to be a primary thread and the second thread to be a secondary thread. If a condition exists that requires overriding, an override state is executed by selecting the second thread to be the primary thread and the first thread to be the secondary thread. The override state is forced to be executed for an override period of time which equals the expected number of clock cycles plus a forced number of clock cycles. The forced number of clock cycles is granted to the first thread in response to the first thread again becoming the primary thread.
    Type: Grant
    Filed: May 30, 2008
    Date of Patent: April 9, 2013
    Assignee: International Business Machines Corporation
    Inventors: James Wilson Bishop, Hung Qui Le, Dung Quoc Nguyen, Balaram Sinharoy, Brian William Thompto, Raymond Cheung Yeung
  • Patent number: 8347068
    Abstract: A multi-mode register rename mechanism which allows a simultaneous multi-threaded processor to support full out-of-order thread execution when the number of threads is low and in-order thread execution when the number of threads increases. Responsive to changing an execution mode of a processor to operate in in-order thread execution mode, the illustrative embodiments switch a physical register in the data processing system to an architected facility, thereby forming a switched physical register. When an instruction is issued to an execution unit, wherein the issued instruction comprises a thread bit, the thread bit is examined to determine if the instruction accesses an architected facility. If the issued instruction accesses an architected facility, the instruction is executed, and the results of the executed instruction are written to the switched physical register.
    Type: Grant
    Filed: April 4, 2007
    Date of Patent: January 1, 2013
    Assignee: International Business Machines Corporation
    Inventors: Richard James Eickemeyer, Hung Qui Le, Dung Quoc Nguyen, Balaram Sinharoy
  • Patent number: 8117403
    Abstract: A computing system uses specialized “Set Associative Transaction Tables” and additional “Summary Transaction Tables” to speed the processing of common transactional memory conflict cases and those which employ assist threads using an Address History Table and processes memory transactions with a Transaction Table in memory for parallel processing of multiple threads of execution by support of which an application need not be aware. Special instructions may mark the boundaries of a transaction and identify memory locations applicable to a transaction. A ‘private to transaction’ (PTRAN) tag, directly addressable as part of the main data storage memory location, enables a quick detection of potential conflicts with other transactions that are concurrently executing on another thread of said computing system. The tag indicates whether (or not) a data entry in memory is part of a speculative memory state of an uncommitted transaction that is currently active in the system.
    Type: Grant
    Filed: October 30, 2007
    Date of Patent: February 14, 2012
    Assignee: International Business Machines Corporation
    Inventors: Thomas J. Heller, Jr., Hung Qui Le
  • Patent number: 8108655
    Abstract: Issue logic identifies a simple fixed point instruction, included in a unified payload, which is ready to issue. The simple fixed point instruction is a type of instruction that is executable by both a fixed point execution unit and a load-store execution unit. In turn, the issue logic determines that the unified payload does not include a load-store instruction that is ready to issue. As a result, the issue logic issues the simple fixed point instruction to the load-store execution unit in response to determining that the simple fixed point instruction is ready to issue and determining that the unified payload does not include a load-store instruction that is ready to issue.
    Type: Grant
    Filed: March 24, 2009
    Date of Patent: January 31, 2012
    Assignee: International Business Machines Corporation
    Inventors: Christopher Michael Abernathy, James Wilson Bishop, Mary Douglass Brown, William Elton Burky, Robert Allen Cordes, Hung Qui Le, Dung Quoc Nguyen, Todd Alan Venton
  • Patent number: 7904661
    Abstract: A method of prefetching data in a microprocessor includes identifying a data stream associated with a process and determining a depth associated with the data stream based upon prefetch factors including the number of currently concurrent data streams and data consumption rates associated with the concurrent data streams. Data prefetch requests are allocated with the data stream to reflect the determined depth of the data stream. Allocating data prefetch requests may include allocating prefetch requests for a number of cache lines away from the cache line currently being referenced, wherein the number of cache lines is equal to the determined depth. The method may include, responsive to determining the depth associated with a data stream, configuring prefetch hardware to reflect the determined depth for the identified data stream. Prefetch control bits in an instruction executed by the processor control the prefetch hardware configuration.
    Type: Grant
    Filed: December 10, 2007
    Date of Patent: March 8, 2011
    Assignee: International Business Machines Corporation
    Inventors: Eric Jason Fluhr, Bradly George Frey, John Barry Griswell, Jr., Hung Qui Le, Cathy May, Francis Patrick O'Connell, Edward John Silha, Albert Thomas Williams
  • Patent number: 7877580
    Abstract: A method of handling program instructions in a microprocessor which reduces delays associated with mispredicted branch instructions, by detecting the occurrence of a stall condition during execution of the program instructions, speculatively executing one or more pending instructions which include at least one branch instruction during the stall condition, and determining the validity of data utilized by the speculative execution. Dispatch logic determines the validity of the data by marking one or more registers of an instruction dispatch unit to indicate which results of the pending instructions are invalid. The speculative execution of instructions can occur across multiple pipeline stages of the microprocessor, and the validity of the data is tracked during their execution in the multiple pipeline stages while monitoring a dependency of the speculatively executed instructions relative to one another during their execution in the multiple pipeline stages.
    Type: Grant
    Filed: December 10, 2007
    Date of Patent: January 25, 2011
    Assignee: International Business Machines Corporation
    Inventors: Richard James Eickemeyer, Hung Qui Le, Dung Quoc Nguyen, Benjamin Walter Stolt, Brian William Thompto
  • Patent number: 7827443
    Abstract: Recovery circuits react to errors in a processor core by waiting for an error-free completion of any pending store-conditional instruction or a cache-inhibited load before ceasing to checkpoint or backup progress of a processor core. Recovery circuits remove the processor core from the logical configuration of the symmetric multiprocessor system, potentially reducing propagation of errors to other parts of the system. The processor core is reset and the checkpointed values may be restored to registers of the processor core. The core processor is allowed not just to resume execution just prior to the instructions that failed to execute correctly the first time, but is allowed to operate in a reduced execution mode for a preprogrammed number of groups. If the preprogrammed number of instruction groups execute without error, the processor core is allowed to resume normal execution.
    Type: Grant
    Filed: November 13, 2008
    Date of Patent: November 2, 2010
    Assignee: International Business Machines Corporation
    Inventors: Susan Elizabeth Eisen, Hung Qui Le, Michael James Mack, Dung Quoc Nguyen, Jose Angel Paredes, Scott Barnett Swaney
  • Publication number: 20100250901
    Abstract: Issue logic identifies a simple fixed point instruction, included in a unified payload, which is ready to issue. The simple fixed point instruction is a type of instruction that is executable by both a fixed point execution unit and a load-store execution unit. In turn, the issue logic determines that the unified payload does not include a load-store instruction that is ready to issue. As a result, the issue logic issues the simple fixed point instruction to the load-store execution unit in response to determining that the simple fixed point instruction is ready to issue and determining that the unified payload does not include a load-store instruction that is ready to issue.
    Type: Application
    Filed: March 24, 2009
    Publication date: September 30, 2010
    Applicant: International Business Machines Corporation
    Inventors: Christopher Michael Abernathy, James Wilson Bishop, Mary Douglass Brown, William Elton Burky, Robert Allen Cordes, Hung Qui Le, Dung Quoc Nguyen, Todd Alan Venton