Patents by Inventor John G. Rell

John G. Rell has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7921279
    Abstract: Result and operand forwarding is provided between differently sized operands in a superscalar processor by grouping a first set of instructions for operand forwarding, and grouping a second set of instructions for result forwarding, the first set of instructions comprising a first source instruction having a first operand and a first dependent instruction having a second operand, the first dependent instruction depending from the first source instruction; the second set of instructions comprising a second source instruction having a third operand and a second dependent instruction having a fourth operand, the second dependent instruction depending from the second source instruction, performing operand forwarding by forwarding the first operand, either whole or in part, as it is being read to the first dependent instruction prior to execution; performing result forwarding by forwarding a result of the second source instruction, either whole or in part, to the second dependent instruction, after execution; wher
    Type: Grant
    Filed: March 19, 2008
    Date of Patent: April 5, 2011
    Assignee: International Business Machines Corporation
    Inventors: David S. Hutton, Fadi Y. Busaba, Bruce C. Giamei, Christopher A. Krygowski, Edward T. Malley, Jeffrey S. Plate, John G. Rell, Jr., Chung-Lung Kevin Shum, Timothy J. Slegel
  • Patent number: 7913068
    Abstract: A system and method for asynchronous dynamic millicode entry prediction in a processor are provided. The system includes a branch target buffer (BTB) to hold branch information. The branch information includes: a branch type indicating that the branch represents a millicode entry (mcentry) instruction targeting a millicode subroutine, and an instruction length code (ILC) associated with the mcentry instruction. The system also includes search logic to perform a method. The method includes locating a branch address in the BTB for the mcentry instruction targeting the millicode subroutine, and determining a return address to return from the millicode subroutine as a function of the an instruction address of the mcentry instruction and the ILC. The system further includes instruction fetch controls to fetch instructions of the millicode subroutine asynchronous to the search logic. The search logic may also operate asynchronous with respect to an instruction decode unit.
    Type: Grant
    Filed: February 21, 2008
    Date of Patent: March 22, 2011
    Assignee: International Business Machines Corporation
    Inventors: James J. Bonanno, Brian R. Prasky, John G. Rell, Jr., Anthony Saporito, Chung-Lung Kevin Shum
  • Patent number: 7913067
    Abstract: A system and method for overlapping execution (OE) of instructions through non-uniform execution pipelines in an in-order processor are provided. The system includes a first execution unit to perform instruction execution in a first execution pipeline. The system also includes a second execution unit to perform instruction execution in a second execution pipeline, where the second execution pipeline includes a greater number of stages than the first execution pipeline. The system further includes an instruction dispatch unit (IDU), the IDU including OE registers and logic for dispatching an OE-capable instruction to the first execution unit such that the instruction completes execution prior to completing execution of a previously dispatched instruction to the second execution unit. The system additionally includes a latch to hold a result of the execution of the OE-capable instruction until after the second execution unit completes the execution of the previously dispatched instruction.
    Type: Grant
    Filed: February 20, 2008
    Date of Patent: March 22, 2011
    Assignee: International Business Machines Corporation
    Inventors: David S. Hutton, Khary J. Alexander, Fadi Y. Busaba, Bruce C. Giamei, John G. Rell, Jr., Eric M. Schwarz, Chung-Lung Kevin Shum
  • Patent number: 7853635
    Abstract: A system for binary multiplication in a superscalar processor includes a first pipeline, an execution unit, and a first multiplexer; a first rotator in communication with one register of the first pipeline and the execution unit; and a leading zero detection register in communication with the execution unit and another register of the first pipeline; a second pipeline, a second execution unit, and a second multiplexer; a rotator in communication with one register of the second pipeline and the second execution unit; and a leading zero detection register in communication with the second execution unit and another register of the first pipeline; and a third pipeline, a binary multiplier in communication with a pair registers of the third pipeline; a general register; an operand buffer for obtaining first and second operands; and a bus for communication between the pipelines, the general register and the operand buffer.
    Type: Grant
    Filed: May 16, 2007
    Date of Patent: December 14, 2010
    Assignee: International Business Machines Corporation
    Inventors: Fadi Y. Busaba, Steven R. Carlough, David S. Hutton, Christopher A. Krygowski, John G. Rell, Jr., Sheryll H. Veneracion
  • Publication number: 20090240922
    Abstract: Result and operand forwarding is provided between differently sized operands in a superscalar processor by grouping a first set of instructions for operand forwarding, and grouping a second set of instructions for result forwarding, the first set of instructions comprising a first source instruction having a first operand and a first dependent instruction having a second operand, the first dependent instruction depending from the first source instruction; the second set of instructions comprising a second source instruction having a third operand and a second dependent instruction having a fourth operand, the second dependent instruction depending from the second source instruction, performing operand forwarding by forwarding the first operand, either whole or in part, as it is being read to the first dependent instruction prior to execution; performing result forwarding by forwarding a result of the second source instruction, either whole or in part, to the second dependent instruction, after execution; wher
    Type: Application
    Filed: March 19, 2008
    Publication date: September 24, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: David S. Hutton, Fadi Y. Busaba, Bruce C. Giamei, Christopher A. Krygowski, Edward T. Malley, Jeffrey S. Plate, John G. Rell, JR., Chung-Lung Kevin Shum, Timothy J. Slegel
  • Publication number: 20090240914
    Abstract: A pipelined microprocessor configured for long operand instructions is disclosed. The microprocessor includes a memory unit and a load-store unit. The load store unit is coupled to the memory unit and includes a data formatter receiving information from the memory unit and including an operand selector and a shift register portion. The microprocessor also includes an execution unit coupled to the load-store unit and receiving operand information there from. The execution unit includes output latches coupled to a storage location within the execution unit for storing output information from the execution unit.
    Type: Application
    Filed: March 19, 2008
    Publication date: September 24, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Edward T. Malley, Khary J. Alexander, Fadi Y. Busaba, Vimal M. Kapadia, Jeffrey S. Plate, John G. Rell, JR., Chung-Lung Kevin Shum
  • Publication number: 20090241084
    Abstract: Systems, methods and computer program products for exploiting orthogonal control vectors in timing driven systems. An exemplary embodiment includes running an initial logic synthesis run on the system, identifying critical inputs to a logic cone related to the run, identifying orthogonal vectors in the logic cone, adding vectors to the logic cone, obtaining logical solutions and selecting a solution from the logical solutions.
    Type: Application
    Filed: March 19, 2008
    Publication date: September 24, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Edward T. Malley, Fadi Y. Busaba, David S. Hutton, Christopher A. Krygowski, Jeffrey S. Plate, John G. Rell
  • Publication number: 20090240918
    Abstract: Eliminating or reducing an operand line crossing penalty by performing an initial fetch for an operand from a data cache of a processor. The initial fetch is performed by allowing or permitting the initial fetch to occur unaligned with reference to a quadword boundary. A plurality of subsequent fetches for a corresponding plurality of operands from the data cache are performed wherein each of the plurality of subsequent fetches is aligned to any of a plurality of quadword boundaries to prevent each of a plurality of individual fetch requests from spanning a plurality of lines in the data cache. A steady stream of data is maintained by placing an operand buffer at an output of the data cache to store and merge data from the initial fetch and the plurality of subsequent fetches, and to return the stored and merged data to the processor.
    Type: Application
    Filed: March 19, 2008
    Publication date: September 24, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Vimal M. Kapadia, Fadi Y. Busaba, Edward T. Malley, John G. Rell, JR., Chung-Lung Kevin Shum
  • Publication number: 20090240929
    Abstract: A method, system, and computer program product for reduced overhead address mode change management in a pipelined, recycling microprocessor are provided. The recycling microprocessor includes logic executing thereon. The microprocessor also includes an instruction fetch unit (IFU) supporting computation of address adds in selected address modes and reporting non-equal comparison of the computation to the logic. The microprocessor further includes a fixed point unit determining whether the mode has changed and reporting changes to the logic. Upon determining the comparison yields an equal result but the mode has changed, a recycle event is triggered to ensure subsequent ofetches are relaunched in the correct mode and that no execution writebacks occur from work performed in an incorrect mode.
    Type: Application
    Filed: March 19, 2008
    Publication date: September 24, 2009
    Applicant: International Business Machines Corporation
    Inventors: David S. Hutton, Michael Billeci, Fadi Y. Busaba, Brian R. Prasky, John G. Rell, JR., Chung-Lung Kevin Shum, Charles F. Webb
  • Publication number: 20090217002
    Abstract: A system and method for asynchronous dynamic millicode entry prediction in a processor are provided. The system includes a branch target buffer (BTB) to hold branch information. The branch information includes: a branch type indicating that the branch represents a millicode entry (mcentry) instruction targeting a millicode subroutine, and an instruction length code (ILC) associated with the mcentry instruction. The system also includes search logic to perform a method. The method includes locating a branch address in the BTB for the mcentry instruction targeting the millicode subroutine, and determining a return address to return from the millicode subroutine as a function of the an instruction address of the mcentry instruction and the ILC. The system further includes instruction fetch controls to fetch instructions of the millicode subroutine asynchronous to the search logic. The search logic may also operate asynchronous with respect to an instruction decode unit.
    Type: Application
    Filed: February 21, 2008
    Publication date: August 27, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: James J. Bonanno, Brian R. Prasky, John G. Rell, JR., Anthony Saporito, Chung-Lung Kevin Shum
  • Publication number: 20090210656
    Abstract: A system and method for overlapping execution (OE) of instructions through non-uniform execution pipelines in an in-order processor are provided. The system includes a first execution unit to perform instruction execution in a first execution pipeline. The system also includes a second execution unit to perform instruction execution in a second execution pipeline, where the second execution pipeline includes a greater number of stages than the first execution pipeline. The system further includes an instruction dispatch unit (IDU), the IDU including OE registers and logic for dispatching an OE-capable instruction to the first execution unit such that the instruction completes execution prior to completing execution of a previously dispatched instruction to the second execution unit. The system additionally includes a latch to hold a result of the execution of the OE-capable instruction until after the second execution unit completes the execution of the previously dispatched instruction.
    Type: Application
    Filed: February 20, 2008
    Publication date: August 20, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: David S. Hutton, Khary J. Alexander, Fadi Y. Busaba, Bruce C. Giamei, John G. Rell, JR., Eric M. Schwarz, Chung-Lung Kevin Shum
  • Patent number: 7490121
    Abstract: A method of implementing binary multiplication in a processing device includes obtaining a multiplicand and a multiplier from a storage device; in the event the multiplier is larger than a selected length, partitioning the multiplier into a plurality of multiplier subgroups; in the event the multiplicand is larger than a selected length, partitioning the multiplicand into a plurality of multiplicand subgroups and at least one of zeroing out of unused bits of the multiplicand subgroup and sign-extending a smaller portion of the multiplicand subgroup; establishing a plurality of multiplicand multiples based on at least one of a selected multiplicand subgroup of the plurality of multiplicand subgroups and the multiplicand; selecting one or more of the multiplicand multiples of the plurality of multiplicand multiples based on the each multiplier subgroup of the plurality of multiplier subgroups; and generating a first modular product based on the selected multiplicand multiples.
    Type: Grant
    Filed: May 16, 2007
    Date of Patent: February 10, 2009
    Assignee: International Business Machines Corporation
    Inventors: Fadi Y. Busaba, Steven R. Carlough, David S. Hutton, Christopher A. Krygowski, John G. Rell, Jr., Sheryll H. Veneracion
  • Patent number: 7412476
    Abstract: A method for decimal multiplication in a superscaler processor comprising: obtaining a first operand and a second operand; establishing a multiplier and an effective multiplicand from the first operand and the second operand; and generating and accumulating a partial product term every two cycles. The partial product terms are created from the effective multiplicand and multiples of the multiplier, where the effective multiplicand is stored in a first register file, the multiples being ones times the effective multiplier, two times the effective multiplier, four times the effective multiplier and eight times the effective multiplier and the partial product terms are added to an accumulation of previous partial product terms shifted one digit right such that a digit shifted off is preserved as a result digit.
    Type: Grant
    Filed: July 27, 2006
    Date of Patent: August 12, 2008
    Assignee: International Business Machines Corporation
    Inventors: Fadi Y. Busaba, Steven R. Carlough, Christopher A. Krygowski, John G. Rell, Jr.
  • Patent number: 7266580
    Abstract: A method and apparatuses for performing binary multiplication on signed and unsigned operands of various lengths is discussed herein. It is a concept that may be split into two parts, the first of which is the multiplication hardware itself, a compact, less than-full sized multiplier employing Booth or other type of recoding methods upon the multiplier to reduce the number of partial products per scan, and implemented in such a manner so that a multiplication operation with large operands may be broken into subgroups of operations that will fit into this mid-sized multiplier whose results, here called modular products, may be knitted back together to form a correct, final product. The second part of the concept is the supporting hardware used to separate the operands into subgroups and input the data and control signals to the multiplier, and the algorithms and apparatuses used to align and combine the modular products properly to obtain the final product.
    Type: Grant
    Filed: May 12, 2003
    Date of Patent: September 4, 2007
    Assignee: International Business Machines Corporation
    Inventors: Fadi Y. Busaba, Steven R. Carlough, David S. Hutton, Christopher A. Krygowski, John G. Rell, Jr., Sheryll H. Veneracion
  • Patent number: 7200742
    Abstract: A method for creating precise exceptions including checkpointing an exception causing instruction. The checkpointing results in a current checkpointed state. The current checkpointed state is locked. It is determined if any of a plurality of registers require restoration to the current checkpointed state. One or more of the registers are restored to the current checkpointed state in response to the results of the determining indicating that the one or more registers require the restoring. The execution unit is restarted at the exception handler or the next sequential instruction dependent on whether traps are enabled for the exception.
    Type: Grant
    Filed: February 10, 2005
    Date of Patent: April 3, 2007
    Assignee: International Business Machines Corporation
    Inventors: Fadi Y. Busaba, Michael J. Mack, John G. Rell, Jr., Eric M. Schwarz, Chung-Lung K. Shum, Timothy J. Slegel, Scott B. Swaney, Sheryll H. Veneracion
  • Patent number: 7167968
    Abstract: A method of pre-aligning data for storage during instruction execution improves performance by eliminating the cycles otherwise required for data alignment. The method can convert data between ASCII and Packed Decimal format, and between Unicode Basic Latin and Packed Decimal format. Conversion to Packed Decimal format is needed for decimal hardware in a microprocessor designed to generate decimal results. Converting from Packed Decimal to ASCII and Unicode Basic Latin is necessary to report Decimal Arithmetic results in a required format for the application program. To further improve performance, all available write ports in the fixed point unit (FXU) are utilized to reduce the number of cycles necessary to store results. To prevent data fetching of the unused destination data from slowing down instruction execution, the destination locations are tested for storage access exceptions, but the data for these operands are not actually fetched.
    Type: Grant
    Filed: April 29, 2004
    Date of Patent: January 23, 2007
    Assignee: International Business Machines Corporation
    Inventors: Fadi Y. Busaba, Steven R. Carlough, Mark A. Check, Christopher A. Krygowski, John G. Rell, Jr., Frank Tanzi
  • Patent number: 7167889
    Abstract: A method for decimal multiplication in a superscaler processor comprising: obtaining a first operand and a second operand; establishing a multiplier and an effective multiplicand from the first operand and the second operand; and generating and accumulating a partial product term every two cycles. The partial product terms are created from the effective multiplicand and multiples of the multiplier, where the effective multiplicand is stored in a first register file, the multiples being ones times the effective multiplier, two times the effective multiplier, four times the effective multiplier and eight times the effective multiplier and the partial product terms are added to an accumulation of previous partial product terms shifted one digit right such that a digit shifted off is preserved as a result digit.
    Type: Grant
    Filed: May 12, 2003
    Date of Patent: January 23, 2007
    Assignee: International Business Machines Corporation
    Inventors: Fadi Y. Busaba, Steven R. Carlough, Christopher A. Krygowski, John G. Rell, Jr.
  • Patent number: 7149767
    Abstract: A method of decimal division in a superscalar processor comprising: obtaining a first operand and a second operand; establishing a dividend and a divisor from the first operand and the second operand; determining a quotient digit and a resulting partial remainder; based on multiple parallel/simultaneous subtractions of at least one of the divisor and a multiple of the divisor from the dividend, utilizing dataflow elements of multiple execution pipes of the superscalar processor.
    Type: Grant
    Filed: May 12, 2003
    Date of Patent: December 12, 2006
    Assignee: International Business Machines Corporation
    Inventors: Fadi Y. Busaba, Steven R. Carlough, Christopher A. Krygowski, John G. Rell, Jr.
  • Patent number: 7085917
    Abstract: In a computer system, a method and apparatus for dispatching and executing multi-cycle and complex instructions. The method results in maximum performance for such without impacting other areas in the processor such as decode, grouping or dispatch units. This invention allows multi-cycle and complex instructions to be dispatched to one port but executed in multiple execution pipes without cracking the instruction and without limiting it to a single execution pipe. Some control signals are generated in the dispatch unit and dispatched with the instruction to the Fixed Point Unit (FXU). The FXU logic then execute these instructions on the available FXU pipes. This method results in optimum performance with little or no other complications. The presented technique places the flexibility of how these instructions will be executed in the FXU, where the actual execution takes place, instead of in the instruction decode or dispatch units or cracking by the compiler.
    Type: Grant
    Filed: May 12, 2003
    Date of Patent: August 1, 2006
    Assignee: International Business Machines Corporation
    Inventors: Fadi Y. Busaba, Steven R. Carlough, Christopher A. Krygowski, John G. Rell, Jr., Timothy J. Slegel
  • Publication number: 20040230626
    Abstract: In a computer system, a method for executing a Test under Mask instruction in the Fixed Execution Unit (FXU) allows for the execution of these instructions in just one cycle single execution cycle inside the FXU without adding any dedicated data flow circuitry by giving the highest priority to the leftmost selected bit in the operand.
    Type: Application
    Filed: May 12, 2003
    Publication date: November 18, 2004
    Applicant: International Business Machines Corporation
    Inventors: Fadi Y. Busaba, Steven R. Carlough, Christopher A. Krygowski, Wen H. Li, John G. Rell,