Patents by Inventor Brian W. Thompto

Brian W. Thompto has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20220050682
    Abstract: A computer system, processor, and method for processing information is disclosed that includes at least one computer processor; a main register file associated with the at least one processor, the main register file having a plurality of entries for storing data, one or more write ports to write data to the main register file entries, and one or more read ports to read data from the main register file entries; one or more execution units including a dense math execution unit; and at least one accumulator register file having a plurality of entries for storing data. The results of the dense math execution unit in an aspect are written to the accumulator register file, preferably to the same accumulator register file entry multiple times, and the data from the accumulator register file is written to the main register file.
    Type: Application
    Filed: August 27, 2021
    Publication date: February 17, 2022
    Inventors: Brian W. Thompto, Maarten J. Boersma, Andreas Wagner, Jose E. Moreira, Hung Q. Le, Silvia Melitta Mueller, Dung Q. Nguyen
  • Publication number: 20220050684
    Abstract: Load store addressing can include a processor, which fuses two consecutive instruction determined to be prefix instructions and treats the two instructions as a single fused instruction. The prefix instruction of the fused instruction is auto-finished at dispatch time in an issue unit of the processor. A suffix instruction of the fused instruction and its fields and the prefix instruction's fields are issued from an issue queue of the issue unit, wherein an opcode of the suffix instruction is issued to a load store unit of the processor, and fields of the fused instruction are issued to the execution unit of the processor. The execution unit forms operands of the suffix instruction, at least one operand formed based on a current instruction address of the single fused instruction. The load store unit executes the suffix instruction using the operands formed by the execution unit.
    Type: Application
    Filed: August 14, 2020
    Publication date: February 17, 2022
    Inventors: Nicholas R. Orzol, Christian Gerhard Zoellin, Brian W. Thompto, Dung Q. Nguyen, Niels Fricke, Sheldon Bernard Levenstein, Phillip G. Williams, Brian D. Barrick
  • Publication number: 20220050679
    Abstract: A system, processor, and/or technique configured to: determine whether two or more load instructions are fusible for execution in a load store unit as a fused load instruction; in response to determining that two or more load instructions are fusible, transmit information to process the two or more fusible load instructions into a single entry of an issue queue; issue the information to process the two or more fusible load instructions from the single entry in the issue queue as a fused load instruction to the load store unit using a single issue port of the issue queue, wherein the fused load instruction contains the information to process the two or more fusible load instructions; execute the fused load instruction in the load store unit; and write back data obtained by executing the fused load instruction simultaneously to multiple entries in the register file.
    Type: Application
    Filed: August 14, 2020
    Publication date: February 17, 2022
    Inventors: Bryan Lloyd, Brian W. Thompto, Dung Q. Nguyen, Sheldon Bernard Levenstein, Brian D. Barrick, Christian Gerhard Zoellin
  • Patent number: 11249757
    Abstract: A system, processor, and/or technique configured to: determine whether two or more load instructions are fusible for execution in a load store unit as a fused load instruction; in response to determining that two or more load instructions are fusible, transmit information to process the two or more fusible load instructions into a single entry of an issue queue; issue the information to process the two or more fusible load instructions from the single entry in the issue queue as a fused load instruction to the load store unit using a single issue port of the issue queue, wherein the fused load instruction contains the information to process the two or more fusible load instructions; execute the fused load instruction in the load store unit; and write back data obtained by executing the fused load instruction simultaneously to multiple entries in the register file.
    Type: Grant
    Filed: August 14, 2020
    Date of Patent: February 15, 2022
    Assignee: International Business Machines Corporation
    Inventors: Bryan Lloyd, Brian W. Thompto, Dung Q. Nguyen, Sheldon Bernard Levenstein, Brian D. Barrick, Christian Gerhard Zoellin
  • Publication number: 20220035634
    Abstract: Technology for fusing certain load instructions and compare-immediate instructions in a computer processor having a load-store architecture with respect to transferring data between memory and registers of the computer processor. In some embodiments the load and compare-immediate instructions are consecutive. In some embodiments, the instructions are only merged if: (i) the respective RA and RT fields of the two instructions match; (ii) the immediate field of the compare-immediate instruction has a certain value, or falls within a range of certain values; and/or (iii) the instructions are received in a consecutive manner.
    Type: Application
    Filed: July 29, 2020
    Publication date: February 3, 2022
    Inventors: Bryan Lloyd, David A. Hrusecky, Sundeep Chadha, Dung Q. Nguyen, Christian Gerhard Zoellin, Brian W. Thompto, Sheldon Bernard Levenstein, Phillip G. Williams
  • Publication number: 20220035636
    Abstract: A method of instruction dispatch routing comprises receiving an instruction for dispatch to one of a plurality of issue queues; determining a priority status of the instruction; selecting a rotation order based on the priority status, wherein a first rotation order is associated with priority instructions and a second rotation order, different from the first rotation order, is associated with non-priority instructions; selecting an issue queue of the plurality of issue queues based on the selected rotation order; and dispatching the instruction to the selected issue queue.
    Type: Application
    Filed: July 31, 2020
    Publication date: February 3, 2022
    Inventors: Eric Mark Schwarz, Brian W. Thompto, Kurt A. Feiste, Michael Joseph Genden, Dung Q. Nguyen, Susan E. Eisen
  • Publication number: 20220035637
    Abstract: A system and method for avoiding write back collisions. The system receives a plurality of instructions at a pipeline queue. Next an issue queue determines a number of cycles for each instruction of the plurality of instructions. The issue queue further determines if a collision will occur between at least two of the instructions. Additionally, the system determines in response to a collision between at least two of the instructions, a number of cycles to delay at least one of the at least two instructions. The instructions are then executed. The system then places the results of the instruction for instructions that had a calculated delay in a result buffer for the determined number of cycles of delay. After the determined number of cycles of delay, the system sends the results to a results mux. Once received at the results mux the results are written back to the register file.
    Type: Application
    Filed: July 30, 2020
    Publication date: February 3, 2022
    Inventors: Brian D. Barrick, Maarten J. Boersma, Niels Fricke, Dung Q. Nguyen, Brian W. Thompto, Andreas Wagner
  • Publication number: 20220019436
    Abstract: Provided is a method for fusing store instructions in a microprocessor. The method includes identifying two instructions in an execution pipeline of a microprocessor. The method further includes determining that the two instructions meet a fusion criteria. In response to determining that the two instructions meet the fusion criteria, the two instructions are recoded into a fused instruction. The fused instruction is executed.
    Type: Application
    Filed: July 20, 2020
    Publication date: January 20, 2022
    Inventors: Bryan Lloyd, Sundeep Chadha, Dung Q. Nguyen, Christian Gerhard Zoellin, Brian W. Thompto, Sheldon Bernard Levenstein, Phillip G. Williams, Robert A. Cordes, Brian Chen
  • Patent number: 11226817
    Abstract: A set of dependence relationships in a set of program instructions is detected by a processor. The set of dependence relationships comprises a first load instruction to load a first data object and a second load instruction to load a second data object from a second address that is provided by address data within the first data object. The processor identifies a number of dependence instances in the set of dependence relationships and determines that the number is over a pattern threshold. The processor sends an enhanced load request to a memory controller. The enhanced load request comprises instructions to load the first data object from a first address on a physical page, locate address data in the first data object based on a memory offset, load the second data object from a second address in the address data, and transmit the first and second data objects to the processor.
    Type: Grant
    Filed: July 11, 2019
    Date of Patent: January 18, 2022
    Assignee: International Business Machines Corporation
    Inventors: Mohit Karve, Donald R. Stence, John B. Griswell, Jr., Brian W. Thompto
  • Publication number: 20220012183
    Abstract: An information handling system and method for translating virtual addresses to real addresses including a processor for processing data; memory devices for storing the data; and a memory controller configured to control accesses to the memory devices, where the processor is configured, in response to a request to translate a first virtual address to a second physical address, to send from the processor to the memory controller a page directory base and a plurality of memory offsets. The memory controller is configured to: read from the memory devices a first level page directory table using the page directory base and a first level memory offset; combine the first level page directory table with a second level memory offset; and read from the memory devices a second level page directory table using the first level page directory table and the second level memory offset.
    Type: Application
    Filed: September 23, 2021
    Publication date: January 13, 2022
    Inventors: Mohit Karve, Brian W. Thompto
  • Publication number: 20220004386
    Abstract: Aspects include a compute array of a processor with mixed-precision numerical linear algebra support. A first precision and a first shape of a first input matrix and a second precision and a second shape of a second input matrix to the compute array are determined. A plurality of linear algebra operations is repeated in parallel within the compute array to update a result matrix in an accumulator register based on the first input matrix, the second input matrix, and a number of rank updates of the result matrix to store in the accumulator register.
    Type: Application
    Filed: September 21, 2021
    Publication date: January 6, 2022
    Inventors: Jose E. Moreira, Brett Olsson, Brian W. Thompto, Silvia Melitta Mueller, Andreas Wagner
  • Patent number: 11188340
    Abstract: Techniques for parallel execution of instructions in an instruction set are described. The techniques include determining a plurality of instruction streams and paths for a branch in an instruction set and executing the determined paths in parallel such that a mis-predicted path does not cause significant mis-prediction penalties.
    Type: Grant
    Filed: December 20, 2018
    Date of Patent: November 30, 2021
    Assignee: International Business Machines Corporation
    Inventors: Brian W. Thompto, Hung Q. Le, Dung Q. Nguyen
  • Patent number: 11188328
    Abstract: Aspects include a compute array of a processor with mixed-precision numerical linear algebra support. A first precision and a first shape of a first input matrix and a second precision and a second shape of a second input matrix to the compute array are determined. A number of rank updates of a result matrix to store in an accumulator register having a predetermined size are determined, where the number of rank updates is based on the first precision and the first shape of the first input matrix, the second precision and the second shape of the second input matrix, and the predetermined size of the accumulator register. A plurality of linear algebra operations is repeated in parallel within the compute array to update the result matrix in the accumulator register based on the first input matrix, the second input matrix, and the number of rank updates.
    Type: Grant
    Filed: December 12, 2019
    Date of Patent: November 30, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Jose E. Moreira, Brett Olsson, Brian W. Thompto, Silvia Melitta Mueller, Andreas Wagner
  • Patent number: 11182458
    Abstract: Embodiments of the present invention are directed to a new instruction set extension and a method for providing 3D lane predication for matrix operations. In a non-limiting embodiment of the invention, a first input matrix having m rows and k columns and a second input matrix having k rows and n columns are received by a compute array of a processor. A three-dimensional predicate mask having an M-bit row mask, an N-bit column mask, and a K-bit rank mask is generated. A result matrix of up to m rows, up to n columns, and up to k rank updates is determined based on the first input matrix, the second input matrix, and the predicate mask.
    Type: Grant
    Filed: December 12, 2019
    Date of Patent: November 23, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Brett Olsson, Brian W. Thompto, Jose E. Moreira, Silvia Melitta Mueller, Andreas Wagner
  • Patent number: 11182164
    Abstract: Support for instruction fusion is provided. An indication whether an instruction is a paired instruction is received from an instruction decoder. Based on the indication, one dispatch slot or a paired dispatch slot is allocated in the instruction dispatcher queue. A mapper converts logical addresses of sources and targets of the instruction to physical addresses. Either one issue slot or a paired issue slot is allocated in an issue queue based on the indication from the instruction decoder. The instruction execution environment is loaded into the issue queue and issued to an execution unit.
    Type: Grant
    Filed: July 23, 2020
    Date of Patent: November 23, 2021
    Assignee: International Business Machines Corporation
    Inventors: Brian D. Barrick, John B. Griswell, Jr., Dung Q. Nguyen, Brian W. Thompto
  • Publication number: 20210342268
    Abstract: In at least one embodiment, a processing unit includes a processor core and a vertical cache hierarchy including at least a store-through upper-level cache and a store-in lower-level cache. The upper-level cache includes a data array and an effective address (EA) directory. The processor core includes an execution unit, an address translation unit, and a prefetch unit configured to initiate allocation of a directory entry in the EA directory for a store target EA without prefetching a cache line of data into the corresponding data entry in the data array. The processor core caches in the directory entry an EA-to-RA address translation information for the store target EA, such that a subsequent demand store access that hits in the directory entry can avoid a performance penalty associated with address translation by the translation unit.
    Type: Application
    Filed: April 1, 2021
    Publication date: November 4, 2021
    Inventors: Bryan Lloyd, Brian W. Thompto, George W. Rohrbaugh, III, Mohit Karve, Vivek Britto
  • Publication number: 20210342150
    Abstract: In at least one embodiment, a processor includes architected register file and non-architected register files for buffering operands. The processor additionally includes an instruction fetch unit that fetches instructions to be executed and at least one execution unit. The at least one execution unit is configured to execute a first class of instructions that access operands in the architected register file and a second class of instructions that access operands in the non-architected register file. The processor also includes a mapper circuit that assigns physical registers to the instructions for buffering of operands. The processor additionally includes a dispatch circuit configured, based on detection of an instruction in one of the first and second classes of instructions for which correct operands do not reside in a respective one of the architected and non-architected register files, to automatically initiate transfer of operands between the architected and non-architected register files.
    Type: Application
    Filed: December 14, 2020
    Publication date: November 4, 2021
    Inventors: Steven J. Battle, Kurt A. Feiste, Susan E. Eisen, Dung Q. Nguyen, Christian Gerhard Zoellin, Kent Li, Brian W. Thompto, Dhivya Jeganathan, Kenneth L. Ward, Brian D. Barrick
  • Patent number: 11163571
    Abstract: Technology for fusing an add-immediate instruction with a load-immediate instruction (or store-immediate instruction) in a microprocessor. This can result in quicker address generation while performing a load and store operation.
    Type: Grant
    Filed: July 29, 2020
    Date of Patent: November 2, 2021
    Assignee: International Business Machines Corporation
    Inventors: Brian D. Barrick, Sundeep Chadha, Sheldon Bernard Levenstein, Phillip G. Williams, Niels Fricke, Dung Q. Nguyen, Brian W. Thompto, Christian Gerhard Zoellin
  • Patent number: 11163695
    Abstract: An information handling system and method for translating virtual addresses to real addresses including a processor for processing data; memory devices for storing the data; and a memory controller configured to control accesses to the memory devices, where the processor is configured, in response to a request to translate a first virtual address to a second physical address, to send from the processor to the memory controller a page directory base and a plurality of memory offsets. The memory controller is configured to: read from the memory devices a first level page directory table using the page directory base and a first level memory offset; combine the first level page directory table with a second level memory offset; and read from the memory devices a second level page directory table using the first level page directory table and the second level memory offset.
    Type: Grant
    Filed: December 3, 2019
    Date of Patent: November 2, 2021
    Assignee: International Business Machines Corporation
    Inventors: Mohit Karve, Brian W. Thompto
  • Patent number: 11163577
    Abstract: A processor reads at least one instruction comprising at least one of a branch instruction and a non-branch instruction. In response to the branch instruction comprising a conditional branch instruction and set in dynamic mode, the processor dynamically predicts a branch path as taken or not taken. The processor, in response to the instruction fetch unit set in static mode for a conditional branch instruction and static branch prediction setting bits received with the conditional branch instruction specifying static branch prediction, statically sets the branch path as taken or not taken according to the static branch prediction setting bits received with the branch instruction. The processor selectively sets the operation of the processor temporarily from the dynamic mode to the static mode only in response to detecting a type of the at least one instruction matches a type of instruction qualifying to trigger static branch prediction.
    Type: Grant
    Filed: November 26, 2018
    Date of Patent: November 2, 2021
    Assignee: International Business Machines Corporation
    Inventors: Sheldon Levenstein, Brian W. Thompto, David S. Levitan