Patents by Inventor Jose E. Moreira

Jose E. Moreira has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11900116
    Abstract: A system may determine that two instructions may be combined based on a processing power of the processor and a size of the instructions, fuse the two instructions into a pair, map the two instructions with a single register tag, write the register tag into a mapper with bits indicating that the register tag is for a first instruction of the two instructions, write the register tag into the mapper with bits indicating that the register tag is for a second instruction of the two instructions, write the fused instruction pair into an issue queue, issue the fused instruction pair to a vector-scalar transformation units (VSU), and execute the two instructions.
    Type: Grant
    Filed: September 29, 2021
    Date of Patent: February 13, 2024
    Assignee: International Business Machines Corporation
    Inventors: Dung Q. Nguyen, Brian W. Thompto, Jose E. Moreira, Jessica Hui-Chun Tseng, Pratap C. Pattnaik, Kattamuri Ekanadham, Manoj Kumar
  • Patent number: 11868275
    Abstract: Aspects of the present disclosure relate to encrypted data processing (EDAP). A processor includes a register file configured to store ciphertext data, an instruction fetch and decode unit configured to fetch and decode instructions, and a functional unit configured to process the stored ciphertext data. The functional unit further includes a decryption module configured to decrypt ciphertext data from the register file to receive cleartext data using an encryption key stored within the functional unit. The functional unit further includes a local buffer configured to store the cleartext data. The functional unit further includes an arithmetic logical unit configured to generate cleartext computation results using the cleartext data The functional unit further includes an encryption module configured to encrypt the cleartext computation results to generate ciphertext computation results for storage back into the register file.
    Type: Grant
    Filed: June 24, 2021
    Date of Patent: January 9, 2024
    Assignee: International Business Machines Corporation
    Inventors: Manoj Kumar, Gianfranco Bilardi, Kattamuri Ekanadham, Jose E. Moreira, Pratap C. Pattnaik, Jessica Hui-Chun Tseng
  • Patent number: 11836493
    Abstract: Embodiments for providing memory access operations for graph analytics by a processor are disclosed. An entire chunk of load and store instructions may be atomically and concurrently executed, where the entire chunk of the load and store instructions are delineated from a plurality of alternative load and store instructions.
    Type: Grant
    Filed: February 10, 2022
    Date of Patent: December 5, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Manoj Kumar, Gianfranco Bilardi, Kattamuri Ekanadham, Jose E. Moreira, Pratap C. Pattnaik, Jessica Hui-Chun Tseng
  • Publication number: 20230367597
    Abstract: A computer system, processor, and method for processing information is disclosed that includes at least one computer processor; a main register file associated with the at least one processor, the main register file having a plurality of entries for storing data, one or more write ports to write data to the main register file entries, and one or more read ports to read data from the main register file entries; one or more execution units including a dense math execution unit; and at least one accumulator register file having a plurality of entries for storing data. The results of the dense math execution unit in an aspect are written to the accumulator register file, preferably to the same accumulator register file entry multiple times, and the data from the accumulator register file is written to the main register file.
    Type: Application
    Filed: July 28, 2023
    Publication date: November 16, 2023
    Inventors: Brian W. Thompto, Maarten J. Boersma, Andreas Wagner, Jose E. Moreira, Hung Q. Le, Silvia Melitta Mueller, Dung Q. Nguyen
  • Patent number: 11755320
    Abstract: Aspects include a compute array of a processor with mixed-precision numerical linear algebra support. A first precision and a first shape of a first input matrix and a second precision and a second shape of a second input matrix to the compute array are determined. A plurality of linear algebra operations is repeated in parallel within the compute array to update a result matrix in an accumulator register based on the first input matrix, the second input matrix, and a number of rank updates of the result matrix to store in the accumulator register.
    Type: Grant
    Filed: September 21, 2021
    Date of Patent: September 12, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Jose E. Moreira, Brett Olsson, Brian W. Thompto, Silvia Melitta Mueller, Andreas Wagner
  • Patent number: 11755325
    Abstract: A computer system, processor, and method for processing information is disclosed that includes at least one computer processor; a main register file associated with the at least one processor, the main register file having a plurality of entries for storing data, one or more write ports to write data to the main register file entries, and one or more read ports to read data from the main register file entries; one or more execution units including a dense math execution unit; and at least one accumulator register file having a plurality of entries for storing data. The results of the dense math execution unit in an aspect are written to the accumulator register file, preferably to the same accumulator register file entry multiple times, and the data from the accumulator register file is written to the main register file.
    Type: Grant
    Filed: August 27, 2021
    Date of Patent: September 12, 2023
    Assignee: International Business Machines Corporation
    Inventors: Brian W. Thompto, Maarten J. Boersma, Andreas Wagner, Jose E. Moreira, Hung Q. Le, Silvia Melitta Mueller, Dung Q. Nguyen
  • Publication number: 20230251862
    Abstract: Embodiments for providing memory access operations for graph analytics by a processor are disclosed. An entire chunk of load and store instructions may be atomically and concurrently executed, where the entire chunk of the load and store instructions are delineated from a plurality of alternative load and store instructions.
    Type: Application
    Filed: February 10, 2022
    Publication date: August 10, 2023
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Manoj KUMAR, Gianfranco BILARDI, Kattamuri EKANADHAM, Jose E. MOREIRA, Pratap C. PATTNAIK, Jessica Hui-Chun TSENG
  • Patent number: 11663009
    Abstract: A Reduced Instruction Set Computer (“RISC”) supporting large-word operations in a computing environment is disclosed. In one implementation, in response to receiving one or more control signals from a central processing unit (“CPU”), a set of operations are executed on a state of a special purpose execution unit (“SPU”) having a plurality of SPU registers, the SPU being associated with the CPU and the state of the SPU having word widths of one or more of the plurality of registers being greater in size than word widths of a plurality of CPU registers of a computing system and a set of state-master bits to synchronize the state of the SPU and a state of the CPU. The results of the set of operations are stored in the plurality of CPU registers or an alternative set of the plurality of SPU registers.
    Type: Grant
    Filed: October 14, 2021
    Date of Patent: May 30, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Sandhya Koteshwara, Kattamuri Ekanadham, Manoj Kumar, Jose E. Moreira, Pratap C. Pattnaik
  • Publication number: 20230124185
    Abstract: A Reduced Instruction Set Computer (“RISC”) supporting large-word operations in a computing environment is disclosed. In one implementation, in response to receiving one or more control signals from a central processing unit (“CPU”), a set of operations are executed on a state of a special purpose execution unit (“SPU”) having a plurality of SPU registers, the SPU being associated with the CPU and the state of the SPU having word widths of one or more of the plurality of registers being greater in size than word widths of a plurality of CPU registers of a computing system and a set of state-master bits to synchronize the state of the SPU and a state of the CPU. The results of the set of operations are stored in the plurality of CPU registers or an alternative set of the plurality of SPU registers.
    Type: Application
    Filed: October 14, 2021
    Publication date: April 20, 2023
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Sandhya KOTESHWARA, Kattamuri EKANADHAM, Manoj KUMAR, Jose E. MOREIRA, Pratap C. PATTNAIK
  • Publication number: 20220414023
    Abstract: Aspects of the present disclosure relate to encrypted data processing (EDAP). A processor includes a register file configured to store ciphertext data, an instruction fetch and decode unit configured to fetch and decode instructions, and a functional unit configured to process the stored ciphertext data. The functional unit further includes a decryption module configured to decrypt ciphertext data from the register file to receive cleartext data using an encryption key stored within the functional unit. The functional unit further includes a local buffer configured to store the cleartext data. The functional unit further includes an arithmetic logical unit configured to generate cleartext computation results using the cleartext data The functional unit further includes an encryption module configured to encrypt the cleartext computation results to generate ciphertext computation results for storage back into the register file.
    Type: Application
    Filed: June 24, 2021
    Publication date: December 29, 2022
    Inventors: Manoj Kumar, Gianfranco Bilardi, Kattamuri Ekanadham, Jose E. Moreira, Pratap C. Pattnaik, Jessica Hui-Chun Tseng
  • Publication number: 20220414270
    Abstract: Aspects of the present disclosure relate to encrypted data processing (EDAP). Encrypted data from a cache to be loaded into a register file can be accessed. The encrypted data can be decrypted to receive cleartext data. The cleartext data can be written to the register file. The cleartext data can be processed using at least one functional unit to receive cleartext computation results. The cleartext computation results can then be written back to the register file.
    Type: Application
    Filed: June 24, 2021
    Publication date: December 29, 2022
    Inventors: Jessica Hui-Chun Tseng, Jose E. Moreira, Pratap C. Pattnaik, Manoj Kumar, Kattamuri Ekanadham, Gianfranco Bilardi
  • Publication number: 20220191020
    Abstract: A processor and method for processing information is disclosed that in response to encountering a function entry instruction while running an application, computes an entry hash value using a hash of three hash input parameters, wherein one of the input parameters is a secret key stored in the special purpose register; and in response to encountering a function exit instruction, computes an exit hash value using the same three input parameters and the same hash used when computing the entry hash value; and determines if the entry hash value is the same as the exit hash value.
    Type: Application
    Filed: December 16, 2020
    Publication date: June 16, 2022
    Inventors: Jose E. Moreira, Arnold Flores, Debapriya Chatterjee, Kattamuri Ekanadham
  • Publication number: 20220188463
    Abstract: A computer system, processor, computer program product, and method for executing instructions in a software application that includes a processor that can be dynamically controlled, in response to a value set in a control register, to operate in either a secure mode or a performance mode. In the secure mode, the processor: upon encountering a secure mode entry instruction, computes an entry hash value using a hash function and stores the entry hash value; and upon encountering a secure mode exit instruction, computes an exit hash value, loads the entry hash value, and determines whether the entry hash value is the same as the exit hash value, and depending upon verification of the hash values can execute the return function or transfer control to the operating system. In the performance mode, the processor: executes both the secure mode entry instruction and the secure mode exit instruction as no-operations.
    Type: Application
    Filed: December 16, 2020
    Publication date: June 16, 2022
    Inventors: Debapriya Chatterjee, Brian W. Thompto, Jose E. Moreira
  • Patent number: 11294685
    Abstract: Method and systems for creating a sequence of fused instructions. An instruction stream is obtained, and a window of instructions from the instruction stream is examined and one or more groups of instructions that satisfy one or more fusion rules are identified. One or more of the groups of instructions that satisfy the one or more fusion rules are fused and a maximal length data dependence chain in the instruction stream is analyzed by analyzing every node in a dependence graph in a selected window of instructions. Fusion of an instruction group is prevented based on the maximal length data dependence chain.
    Type: Grant
    Filed: June 4, 2019
    Date of Patent: April 5, 2022
    Assignee: International Business Machines Corporation
    Inventors: Jessica Hui-Chun Tseng, Manoj Kumar, Kattamuri Ekanadham, Jose E. Moreira, Pratap C. Pattnaik
  • Patent number: 11281745
    Abstract: Methods and systems of matrix multiplication are described. In an example, a processor can multiply a first entry of a first vector of a first data array with a second vector of a second data array to generate a third vector of a third data array. The processor can store the third vector of the third data array in the second register file. The processor can multiply a second entry of the first vector with the second vector to generate a fourth vector of the third data array. The processor can store the fourth vector of the third data array in the second register file. The processor can combine vectors of the third data array that are stored in the second register file to produce the third data array.
    Type: Grant
    Filed: August 16, 2019
    Date of Patent: March 22, 2022
    Assignee: International Business Machines Corporation
    Inventors: Bruce Fleischer, Jose E. Moreira, Joel A. Silberman
  • Publication number: 20220050682
    Abstract: A computer system, processor, and method for processing information is disclosed that includes at least one computer processor; a main register file associated with the at least one processor, the main register file having a plurality of entries for storing data, one or more write ports to write data to the main register file entries, and one or more read ports to read data from the main register file entries; one or more execution units including a dense math execution unit; and at least one accumulator register file having a plurality of entries for storing data. The results of the dense math execution unit in an aspect are written to the accumulator register file, preferably to the same accumulator register file entry multiple times, and the data from the accumulator register file is written to the main register file.
    Type: Application
    Filed: August 27, 2021
    Publication date: February 17, 2022
    Inventors: Brian W. Thompto, Maarten J. Boersma, Andreas Wagner, Jose E. Moreira, Hung Q. Le, Silvia Melitta Mueller, Dung Q. Nguyen
  • Publication number: 20220012010
    Abstract: A data ordering device includes a plurality of inputs N and a plurality of outputs M. There is a sorting network coupled between the plurality of inputs N and the plurality of outputs M. There are one or more latches comprising a buffer coupled between each input of the plurality of inputs N and a corresponding input of the sorting network. There are one or more latches comprising a buffer coupled between each output of the plurality of outputs M and a corresponding output of the sorting network. There is an input for a control signal operative to initiate a sorting of data between the plurality of inputs N and the plurality of outputs M. The data ordering device is coupled to a core of a central processing unit.
    Type: Application
    Filed: September 26, 2021
    Publication date: January 13, 2022
    Inventors: Manoj Kumar, Pratap C. Pattnaik, Kattamuri Ekanadham, Jessica Tseng, Jose E. Moreira
  • Publication number: 20220004386
    Abstract: Aspects include a compute array of a processor with mixed-precision numerical linear algebra support. A first precision and a first shape of a first input matrix and a second precision and a second shape of a second input matrix to the compute array are determined. A plurality of linear algebra operations is repeated in parallel within the compute array to update a result matrix in an accumulator register based on the first input matrix, the second input matrix, and a number of rank updates of the result matrix to store in the accumulator register.
    Type: Application
    Filed: September 21, 2021
    Publication date: January 6, 2022
    Inventors: Jose E. Moreira, Brett Olsson, Brian W. Thompto, Silvia Melitta Mueller, Andreas Wagner
  • Patent number: 11188328
    Abstract: Aspects include a compute array of a processor with mixed-precision numerical linear algebra support. A first precision and a first shape of a first input matrix and a second precision and a second shape of a second input matrix to the compute array are determined. A number of rank updates of a result matrix to store in an accumulator register having a predetermined size are determined, where the number of rank updates is based on the first precision and the first shape of the first input matrix, the second precision and the second shape of the second input matrix, and the predetermined size of the accumulator register. A plurality of linear algebra operations is repeated in parallel within the compute array to update the result matrix in the accumulator register based on the first input matrix, the second input matrix, and the number of rank updates.
    Type: Grant
    Filed: December 12, 2019
    Date of Patent: November 30, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Jose E. Moreira, Brett Olsson, Brian W. Thompto, Silvia Melitta Mueller, Andreas Wagner
  • Patent number: 11182458
    Abstract: Embodiments of the present invention are directed to a new instruction set extension and a method for providing 3D lane predication for matrix operations. In a non-limiting embodiment of the invention, a first input matrix having m rows and k columns and a second input matrix having k rows and n columns are received by a compute array of a processor. A three-dimensional predicate mask having an M-bit row mask, an N-bit column mask, and a K-bit rank mask is generated. A result matrix of up to m rows, up to n columns, and up to k rank updates is determined based on the first input matrix, the second input matrix, and the predicate mask.
    Type: Grant
    Filed: December 12, 2019
    Date of Patent: November 23, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Brett Olsson, Brian W. Thompto, Jose E. Moreira, Silvia Melitta Mueller, Andreas Wagner