Patents by Inventor Silvia Melitta Mueller

Silvia Melitta Mueller has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Parallel rounding for conversion from binary floating point to binary coded decimal

Patent number: 11221826

Abstract: Embodiments of the invention are directed to a computer-implemented method of for parallel conversion to binary coded decimal format. The method includes receiving, by a floating point unit (FPU), a value in binary floating point (BFP) format. The BFP value includes an integer part and a fractional part. The FPU converts the BFP value to a binary coded decimal (BCD) value. In parallel to converting the BFP value to a BCD value, the FPU performs a rounding operation on the BFP value. The FPU receives the rounding information and operates on the BCD value accordingly.

Type: Grant

Filed: July 30, 2019

Date of Patent: January 11, 2022

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Stefan Payer, Silvia Melitta Mueller, Razvan Peter Figuli, Revital Arieli
COMPUTE ARRAY OF A PROCESSOR WITH MIXED-PRECISION NUMERICAL LINEAR ALGEBRA SUPPORT

Publication number: 20220004386

Abstract: Aspects include a compute array of a processor with mixed-precision numerical linear algebra support. A first precision and a first shape of a first input matrix and a second precision and a second shape of a second input matrix to the compute array are determined. A plurality of linear algebra operations is repeated in parallel within the compute array to update a result matrix in an accumulator register based on the first input matrix, the second input matrix, and a number of rank updates of the result matrix to store in the accumulator register.

Type: Application

Filed: September 21, 2021

Publication date: January 6, 2022

Inventors: Jose E. Moreira, Brett Olsson, Brian W. Thompto, Silvia Melitta Mueller, Andreas Wagner
Parallelized rounding for decimal floating point to binary coded decimal conversion

Patent number: 11210064

Abstract: A computer-implemented method includes: receiving, using a processor, a decimal floating point number; and using a floating point unit within the processor to convert the decimal floating point number into a binary coded decimal number, wherein the floating point unit starts a conversion loop subsequent to a rounding loop starting, wherein the rounding loop and the conversion loop run in parallel once started.

Type: Grant

Filed: July 30, 2019

Date of Patent: December 28, 2021

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Stefan Payer, Silvia Melitta Mueller, Nicol Hofmann, Razvan Peter Figuli
Compute array of a processor with mixed-precision numerical linear algebra support

Patent number: 11188328

Abstract: Aspects include a compute array of a processor with mixed-precision numerical linear algebra support. A first precision and a first shape of a first input matrix and a second precision and a second shape of a second input matrix to the compute array are determined. A number of rank updates of a result matrix to store in an accumulator register having a predetermined size are determined, where the number of rank updates is based on the first precision and the first shape of the first input matrix, the second precision and the second shape of the second input matrix, and the predetermined size of the accumulator register. A plurality of linear algebra operations is repeated in parallel within the compute array to update the result matrix in the accumulator register based on the first input matrix, the second input matrix, and the number of rank updates.

Type: Grant

Filed: December 12, 2019

Date of Patent: November 30, 2021

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Jose E. Moreira, Brett Olsson, Brian W. Thompto, Silvia Melitta Mueller, Andreas Wagner
Validating microprocessor performance

Patent number: 11188304

Abstract: Validating microprocessor instruction execution by receiving a floating-point exception selection, receiving a validation method selection, generating validation data according to the floating-point exception selection and the validation method selection by randomly generating a first tensor element value and randomly generating a second tensor element value according to the first tensor element value and the floating-point exception selection, and executing a floating-point computation according to the validation data.

Type: Grant

Filed: July 1, 2020

Date of Patent: November 30, 2021

Assignee: International Business Machines Corporation

Inventors: Gal Ashour, Oz Dov Hershkovitz, Michal Rimon, Karen Holtz, Silvia Melitta Mueller, Avishai Moshe Fedida
Three-dimensional lane predication for matrix operations

Patent number: 11182458

Abstract: Embodiments of the present invention are directed to a new instruction set extension and a method for providing 3D lane predication for matrix operations. In a non-limiting embodiment of the invention, a first input matrix having m rows and k columns and a second input matrix having k rows and n columns are received by a compute array of a processor. A three-dimensional predicate mask having an M-bit row mask, an N-bit column mask, and a K-bit rank mask is generated. A result matrix of up to m rows, up to n columns, and up to k rank updates is determined based on the first input matrix, the second input matrix, and the predicate mask.

Type: Grant

Filed: December 12, 2019

Date of Patent: November 23, 2021

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Brett Olsson, Brian W. Thompto, Jose E. Moreira, Silvia Melitta Mueller, Andreas Wagner
Binary floating-point multiply and scale operation for compute-intensive numerical applications and apparatuses

Patent number: 11182127

Abstract: Techniques facilitating binary floating-point multiply and scale operation for compute-intensive numerical applications and apparatuses are provided. An embodiment relates to a system that can comprise a memory that stores computer executable components and a processor that executes the computer executable components stored in the memory. The computer executable components can comprise a receiver component that receives an instruction to perform a multiply and scale operation of the first floating point operand value, the second floating point operand value, and the integer operand value, wherein the multiplication component obtains the floating-point product in response to the instruction to perform the multiply and scale operation. The multiplication can be performed as a single instruction.

Type: Grant

Filed: March 25, 2019

Date of Patent: November 23, 2021

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Silvia Melitta Mueller, Bruce Fleischer, Ankur Agrawal, Kailash Gopalakrishnan
Floating point unit for exponential function implementation

Patent number: 11163533

Abstract: A computer-implemented method for performing an exponential calculation using only two fully-pipelined instructions in a floating point unit that includes. The method includes computing an intermediate value y? by multiplying an input operand with a predetermined constant value. The input operand is received in floating point representation. The method further includes computing an exponential result for the input operand by executing a fused instruction. The fused instructions includes converting the intermediate value y? to an integer representation z represented by v most significant bits (MSB), and w least significant bits (LSB). The fused instruction further includes determining exponent bits of the exponential result based on the v MSB from the integer representation z. The method further includes determining mantissa bits of the exponential result according to a piece-wise linear mapping function using a predetermined number of segments based on the w LSB from the integer representation z.

Type: Grant

Filed: July 18, 2019

Date of Patent: November 2, 2021

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Xiao Sun, Ankur Agrawal, Kailash Gopalakrishnan, Silvia Melitta Mueller, Kerstin Claudia Schelm
SMT processor to create a virtual vector register file for a borrower thread from a number of donated vector register files

Patent number: 11132228

Abstract: A computing device and a method of allocating vector register files in a simultaneously-multithreaded (SMT) processor core are provided. A request for a first number (M) of vector register files is received from a borrower thread of the processor core. One or more available donor threads of the processor core are identified. A second number (N) of the vector register files, of the identified one or more available donor threads, are assigned to the borrower thread, where N is ?M. The borrower thread is parameterized to create a virtualized vector register file for the borrower thread, based on a width of the N vector register files of the identified one or more donor threads.

Type: Grant

Filed: March 21, 2018

Date of Patent: September 28, 2021

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Mauricio Serrano, Giles Frazier, Silvia Melitta Mueller
Instruction handling for accumulation of register results in a microprocessor

Patent number: 11132198

Abstract: A computer system, processor, and method for processing information is disclosed that includes at least one computer processor; a main register file associated with the at least one processor, the main register file having a plurality of entries for storing data, one or more write ports to write data to the main register file entries, and one or more read ports to read data from the main register file entries; one or more execution units including a dense math execution unit; and at least one accumulator register file having a plurality of entries for storing data. The results of the dense math execution unit in an aspect are written to the accumulator register file, preferably to the same accumulator register file entry multiple times, and the data from the accumulator register file is written to the main register file.

Type: Grant

Filed: August 29, 2019

Date of Patent: September 28, 2021

Assignee: International Business Machines Corporation

Inventors: Brian W. Thompto, Maarten J. Boersma, Andreas Wagner, Jose E. Moreira, Hung Q. Le, Silvia Melitta Mueller, Dung Q. Nguyen
THREE-DIMENSIONAL LANE PREDICATION FOR MATRIX OPERATIONS

Publication number: 20210182359

Abstract: Embodiments of the present invention are directed to a new instruction set extension and a method for providing 3D lane predication for matrix operations. In a non-limiting embodiment of the invention, a first input matrix having m rows and k columns and a second input matrix having k rows and n columns are received by a compute array of a processor. A three-dimensional predicate mask having an M-bit row mask, an N-bit column mask, and a K-bit rank mask is generated. A result matrix of up to m rows, up to n columns, and up to k rank updates is determined based on the first input matrix, the second input matrix, and the predicate mask.

Type: Application

Filed: December 12, 2019

Publication date: June 17, 2021

Inventors: Brett Olsson, Brian W. Thompto, Jose E. Moreira, Silvia Melitta Mueller, Andreas Wagner
COMPUTE ARRAY OF A PROCESSOR WITH MIXED-PRECISION NUMERICAL LINEAR ALGEBRA SUPPORT

Publication number: 20210182060

Abstract: Aspects include a compute array of a processor with mixed-precision numerical linear algebra support. A first precision and a first shape of a first input matrix and a second precision and a second shape of a second input matrix to the compute array are determined. A number of rank updates of a result matrix to store in an accumulator register having a predetermined size are determined, where the number of rank updates is based on the first precision and the first shape of the first input matrix, the second precision and the second shape of the second input matrix, and the predetermined size of the accumulator register. A plurality of linear algebra operations is repeated in parallel within the compute array to update the result matrix in the accumulator register based on the first input matrix, the second input matrix, and the number of rank updates.

Type: Application

Filed: December 12, 2019

Publication date: June 17, 2021

Inventors: Jose E. Moreira, Brett Olsson, Brian W. Thompto, Silvia Melitta Mueller, Andreas Wagner
MIXED PRECISION FLOATING-POINT MULTIPLY-ADD OPERATION

Publication number: 20210182024

Abstract: An example computer-implemented method includes receiving a first value, a second value, a third value, and a fourth value, wherein the first value, the second value, the third value, and the fourth value are 16-bit or smaller precision floating-point numbers. The method further includes multiplying the first value and the second value to generate a first product, wherein the first product is a 32-bit floating-point number. The method further includes multiplying the third value and the fourth value to generate a second product, wherein the second product is a 32-bit floating-point number. The method further includes summing the first product and the second product to generate a summed value, wherein the summed value is a 32-bit floating-point number. The method further includes adding the summed value to an addend value to generate a result value, wherein the addend value and the result value are 32-bit floating-point numbers.

Type: Application

Filed: December 12, 2019

Publication date: June 17, 2021

Inventors: Silvia Melitta Mueller, Andreas Wagner, Brian W. Thompto
LOW LATENCY FLOATING-POINT DIVISION OPERATIONS

Publication number: 20210149633

Abstract: Methods and systems for division operation are described. A processor can initialize an estimated quotient between the dividend and the divisor separately from a floating-point unit (FPU) pipeline. The processor can implement the FPU pipeline to execute a refinement process that can include at least a first iteration of operations and a second iteration of operations. The refinement process can include, in the first iteration of operations, generating a first unnormalized floating-point value using the initialized estimated quotient. The refinement process can include, in the second iteration of operations, generating a second unnormalized floating-point value using the first unnormalized floating-point value. The processor can determine a final quotient based on the second unnormalized floating-point value.

Type: Application

Filed: November 14, 2019

Publication date: May 20, 2021

Inventors: Silvia Melitta Mueller, Thomas Winters Fox, Bruce Fleischer
Decimal load immediate instruction

Patent number: 10990390

Abstract: An instruction generates a value for use in processing within a computing environment. The instruction obtains a sign control associated with the instruction, and shifts an input value of the instruction in a specified direction by a selected amount to provide a result. The result is placed in a first designated location in a register, and the sign, which is based on the sign control, is placed in a second designated location of the register. The result and the sign provide a signed value to be used in processing within the computing environment.

Type: Grant

Filed: August 19, 2019

Date of Patent: April 27, 2021

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Jonathan D. Bradbury, Reid T. Copeland, Silvia Melitta Mueller
HYBRID FLOATING POINT REPRESENTATION FOR DEEP LEARNING ACCELERATION

Publication number: 20210109709

Abstract: In an embodiment, a method includes configuring a specialized circuit for floating point computations using numbers represented by a hybrid format, wherein the hybrid format includes a first format and a second format. In the embodiment, the method includes operating the further configured specialized circuit to store an approximation of a numeric value in the first format during a forward pass for training a deep learning network. In the embodiment, the method includes operating the further configured specialized circuit to store an approximation of a second numeric value in the second format during a backward pass for training the deep learning network.

Type: Application

Filed: December 21, 2020

Publication date: April 15, 2021

Applicant: International Business Machines Corporation

Inventors: Naigang Wang, Jungwook Choi, Kailash Gopalakrishnan, Ankur Agrawal, Silvia Melitta Mueller
Hybrid floating point representation for deep learning acceleration

Patent number: 10963219

Abstract: In an embodiment, a method includes configuring a specialized circuit for floating point computations using numbers represented by a hybrid format, wherein the hybrid format includes a first format and a second format. In the embodiment, the method includes operating the further configured specialized circuit to store an approximation of a numeric value in the first format during a forward pass for training a deep learning network. In the embodiment, the method includes operating the further configured specialized circuit to store an approximation of a second numeric value in the second format during a backward pass for training the deep learning network.

Type: Grant

Filed: February 6, 2019

Date of Patent: March 30, 2021

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Naigang Wang, Jungwook Choi, Kailash Gopalakrishnan, Ankur Agrawal, Silvia Melitta Mueller
INSTRUCTION HANDLING FOR ACCUMULATION OF REGISTER RESULTS IN A MICROPROCESSOR

Publication number: 20210064365

Abstract: A computer system, processor, and method for processing information is disclosed that includes at least one computer processor; a main register file associated with the at least one processor, the main register file having a plurality of entries for storing data, one or more write ports to write data to the main register file entries, and one or more read ports to read data from the main register file entries; one or more execution units including a dense math execution unit; and at least one accumulator register file having a plurality of entries for storing data. The results of the dense math execution unit in an aspect are written to the accumulator register file, preferably to the same accumulator register file entry multiple times, and the data from the accumulator register file is written to the main register file.

Type: Application

Filed: August 29, 2019

Publication date: March 4, 2021

Inventors: Brian W. Thompto, Maarten J. Boersma, Andreas Wagner, Jose E. Moreira, Hung Q. Le, Silvia Melitta Mueller, Dung Q. Nguyen
CONDITION CODE ANTICIPATOR FOR HEXADECIMAL FLOATING POINT

Publication number: 20210042088

Abstract: An aspect includes executing, by a binary based floating-point arithmetic unit of a processor, a calculation having two or more operands in hexadecimal format based on a hexadecimal floating-point (HFP) instruction and providing a condition code for a calculation result of the calculation. The floating-point arithmetic unit includes a condition code anticipator circuit that is configured to provide the condition code to the processor prior to availability of the calculation result.

Type: Application

Filed: August 9, 2019

Publication date: February 11, 2021

Inventors: Silvia Melitta Mueller, Petra Leber, Kerstin Claudia Schelm, Cedric Lichtenau
PARALLELIZED ROUNDING FOR DECIMAL FLOATING POINT TO BINARY CODED DECIMAL CONVERSION

Publication number: 20210034328

Abstract: A computer-implemented method includes: receiving, using a processor, a decimal floating point number; and using a floating point unit within the processor to convert the decimal floating point number into a binary coded decimal number, wherein the floating point unit starts a conversion loop subsequent to a rounding loop starting, wherein the rounding loop and the conversion loop run in parallel once started.

Type: Application

Filed: July 30, 2019

Publication date: February 4, 2021

Inventors: Stefan Payer, Silvia Melitta Mueller, Nicol Hofmann, Razvan Peter Figuli

prev 1 2 3 4 5 6 … next