Microprocessor Patents (Class 708/510)

Systems, methods, and apparatuses for tile store

Patent number: 11288069

Abstract: Embodiments detailed herein relate to matrix operations. In particular, the loading of a matrix (tile) from memory.

Type: Grant

Filed: July 1, 2017

Date of Patent: March 29, 2022

Assignee: Intel Corporation

Inventors: Robert Valentine, Menachem Adelman, Elmoustapha Ould-Ahmed-Vall, Bret L. Toll, Milind B. Girkar, Zeev Sperber, Mark J. Charney, Rinat Rappoport, Jesus Corbal, Stanislav Shwartsman, Igor Yanover, Alexander F. Heinecke, Barukh Ziv, Dan Baum, Yuri Gebil
Generating source and destination addresses for repeated accelerator instruction

Patent number: 11182160

Abstract: A method and circuit for a data processing system provide a hardware accelerator repeat control instruction (402A) which is executed with a hardware accelerator instruction (402B) to extract and latch repeat parameters from the hardware accelerator repeat control instruction, such as a repeat count value (RPT_CNT), a source address offset value (ADDR_INCR0), and a destination address offset value (ADDR_INCR1), and to generate a command to the hardware accelerator (205) to execute the hardware accelerator instruction a specified plurality of times based on instruction parameters from the hardware accelerator instruction by using the repeat count value to track how many times the hardware accelerator instruction is executed and by automatically generating, at each execution of the hardware accelerator instruction, additional source and destination addresses for the hardware accelerator from the repeat parameters until the hardware accelerator instruction has been executed the specified plurality of times by the

Type: Grant

Filed: November 24, 2020

Date of Patent: November 23, 2021

Assignee: NXP USA, Inc.

Inventors: Maik Brett, Christian Tuschen, Sidhartha Taneja, Tejbal Prasad, Saurabh Arora, Anurag Jain, Pranshu Agrawal, Mukul Aggarwal, Ajay Sharma
Microprocessor with booth multiplication

Patent number: 10776108

Abstract: A microprocessor provides at least two storage areas and uses a datapath for Booth multiplication. According to a first and second field of a microinstruction, the datapath gets multiplicand number supply data from the first storage area and multiplier number supply data from the second storage area. The datapath operates according to a word length indicated in a third field of the microinstruction. The datapath gets multi-bit acquisitions for Booth multiplication from the multiplier number supply data. The datapath divides the multiplicand number supply data into multiplicand numbers according to the word length, and performs Booth multiplication on the multiplicand numbers based on the multi-bit acquisitions to get partial products. According to the word length, the datapath selects a part of the partial products to be shifted and added for generation of a plurality of products.

Type: Grant

Filed: October 18, 2018

Date of Patent: September 15, 2020

Assignee: SHANGHAI ZHAOXIN SEMICONDUCTOR CO., LTD.

Inventors: Jing Chen, Xiaoyang Li, Juanli Song, Zhenhua Huang, Weilin Wang, Jiin Lai
Microprocessor with booth multiplication

Patent number: 10754646

Abstract: A microprocessor with Booth multiplication, in which several acquisition registers are used. In a first word length, a first acquisition register stores an unsigned ending acquisition of a first multiplier number carried in multiplier number supply data, and a third acquisition register stores a starting acquisition of a second multiplier number carried in the multiplier number supply data. In a second word length that is longer than the first word length, a fourth acquisition register stores a middle acquisition of a third multiplier number carried in the multiplier number supply data. A partial product selection circuit is required for selection of a partial product, to get the partial product from Booth multiplication based on the third acquisition register (corresponding to the first word length) or based on the fourth acquisition register (corresponding to the second word length).

Type: Grant

Filed: October 18, 2018

Date of Patent: August 25, 2020

Assignee: SHANGHAI ZHAOXIN SEMICONDUCTOR CO., LTD.

Inventors: Jing Chen, Xiaoyang Li, Juanli Song, Zhenhua Huang, Weilin Wang, Jiin Lai
Computer-based square root and division operations

Patent number: 9910638

Abstract: Square root operations in a computer processor are disclosed. A first iteration for calculating partial results of a square root operation is performed in a larger number of cycles than remaining iterations. The first iteration requires calculation of a first digit that is larger than the subsequent digits. The first iteration thus requires multiplication of values that are larger than corresponding values for the subsequent other digits. By splitting the first digit into two parts, the required multiplications can be performed in less time than if the first digit were not split. Performing these multiplications in less time reduces the total delay for clock cycles associated with the first digit calculations, which increases the possible clock frequency allowed. A multiply-and-accumulate unit that performs either packed-single operations or double-precision operations may be used, along with a combined division/square root unit for simultaneous execution of division and square root operations.

Type: Grant

Filed: August 25, 2016

Date of Patent: March 6, 2018

Assignee: Advanced Micro Devices, Inc.

Inventors: Hanbing Liu, John Kelley, Michael Estlick, Erik Swanson, Jay Fleischman
Method and apparatus for implementing a multiple operand vector floating point summation to scalar function

Patent number: 8239438

Abstract: Embodiments of the invention provide methods and apparatus for executing a multiple operand instruction. Executing the multiple operand instruction comprises computing an arithmetic result of a pair of operands in each processing lane of a vector unit. The arithmetic results generated in each processing lane of the vector unit may be transferred to a dot product unit. The dot product unit may compute an arithmetic result using the arithmetic result computed by each processing lane of the vector unit to generate an arithmetic result of more than two operands.

Type: Grant

Filed: August 17, 2007

Date of Patent: August 7, 2012

Assignee: International Business Machines Corporation

Inventors: Adam James Muff, Matthew Ray Tubbs
FAST FLOATING POINT RESULT FORWARDING USING NON-ARCHITECTED DATA FORMAT

Publication number: 20110060785

Abstract: A microprocessor having an instruction set architecture (ISA) that specifies at least one architected data format (ADF) for floating-point operands. The microprocessor includes a plurality of floating-point units, each comprising an arithmetic unit configured to receive non-ADF source operands and to perform a floating-point operation on the non-ADF source operands to generate a non-ADF result. The microprocessor also includes forwarding buses, configured to forward the non-ADF result generated by each arithmetic unit of the plurality of floating-point units to each of the plurality of floating-point units for selective use as one of the non-ADF source operands.

Type: Application

Filed: June 22, 2010

Publication date: March 10, 2011

Applicant: VIA TECHNOLOGIES, INC.

Inventors: G. Glenn Henry, Terry Parks
Efficient error-check and exact-check for Newton-Raphson divide and square-root operations

Patent number: 7899859

Abstract: One embodiment of the present invention provides a system that performs both error-check and exact-check operations for a Newton-Raphson divide or square-root computation. During operation, the system performs Newton-Raphson iterations followed by a multiply for a divide or a square-root operation to produce a result, which includes one or more additional bits of accuracy beyond a desired accuracy for the result. Next, the system rounds the result to the desired accuracy to produce a rounded result t. The system then analyzes the additional bits of accuracy to determine whether t is correct and whether t is exact.

Type: Grant

Filed: December 20, 2005

Date of Patent: March 1, 2011

Assignee: Oracle America, Inc.

Inventors: Allen Lyu, Leonard D. Rarick
SYSTEM AND METHOD FOR STORING NUMBERS IN FIRST AND SECOND FORMATS IN A REGISTER FILE

Publication number: 20100095099

Abstract: A system and a method for storing numbers in a register file are provided. The system and the method store single precision numbers in double precision format in a register file that is shared between floating point computational units and computational units not supporting floating point numbers.

Type: Application

Filed: October 14, 2008

Publication date: April 15, 2010

Applicant: International Business Machines Corporation

Inventors: Maarten Boersma, Michael Kroener, Petra Leber, Silvia M. Mueller, Jochen Preiss, Kerstin Schelm
DETECTION OF POTENTIAL NEED TO USE A LARGER DATA FORMAT IN PERFORMING FLOATING POINT OPERATIONS

Publication number: 20080270508

Abstract: Detection of whether a result of a floating point operation is safe. Characteristics of the result are examined to determine whether the result is safe or potentially unsafe, as defined by the user. An instruction is provided to facilitate detection of safe or potentially unsafe results.

Type: Application

Filed: April 25, 2007

Publication date: October 30, 2008

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Shawn D. Lundvall, Ronald M. Smith, Phil C. Yeh
EXTRACT BIASED EXPONENT OF DECIMAL FLOATING POINT DATA

Publication number: 20080270509

Abstract: A decimal floating point finite number in a decimal floating point format is composed from the number in a different format. A decimal floating point format includes fields to hold information relating to the sign, exponent and significand of the decimal floating point finite number. Other decimal floating point data, including infinities and NaNs (not a number), are also composed. Decimal floating point data are also decomposed from the decimal floating point format to a different format. For composition and decomposition, one or more instructions may be employed, including an insert biased exponent or extract biased exponent instruction.

Type: Application

Filed: June 29, 2007

Publication date: October 30, 2008

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Shawn D. Lundvall, Eric M. Schwarz, Ronald M. Smith, Phil C. Yeh
Economical on-the-fly rounding for digit-recurrence algorithms

Patent number: 6792443

Abstract: Apparatus and methods are provided for an improved on-the-fly rounding technique for digit-recurrence algorithms, such as division and square root calculations. According to one embodiment, only two forms of an intermediate result of an operation to be performed by a digit-recurrence algorithm are maintained. A first form is maintained in a first register and a second form is maintained in a second register. Responsive to receiving digits 1 to L−2 of the intermediate result from a digit recurrence unit, where L represents a number of digits that satisfies a predetermined precision for the operation, both forms of the intermediate result are updated by register swapping or concatenation under the control of load and shift control logic and on-the-fly conversion logic. Then, a rounded result is generated by determining digits dL−1 and dL and appending a rounded last digit to the appropriate form of the intermediate result.

Type: Grant

Filed: June 29, 2001

Date of Patent: September 14, 2004

Assignee: Intel Corporation

Inventor: Ping Tak Peter Tang
Higher precision divide and square root approximations

Publication number: 20030149712

Abstract: A floating point unit includes a multiplier, an approximation circuit, and a control circuit coupled to the multiplier and the approximation circuit. The approximation circuit is configured to generate an approximation of a difference of the first result from the multiplier and a constant. The control circuit is configured to approximate a function specified by a floating point instruction provided to the floating point unit for execution using an approximation algorithm. The approximation algorithm comprises at least two iterations through the multiplier and optionally the approximation circuit. The control circuit is configured to correct the approximation from the approximation circuit from a first iteration of the approximation algorithm during a second iteration of the approximation algorithm by supplying a correction vector to the multiplier during the second iteration. The multiplier is configured to incorporate the correction vector into the first result during the second iteration.

Type: Application

Filed: February 1, 2002

Publication date: August 7, 2003

Inventors: Robert Rogenmoser, Michael C. Kim
Circuitry for carrying out at least one of a square root and a division operation

Publication number: 20030131035

Abstract: The invention provides circuitry for carrying out at least one of a square root operation and a division operation. The circuitry utilizes a carry slave adder and a carry propagate adder part. The carry save adder and the carry propagate adder part are arranged in parallel.

Type: Application

Filed: November 7, 2002

Publication date: July 10, 2003

Inventor: Tariq Kurd
Circuitry for carrying out square root and division operations

Publication number: 20030126175

Abstract: The invention provides circuitry for carrying out a square root operation and a division operation. The circuitry utilizes common iteration circuitry for carrying out a plurality of iterations and means for identifying if an square root operation or a division operation is to be performed. The iteration circuitry is controlled in accordance with whether a square root or division operation is to be performed.

Type: Application

Filed: November 8, 2002

Publication date: July 3, 2003

Inventor: Tariq Kurd
Method for determining the square root of a long-bit number using a short-bit processor

Publication number: 20030028573

Abstract: In a method for determining the square root of a long-bit number using a short-bit processor, the long-bit number is assumed to be c×22K+d, where c, d<22k, and its solution is assumed to be (a×2K+b)2. The ‘a’ is determined by using a bisection method to obtain the floor value of the square root of ‘c’. In order to obtained the value of ‘b’, there is derived a successive substitution equation: b[n]=(c−a2)×22k+(d−b[n−1]2)/22(k+1). An initial value is given to ‘b’ to execute the successive substitution equation recursively several times until the equation is convergent.

Type: Application

Filed: October 19, 2001

Publication date: February 6, 2003

Inventor: Sheng-Hung Wu
Rapid execution of floating point load control word instructions

Patent number: 6405305

Abstract: A microprocessor with a floating point unit configured to rapidly execute floating point load control word (FLDCW) type instructions in an out of program order context is disclosed. The floating point unit is configured to schedule instructions older than the FLDCW-type instruction before the FLDCW-type instruction is scheduled. The FLDCW-type instruction acts as a barrier to prevent instructions occurring after the FLDCW-type instruction in program order from executing before the FLDCW-type instruction. Indicator bits may be used to simplify instruction scheduling, and copies of the floating point control word may be stored for instruction that have long execution cycles. A method and computer configured to rapidly execute FLDCW-type instructions in an out of program order context are also disclosed.

Type: Grant

Filed: September 10, 1999

Date of Patent: June 11, 2002

Assignee: Advanced Micro Devices, Inc.

Inventors: Stephan G. Meier, Jeffrey E. Trull, Derrick R. Meyer, Norbert Juffa
System and method for floating-point computation

Patent number: 6289365

Abstract: A system is disclosed for performing floating point computation in connection with numbers in a base floating point representation (such as the representation defined in IEEE Std. 754) that defines a plurality of formats, including a normalized format and a de-normalized format, using a common floating point representation that defines a unitary normalized format. The system includes a base to common representation converter, a processor and a common to base representation converter. The base to common representation converter converts numbers from the base floating point representation to the common floating point representation, so that all numbers involved in a computation will be expressed in the unitary normalized format. The processor is configured to perform a mathematical operation of at least one predetermined type in connection with the converted numbers generated by the base to common representation converter to generate a floating point result in the common representation.

Type: Grant

Filed: December 9, 1997

Date of Patent: September 11, 2001

Assignee: Sun Microsystems, Inc.

Inventor: Guy L. Steele, Jr.
Conversion from packed floating point data to packed 8-bit integer data in different architectural registers

Publication number: 20010016902

Abstract: A method and instruction for converting a number from a floating point format to an integer format are described. Numbers are stored in the floating point format in a register of a first set of architectural registers in a packed format. At least one of the numbers in the floating point format is converted to at least one 8-bit number in the integer format. The 8-bit number in the integer format is placed in a register of a second set of architectural registers in the packed format.

Type: Application

Filed: April 27, 2001

Publication date: August 23, 2001

Inventors: Mohammad A.F. Abdallah, Hsien-Cheng E. Hsieh, Thomas R. Huff, Vladimir Pentkovski, Patrice Roussel, Shreekant S. Thakkar
Piping rounding mode bits with floating point instructions to eliminate serialization

Patent number: 6233672

Abstract: A floating point unit is provided which conveys the rounding mode in effect upon dispatch of a particular instruction with that particular instruction into the execution pipeline of the floating point unit. Upon dispatch of a control word update instruction into the execution pipeline, the rounding mode is updated according to the updated control word provided for the control word update instruction. Instructions subsequent to the control word update instruction thereby receive the updated rounding mode as those instructions are dispatched. The updated rounding mode is available to the subsequent instructions prior to retiring the control word update instruction. The rounding mode is therefore updated without serializing the update. If the control word update instruction modifies the value in a field other than the rounding mode, the instructions subsequent to the control word update instruction may be discarded and re-executed subsequent to updating the control word register with the updated control word.

Type: Grant

Filed: March 6, 1997

Date of Patent: May 15, 2001

Assignee: Advanced Micro Devices, Inc.

Inventor: Thomas W. Lynch
Apparatus and method useful for evaluating periodic functions

Patent number: 6141670

Abstract: A computer and a method of using the computer to reduce an original argument to obtain a periodic function of the argument. A special number P.sub.j is employed that is close to a nontrivial even-integral multiple .pi.. The technique subtracts a non-negative integral multiple of P.sub.j from the original argument to obtain a first reduced argument. Then, a second non-negative integer multiple of a floating-point representation of .pi./2 is subtracted from the first reduced argument to obtain a second reduced argument. Next, a periodic function of a third argument equal to a sum of the second reduced argument plus the product of the first non-negative integral multiple and a floating-point representation of an offset .delta..sub.j is evaluated to obtain a result.

Type: Grant

Filed: September 30, 1997

Date of Patent: October 31, 2000

Assignee: Intel Corporation

Inventors: Shane A. Story, Ping Tak Peter Tang
Apparatus and method for absolute floating point register addressing

Patent number: 6134573

Abstract: An apparatus and method for improving the execution of floating point instructions in a microprocessor is provided. During decode of a floating point instruction, translation logic generates absolute addresses of specified registers in a floating point register file. These absolute references, as opposed to relative references to a top-of-stack, are inserted into associated micro instructions. In the event of an exception, synchronization logic provides an architected top-of-stack for the floating point instruction associated with the exception to the translation logic so that subsequent instructions will properly reference floating point registers.

Type: Grant

Filed: April 20, 1998

Date of Patent: October 17, 2000

Assignee: IP-First, L.L.C.

Inventors: G. Glenn Henry, Albert J. Loper, Jr., Terry Parks
System and method for floating-point computation for numbers in delimited floating point representation

Patent number: 6131106

Abstract: Floating point numbers and other values are represented in a "delimited" representation in which all numbers, including those which would in the IEEE Std. 754 representation, be in the de-normalized format, are in a format which is normalized with an implicit most significant digit having the value "one." For numbers which would, in the IEEE Std.

Type: Grant

Filed: January 30, 1998

Date of Patent: October 10, 2000

Assignee: Sun Microsystems Inc

Inventor: Guy L. Steele, Jr.
Method and apparatus for trading performance for precision when processing denormal numbers in a computer system

Patent number: 6105047

Abstract: An apparatus to improve the speed of handling of denormal numbers in a computer system, the apparatus comprising a mode bit and a selector, the mode bit set when denormals are to be replaced by zero, the selector having a first input and an output, the first input comprising a floating point number, the selector selecting zero to become the output when the floating point number is denormal and the mode bit is set, the selector selecting the floating point number to become the output otherwise.

Type: Grant

Filed: November 9, 1998

Date of Patent: August 15, 2000

Assignee: Intel Corporation

Inventors: Harshvardhan Sharangpani, Roger Golliver
Floating-point processor with operand-format precision greater than execution precision

Patent number: 6029243

Abstract: A floating-point processor nominally capable of single and double, but not extended, precision execution stores operands in extended-precision format. A format converter converts single and double precision source values to extended-precision format. Trap logic checks the apparent precision of the extended-precision operands and the requested result precision to determine whether the floating-point processor can execute the requested operation and yield the appropriate result. If the maximum of the requested precision and the maximum apparent precision of the operands is single or double, the requested operation is executed in hardware. Otherwise, a trap is issued to call an extended precision floating-point subroutine. This approach augments the class of operations that can be handled in hardware by a double-precision floating-point processor, and thus improves the floating-point computational throughput of an incorporating computer system.

Type: Grant

Filed: September 19, 1997

Date of Patent: February 22, 2000

Assignee: VLSI Technology, Inc.

Inventors: Timothy A. Pontius, Kenneth A. Dockser
Partitioning of binary quad word format multiply instruction on S/390 processor

Patent number: 6021422

Abstract: There is a unique partitioning problem in determining how to execute the floating point multiply instruction defined by IEEE 754 standard for the quad word format on a S/390 processor. Several manufacturers including IBM and HP define the binary quad word format to have a 113 bit significand. IBM S/390 hexadecimal long floating point format has a 56 bit significand and most S/390 floating point units only contain a long format multiplier. Quad word format multiplication must be executed as a series of several long precision multiplications and extended precision or long precision additions. The S/390 hexadecimal quad word format is easier to implement than binary format since it has a 112 bit significand and can easily be partitioned into two 56 bit parts. But a 113 bit significand would just exceed two partitions and require a third.

Type: Grant

Filed: March 5, 1998

Date of Patent: February 1, 2000

Assignee: International Business Machines Corporation

Inventor: Eric Mark Schwarz