Microprocessor Patents (Class 708/510)
-
Patent number: 11288069Abstract: Embodiments detailed herein relate to matrix operations. In particular, the loading of a matrix (tile) from memory.Type: GrantFiled: July 1, 2017Date of Patent: March 29, 2022Assignee: Intel CorporationInventors: Robert Valentine, Menachem Adelman, Elmoustapha Ould-Ahmed-Vall, Bret L. Toll, Milind B. Girkar, Zeev Sperber, Mark J. Charney, Rinat Rappoport, Jesus Corbal, Stanislav Shwartsman, Igor Yanover, Alexander F. Heinecke, Barukh Ziv, Dan Baum, Yuri Gebil
-
Patent number: 11182160Abstract: A method and circuit for a data processing system provide a hardware accelerator repeat control instruction (402A) which is executed with a hardware accelerator instruction (402B) to extract and latch repeat parameters from the hardware accelerator repeat control instruction, such as a repeat count value (RPT_CNT), a source address offset value (ADDR_INCR0), and a destination address offset value (ADDR_INCR1), and to generate a command to the hardware accelerator (205) to execute the hardware accelerator instruction a specified plurality of times based on instruction parameters from the hardware accelerator instruction by using the repeat count value to track how many times the hardware accelerator instruction is executed and by automatically generating, at each execution of the hardware accelerator instruction, additional source and destination addresses for the hardware accelerator from the repeat parameters until the hardware accelerator instruction has been executed the specified plurality of times by theType: GrantFiled: November 24, 2020Date of Patent: November 23, 2021Assignee: NXP USA, Inc.Inventors: Maik Brett, Christian Tuschen, Sidhartha Taneja, Tejbal Prasad, Saurabh Arora, Anurag Jain, Pranshu Agrawal, Mukul Aggarwal, Ajay Sharma
-
Patent number: 10776108Abstract: A microprocessor provides at least two storage areas and uses a datapath for Booth multiplication. According to a first and second field of a microinstruction, the datapath gets multiplicand number supply data from the first storage area and multiplier number supply data from the second storage area. The datapath operates according to a word length indicated in a third field of the microinstruction. The datapath gets multi-bit acquisitions for Booth multiplication from the multiplier number supply data. The datapath divides the multiplicand number supply data into multiplicand numbers according to the word length, and performs Booth multiplication on the multiplicand numbers based on the multi-bit acquisitions to get partial products. According to the word length, the datapath selects a part of the partial products to be shifted and added for generation of a plurality of products.Type: GrantFiled: October 18, 2018Date of Patent: September 15, 2020Assignee: SHANGHAI ZHAOXIN SEMICONDUCTOR CO., LTD.Inventors: Jing Chen, Xiaoyang Li, Juanli Song, Zhenhua Huang, Weilin Wang, Jiin Lai
-
Patent number: 10754646Abstract: A microprocessor with Booth multiplication, in which several acquisition registers are used. In a first word length, a first acquisition register stores an unsigned ending acquisition of a first multiplier number carried in multiplier number supply data, and a third acquisition register stores a starting acquisition of a second multiplier number carried in the multiplier number supply data. In a second word length that is longer than the first word length, a fourth acquisition register stores a middle acquisition of a third multiplier number carried in the multiplier number supply data. A partial product selection circuit is required for selection of a partial product, to get the partial product from Booth multiplication based on the third acquisition register (corresponding to the first word length) or based on the fourth acquisition register (corresponding to the second word length).Type: GrantFiled: October 18, 2018Date of Patent: August 25, 2020Assignee: SHANGHAI ZHAOXIN SEMICONDUCTOR CO., LTD.Inventors: Jing Chen, Xiaoyang Li, Juanli Song, Zhenhua Huang, Weilin Wang, Jiin Lai
-
Patent number: 9910638Abstract: Square root operations in a computer processor are disclosed. A first iteration for calculating partial results of a square root operation is performed in a larger number of cycles than remaining iterations. The first iteration requires calculation of a first digit that is larger than the subsequent digits. The first iteration thus requires multiplication of values that are larger than corresponding values for the subsequent other digits. By splitting the first digit into two parts, the required multiplications can be performed in less time than if the first digit were not split. Performing these multiplications in less time reduces the total delay for clock cycles associated with the first digit calculations, which increases the possible clock frequency allowed. A multiply-and-accumulate unit that performs either packed-single operations or double-precision operations may be used, along with a combined division/square root unit for simultaneous execution of division and square root operations.Type: GrantFiled: August 25, 2016Date of Patent: March 6, 2018Assignee: Advanced Micro Devices, Inc.Inventors: Hanbing Liu, John Kelley, Michael Estlick, Erik Swanson, Jay Fleischman
-
Patent number: 8239438Abstract: Embodiments of the invention provide methods and apparatus for executing a multiple operand instruction. Executing the multiple operand instruction comprises computing an arithmetic result of a pair of operands in each processing lane of a vector unit. The arithmetic results generated in each processing lane of the vector unit may be transferred to a dot product unit. The dot product unit may compute an arithmetic result using the arithmetic result computed by each processing lane of the vector unit to generate an arithmetic result of more than two operands.Type: GrantFiled: August 17, 2007Date of Patent: August 7, 2012Assignee: International Business Machines CorporationInventors: Adam James Muff, Matthew Ray Tubbs
-
Publication number: 20110060785Abstract: A microprocessor having an instruction set architecture (ISA) that specifies at least one architected data format (ADF) for floating-point operands. The microprocessor includes a plurality of floating-point units, each comprising an arithmetic unit configured to receive non-ADF source operands and to perform a floating-point operation on the non-ADF source operands to generate a non-ADF result. The microprocessor also includes forwarding buses, configured to forward the non-ADF result generated by each arithmetic unit of the plurality of floating-point units to each of the plurality of floating-point units for selective use as one of the non-ADF source operands.Type: ApplicationFiled: June 22, 2010Publication date: March 10, 2011Applicant: VIA TECHNOLOGIES, INC.Inventors: G. Glenn Henry, Terry Parks
-
Patent number: 7899859Abstract: One embodiment of the present invention provides a system that performs both error-check and exact-check operations for a Newton-Raphson divide or square-root computation. During operation, the system performs Newton-Raphson iterations followed by a multiply for a divide or a square-root operation to produce a result, which includes one or more additional bits of accuracy beyond a desired accuracy for the result. Next, the system rounds the result to the desired accuracy to produce a rounded result t. The system then analyzes the additional bits of accuracy to determine whether t is correct and whether t is exact.Type: GrantFiled: December 20, 2005Date of Patent: March 1, 2011Assignee: Oracle America, Inc.Inventors: Allen Lyu, Leonard D. Rarick
-
Publication number: 20100095099Abstract: A system and a method for storing numbers in a register file are provided. The system and the method store single precision numbers in double precision format in a register file that is shared between floating point computational units and computational units not supporting floating point numbers.Type: ApplicationFiled: October 14, 2008Publication date: April 15, 2010Applicant: International Business Machines CorporationInventors: Maarten Boersma, Michael Kroener, Petra Leber, Silvia M. Mueller, Jochen Preiss, Kerstin Schelm
-
Publication number: 20080270508Abstract: Detection of whether a result of a floating point operation is safe. Characteristics of the result are examined to determine whether the result is safe or potentially unsafe, as defined by the user. An instruction is provided to facilitate detection of safe or potentially unsafe results.Type: ApplicationFiled: April 25, 2007Publication date: October 30, 2008Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Shawn D. Lundvall, Ronald M. Smith, Phil C. Yeh
-
Publication number: 20080270509Abstract: A decimal floating point finite number in a decimal floating point format is composed from the number in a different format. A decimal floating point format includes fields to hold information relating to the sign, exponent and significand of the decimal floating point finite number. Other decimal floating point data, including infinities and NaNs (not a number), are also composed. Decimal floating point data are also decomposed from the decimal floating point format to a different format. For composition and decomposition, one or more instructions may be employed, including an insert biased exponent or extract biased exponent instruction.Type: ApplicationFiled: June 29, 2007Publication date: October 30, 2008Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Shawn D. Lundvall, Eric M. Schwarz, Ronald M. Smith, Phil C. Yeh
-
Patent number: 6792443Abstract: Apparatus and methods are provided for an improved on-the-fly rounding technique for digit-recurrence algorithms, such as division and square root calculations. According to one embodiment, only two forms of an intermediate result of an operation to be performed by a digit-recurrence algorithm are maintained. A first form is maintained in a first register and a second form is maintained in a second register. Responsive to receiving digits 1 to L−2 of the intermediate result from a digit recurrence unit, where L represents a number of digits that satisfies a predetermined precision for the operation, both forms of the intermediate result are updated by register swapping or concatenation under the control of load and shift control logic and on-the-fly conversion logic. Then, a rounded result is generated by determining digits dL−1 and dL and appending a rounded last digit to the appropriate form of the intermediate result.Type: GrantFiled: June 29, 2001Date of Patent: September 14, 2004Assignee: Intel CorporationInventor: Ping Tak Peter Tang
-
Publication number: 20030149712Abstract: A floating point unit includes a multiplier, an approximation circuit, and a control circuit coupled to the multiplier and the approximation circuit. The approximation circuit is configured to generate an approximation of a difference of the first result from the multiplier and a constant. The control circuit is configured to approximate a function specified by a floating point instruction provided to the floating point unit for execution using an approximation algorithm. The approximation algorithm comprises at least two iterations through the multiplier and optionally the approximation circuit. The control circuit is configured to correct the approximation from the approximation circuit from a first iteration of the approximation algorithm during a second iteration of the approximation algorithm by supplying a correction vector to the multiplier during the second iteration. The multiplier is configured to incorporate the correction vector into the first result during the second iteration.Type: ApplicationFiled: February 1, 2002Publication date: August 7, 2003Inventors: Robert Rogenmoser, Michael C. Kim
-
Publication number: 20030131035Abstract: The invention provides circuitry for carrying out at least one of a square root operation and a division operation. The circuitry utilizes a carry slave adder and a carry propagate adder part. The carry save adder and the carry propagate adder part are arranged in parallel.Type: ApplicationFiled: November 7, 2002Publication date: July 10, 2003Inventor: Tariq Kurd
-
Publication number: 20030126175Abstract: The invention provides circuitry for carrying out a square root operation and a division operation. The circuitry utilizes common iteration circuitry for carrying out a plurality of iterations and means for identifying if an square root operation or a division operation is to be performed. The iteration circuitry is controlled in accordance with whether a square root or division operation is to be performed.Type: ApplicationFiled: November 8, 2002Publication date: July 3, 2003Inventor: Tariq Kurd
-
Publication number: 20030028573Abstract: In a method for determining the square root of a long-bit number using a short-bit processor, the long-bit number is assumed to be c×22K+d, where c, d<22k, and its solution is assumed to be (a×2K+b)2. The ‘a’ is determined by using a bisection method to obtain the floor value of the square root of ‘c’. In order to obtained the value of ‘b’, there is derived a successive substitution equation: b[n]=(c−a2)×22k+(d−b[n−1]2)/22(k+1). An initial value is given to ‘b’ to execute the successive substitution equation recursively several times until the equation is convergent.Type: ApplicationFiled: October 19, 2001Publication date: February 6, 2003Inventor: Sheng-Hung Wu
-
Patent number: 6405305Abstract: A microprocessor with a floating point unit configured to rapidly execute floating point load control word (FLDCW) type instructions in an out of program order context is disclosed. The floating point unit is configured to schedule instructions older than the FLDCW-type instruction before the FLDCW-type instruction is scheduled. The FLDCW-type instruction acts as a barrier to prevent instructions occurring after the FLDCW-type instruction in program order from executing before the FLDCW-type instruction. Indicator bits may be used to simplify instruction scheduling, and copies of the floating point control word may be stored for instruction that have long execution cycles. A method and computer configured to rapidly execute FLDCW-type instructions in an out of program order context are also disclosed.Type: GrantFiled: September 10, 1999Date of Patent: June 11, 2002Assignee: Advanced Micro Devices, Inc.Inventors: Stephan G. Meier, Jeffrey E. Trull, Derrick R. Meyer, Norbert Juffa
-
Patent number: 6289365Abstract: A system is disclosed for performing floating point computation in connection with numbers in a base floating point representation (such as the representation defined in IEEE Std. 754) that defines a plurality of formats, including a normalized format and a de-normalized format, using a common floating point representation that defines a unitary normalized format. The system includes a base to common representation converter, a processor and a common to base representation converter. The base to common representation converter converts numbers from the base floating point representation to the common floating point representation, so that all numbers involved in a computation will be expressed in the unitary normalized format. The processor is configured to perform a mathematical operation of at least one predetermined type in connection with the converted numbers generated by the base to common representation converter to generate a floating point result in the common representation.Type: GrantFiled: December 9, 1997Date of Patent: September 11, 2001Assignee: Sun Microsystems, Inc.Inventor: Guy L. Steele, Jr.
-
Publication number: 20010016902Abstract: A method and instruction for converting a number from a floating point format to an integer format are described. Numbers are stored in the floating point format in a register of a first set of architectural registers in a packed format. At least one of the numbers in the floating point format is converted to at least one 8-bit number in the integer format. The 8-bit number in the integer format is placed in a register of a second set of architectural registers in the packed format.Type: ApplicationFiled: April 27, 2001Publication date: August 23, 2001Inventors: Mohammad A.F. Abdallah, Hsien-Cheng E. Hsieh, Thomas R. Huff, Vladimir Pentkovski, Patrice Roussel, Shreekant S. Thakkar
-
Patent number: 6233672Abstract: A floating point unit is provided which conveys the rounding mode in effect upon dispatch of a particular instruction with that particular instruction into the execution pipeline of the floating point unit. Upon dispatch of a control word update instruction into the execution pipeline, the rounding mode is updated according to the updated control word provided for the control word update instruction. Instructions subsequent to the control word update instruction thereby receive the updated rounding mode as those instructions are dispatched. The updated rounding mode is available to the subsequent instructions prior to retiring the control word update instruction. The rounding mode is therefore updated without serializing the update. If the control word update instruction modifies the value in a field other than the rounding mode, the instructions subsequent to the control word update instruction may be discarded and re-executed subsequent to updating the control word register with the updated control word.Type: GrantFiled: March 6, 1997Date of Patent: May 15, 2001Assignee: Advanced Micro Devices, Inc.Inventor: Thomas W. Lynch
-
Patent number: 6141670Abstract: A computer and a method of using the computer to reduce an original argument to obtain a periodic function of the argument. A special number P.sub.j is employed that is close to a nontrivial even-integral multiple .pi.. The technique subtracts a non-negative integral multiple of P.sub.j from the original argument to obtain a first reduced argument. Then, a second non-negative integer multiple of a floating-point representation of .pi./2 is subtracted from the first reduced argument to obtain a second reduced argument. Next, a periodic function of a third argument equal to a sum of the second reduced argument plus the product of the first non-negative integral multiple and a floating-point representation of an offset .delta..sub.j is evaluated to obtain a result.Type: GrantFiled: September 30, 1997Date of Patent: October 31, 2000Assignee: Intel CorporationInventors: Shane A. Story, Ping Tak Peter Tang
-
Patent number: 6134573Abstract: An apparatus and method for improving the execution of floating point instructions in a microprocessor is provided. During decode of a floating point instruction, translation logic generates absolute addresses of specified registers in a floating point register file. These absolute references, as opposed to relative references to a top-of-stack, are inserted into associated micro instructions. In the event of an exception, synchronization logic provides an architected top-of-stack for the floating point instruction associated with the exception to the translation logic so that subsequent instructions will properly reference floating point registers.Type: GrantFiled: April 20, 1998Date of Patent: October 17, 2000Assignee: IP-First, L.L.C.Inventors: G. Glenn Henry, Albert J. Loper, Jr., Terry Parks
-
Patent number: 6131106Abstract: Floating point numbers and other values are represented in a "delimited" representation in which all numbers, including those which would in the IEEE Std. 754 representation, be in the de-normalized format, are in a format which is normalized with an implicit most significant digit having the value "one." For numbers which would, in the IEEE Std.Type: GrantFiled: January 30, 1998Date of Patent: October 10, 2000Assignee: Sun Microsystems IncInventor: Guy L. Steele, Jr.
-
Patent number: 6105047Abstract: An apparatus to improve the speed of handling of denormal numbers in a computer system, the apparatus comprising a mode bit and a selector, the mode bit set when denormals are to be replaced by zero, the selector having a first input and an output, the first input comprising a floating point number, the selector selecting zero to become the output when the floating point number is denormal and the mode bit is set, the selector selecting the floating point number to become the output otherwise.Type: GrantFiled: November 9, 1998Date of Patent: August 15, 2000Assignee: Intel CorporationInventors: Harshvardhan Sharangpani, Roger Golliver
-
Patent number: 6029243Abstract: A floating-point processor nominally capable of single and double, but not extended, precision execution stores operands in extended-precision format. A format converter converts single and double precision source values to extended-precision format. Trap logic checks the apparent precision of the extended-precision operands and the requested result precision to determine whether the floating-point processor can execute the requested operation and yield the appropriate result. If the maximum of the requested precision and the maximum apparent precision of the operands is single or double, the requested operation is executed in hardware. Otherwise, a trap is issued to call an extended precision floating-point subroutine. This approach augments the class of operations that can be handled in hardware by a double-precision floating-point processor, and thus improves the floating-point computational throughput of an incorporating computer system.Type: GrantFiled: September 19, 1997Date of Patent: February 22, 2000Assignee: VLSI Technology, Inc.Inventors: Timothy A. Pontius, Kenneth A. Dockser
-
Patent number: 6021422Abstract: There is a unique partitioning problem in determining how to execute the floating point multiply instruction defined by IEEE 754 standard for the quad word format on a S/390 processor. Several manufacturers including IBM and HP define the binary quad word format to have a 113 bit significand. IBM S/390 hexadecimal long floating point format has a 56 bit significand and most S/390 floating point units only contain a long format multiplier. Quad word format multiplication must be executed as a series of several long precision multiplications and extended precision or long precision additions. The S/390 hexadecimal quad word format is easier to implement than binary format since it has a 112 bit significand and can easily be partitioned into two 56 bit parts. But a 113 bit significand would just exceed two partitions and require a third.Type: GrantFiled: March 5, 1998Date of Patent: February 1, 2000Assignee: International Business Machines CorporationInventor: Eric Mark Schwarz