Floating Point Or Vector Patents (Class 712/222)

HARDWARE INSTRUCTIONS TO ACCELERATE TABLE-DRIVEN MATHEMATICAL FUNCTION EVALUATION

Publication number: 20110296146

Abstract: A set of instructions for implementation in a floating-point unit or other computer processor hardware is disclosed herein. In one embodiment, an extended-range fused multiply-add operation, a first look-up operation, and a second look-up operation are each embodied in hardware instructions configured to be operably executed in a processor. These operations are accompanied by a table which provides a set of defined values in response to various function types, supporting the computation of elementary functions such as reciprocal, square, cube, fourth roots and their reciprocals, exponential, and logarithmic functions. By allowing each of these functions to be computed with a hardware instruction, branching and predicated execution may be reduced or eliminated, while also permitting the use of distributed instructions across a number of execution units.

Type: Application

Filed: May 27, 2010

Publication date: December 1, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Christopher K. Anand, Robert F. Enenkel, Anuroop Sharma, Daniel M. Zabawa
METHOD OF TESTING COMPUTER, COMPUTER TEST APPARATUS AND NON-TRANSITORY COMPUTER-READABLE MEDIUM

Publication number: 20110296147

Abstract: A method of testing a computer, the method has designating a register as an input-only register having a setting of a value which does not cause an exception interruption with an execution of a specific type of instruction, generating a test instruction array having a plurality of instructions for a test, by assigning a register excluding the input-only register as an output destination of an execution result of each of the plurality of instructions, executing the plurality of instructions included in the generated test instruction array, and evaluating the execution results by the computer.

Type: Application

Filed: April 13, 2011

Publication date: December 1, 2011

Applicant: FUJITSU LIMITED

Inventors: Fumio Ichikawa, Tamoru Inoue
GETFIRST AND ASSIGNLAST INSTRUCTIONS FOR PROCESSING VECTORS

Publication number: 20110283092

Abstract: The described embodiments comprise a processor that executes vector instructions. In the described embodiments, while executing program code, the processor receives a vector instruction that indicates an input vector that includes N elements, wherein receiving the vector instruction comprises optionally receiving a predicate vector that includes N elements. The processor then executes the vector instruction. When executing the vector instruction, if the predicate vector is received, based on active elements in the predicate vector, otherwise, if the predicate vector is not received, based on an assumed predicate vector for which each element is active, the processor sets a value in a scalar register equal to a predetermined element of the input vector. In the described embodiments, the vector instruction can be a GetFirst, an AssignLast1P, or an AssignLast2P instruction.

Type: Application

Filed: July 22, 2011

Publication date: November 17, 2011

Applicant: APPLE INC.

Inventor: Jeffry E. Gonion
INSTRUCTION SUPPORT FOR PERFORMING MONTGOMERY MULTIPLICATION

Publication number: 20110276790

Abstract: Techniques are disclosed relating to a processor including instruction support for performing a Montgomery multiplication. The processor may issue, for execution, programmer-selectable instruction from a defined instruction set architecture (ISA). The processor may include an instruction execution unit configured to receive instructions including a first instance of a Montgomery-multiply instruction defined within the ISA. The Montgomery-multiply instruction is executable by the processor to operate on at least operands A, B, and N residing in respective portions of a general-purpose register file of the processor, where at least one of operands A, B, N spans at least two registers of general-purpose register file. The instruction execution unit is configured to calculate P mod N in response to receiving the first instance of the Montgomery-multiply instruction, where P is the product of at least operand A, operand B, and R??1.

Type: Application

Filed: May 7, 2010

Publication date: November 10, 2011

Inventors: Christopher H. Olson, Gregory F. Grohoski, Lawrence Spracklen, Nils Gura
Reconfigurable paired processing element array configured with context generated each cycle by FSM controller for multi-cycle floating point operation

Patent number: 8046564

Abstract: Techniques, systems and apparatus are described for providing a processing element (PE) structure forming a floating point unit (FPU)-processing element. Each processing element includes each of two multiplexers (MUXes) to receive data from one or more sources including another PE, and select one value from the received data. The processing element includes an arithmetic logic unit (ALU) in communication with the two multiplexers to receive the selected value from each multiplexer as two input values, and process the received two input values to generate results of the ALU.

Type: Grant

Filed: September 19, 2008

Date of Patent: October 25, 2011

Assignee: Core Logic, Inc.

Inventors: Hoon Mo Yang, Man Hwee Jo, Il Hyun Park, Ki Young Choi
Load/Move Duplicate Instructions for a Processor

Publication number: 20110258418

Abstract: A method includes, in a processor, loading/moving a first portion of bits of a source into a first portion of a destination register and duplicate that first portion of bits in a subsequent portion of the destination register.

Type: Application

Filed: April 15, 2011

Publication date: October 20, 2011

Inventor: Patrice Roussel
Data dependent instruction decode

Patent number: 8028153

Abstract: A circuit arrangement and method support data dependent instruction decoding, whereby instructions are decoded, in part, using decode data that is stored in operand registers identified by such instructions. An instruction may include an opcode and at least one operand that identifies a register. During execution of the instruction, the instruction is first decoded using the opcode, and then decode data stored in the operand register is retrieved and used to further decode the instruction, e.g., to select from among a plurality of operations or instruction types associated with the same opcode.

Type: Grant

Filed: August 14, 2008

Date of Patent: September 27, 2011

Assignee: International Business Machines Corporation

Inventors: Mark J Hickey, Adam J Muff, Matthew R Tubbs, Charles D Wait
APPARATUS AND METHOD FOR IMPLEMENTING INSTRUCTION SUPPORT FOR PERFORMING A CYCLIC REDUNDANCY CHECK (CRC)

Publication number: 20110231636

Abstract: Techniques relating to a processor including instruction support for implementing a cyclic redundancy check (CRC) operation. The processor may issue, for execution, programmer-selectable instructions from a defined instruction set architecture (ISA). The processor may include a cryptographic unit configured to receive instructions that include a first instance of a cyclic redundancy check (CRC) instruction defined within the ISA, where the first instance of the CRC instruction is executable by the cryptographic unit to perform a first CRC operation on a set of data that produces a checksum value. In one embodiment, the cryptographic unit is configured to generate the checksum value using a generator polynomial of 0x11EDC6F41.

Type: Application

Filed: March 16, 2010

Publication date: September 22, 2011

Inventors: Christopher H. Olson, Gregory F. Grohoski, Lawrence A. Spracklen
Interfacing with a dynamically configurable arithmetic unit

Patent number: 8024678

Abstract: An interface to a dynamically configurable arithmetic unit can include data alignment modules, where each data alignment module receives input variables being associated with one or more arithmetic expressions. The interface can include multiplexers coupled to the data alignment modules, wherein a data alignment module has outputs coupled to a first multiplexer. The first multiplexer can have a selection line and an output coupled to an input port of the dynamically configurable arithmetic unit. The interface can include a second multiplexer having input instructions and the selection line, where each instruction is associated with one of the arithmetic expressions and has an operation to be performed by the dynamically configurable arithmetic unit. The second multiplexer is configurable to provide selected ones of the input instructions to the dynamically configurable arithmetic unit through an output of the second multiplexer responsive to the selection line.

Type: Grant

Filed: April 1, 2009

Date of Patent: September 20, 2011

Assignee: Xilinx, Inc.

Inventors: Bradley L. Taylor, Arvind Sundararajan, Shay Ping Seng, L. James Hwang
ASSIGNING FLOATING-POINT OPERATIONS TO A FLOATING-POINT UNIT AND AN ARITHMETIC LOGIC UNIT

Publication number: 20110208505

Abstract: A processor may include a floating-point unit (FPU) and an arithmetic logic unit (ALU). Instructions to the processor may include greater or lesser amounts of floating-point operations and integer operations. In a circumstance where instructions include predominantly integer operations, power to the FPU may be reduced or turned completely off. In such a circumstance, occasional floating-point operations may be emulated and performed by the ALU. If the processor subsequently determines that incoming instructions include a greater proportion of floating-point operations, the FPU may be powered back on and used to perform the floating-point operations.

Type: Application

Filed: February 24, 2010

Publication date: August 25, 2011

Applicant: ADVANCED MICRO DEVICES, INC.

Inventors: David E. Mayhew, Mark D. Hummel
Method and software for partitioned group element selection operation

Patent number: 8001360

Abstract: A system and software for improving the performance of processors by incorporating an execution unit operable to decode and execute single instructions specifying a data selection operand and a first and a second register providing a plurality of data elements, the data selection operand comprising a plurality of fields each selecting one of the plurality of data elements, the execution unit operable to provide the data element selected by each field of the data selection operand to a predetermined position in a catenated result.

Type: Grant

Filed: January 16, 2004

Date of Patent: August 16, 2011

Assignee: Microunity Systems Engineering, Inc.

Inventors: Craig Hansen, John Moussouris
Multithreaded programmable processor and system with partitioned operations

Patent number: 7987344

Abstract: A programmable processor and method for improving the performance of processors by incorporating an execution unit configurable to execute a plurality of instruction streams from the plurality of threads, wherein each instruction stream includes a group instruction that operates on a plurality of data elements in partitioned fields of at least one of the registers to produce a catenated result.

Type: Grant

Filed: January 16, 2004

Date of Patent: July 26, 2011

Assignee: Microunity Systems Engineering, Inc.

Inventors: Craig Hansen, John Moussouris
MULTI-INPUT AND BINARY REPRODUCIBLE, HIGH BANDWIDTH FLOATING POINT ADDER IN A COLLECTIVE NETWORK

Publication number: 20110173421

Abstract: To add floating point numbers in a parallel computing system, a collective logic device receives the floating point numbers from computing nodes. The collective logic devices converts the floating point numbers to integer numbers. The collective logic device adds the integer numbers and generating a summation of the integer numbers. The collective logic device converts the summation to a floating point number. The collective logic device performs the receiving, the converting the floating point numbers, the adding, the generating and the converting the summation in one pass. One pass indicates that the computing nodes send inputs only once to the collective logic device and receive outputs only once from the collective logic device.

Type: Application

Filed: January 8, 2010

Publication date: July 14, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Dong Chen, Noel A. Eisley, Philip Heidelberger, Burkhard Steinmacher-Burow
Floating Point Collect and Operate

Publication number: 20110161624

Abstract: Mechanisms are provided for performing a floating point collect and operate for a summation across a vector for a dot product operation. A routing network placed before the single instruction multiple data (SIMD) unit allows the SIMD unit to perform a summation across a vector with a single stage of adders. The routing network routes the vector elements to the adders in a first cycle. The SIMD unit stores the results of the adders into a results vector register. The routing network routes the summation results from the results vector register to the adders in a second cycle. The SIMD unit then stores the results from the second cycle in the results vector register.

Type: Application

Filed: December 29, 2009

Publication date: June 30, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Brian K. Flachs, Seiji Maeda, Steven Osman
Parallel and Vectored Gilbert-Johnson-Keerthi Graphics Processing

Publication number: 20110153996

Abstract: Parallel and vectored data structures may be used in a single instruction multiple data processor that applies the Gilbert-Johnson-Keerthi algorithm. As a result, the performance of multi-core processors doing graphics processing may be increased in some cases.

Type: Application

Filed: December 23, 2009

Publication date: June 23, 2011

Inventors: Aleksey A. Bader, Mikhail Smelyanskiy, Jatin Chhugani
VECTOR COMPUTER AND INSTRUCTION CONTROL METHOD THEREFOR

Publication number: 20110138155

Abstract: A vector computer executing vector operations via vector pipeline processing is restructured to dynamically perform an overtaking control on vector gather/scatter instructions. Minimum/maximum values among vector elements of vector registers are determined based on the result of fixed-point calculation defining an address dependency source instruction in accordance with a vector gather/scatter instruction, wherein minimum/maximum values are determined in a redundant time owing to a short turnaround time of the fixed-point calculation compared to floating-point calculation. An access range of addresses attributed to the vector gather/scatter instruction is specified based on minimum/maximum values. An overtaking control is performed on the vector gather/scatter instruction in light of the access range of addresses.

Type: Application

Filed: December 1, 2010

Publication date: June 9, 2011

Inventor: EIICHIRO KAWAGUCHI
Architecture for joint detection hardware accelerator

Patent number: 7953958

Abstract: A joint detection system is configured to perform joint detection of received signals and includes a joint detection accelerator and a host processor. The joint detection accelerator may include a memory unit to store input data values, intermediate results and output data values; one or more computation units to process the input data values and the intermediate results, and to provide output data values to the memory unit; a controller to control the memory and the one or more computation units to perform joint detection processing; and an external interface to receive the input data values from the host processor and to provide output data values to the host processor. The computation units may include a complex multiply accumulate unit, a simplified complex multiply accumulate unit and a normalized floating point divider. The memory unit may include an input memory, a matrix memory, a main memory and an output memory.

Type: Grant

Filed: June 12, 2007

Date of Patent: May 31, 2011

Assignee: MediaTek Inc.

Inventors: John Zijun Shen, Paul D. Krivacek, Thomas J. Barber, Jr., Lidwine Martinot, Aiguo Yan, Marko Kocic
Multifunction hexadecimal instruction form system and program product

Patent number: 7949858

Abstract: A new zSeries floating-point unit has a fused multiply-add dataflow capable of supporting two architectures and fused MULTIPLY and ADD and Multiply and SUBTRACT in both RRF and RXF formats for the fused functions. Both binary and hexadecimal floating-point instructions are supported for a total of 6 formats. The floating-point unit is capable of performing a multiply-add instruction for hexadecimal or binary every cycle with a latency of 5 cycles. This supports two architectures with two internal formats with their own biases. This has eliminated format conversion cycles and has optimized the width of the dataflow. The unit is optimized for both hexadecimal and binary floating-point architecture supporting a multiply-add/subtract per cycle.

Type: Grant

Filed: February 2, 2009

Date of Patent: May 24, 2011

Assignee: International Business Machines Corporation

Inventors: Eric M. Schwarz, Ronald M. Smith, Sr.
Method and apparatus to extract integer and fractional components from floating-point data

Publication number: 20110119471

Abstract: A method is presented including decomposing a first value into many parts. Decomposing includes shifting (310) a rounded integer portion of the first value to generate a second value. Generating (320) a third value. Extracting (330) a plurality of significand bits from the second value to generate a fourth value. Extracting (340) a portion of bits from the fourth value to generate an integer component. Generating (350) a fifth value. Also the third value, the fifth value, and the integer component are either stored (360, 380) in a memory or transmitted to an arithmetic logical unit (ALU).

Type: Application

Filed: July 13, 2001

Publication date: May 19, 2011

Inventors: Robert S Norin, Olga Dryzhakova, Alexander Isaev, Andrey Naralkin
Conditional execution of floating point store instruction by simultaneously reading condition code and store data from multi-port register file

Patent number: 7945766

Abstract: A processor capable of executing conditional store instructions without being limited by the number of condition codes is provided. Condition data is stored in floating-point registers, and an operation unit executes a conditional floating-point store instruction of determining whether to store, in cache, store data.

Type: Grant

Filed: November 25, 2008

Date of Patent: May 17, 2011

Assignee: Fujitsu Limited

Inventor: Toshio Yoshida
Adaptive execution cycle control method for enhanced instruction throughput

Patent number: 7937568

Abstract: A method, system and processor for increasing the instruction throughput in a processor executing longer latency instructions within the instruction pipeline. Logic associated with specific stages of the execution pipeline, responsible for executing the particular type of instructions, determines when at least a threshold number of the particular-type instructions is scheduled to be executed. The logic then automatically changes an execution cycle frequency of the specific pipeline stages from a first cycle frequency to a second, pre-established higher cycle frequency, which enables more efficient execution and higher execution throughput of the particular-type instructions. The cycle frequency of only the one or more functional stages are switched to the higher cycle frequency independent of the cycle frequency of the other functional stages in the processor pipeline.

Type: Grant

Filed: July 11, 2007

Date of Patent: May 3, 2011

Assignee: International Business Machines Corporation

Inventors: Anthony Correale, Jr., Kenichi Tsuchiya
Register state saving and restoring

Publication number: 20110093686

Abstract: In a data processing apparatus 1 having registers 6, when a state saving trigger event occurs while a result value of a data processing operation is still to be written to a destination register then saving and restoring control circuitry 12 selects a state saving sequence defining a temporal order for saving register values to a backup data store 10. The sequence is selected to provide the destination register with a position within the sequence corresponding to a time after the result value has been written to the destination register. The register values are then saved to the backup data store 10 in the order of the selected state saving sequence. A similar technique can be used when a state restoring trigger event triggers loading of the data values from the backup data store 10 to the registers 6.

Type: Application

Filed: September 16, 2010

Publication date: April 21, 2011

Applicant: ARM LIMITED

Inventors: Antony John Penton, Simon Axford
Early exit processing of iterative refinement algorithm using register dependency disable

Patent number: 7921278

Abstract: An “early exit” of an iterative refinement algorithm is implemented by effectively disabling read after write dependency stalls of newer instructions, as well as disabling the register write enable of these instructions, for the remainder of the algorithm, in addition to disabling the register write enable of these instructions. By doing so, the latency of the algorithm is reduced and the performance is increased without the complexity and potential poor performance of compare and branch instructions that might otherwise be required.

Type: Grant

Filed: March 10, 2008

Date of Patent: April 5, 2011

Assignee: International Business Machines Corporation

Inventors: Adam James Muff, Matthew Ray Tubbs
Method and system for overlapping execution of instructions through non-uniform execution pipelines in an in-order processor

Patent number: 7913067

Abstract: A system and method for overlapping execution (OE) of instructions through non-uniform execution pipelines in an in-order processor are provided. The system includes a first execution unit to perform instruction execution in a first execution pipeline. The system also includes a second execution unit to perform instruction execution in a second execution pipeline, where the second execution pipeline includes a greater number of stages than the first execution pipeline. The system further includes an instruction dispatch unit (IDU), the IDU including OE registers and logic for dispatching an OE-capable instruction to the first execution unit such that the instruction completes execution prior to completing execution of a previously dispatched instruction to the second execution unit. The system additionally includes a latch to hold a result of the execution of the OE-capable instruction until after the second execution unit completes the execution of the previously dispatched instruction.

Type: Grant

Filed: February 20, 2008

Date of Patent: March 22, 2011

Assignee: International Business Machines Corporation

Inventors: David S. Hutton, Khary J. Alexander, Fadi Y. Busaba, Bruce C. Giamei, John G. Rell, Jr., Eric M. Schwarz, Chung-Lung Kevin Shum
Early exit processing of iterative refinement algorithm using register dependency disable and programmable early exit condition

Patent number: 7913066

Abstract: A programmable “early exit” of an iterative refinement algorithm is implemented by effectively disabling read after write dependency stalls of newer instructions, as well as disabling the register write enable of these instructions, for the remainder of the algorithm, in addition to disabling the register write enable of these instructions. In addition, programmable logic is provided to enable a custom early exit condition to be specified for the iterative refinement algorithm so that the underlying hardware can be configured for optimal execution of particular iterative refinement algorithms. By doing so, the latency of the algorithm is reduced and the performance is increased without the complexity and potential poor performance of compare and branch instructions that might otherwise be required.

Type: Grant

Filed: March 10, 2008

Date of Patent: March 22, 2011

Assignee: International Business Machines Corporation

Inventors: Adam James Muff, Matthew Ray Tubbs
SPECULATIVE FORWARDING OF NON-ARCHITECTED DATA FORMAT FLOATING POINT RESULTS

Publication number: 20110060892

Abstract: A microprocessor having an instruction set architecture (ISA) that specifies at least one architected data format (ADF) for floating-point operands includes first and second floating-point units. The first floating-point unit is configured to speculatively forward a non-ADF result generated by the first floating-point unit to the second floating-point unit. The non-ADF result is associated with a first instruction. The second floating-point unit is configured to use the speculatively forwarded non-ADF result associated with the first instruction as a source operand to generate a result of a second instruction. The second floating-point unit is further configured to convert the non-ADF result to an ADF result and to determine whether the non-ADF result creates an exception condition when converted to the ADF result. The microprocessor is configured to cancel the second instruction, in response to determining that the non-ADF result creates an exception condition when converted to the ADF result.

Type: Application

Filed: June 22, 2010

Publication date: March 10, 2011

Applicant: VIA TECHNOLOGIES, INC.

Inventors: G. Glenn Henry, Terry Parks
Processing unit incorporating special purpose register for use with instruction-based persistent vector multiplexer control

Patent number: 7904700

Abstract: A software-accessible special purpose register is architected into a processing unit in order to implement persistent vector multiplexer control of a vector-based execution unit. A persistent swizzle instruction is defined in an instruction set for the vector-based execution unit and is used to cause state information to be stored in the special purpose register such that the operand vectors processed by subsequent vector instructions executed by the vector-based execution unit will be selectively shuffled using the persisted state information. As a result, when multiple vector instructions require a common custom word ordering for one or more operand vectors, a single persistent swizzle instruction may be used to select the desired custom word ordering for all of the vector instructions.

Type: Grant

Filed: March 10, 2008

Date of Patent: March 8, 2011

Assignee: International Business Machines Corporation

Inventors: Eric Oliver Mejdrich, Adam James Muff, Robert Allen Shearer, Matthew Ray Tubbs
Processing unit incorporating instruction-based persistent vector multiplexer control

Patent number: 7904699

Abstract: Persistent vector multiplexer control is used in a vector-based execution unit to control the shuffling of words in operand vectors processed by the execution unit. In addition, a persistent swizzle instruction is defined in an instruction set for the vector-based execution unit and is used to cause state information to be persisted such that the operand vectors processed by subsequent vector instructions executed by the vector-based execution unit will be selectively shuffled using the persisted state information. As a result, when multiple vector instructions require a common custom word ordering for one or more operand vectors, a single persistent swizzle instruction may be used to select the desired custom word ordering for all of the vector instructions.

Type: Grant

Filed: March 10, 2008

Date of Patent: March 8, 2011

Assignee: International Business Machines Corporation

Inventors: Eric Oliver Mejdrich, Adam James Muff, Robert Allen Shearer, Matthew Ray Tubbs
Floating point only SIMD instruction set architecture including compare, select, Boolean, and alignment operations

Patent number: 7900025

Abstract: Mechanisms for implementing a floating point only single instruction multiple data instruction set architecture are provided. A processor is provided that comprises an issue unit, an execution unit coupled to the issue unit, and a vector register file coupled to the execution unit. The execution unit has logic that implements a floating point (FP) only single instruction multiple data (SIMD) instruction set architecture (ISA). The floating point vector registers of the vector register file store both scalar and floating point values as vectors having a plurality of vector elements. The processor may be part of a data processing system.

Type: Grant

Filed: October 14, 2008

Date of Patent: March 1, 2011

Assignee: International Business Machines Corporation

Inventor: Michael K. Gschwind
In-Data Path Tracking of Floating Point Exceptions and Store-Based Exception Indication

Publication number: 20110047358

Abstract: Mechanisms are provided for tracking exceptions in the execution of vectorized code. A speculative instruction is executed on a vector element of a vector. An exception condition is detected in association with the vector element based on a result of executing the speculative instruction on the vector element. A special exception value is stored in the vector element in a vector register corresponding to the vector, indicative of the exception condition, without invoking an exception handler for the exception condition. The special exception value is propagated with the vector element of the vector through a processor architecture of the processor, without invoking the exception handler for the exception condition. An exception corresponding to the exception condition indicated by the special exception value is generated only in response to a non-speculative instruction being executed that performs a non-speculative operation on the vector element.

Type: Application

Filed: August 19, 2009

Publication date: February 24, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Alexandre E. Eichenberger, Alan Gara, Michael K. Gschwind
Insertion of Operation-and-Indicate Instructions for Optimized SIMD Code

Publication number: 20110047359

Abstract: Mechanisms are provided for inserting indicated instructions for tracking and indicating exceptions in the execution of vectorized code. A portion of first code is received for compilation. The portion of first code is analyzed to identify non-speculative instructions performing designated non-speculative operations in the first code that are candidates for replacement by replacement operation-and-indicate instructions that perform the designated non-speculative operations and further perform an indication operation for indicating any exception conditions corresponding to special exception values present in vector register inputs to the replacement operation-and-indicate instructions. The replacement is performed and second code is generated based on the replacement of the at least one non-speculative instruction.

Type: Application

Filed: August 19, 2009

Publication date: February 24, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Alexandre E. Eichenberger, Alan Gara, Michael K. Gschwind
Operand queue for use in a floating point unit to reduce read-after-write latency and method operation

Patent number: 7895418

Abstract: There is disclosed an operand queue for use in a floating point unit. The floating point unit comprises floating point processing units for executing floating point instructions that write operands to an external memory and for executing floating point instructions that read operands from the external memory. The floating point also comprises an operand queue for storing a plurality of operands associated with one or more operations being processed in the floating point unit. The operand queue stores a first operand being written to an external memory by a floating point write instruction executed by a first one of the plurality of floating point processing units and supplies the first operand to a floating point read instruction executed by a second one of the plurality of floating point processing units subsequent to the execution of the floating point write instruction.

Type: Grant

Filed: November 28, 2005

Date of Patent: February 22, 2011

Assignee: National Semiconductor Corporation

Inventor: Daniel W. Green
RISC PROCESSOR APPARATUS AND METHOD FOR SUPPORTING X86 VIRTUAL MACHINE

Publication number: 20110035745

Abstract: A RISC processor apparatus and method for supporting an X86 virtual machine.

Type: Application

Filed: December 17, 2008

Publication date: February 10, 2011

Applicant: Institute of Computing Technology of the Chinese Academy of Sciences

Inventors: Guojie Li, Weiwu Hu, Xiaoyu Li, Menghao Su
NON-ATOMIC SCHEDULING OF MICRO-OPERATIONS TO PERFORM ROUND INSTRUCTION

Publication number: 20110029760

Abstract: A microprocessor executes an instruction specifying a floating-point input operand having a predetermined size and that instructs the microprocessor to round the floating-point input operand to an integer value using a rounding mode and to return a floating-point result having the same predetermined size. An instruction translator translates the instruction into first and second microinstructions. An execution unit executes the first and second microinstructions. The first microinstruction receives as an input operand the instruction floating-point input operand and generates an intermediate result from the input operand. The second microinstruction receives as an input operand the intermediate result of the first microinstruction and generates the floating-point result of the instruction from the intermediate result. The intermediate result is the same predetermined size as the instruction floating-point input operand.

Type: Application

Filed: May 20, 2010

Publication date: February 3, 2011

Applicant: VIA TECHNOLOGIES, INC.

Inventors: Tom Elmer, Terry Parks
DYNAMIC FLOATING POINT REGISTER PRECISION CONTROL

Publication number: 20110004644

Abstract: Apparatus and methods are provided to perform floating point operations that are adaptive to the precision formats of input operands. The apparatus includes adaptive conversion logic and a tagged register file. The adaptive conversion logic receives the input operands, where each of the input operands is of a corresponding precision. The adaptive conversion logic also records the corresponding precision for use in subsequent floating point operations. The tagged register file is coupled to the adaptive conversion logic. The tagged register file stores the each of the input operands, and stores the corresponding precision and furthermore associates the corresponding precision with the each of the input operands. The subsequent floating point operations are performed at a precision level according to the corresponding precision.

Type: Application

Filed: July 3, 2009

Publication date: January 6, 2011

Applicant: VIA Technologies, Inc.

Inventors: G. Glenn Henry, Rodney E. Hooker, Terry Parks
Aligning precision converted vector data using mask indicating offset relative to element boundary corresponding to precision type

Patent number: 7865693

Abstract: Mechanisms for aligning enhanced precision vectors based on reduced precision data values are provided. At least one data value, having a first precision type, is received for storing in a vector register. The vector register stores data as a vector having a plurality of vector elements. The first precision type is modified to have a second precision type different in precision than the first precision type to thereby generate at least one modified data value. The at least one modified data value is stored in at least one vector element of the plurality of vector elements. An alignment of the at least one modified data value is determined relative to a boundary of a vector element of the vector register. An alignment operation to re-align the at least one modified data value based on the boundary of the vector element of the vector register is performed.

Type: Grant

Filed: October 14, 2008

Date of Patent: January 4, 2011

Assignee: International Business Machines Corporation

Inventors: Alexandre E. Eichenberger, Bruce M. Fleischer, Michael K. Gschwind
Remapping source Registers to aid instruction scheduling within a processor

Publication number: 20100332805

Abstract: An out-of-order renaming processor is provided with a register file within which aliasing between registers of different sizes may occur. In this way a program instruction having a source register of a double precision size may alias with two single precision registers being used as destinations of one or more preceding program instructions. In order to track this data dependency the double precision register may be remapped into a micro-operation specifying two single precision registers as its source register. In this way, scheduling circuitry may use its existing hazard detection and management mechanisms to handle potential data hazards and dependencies. Not all program instructions having such data hazards between registers of different sizes are handled by this source register remapping. For these other program instructions a slower mechanism for dealing with the data dependency hazard is provided.

Type: Application

Filed: June 24, 2009

Publication date: December 30, 2010

Applicant: ARM Limited

Inventors: Conrado Blasco Allue, David James Williamson, James Nolan Hardage, Glen Andrew Harris, Robert Gregory McDonald
Data processing apparatus and method

Publication number: 20100325397

Abstract: A data processing apparatus is described which comprises processing circuitry responsive to data processing instructions to execute integer data processing operations and floating point data processing operations, a first set of integer registers useable by the processing circuitry in executing the integer data processing operations, and a second set of floating point registers useable by the processing circuitry in executing the floating point data processing operations.

Type: Application

Filed: May 3, 2010

Publication date: December 23, 2010

Inventor: Simon John Craske
VECTOR TEST INSTRUCTION FOR PROCESSING VECTORS

Publication number: 20100325399

Abstract: The described embodiments provide a processor that executes a vector instruction. The processor starts by receiving a vector instruction that uses at least one vector of values that includes N elements as an input. In addition, the processor optionally receives a predicate vector that includes N elements. The processor then executes the vector instruction. In the described embodiments, when executing the vector instruction, if the predicate vector is received, for one or more selected elements in the vector of values for which a corresponding element in the predicate vector is active, otherwise, for one or more selected elements in the vector of values, the processor checks the one or more selected elements to determine if the selected elements contain a predetermined value. When the selected elements contain the predetermined value, the processor sets a corresponding status flag.

Type: Application

Filed: August 31, 2010

Publication date: December 23, 2010

Applicant: APPLE INC.

Inventors: Jeffry E. Gonion, Keith E. Diefendorff
RUNNING-MIN AND RUNNING-MAX INSTRUCTIONS FOR PROCESSING VECTORS

Publication number: 20100325398

Abstract: The described embodiments provide a processor for generating a result vector that contains results from a comparison operation. During operation, the processor receives a first input vector, a second input vector, and a control vector. When subsequently generating a result vector, the processor first captures a base value from a key element position in the first input vector. For selected elements in the result vector, processor compares the base value and values from relevant elements to the left of a corresponding element in the second input vector, and writes the result into the element in the result vector. In the described embodiments, the key element position and the relevant elements can be defined by the control vector and an optional predicate vector.

Type: Application

Filed: August 31, 2010

Publication date: December 23, 2010

Applicant: APPLE INC.

Inventors: Jeffry E. Gonion, Keith E. Diefendorff
SUPERSCALAR REGISTER-RENAMING FOR A STACK-ADDRESSED ARCHITECTURE

Publication number: 20100318772

Abstract: A system and method for increasing processor throughput by decreasing a loop critical path. In one embodiment, a table comprises multiple stack entries, each comprising an x87 floating-point (FP) stack specifier. The combinatorial logic for operand translation of N FP instructions per clock cycle may require N instantiated copies of a combinatorial logic block. Each instantiated copy may determine a new ordering of the stack entries. Control logic may receive necessary information from the corresponding N FP instructions and determine a corresponding combined computational effect, or stack reordering, on entries within the table based on two or more instructions. Resulting control signals are conveyed to the N instantiated copies. A resulting accumulative delay from an input of the first copy to the output of the Nth copy may be less than or equal to (N?1)*time_delay versus a longer N*time_delay.

Type: Application

Filed: June 11, 2009

Publication date: December 16, 2010

Inventors: Ranganathan Sudhakar, Daryl Lieu, Debjit Das Sarma
Method and apparatus for performing improved group instructions

Patent number: 7849291

Abstract: Systems and apparatuses are presented relating a programmable processor comprising an execution unit that is operable to decode and execute instructions received from an instruction path and partition data stored in registers in the register file into multiple data elements, the execution unit capable of executing a plurality of different group floating-point and group integer arithmetic operations that each arithmetically operates on multiple data elements stored registers in a register file to produce a catenated result that is returned to a register in the register file, wherein the catenated result comprises a plurality of individual results, wherein the execution unit is capable of executing group data handling operations that re-arrange data elements in different ways in response to data handling instructions.

Type: Grant

Filed: October 29, 2007

Date of Patent: December 7, 2010

Assignee: Microunity Systems Engineering, Inc.

Inventors: Craig Hansen, John Moussouris
Sharing data in internal and memory representations with dynamic data-driven conversion

Patent number: 7849294

Abstract: Illustrative embodiments determine the data type of the operand being accessed as well as analyze the data value subrange of the input operand data type. If the operand's data type does not match the required format of the instruction being processed, a determination is made as to whether a subrange of data values of the data type of the input operand is supported natively. If the subrange of data values of the input operand is not supported natively, then a format conversion is performed on the data and the instruction may then operate on the data. Otherwise, the data may be operated on directly by the instruction without a format conversion operation and thus, the conversion is not performed.

Type: Grant

Filed: January 31, 2008

Date of Patent: December 7, 2010

Assignee: International Business Machines Corporation

Inventors: Michael K. Gschwind, Brett Olsson
SINGLE CYCLE DATA MOVEMENT BETWEEN GENERAL PURPOSE AND FLOATING-POINT REGISTERS

Publication number: 20100306510

Abstract: Systems and methods for providing single cycle movement of data between a floating-point register file (FRF) and a general purpose or integer register file (RF) of a microprocessor system are provided. The system may include an integer execution unit operative to execute instructions with single cycle latency, a floating-point execution unit, a working register file (WRF), an FRF, and an IRF. To achieve the single cycle movement functionality, the integer execution unit may physically own the WRF, IRF, and FRF, and may monitor and control any dependencies between them. Thus, since the integer execution unit has direct read access to both the IRF and the FRF, data may be moved between the two register files using the single cycle operation of the integer execution unit, without the need to store and load the data from memory.

Type: Application

Filed: June 2, 2009

Publication date: December 2, 2010

Applicant: Sun Microsystems, Inc.

Inventors: Christopher Olson, Robert T. Golla, Jeffrey S. Brooks
Checking for exception by floating point instruction reordered across branch by comparing current status in FP status register against last status copied in shadow register

Patent number: 7840788

Abstract: A process which automatically inserts commands that test for and raise exceptions indicating floating point status exceptions into a sequence of instructions to be executed, re-ordering a pipelined instructions by moving a floating point instruction from after a branch instruction to before the branch instruction, and responds to exceptions in execution of the sequence of instructions by returning execution to a point in the sequence of instructions at which correct state is known and then executing each instruction in the sequence singly to completion so that exceptions in pipelined floating point instructions can be automatically-detected and handled precisely.

Type: Grant

Filed: February 26, 2008

Date of Patent: November 23, 2010

Inventors: Guillermo J. Rozas, David Dunn, Robert F. Cmelik
Method and floating point unit to convert a hexadecimal floating point number to a binary floating point number

Patent number: 7840622

Abstract: Method to convert a hexadecimal floating point number (H) into a binary floating point number by using a Floating Point Unit (FPU) with fused multiply add with an A-register a B-register for two multiplicand operands and a C-register for an addend operand, wherein a leading zero counting unit (LZC) is associated to the addend C-register, wherein the difference of the leading zero result provided by the LZC and the input exponent (E) is calculated by a control unit and determines based on the Raw-Result-Exponent a force signal (F) with special conditions like ‘Exponent Overflow’, ‘Exponent Underflow’, and ‘Zero Result’.

Type: Grant

Filed: July 20, 2006

Date of Patent: November 23, 2010

Assignee: International Business Machines Corporation

Inventors: Guenter Gerwig, Klaus Michael Kroener
RELIABLE EXECUTION USING COMPARE AND TRANSFER INSTRUCTION ON AN SMT MACHINE

Publication number: 20100281239

Abstract: A system and method for efficient reliable execution on a simultaneous multithreading machine. A processor is placed in a reliable execution mode (REM) to detect possible errors during execution of a mission critical software application. Only two threads may be configured to operate in this mode. Floating-point store and integer-transfer unary instructions may be converted to new binary instructions. Each new instruction has two source operands, each one corresponding to a different thread is specified by a same logical register number as a single source operand of the original unary instruction. All other instructions are replicated, wherein the original instruction and its twin are assigned to different threads. Simultaneous multi-threaded (SMT) floating-point logic may only be able to provide lockstep execution when it communicates using the new instruction with instantiated integer independent clusters.

Type: Application

Filed: April 29, 2009

Publication date: November 4, 2010

Inventors: Ranganathan Sudhakar, Nhon T. Quach
Processing unit for generating control signal, controller with the processing unit for controlling actuator, and program executed in the processing unit

Patent number: 7826935

Abstract: A controller with a processing unit and a floating-point processor is disposed in a vehicle to control an actuator according to detected parameters sent from sensors. The processor generates a floating-point control parameter indicating a first quantity of control from the detected parameters at floating-point calculations, and the unit generates a fixed-point control parameter indicating a second quantity of control from the detected parameters at fixed-point calculations. The unit converts the floating-point control parameter into a converted control parameter of fixed-point representation and judges from both the converted control parameter and the fixed-point control parameter whether a failure has occurred in the processor. When no failure has occurred in the processor, the unit generates a control signal from the floating-point control parameter. The actuator is operated according to the control signal so as to give the first quantity of control to a controlled object.

Type: Grant

Filed: April 5, 2007

Date of Patent: November 2, 2010

Assignee: Denso Corporation

Inventors: Hidetoshi Kobayashi, Eiji Takayama
LOGICAL MAP TABLE FOR DETECTING DEPENDENCY CONDITIONS

Publication number: 20100274993

Abstract: Techniques and structures are described which allow the detection of certain dependency conditions, including evil twin conditions, during the execution of computer instructions. Information used to detect dependencies may be stored in a logical map table, which may include a content-addressable memory. The logical map table may maintain a logical register to physical register mapping, including entries dedicated to physical registers available as rename registers. In one embodiment, each entry in the logical map table includes a first value usable to indicate whether only a portion of the physical register is valid and whether the physical register includes the most recent update to the logical register being renamed. Use of this first value may allow precise detection of dependency conditions, including evil twin conditions, upon an instruction reading from at least two portions of a logical register having an entry in the logical map table whose first value is set.

Type: Application

Filed: April 22, 2009

Publication date: October 28, 2010

Inventors: Robert T. Golla, Jama I. Barreh, Jeffrey S. Brooks, Howard L. Levy
APPARATUS AND METHOD FOR HANDLING DEPENDENCY CONDITIONS

Publication number: 20100274992

Abstract: Techniques for handling dependency conditions, including evil twin conditions, are disclosed herein. An instruction may designate a source register comprising two portions. The source register may be a double-precision register and its two portions may be single-precision portions, each specified as destinations by two other single-precision instructions. Execution of these two single-precision instructions, especially on a register renaming machine, may result in the appropriate values for the two portions of the source register being stored in different physical locations, which can complicate execution of an instruction stream. In response to detecting a potential dependency, one or more instructions may be inserted in an instruction stream to enable the appropriate values to be stored within one physical double precision register, eliminating an actual or potential evil twin dependency.

Type: Application

Filed: April 22, 2009

Publication date: October 28, 2010

Inventors: Yuan C. Chou, Jared C. Smolens, Jeffrey S. Brooks

prev … 2 3 4 5 6 7 8 9 10 … next