Floating Point Or Vector Patents (Class 712/222)

Floating point scaling processors, methods, systems, and instructions

Patent number: 10228909

Abstract: A method of an aspect includes receiving a floating point scaling instruction. The floating point scaling instruction indicates a first source including one or more floating point data elements, a second source including one or more corresponding floating point data elements, and a destination. A result is stored in the destination in response to the floating point scaling instruction. The result includes one or more corresponding result floating point data elements each including a corresponding floating point data element of the second source multiplied by a base of the one or more floating point data elements of the first source raised to a power of an integer representative of the corresponding floating point data element of the first source. Other methods, apparatus, systems, and instructions are disclosed.

Type: Grant

Filed: March 30, 2018

Date of Patent: March 12, 2019

Assignee: Intel Corporation

Inventors: Cristina S. Anderson, Amit Gradstein, Robert Valentine, Simon Rubanovich, Benny Eitan
Processors, methods, systems, and instructions to store consecutive source elements to unmasked result elements with propagation to masked result elements

Patent number: 10223113

Abstract: A processor of an aspect includes a decode unit to decode an instruction indicating a first source packed data operand including at least four data elements, a source mask including at least four mask elements, and a destination storage location. An execution unit, in response to the instruction, stores a result packed data operand having a series of at least two unmasked result data elements. Each of the unmasked result data elements stores a value of a different one of at least two consecutive data elements of the first source packed data operand in a relative order. All masked result elements, which are between a nearest corresponding pair of unmasked result data elements, have a same value as an unmasked result data element of the corresponding pair, which is closest to a first end of the result packed data operand. The masked result data elements correspond to masked mask elements.

Type: Grant

Filed: March 27, 2014

Date of Patent: March 5, 2019

Assignee: Intel Corporation

Inventor: Mikhail Plotnikov
Accelerated interlane vector reduction instructions

Patent number: 10209989

Abstract: A vector reduction instruction is executed by a processor to provide efficient reduction operations on an array of data elements. The processor includes vector registers. Each vector register is divided into a plurality of lanes, and each lane stores the same number of data elements. The processor also includes execution circuitry that receives the vector reduction instruction to reduce the array of data elements stored in a source operand into a result in a destination operand using a reduction operator. Each of the source operand and the destination operand is one of the vector registers. Responsive to the vector reduction instruction, the execution circuitry applies the reduction operator to two of the data elements in each lane, and shifts one or more remaining data elements when there is at least one of the data elements remaining in each lane.

Type: Grant

Filed: March 7, 2017

Date of Patent: February 19, 2019

Assignee: Intel Corporation

Inventors: Paul Caprioli, Abhay S. Kanhere, Jeffrey J. Cook, Muawya M. Al-Otoom
Pessimistic dependency handling based on storage regions

Patent number: 10114650

Abstract: Techniques are disclosed relating to handling dependencies between instructions. In one embodiment, an apparatus includes decode circuitry and dependency circuitry. In this embodiment, the decode circuitry is configured to receive an instruction that specifies a destination location and determine a first storage region that includes the destination location. In this embodiment, the storage region is one of a plurality of different storage regions accessible by instructions processed by the apparatus. In this embodiment, the dependency circuitry is configured to stall the instruction until one or more older instructions that specify source locations in the first storage region have read their source locations. The disclosed techniques may be described as “pessimistic” dependency handling, which may, in some instances, maintain performance while limiting complexity, power consumption, and area of dependency logic.

Type: Grant

Filed: February 23, 2015

Date of Patent: October 30, 2018

Assignee: Apple Inc.

Inventors: Robert D. Kenney, Liang-Kai Wang
Floating point scaling processors, methods, systems, and instructions

Patent number: 10089076

Abstract: A method of an aspect includes receiving a floating point scaling instruction. The floating point scaling instruction indicates a first source including one or more floating point data elements, a second source including one or more corresponding floating point data elements, and a destination. A result is stored in the destination in response to the floating point scaling instruction. The result includes one or more corresponding result floating point data elements each including a corresponding floating point data element of the second source multiplied by a base of the one or more floating point data elements of the first source raised to a power of an integer representative of the corresponding floating point data element of the first source. Other methods, apparatus, systems, and instructions are disclosed.

Type: Grant

Filed: March 15, 2018

Date of Patent: October 2, 2018

Assignee: Intel Corporation

Inventors: Cristina S. Anderson, Amit Gradstein, Robert Valentine, Simon Rubanovich, Benny Eitan
Multiple-core computer processor for reverse time migration

Patent number: 10078593

Abstract: A multi-core computer processor including a plurality of processor cores interconnected in a Network-on-Chip (NoC) architecture, a plurality of caches, each of the plurality of caches being associated with one and only one of the plurality of processor cores, and a plurality of memories, each of the plurality of memories being associated with a different set of at least one of the plurality of processor cores and each of the plurality of memories being configured to be visible in a global memory address space such that the plurality of memories are visible to two or more of the plurality of processor cores, wherein at least one of a number of the processor cores, a size of each of the plurality of caches, or a size of each of the plurality of memories is configured for performing a reverse-time-migration (RTM) computation.

Type: Grant

Filed: October 26, 2012

Date of Patent: September 18, 2018

Assignee: THE REGENTS OF THE UNIVERSITY OF CALIFORNIA

Inventors: John Shalf, David Donofrio, Leonid Oliker, Jens Kruger, Samuel Williams
Executing perform floating point operation instructions

Patent number: 10061560

Abstract: A method and system are disclosed for executing a machine instruction in a central processing unit. The method comprises the steps of obtaining a perform floating-point operation instruction; obtaining a test bit; and determining a value of the test bit. If the test bit has a first value, (a) a specified floating-point operation function is performed, and (b) a condition code is set to a value determined by said specified function. If the test bit has a second value, (c) a check is made to determine if said specified function is valid and installed on the machine, (d) if said specified function is valid and installed on the machine, the condition code is set to one code value, and (e) if said specified function is either not valid or not installed on the machine, the condition code is set to a second code value.

Type: Grant

Filed: April 25, 2016

Date of Patent: August 28, 2018

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Michel H. T. Hack, Ronald M. Smith, Sr.
Multifunctional hexadecimal instruction form system and program product

Patent number: 9996346

Abstract: A new zSeries floating-point unit has a fused multiply-add dataflow capable of supporting two architectures and fused MULTIPLY and ADD and Multiply and SUBTRACT in both RRF and RXF formats for the fused functions. Both binary and hexadecimal floating-point instructions are supported for a total of 6 formats. The floating-point unit is capable of performing a multiply-add instruction for hexadecimal or binary every cycle with a latency of 5 cycles. This supports two architectures with two internal formats with their own biases. This has eliminated format conversion cycles and has optimized the width of the dataflow. The unit is optimized for both hexadecimal and binary floating-point architecture supporting a multiply-add/subtract per cycle.

Type: Grant

Filed: June 28, 2017

Date of Patent: June 12, 2018

Assignee: International Business Machines Corporation

Inventors: Eric M. Schwarz, Ronald M. Smith, Sr.
Instruction to reduce elements in a vector register with strided access pattern

Patent number: 9921832

Abstract: A vector reduction instruction with non-unit strided access pattern is received and executed by the execution circuitry of a processor. In response to the instruction, the execution circuitry performs an associative reduction operation on data elements of a first vector register. Based on values of the mask register and a current element position being processed, the execution circuitry sequentially sets one or more data elements of the first vector register to a result, which is generated by the associative reduction operation applied to both a previous data element of the first vector register and a data clement of a third vector register. The previous data element is located more than one element position away from the current element position.

Type: Grant

Filed: December 28, 2012

Date of Patent: March 20, 2018

Assignee: Intel Corporation

Inventors: Albert Hartono, Jayashankar Bharadwaj, Nalini Vasudevan, Sara S. Baghsorkhi, Victor W. Lee, Daehyun Kim
Methods, apparatus, instructions and logic to provide permute controls with leading zero count functionality

Patent number: 9804850

Abstract: Instructions and logic provide SIMD permute controls with leading zero count functionality. Some embodiments include processors with a register with a plurality of data fields, each of the data fields to store a second plurality of bits. A destination register has corresponding data fields, each of these data fields to store a count of the number of most significant contiguous bits set to zero for corresponding data fields. Responsive to decoding a vector leading zero count instruction, execution units count the number of most significant contiguous bits set to zero for each of data fields in the register, and store the counts in corresponding data fields of the first destination register. Vector leading zero count instructions can be used to generate permute controls and completion masks to be used along with the set of permute controls, to resolve dependencies in gather-modify-scatter SIMD operations.

Type: Grant

Filed: June 21, 2016

Date of Patent: October 31, 2017

Assignee: Intel Corporation

Inventors: Christopher J. Hughes, Mikhail Plotnikov, Andrey Naraikin, Robert Valentine
Data synchronization in a cloud infrastructure

Patent number: 9734224

Abstract: A synchronization infrastructure that synchronizes data stored between components in a cloud infrastructure system is described. A first component in the cloud infrastructure system may store subscription information related to a subscription order which may in turn be utilized by a second component in the cloud infrastructure system to orchestrate the provisioning of services and resources for the order placed by the customer. The synchronization architecture utilizes transactionally consistent checkpoints that describe the state of the data stored in the components to synchronize the data between these components.

Type: Grant

Filed: March 20, 2015

Date of Patent: August 15, 2017

Assignee: Oracle International Corporation

Inventors: Ramkrishna Chatterjee, Ramesh Vasudevan, Anjani Kalyan Prathipati, Gopalan Arun
Super multiply add (super madd) instruction

Patent number: 9733935

Abstract: A method of processing an instruction is described that includes fetching and decoding the instruction. The instruction has separate destination address, first operand source address and second operand source address components. The first operand source address identifies a location of a first mask pattern in mask register space. The second operand source address identifies a location of a second mask pattern in the mask register space. The method further includes fetching the first mask pattern from the mask register space; fetching the second mask pattern from the mask register space; merging the first and second mask patterns into a merged mask pattern; and, storing the merged mask pattern at a storage location identified by the destination address.

Type: Grant

Filed: December 23, 2011

Date of Patent: August 15, 2017

Assignee: Intel Corporation

Inventors: Jesus Corbal, Andrew T. Forsyth, Roger Espasa, Manel Fernandez, Thomas D. Fletcher
Exception control method, system, and program

Patent number: 9710270

Abstract: A method for programmably controlling an exception includes performing, by a processor, a step of executing a control specification instruction for exception control specification that indicates whether an exception is enabled or not and setting a control specification value for the exception in a register and a step of executing a control execution instruction for exception control execution that indicates whether the exception is to be raised or not, determining whether the control specification value set in the register is a value for enabling the exception, and, when the control specification value is the value for enabling the exception, raising the exception. The method further includes performing a step of not raising the exception when the control specification value set in the register is not the value for enabling the exception.

Type: Grant

Filed: December 20, 2011

Date of Patent: July 18, 2017

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: Noriaki Asamoto
Vector shuffle instructions operating on multiple lanes each having a plurality of data elements using a same set of per-lane control bits

Patent number: 9672034

Abstract: In-lane vector shuffle operations are described. In one embodiment a shuffle instruction specifies a field of per-lane control bits, a source operand and a destination operand, these operands having corresponding lanes, each lane divided into corresponding portions of multiple data elements. Sets of data elements are selected from corresponding portions of every lane of the source operand according to per-lane control bits. Elements of these sets are copied to specified fields in corresponding portions of every lane of the destination operand. Another embodiment of the shuffle instruction also specifies a second source operand, all operands having corresponding lanes divided into multiple data elements. A set selected according to per-lane control bits contains data elements from every lane portion of a first source operand and data elements from every corresponding lane portion of the second source operand. Set elements are copied to specified fields in every lane of the destination operand.

Type: Grant

Filed: March 15, 2013

Date of Patent: June 6, 2017

Assignee: Intel Corporation

Inventors: Zeev Sperber, Robert Valentine, Benny Eitan, Doron Orenstein
Computer processor providing exception handling with reduced state storage

Patent number: 9619233

Abstract: A computer architecture allows for simplified exception handling by restarting the program after exceptions at the beginning of idempotent regions, the idempotent regions allowing re-execution without the need for restoring complex state information from checkpoints. Recovery from mis-speculation may be provided by a similar mechanism but using smaller idempotent regions reflecting a more frequent occurrence of mis-speculation. A compiler generating different idempotent regions for speculation and exception handling is also disclosed.

Type: Grant

Filed: February 19, 2016

Date of Patent: April 11, 2017

Assignee: Wisconsin Alumni Research Foundation

Inventors: Jaikrishnan Menon, Marc Asher De Kruijf, Karthikeyan Sankaralingam
Message dispatcher for payment system

Patent number: 9613350

Abstract: A payment reader includes a contactless interface for communicating with a contactless device. The payment reader has a processor that executes instructions stored in memory, and the instructions include instructions for a plurality of firmware modules including a message dispatcher module and a plurality of functional modules. The functional modules generate messages and the message dispatcher module stores the messages in a queued data structure such as a stack or a queue. The messages are provided to the functional modules from the queued data structure. Some of the messages are timed messages that are returned to the queued data structure.

Type: Grant

Filed: February 24, 2016

Date of Patent: April 4, 2017

Inventor: Kshitiz Vadera
Hazard check instructions for enhanced predicate vector operations

Patent number: 9600280

Abstract: A hazard check instruction has operands that specify addresses of vector elements to be read by first and second vector memory operations. The hazard check instruction outputs a dependency vector identifying, for each element position of the first vector corresponding to the first vector memory operation, which element position of the second vector that the element of the first vector depends on (if any). In an embodiment, at least one of the vector memory operations has addresses specified using a scalar address in the operands (and a vector attribute associated with the vector). In an embodiment, the operands may include predicates for one or both of the vector memory operations, indicating which vector elements are active. The dependency vector may be qualified by the predicates, indicating dependencies only for active elements.

Type: Grant

Filed: September 24, 2013

Date of Patent: March 21, 2017

Assignee: Apple Inc.

Inventor: Jeffry E. Gonion
Collaboratively reconstituting tables

Patent number: 9588952

Abstract: Reconstituting an attribute associated with data. Data in a tabular form may be received. The data is analyzed for a field that is likely to be determined by a formula. Responsive to identifying the field likely to be determined by the formula, An indication of the field and the formula with the data are stored in a repository. The indication of the field and the formula with the data from the repository may be retrieved to facilitate incorporating the data in an application with the formula for the field integrated into the application.

Type: Grant

Filed: June 22, 2015

Date of Patent: March 7, 2017

Assignee: International Business Machines Corporation

Inventors: Al Chakra, Liam Harpur, John Rice
Accelerated interlane vector reduction instructions

Patent number: 9588766

Abstract: A vector reduction instruction is executed by a processor to provide efficient reduction operations on an array of data elements. The processor includes vector registers. Each vector register is divided into a plurality of lanes, and each lane stores the same number of data elements. The processor also includes execution circuitry that receives the vector reduction instruction to reduce the array of data elements stored in a source operand into a result in a destination operand using a reduction operator. Each of the source operand and the destination operand is one of the vector registers. Responsive to the vector reduction instruction, the execution circuitry applies the reduction operator to two of the data elements in each lane, and shifts one or more remaining data elements when there is at least one of the data elements remaining in each lane.

Type: Grant

Filed: September 28, 2012

Date of Patent: March 7, 2017

Assignee: Intel Corporation

Inventors: Paul Caprioli, Abhay S. Kanhere, Jeffrey J. Cook, Muawya M. Al-Otoom
Compiling source code to reduce run-time execution of vector element reverse operations

Patent number: 9569188

Abstract: Compiling source code to reduce run-time execution of vector element reverse operations, includes: identifying, by a compiler, a first loop nested within a second loop in a computer program; identifying, by the compiler, a vector element reverse operation within the first loop; moving, by the compiler, the vector element reverse operation from the first loop to the second loop.

Type: Grant

Filed: July 25, 2016

Date of Patent: February 14, 2017

Assignee: International Business Machines Corporation

Inventors: Michael Karl Gschwind, William J. Schmidt
Compiling source code to reduce run-time execution of vector element reverse operations

Patent number: 9569190

Abstract: Compiling source code to reduce run-time execution of vector element reverse operations, includes: identifying, by a compiler, a first loop nested within a second loop in a computer program; identifying, by the compiler, a vector element reverse operation within the first loop; moving, by the compiler, the vector element reverse operation from the first loop to the second loop.

Type: Grant

Filed: August 4, 2015

Date of Patent: February 14, 2017

Assignee: International Business Machines Corporation

Inventors: Michael Karl Gschwind, William J. Schmidt
Secured comparison method of two operands and corresponding device

Patent number: 9501277

Abstract: A first operation of comparison of the first initial operand with the second initial operand uses at least one comparison operator in such a way as to obtain a first final result word. A second operation of comparison of the second initial operand with the first initial operand uses the at least one comparison operator in such a way as to obtain a second final result word. Another operation checks the values of the bits of the two final result words in relation to a part at least of r combinations of reference values taken from possible combinations of values of these two final result words. These reference combinations represent a valid result of comparison of the two operands including an equality, a relationship of inferiority and a relationship of superiority between the two operands.

Type: Grant

Filed: June 12, 2014

Date of Patent: November 22, 2016

Assignee: STMICROELECTRONICS (ROUSSET) SAS

Inventors: Pierre Guillemin, Yannick Teglia
Two level re-order buffer

Patent number: 9495159

Abstract: In response to detecting one or more conditions are met, a checkpoint of a current state of a thread may be created. One or more incomplete instructions may be moved from a first level of a re-order buffer to a second level of the re-order buffer. Each incomplete instruction may be currently executing or awaiting execution.

Type: Grant

Filed: September 27, 2013

Date of Patent: November 15, 2016

Assignee: Intel Corporation

Inventors: Mark J. Dechene, Srikanth T. Srinivasan, Matthew C. Merten, Tong Li, Christine E. Wang
Methods, apparatus, and instructions for converting vector data

Patent number: 9495153

Abstract: A computer processor includes a decoder for decoding machine instructions and an execution unit for executing those instructions. The decoder and the execution unit are capable of decoding and executing vector instructions that include one or more format conversion indicators. For instance, the processor may be capable of executing a vector-load-convert-and-write (VLoadConWr) instruction that provides for loading data from memory to a vector register. The VLoadConWr instruction may include a format conversion indicator to indicate that the data from memory should be converted from a first format to a second format before the data is loaded into the vector register. Other embodiments are described and claimed.

Type: Grant

Filed: March 15, 2013

Date of Patent: November 15, 2016

Assignee: Intel Corporation

Inventors: Eric Sprangle, Robert D. Cavin, Anwar Rohillah, Douglas M. Carmean
Methods of and apparatus for approximating a function

Patent number: 9489344

Abstract: A data processor of a processing system, such as a graphics processing system, converts an input data value into an output data value by approximating a function which maps input values to output values. The data processor approximates the function using first and second predetermined ranges of values which are quantized into plural corresponding pairs of range sections, a predetermined gradient for each pair of range sections, and predetermined section end values for each pair of range sections. By using these predetermined parameters, the approximation of the function can be implemented efficiently by the data processor of the processing system.

Type: Grant

Filed: June 27, 2013

Date of Patent: November 8, 2016

Assignee: ARM LIMITED

Inventors: Jorn Nystad, Sean Tristram Ellis
Restricted instructions in transactional execution

Patent number: 9448797

Abstract: Restricted instructions are prohibited from execution within a transaction. There are classes of instructions that are restricted regardless of type of transaction: constrained or nonconstrained. There are instructions only restricted in constrained transactions, and there are instructions that are selectively restricted for given transactions based on controls specified on instructions used to initiate the transactions.

Type: Grant

Filed: March 4, 2013

Date of Patent: September 20, 2016

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Dan F. Greiner, Christian Jacobi, Timothy J. Slegel
Restricted instructions in transactional execution

Patent number: 9448796

Abstract: Restricted instructions are prohibited from execution within a transaction. There are classes of instructions that are restricted regardless of type of transaction: constrained or nonconstrained. There are instructions only restricted in constrained transactions, and there are instructions that are selectively restricted for given transactions based on controls specified on instructions used to initiate the transactions.

Type: Grant

Filed: June 15, 2012

Date of Patent: September 20, 2016

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Dan F. Greiner, Christian Jacobi, Timothy J. Slegel
Data processing apparatus and method for performing scan operations

Patent number: 9355061

Abstract: A data processing apparatus and method are provided for executing a vector scan instruction. The data processing apparatus comprises a vector register store configured to store vector operands, and processing circuitry configured to perform operations on vector operands retrieved from said vector register store. Further, control circuitry is configured to control the processing circuitry to perform the operations required by one or more instructions, said one or more instructions including a vector scan instruction specifying a vector operand comprising N vector elements and defining a scan operation to be performed on a sequence of vector elements within the vector operand.

Type: Grant

Filed: January 28, 2014

Date of Patent: May 31, 2016

Assignee: ARM Limited

Inventors: Matthias Lothar Boettcher, Mbou Eyole-Monono, Giacomo Gabrielli
Reducing register read ports for register pairs

Patent number: 9329868

Abstract: Embodiments relate to reducing a number of read ports for register pairs. An aspect includes executing an instruction. The instruction identifies a pair of registers as containing a wide operand which spans the pair of registers. It is determined if a pairing indicator associated with the pair of registers has a first value or a second value. The first value indicates that the wide operand is stored in a wide register, and the second value indicates that the wide operand is not stored in the wide register. Based on the pairing indicator having the first value, the wide operand is read from the wide register. Based on the pairing indicator having the second value, the wide operand is read from the pair of registers. An operation is performed using the wide operand.

Type: Grant

Filed: October 22, 2014

Date of Patent: May 3, 2016

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Jonathan D. Bradbury, Michael K. Gschwind
Reducing register read ports for register pairs

Patent number: 9323529

Abstract: Embodiments relate to reducing a number of read ports for register pairs. An aspect includes executing an instruction. The instruction identifies a pair of registers as containing a wide operand which spans the pair of registers. It is determined if a pairing indicator associated with the pair of registers has a first value or a second value. The first value indicates that the wide operand is stored in a wide register, and the second value indicates that the wide operand is not stored in the wide register. Based on the pairing indicator having the first value, the wide operand is read from the wide register. Based on the pairing indicator having the second value, the wide operand is read from the pair of registers. An operation is performed using the wide operand.

Type: Grant

Filed: July 18, 2012

Date of Patent: April 26, 2016

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Jonathan D. Bradbury, Michael K. Gschwind
Predicting register pairs

Patent number: 9323532

Abstract: Embodiments relate to reducing a number of read ports for register pairs. An aspect includes executing an instruction. The instruction identifies a pair of registers as containing a wide operand which spans the pair of registers. The executing of the instruction includes determining whether a pairing indicator associated with the pair of registers has a first value, a second value or a third value. Based on the pairing indicator having the first value, the wide operand is read from the wide register. Based on the pairing indicator having the second value the wide operand is read from the pair of registers. Based on the pairing indicator having the third value, the wide operand is speculatively read from a predetermined register. The predetermined register consists of the wide register or the pair of registers.

Type: Grant

Filed: July 18, 2012

Date of Patent: April 26, 2016

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Jonathan D. Bradbury, Michael K. Gschwind
Efficient correction of normalizer shift amount errors in fused multiply add operations

Patent number: 9317251

Abstract: A method for correcting a shift error in a fused multiply add operation. The method comprises adjusting a normalized floating-point number before performing a shift error correction to produce an adjusted normalized floating-point number, and correcting a shift error in the adjusted normalized floating-point number. The correcting the shift error comprises shifting a mantissa of the adjusted normalized floating-point number in one direction. A fused multiply add module comprising a normalizer module, a compensation logic, and a round. The normalizer module is operable to normalize a floating-point number to produce a normalized floating-point number. The floating-point number is normalized based upon an estimated quantity of leading zeros. The compensation logic is operable to manage a correction of a shift error in the normalized floating-point number. The rounder is operable to correct the shift error with a mantissa shift in only one direction.

Type: Grant

Filed: December 31, 2012

Date of Patent: April 19, 2016

Assignee: NVIDIA CORPORATION

Inventors: Charles Tsen, Adam Dreyer
Computer processor providing exception handling with reduced state storage

Patent number: 9298497

Abstract: A computer architecture allows for simplified exception handling by restarting the program after exceptions at the beginning of idempotent regions, the idempotent regions allowing re-execution without the need for restoring complex state information from checkpoints. Recovery from mis-speculation may be provided by a similar mechanism but using smaller idempotent regions reflecting a more frequent occurrence of mis-speculation. A compiler generating different idempotent regions for speculation and exception handling is also disclosed.

Type: Grant

Filed: July 13, 2012

Date of Patent: March 29, 2016

Assignee: Wisconsin Alumni Research Foundation

Inventors: Jaikrishnan Menon, Marc Asher De Kruijf, Karthikeyan Sankaralingam
Vector processing engines having programmable data path configurations for providing multi-mode radix-2butterfly vector processing circuits, and related vector processors, systems, and methods

Patent number: 9275014

Abstract: Vector processing engines (VPEs) having programmable data path configurations for providing multi-mode Radix-2X butterfly vector processing circuits. Related vector processors, systems, and methods are also disclosed. The VPEs disclosed herein include a plurality of vector processing stages each having vector processing blocks that have programmable data path configurations for performing Radix-2X butterfly vector operations to perform Fast Fourier Transform (FFT) vector processing operations efficiently. The data path configurations of the vector processing blocks can be programmed to provide different types of Radix-2X butterfly vector operations as well as other arithmetic logic vector operations.

Type: Grant

Filed: March 13, 2013

Date of Patent: March 1, 2016

Assignee: QUALCOMM Incorporated

Inventor: Raheel Khan
Method and device for generating an exception

Patent number: 9268527

Abstract: A floating point value can represent a number or something that is not a number (NaN). A floating point value that is a NaN having data field that stores information, such as a propagation count that indicates the number of times a NaN value has been propagated through instructions. A NaN evaluation instruction can determine whether one or more operands is a NaN operand of a particular type, and if so can generate a result that is a NaN of a different type. An exception can be generated based upon the NaN of the different type being provided as a resultant.

Type: Grant

Filed: March 15, 2013

Date of Patent: February 23, 2016

Assignee: FREESCALE SEMICONDUCTOR, INC.

Inventor: William C. Moyer
Data processing apparatus

Patent number: 9239702

Abstract: A programmable data processing apparatus having a bit-plane extraction operation is described, for extracting data from a value of, for example, 32 bits containing 4 bytes, 1a to 1d. Each byte 1a to 1d comprises 8 bits, (a0-a7, b0-b7, c0-c7 and d0-d7, respectively). The bit-plane extraction operation retrieves one bit from each of these bytes, for example the second bit (a1, b1, c1, d1), which is specified by an argument. The operation involves concatenating these bits (a1, b1, c1, d1) and returning a result value 5. Depending on the particular data processing application, the result value may be bit-reversed to provide a result value 7 (for example, if a bit-reversal is required to deal with endianness, or other reasons). The bit-plane extraction operation can be used as a pre-processing operation in data processing operations such as “sum-of-absolute-differences” in the processing of video data.

Type: Grant

Filed: June 8, 2005

Date of Patent: January 19, 2016

Assignee: Intel Corporation

Inventor: Antonius Adrianus Maria Van Wel
Arithmetic unit, processor, compiler and compiling method

Patent number: 9223543

Abstract: An arithmetic unit which includes: a data supply section which supplies floating-point type object data to which a sign is to be added and condition data which includes a condition under which the sign is added; a sign data generating section which extracts the condition included in the condition data and generates sign data for adding the sign to the object data on the basis of the extracted condition; and an integer arithmetic operation section which performs an integer arithmetic operation while treating the object data as integer type data so as to add the sign to the object data on the basis of the sign data and the object data.

Type: Grant

Filed: January 4, 2010

Date of Patent: December 29, 2015

Assignee: Sony Corporation

Inventors: Yukihiko Mogi, Masato Kamata, Yuki Kawaguchi
Number representation and memory system for arithmetic

Patent number: 9223544

Abstract: A method, device and system for representing numbers in a computer including storing a floating-point number M in a computer memory; representing the floating-point number M as an interval with lower and upper bounds A and B when it is accessed by using at least two floating-point numbers in the memory; and then representing M as an interval with lower and upper bounds A and B when it is used in a calculation by using at least three floating-point numbers in the memory. Calculations are performed using the interval and when the data is written back to the memory it may be stored as an interval if the size of the interval is significant, i.e. larger than a first threshold value. A warning regarding the suspect accuracy of any data stored as an interval may be issued if the interval is too large, i.e. larger than a second threshold value.

Type: Grant

Filed: September 7, 2012

Date of Patent: December 29, 2015

Assignee: Intel Corporation

Inventors: Helia Naeimi, Ralph Nathan, Shih-Lien L. Lu, John L. Gustafson
Processor exploiting trivial arithmetic operations

Patent number: 9223575

Abstract: The present application relates to the field of processors and in particular to the carrying out of arithmetic operations. Many of the computations performed by processors consist of a large number of simple operations. As a result, a multiplication operation may take a significant number of clock cycles to complete. The present application provides a processor having a trivial operand register, which is used in the carrying out of arithmetic or storage operations for data values stored in a data store.

Type: Grant

Filed: March 16, 2008

Date of Patent: December 29, 2015

Assignee: LINEAR ALGEBRA TECHNOLOGIES LIMITED

Inventor: David Moloney
Double rounded combined floating-point multiply and add

Patent number: 9213523

Abstract: Methods, apparatus, instructions and logic are disclosed providing double rounded combined floating-point multiply and add functionality as scalar or vector SIMD instructions or as fused micro-operations. Embodiments include detecting floating-point (FP) multiplication operations and subsequent FP operations specifying as source operands results of the FP multiplications. The FP multiplications and the subsequent FP operations are encoded as combined FP operations including rounding of the results of FP multiplication followed by the subsequent FP operations. The encoding of said combined FP operations may be stored and executed as part of an executable thread portion using fused-multiply-add hardware that includes overflow detection for the product of FP multipliers, first and second FP adders to add third operand addend mantissas and the products of the FP multipliers with different rounding inputs based on overflow, or no overflow, in the products of the FP multiplier.

Type: Grant

Filed: June 29, 2012

Date of Patent: December 15, 2015

Assignee: Intel Corporation

Inventors: Sridhar Samudrala, Grigorios Magklis, Marc Lupon, David R. Ditzel
Allocation of memory space to individual processor cores

Patent number: 9208093

Abstract: Techniques are generally described for a multi-core processor with a plurality of processor cores. At least one cache is accessible to at least two of the plurality of processor cores. The multi-core processor can be configured for separately allocating a memory space within the cache to the individual processor cores accessing the cache.

Type: Grant

Filed: April 21, 2009

Date of Patent: December 8, 2015

Assignee: Empire Technology Development LLC

Inventors: Thomas Martin Conte, Andrew Wolfe
Controlled-precision iterative arithmetic logic unit

Patent number: 9146706

Abstract: A controlled-precision Iterative Arithmetic Logic Unit (IALU) included in a processor produces sub-precision results, i.e. results having a bit precision less than full precision. In one embodiment, the controlled-precision IALU comprises an arithmetic logic circuit and a precision control circuit. The arithmetic logic circuit is configured to iteratively process operands of a first bit precision to obtain a result. The precision control circuit is configured to end the iterative operand processing when the result achieves a programmed second bit precision less than the first bit precision. In one embodiment, the precision control circuit causes the arithmetic logic circuit to end the iterative operand processing in response to an indicator received by the control circuit. The controlled-precision IALU further comprises rounding logic configured to round the sub-precision result.

Type: Grant

Filed: May 5, 2006

Date of Patent: September 29, 2015

Assignee: QUALCOMM Incorporated

Inventor: Kenneth Alan Dockser
Vector floating point argument reduction

Patent number: 9146901

Abstract: A processing apparatus is provided with processing circuitry 6, 8 and decoder circuitry 10 responsive to a received argument reduction instruction FREDUCE4, FDOT3R to generate control signals 16 for controlling the processing circuitry 6, 8. The action of the argument reduction instruction is to subject each component of an input vector to a scaling which adds or subtracts an exponent shift value C to the exponent of the input vector component. The exponent shift value C is selected such that a sum of this exponent shift value C with the maximum exponent value B of any of the input vector components lies within a range between a first predetermined value and a second predetermined value. A consequence of execution of this argument reduction instruction is that the result vector when subject to a dot-product operation will be resistant to floating point underflows or overflows.

Type: Grant

Filed: August 26, 2011

Date of Patent: September 29, 2015

Assignee: ARM Limited

Inventor: Jorn Nystad
MERGED FLOATING POINT OPERATION USING A MODEBIT

Publication number: 20150121044

Abstract: A first floating-point operation unit receives first and second variables and performs a first operation generating a first output. A first rounding unit receives and rounds the first output to generate a second output if a control bit is in a first state. A second floating-point operation unit receives a third variable and either the first output or the second output and performs a second operation on the third variable and either the first output or the second output, to generate a third output. The second floating-point operation unit receives and operates on the first output if the control bit is in the first state, or the second output if the control bit is in the second state. A second rounding unit receives and rounds the third output.

Type: Application

Filed: December 29, 2014

Publication date: April 30, 2015

Inventor: David Yiu-Man Lau
COMPUTER AND METHODS FOR SOLVING MATH FUNCTIONS

Publication number: 20150121043

Abstract: Computers and methods for performing mathematical functions are disclosed. An embodiment of a computer includes an operations level and a driver level. The operations level performs mathematical operations. The driver level includes a first lookup table and a second lookup table, wherein the first lookup table includes first data for calculating at least one mathematical function using a first level of accuracy. The second lookup table includes second data for calculating the at least one mathematical function using a second level of accuracy, wherein the first level of accuracy is greater than the second level of accuracy. A driver executes either the first data or the second data depending on a selected level of accuracy.

Type: Application

Filed: October 30, 2013

Publication date: April 30, 2015

Applicant: Texas Instruments Incorporated

Inventors: Kyong Ho Lee, Seok-Jun Lee, Manish Goel
Vector math instruction execution by DSP processor approximating division and complex number magnitude

Patent number: 9015452

Abstract: A digital signal processor (DSP) includes an instruction fetch unit, an instruction decode unit, a register set and a plurality of work units in communication with the instruction decode unit. A first embodiment calculates two divisions on packed numerators and packed denominators. The DSP work units calculate indexes into a 1/d look-up table and make a final sign correction. A second embodiment calculates an approximation of a vector magnitude of a complex number x+jy. The approximation is based upon ?(x2+y2)??*max(|x|, |y|)+?*min(|x|, |y|). The DSP work units calculate the absolute values, find the maxima and minima, and form the packed results of two vector magnitude calculations.

Type: Grant

Filed: February 18, 2010

Date of Patent: April 21, 2015

Assignee: Texas Instruments Incorporated

Inventor: Udayan Dasgupta
Packing odd bytes from two source registers of packed data

Patent number: 9015453

Abstract: An apparatus includes an instruction decoder, first and second source registers and a circuit coupled to the decoder to receive packed data from the source registers and to unpack the packed data responsive to an unpack instruction received by the decoder. A first packed data element and a third packed data element are received from the first source register. A second packed data element and a fourth packed data element are received from the second source register. The circuit copies the packed data elements into a destination register resulting with the second packed data element adjacent to the first packed data element, the third packed data element adjacent to the second packed data element, and the fourth packed data element adjacent to the third packed data element.

Type: Grant

Filed: December 29, 2012

Date of Patent: April 21, 2015

Assignee: Intel Corporation

Inventors: Alexander D. Peleg, Yaakov Yaari, Millind Mittal, Larry M. Mennemeier, Benny Eitan
VECTOR INDEXED MEMORY ACCESS PLUS ARITHMETIC AND/OR LOGICAL OPERATION PROCESSORS, METHODS, SYSTEMS, AND INSTRUCTIONS

Publication number: 20150095623

Abstract: A processor including a decode unit to receive a vector indexed load plus arithmetic and/or logical (A/L) operation plus store instruction. The instruction is to indicate a source packed memory indices operand that is to have a plurality of packed memory indices. The instruction is also to indicate a source packed data operand that is to have a plurality of packed data elements. The processor also includes an execution unit coupled with the decode unit. The execution unit, in response to the instruction, is to load a plurality of data elements from memory locations corresponding to the plurality of packed memory indices, perform A/L operations on the plurality of packed data elements of the source packed data operand and the loaded plurality of data elements, and store a plurality of result data elements in the memory locations corresponding to the plurality of packed memory indices.

Type: Application

Filed: September 27, 2013

Publication date: April 2, 2015

Inventors: Igor Ermolaev, Bret L. Toll, Robert Valentine, Jesus Corbal San Adrian, Gautam B. Doshi, Rama Kishan V. Malladi, Prasenjit Chakraborty
VECTOR FLOATING POINT TEST DATA CLASS IMMEDIATE INSTRUCTION

Publication number: 20150095624

Abstract: A Vector Floating Point Test Data Class Immediate instruction is provided that determines whether one or more elements of a vector specified in the instruction are of one or more selected classes and signs. If a vector element is of a selected class and sign, an element in an operand of the instruction corresponding to the vector element is set to a first defined value, and if the vector element is not of the selected class and sign, the operand element corresponding to the vector element is set to a second defined value.

Type: Application

Filed: December 5, 2014

Publication date: April 2, 2015

Inventors: Jonathan D. Bradbury, Eric M. Schwarz
CONVERT TO ZONED FORMAT FROM DECIMAL FLOATING POINT FORMAT

Publication number: 20150089206

Abstract: Machine instructions, referred to herein as a long Convert from Zoned instruction (CDZT) and extended Convert from Zoned instruction (CXZT), are provided that read EBCDIC or ASCII data from memory, convert it to the appropriate decimal floating point format, and write it to a target floating point register or floating point register pair. Further, machine instructions, referred to herein as a long Convert to Zoned instruction (CZDT) and extended Convert to Zoned instruction (CZXT), are provided that convert a decimal floating point (DFP) operand in a source floating point register or floating point register pair to EBCDIC or ASCII data and store it to a target memory location.

Type: Application

Filed: December 4, 2014

Publication date: March 26, 2015

Inventors: Steven R. Carlough, Reid T. Copeland, Charles W. Gainey, JR., Marcel Mitran, Eric M. Schwarz, Timothy J. Slegel

prev 1 2 3 4 5 6 … next