Floating Point Or Vector Patents (Class 712/222)
  • Patent number: 10761970
    Abstract: The invention is notably directed to a computer-implemented method for performing safety check operations. The method comprises steps that are implemented while executing a computer program, which is instrumented with safety check operations. As a result, this computer program forms a sequence of ordered instructions. Such instructions comprise safety check operation instructions, in addition to generic execution instructions and system inputs. System inputs allow the executing program to interact with an operating system, which manages resources for the computer program to execute. A series of instructions are identified while executing the computer program. Namely, a first instruction is identified in the sequence, as one of the safety check operation instructions, in view of its subsequent execution. After having identified the first instruction, a second instruction is identified in the sequence. The second instruction is identified as one of the generic computer program instructions.
    Type: Grant
    Filed: October 20, 2017
    Date of Patent: September 1, 2020
    Assignee: International Business Machines Corporation
    Inventors: Anil Kurmus, Matthias Neugschwandtner, Alessandro Sorniotti
  • Patent number: 10761728
    Abstract: Provided herein may be a memory controller and a method of operating the same. The memory controller may include a command processor configured to generate a flush command in response to a flush request input from an external host and to assign a slot number corresponding to the flush command; a sequence generator configured to determine flush data to be stored in response to the flush command, and to generate a write sequence in which the flush data is to be stored based on a size of the flush data and an assigned device sequence of the plurality of memory devices; and a memory operation controller configured to control the plurality of memory devices to store the flush data in the plurality of memory devices.
    Type: Grant
    Filed: December 10, 2018
    Date of Patent: September 1, 2020
    Assignee: SK hynix Inc.
    Inventors: Sung Kwan Hong, Yeong Sik Yi
  • Patent number: 10740067
    Abstract: Setting or updating of floating point controls is managed. Floating point controls include controls used for floating point operations, such as rounding mode and/or other controls. Further, floating point controls include status associated with floating point operations, such as floating point exceptions and/or others. The management of the floating point controls includes efficiently updating the controls, while reducing costs associated therewith.
    Type: Grant
    Filed: June 23, 2017
    Date of Patent: August 11, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michael K. Gschwind, Valentina Salapura
  • Patent number: 10726329
    Abstract: Techniques in advanced deep learning provide improvements in one or more of accuracy, performance, and energy efficiency. An array of processing elements performs flow-based computations on wavelets of data. Each processing element has a respective compute element and a respective routing element. Instructions executed by the compute element include operand specifiers, some specifying a data structure register storing a data structure descriptor describing an operand as a fabric vector or a memory vector. The data structure descriptor further describes the memory vector as one of a one-dimensional vector, a four-dimensional vector, or a circular buffer vector. Optionally, the data structure descriptor specifies an extended data structure register storing an extended data structure descriptor. The extended data structure descriptor specifies parameters relating to a four-dimensional vector or a circular buffer vector.
    Type: Grant
    Filed: April 17, 2018
    Date of Patent: July 28, 2020
    Assignee: Cerebras Systems Inc.
    Inventors: Sean Lie, Michael Morrison, Srikanth Arekapudi, Gary R. Lauterbach, Michael Edwin James
  • Patent number: 10705842
    Abstract: Methods and apparatuses relating to high-performance authenticated encryption are described.
    Type: Grant
    Filed: April 2, 2018
    Date of Patent: July 7, 2020
    Assignee: INTEL CORPORATION
    Inventors: Vikram Suresh, Sanu Mathew, Sudhir Satpathy, Vinodh Gopal
  • Patent number: 10672100
    Abstract: An apparatus determines a second pixel range of an uncorrected image necessary to generate a first pixel range having pixels in a preset range of a corrected image, including a cache unit determining the second pixel range and reading and holding the second pixel range from memory before executing correction. Correspondences indicating positions of the uncorrected image corresponding to positions of pixels of the corrected image, respectively, are preset. The cache unit specifies a position of the uncorrected image corresponding to a pixel of one of four corners of a rectangular third pixel range including the first pixel range based on the correspondence, specifies pixel ranges of the uncorrected image necessary for pixel value generation, respectively, at the four corners of the third pixel range based on the specified position, and determines a pixel range including a convex set including the specified pixel ranges as the second pixel range.
    Type: Grant
    Filed: October 12, 2016
    Date of Patent: June 2, 2020
    Assignee: Hitachi Automotive Systems, Ltd.
    Inventors: Yusuke Uchida, Tetsuya Yamada, Shigeru Matsuo, Manabu Sasamoto
  • Patent number: 10592243
    Abstract: A streaming engine employed in a digital data processor specifies a fixed read only data stream defined by plural nested loops. An address generator produces address of data elements. A steam head register stores data elements next to be supplied to functional units for use as operands. The streaming engine fetches stream data ahead of use by the central processing unit core in a stream buffer constructed like a cache. The stream buffer cache includes plural cache lines, each includes tag bits, at least one valid bit and data bits. Cache lines are allocated to store newly fetched stream data. Cache lines are deallocated upon consumption of the data by a central processing unit core functional unit. Instructions preferably include operand fields with a first subset of codings corresponding to registers, a stream read only operand coding and a stream read and advance operand coding.
    Type: Grant
    Filed: September 10, 2018
    Date of Patent: March 17, 2020
    Assignee: TEXAS INSTRUMENTS INCORPORATED
    Inventor: Joseph Zbiciak
  • Patent number: 10565283
    Abstract: A method of an aspect includes receiving an instruction indicating a destination storage location. A result is stored in the destination storage location in response to the instruction. The result includes a sequence of at least four consecutive non-negative integers in numerical order. In an aspect, the instruction does not indicate a source packed data operand having a plurality of packed data elements in an architecturally-visible storage location. Other methods, apparatus, systems, and instructions are disclosed.
    Type: Grant
    Filed: December 22, 2011
    Date of Patent: February 18, 2020
    Assignee: Intel Corporation
    Inventors: Seth Abraham, Robert Valentine, Elmoustapha Ould-Ahmed-Vall, Zeev Sperber, Amit Gradstein
  • Patent number: 10567249
    Abstract: A method and system are described. The method and system include determining a grouping characteristic for a plurality of nodes and a corresponding plurality of links. The nodes and the links correspond to components of a network and are associated with network performance information. The grouping characteristic includes at least one of partitionability into pages and a hop distance. The method and system also include generating a graphical visualization based on the grouping characteristic, the nodes and the links.
    Type: Grant
    Filed: March 18, 2019
    Date of Patent: February 18, 2020
    Assignee: ThousandEyes, Inc.
    Inventors: John Moeses Ercia Bauan, Sunil Bandla, Ricardo V. Oliveira
  • Patent number: 10521383
    Abstract: A first operation identifier is assigned to a first operation directed to a memory component, the first operation identifier having an entry in a first data structure that associates the first operation identifier with a first plurality of buffer identifiers. It is determined whether the first operation collides with a prior operation assigned a second operation identifier, the second operation identifier having an entry in the first data structure that associates the second operation identifier with a second plurality of buffer identifiers. It is determined whether the first operation is a read or a write operation. In response to determining that the first operation collides with the prior operation and that the first operation is a read operation, the first plurality of buffer identifiers are updated with a buffer identifier included in the second plurality of buffer identifiers.
    Type: Grant
    Filed: December 17, 2018
    Date of Patent: December 31, 2019
    Assignee: MICRON TECHNOLOGY, INC.
    Inventors: Lyle E. Adams, Mark Ish, Pushpa Seetamraju, Karl D. Schuh, Dan Tupy
  • Patent number: 10467185
    Abstract: An apparatus is described having instruction execution logic circuitry. The instruction execution logic circuitry has input vector element routing circuitry to perform the following for each of three different instructions: for each of a plurality of output vector element locations, route into an output vector element location an input vector element from one of a plurality of input vector element locations that are available to source the output vector element. The output vector element and each of the input vector element locations are one of three available bit widths for the three different instructions. The apparatus further includes masking layer circuitry coupled to the input vector element routing circuitry to mask a data structure created by the input vector routing element circuitry. The masking layer circuitry is designed to mask at three different levels of granularity that correspond to the three available bit widths.
    Type: Grant
    Filed: April 24, 2017
    Date of Patent: November 5, 2019
    Assignee: Intel Corporation
    Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Suleyman Sair
  • Patent number: 10360036
    Abstract: A computer processing system is provided. The computer processing system includes a processor configured to crack a Move-To-FPSCR instruction into two internal instructions. A first one of the two internal instructions executes out-of-order to update a control field and a second one of the two internal instructions executes in-order to compute a trap decision.
    Type: Grant
    Filed: July 12, 2017
    Date of Patent: July 23, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Brian J. D. Barrick, Maarten J. Boersma, Niels Fricke, Michael J. Genden
  • Patent number: 10339057
    Abstract: A streaming engine employed in a digital data processor specifies a fixed read only data stream defined by plural nested loops. An address generator produces address of data elements for the nested loops. A steam head register stores data elements next to be supplied to functional units for use as operands. A stream template specifies loop count and loop dimension for each nested loop. A format definition field in the stream template specifies the number of loops and the stream template bits devoted to the loop counts and loop dimensions. This permits the same bits of the stream template to be interpreted differently enabling trade off between the number of loops supported and the size of the loop counts and loop dimensions.
    Type: Grant
    Filed: December 20, 2016
    Date of Patent: July 2, 2019
    Assignee: TEXAS INSTRUMENTS INCORPORATED
    Inventor: Joseph Zbiciak
  • Patent number: 10318433
    Abstract: A streaming engine employed in a digital data processor specifies a fixed read only data stream defined by plural nested loops. An address generator produces address of data elements for the nested loops. A steam head register stores data elements next to be supplied to functional units for use as operands. A stream template register independently specifies a linear address or a circular address mode for each of the nested loops.
    Type: Grant
    Filed: December 20, 2016
    Date of Patent: June 11, 2019
    Assignee: TEXAS INSTRUMENTS INCORPORATED
    Inventor: Joseph Zbiciak
  • Patent number: 10296334
    Abstract: Apparatus, method, and system for performing a vector bit gather are describe herein. One embodiment of a processor includes: a first vector register storing one or more source data elements, a second vector register storing one or more control elements, and a vector bit gather logic. Each of the control elements includes a plurality of bit fields, each of which is associated with a plurality of corresponding bit positions in a destination vector register and is to identify a bit from the one or more corresponding source data element to be copied to each of the plurality of corresponding bit positions. The vector bit shuffle logic is to read the bit fields from the second vector register and, for each bit field, to identify a bit from the source data elements and responsively copy it to each of the plurality of corresponding bit positions in the destination vector register.
    Type: Grant
    Filed: December 27, 2014
    Date of Patent: May 21, 2019
    Assignee: Intel Corporation
    Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal San Adrian, Mark J. Charney, Guillem Sole, Roger Espasa
  • Patent number: 10291406
    Abstract: Embodiments include a method of adding first and second binary numbers having C bits and divided into D words to provide a third binary number in E successive adding operations, C, D and E being plural positive integers, the method comprising: a first group of D adding operations adding together respective words of the first and second binary numbers to provide D sum and carry outputs ranging from a least significant to a most significant sum and carry output; one or more subsequent groups of adding operations adding together sum and carry outputs from an immediately preceding group of adding operations, a final group of the one or more subsequent groups resulting in the third binary number consisting of the sum outputs from the final group and a carry from the most significant carry output of the final group, wherein E is less than D.
    Type: Grant
    Filed: June 7, 2017
    Date of Patent: May 14, 2019
    Assignee: NXP B.V.
    Inventors: Marinus van Splunter, Artur Tadeusz Burchard
  • Patent number: 10275216
    Abstract: A method of an aspect includes receiving a floating point scaling instruction. The floating point scaling instruction indicates a first source including one or more floating point data elements, a second source including one or more corresponding floating point data elements, and a destination. A result is stored in the destination in response to the floating point scaling instruction. The result includes one or more corresponding result floating point data elements each including a corresponding floating point data element of the second source multiplied by a base of the one or more floating point data elements of the first source raised to a power of an integer representative of the corresponding floating point data element of the first source. Other methods, apparatus, systems, and instructions are disclosed.
    Type: Grant
    Filed: March 30, 2018
    Date of Patent: April 30, 2019
    Assignee: Intel Corporation
    Inventors: Cristina S. Anderson, Amit Gradstein, Robert Valentine, Simon Rubanovich, Benny Eitan
  • Patent number: 10228909
    Abstract: A method of an aspect includes receiving a floating point scaling instruction. The floating point scaling instruction indicates a first source including one or more floating point data elements, a second source including one or more corresponding floating point data elements, and a destination. A result is stored in the destination in response to the floating point scaling instruction. The result includes one or more corresponding result floating point data elements each including a corresponding floating point data element of the second source multiplied by a base of the one or more floating point data elements of the first source raised to a power of an integer representative of the corresponding floating point data element of the first source. Other methods, apparatus, systems, and instructions are disclosed.
    Type: Grant
    Filed: March 30, 2018
    Date of Patent: March 12, 2019
    Assignee: Intel Corporation
    Inventors: Cristina S. Anderson, Amit Gradstein, Robert Valentine, Simon Rubanovich, Benny Eitan
  • Patent number: 10223113
    Abstract: A processor of an aspect includes a decode unit to decode an instruction indicating a first source packed data operand including at least four data elements, a source mask including at least four mask elements, and a destination storage location. An execution unit, in response to the instruction, stores a result packed data operand having a series of at least two unmasked result data elements. Each of the unmasked result data elements stores a value of a different one of at least two consecutive data elements of the first source packed data operand in a relative order. All masked result elements, which are between a nearest corresponding pair of unmasked result data elements, have a same value as an unmasked result data element of the corresponding pair, which is closest to a first end of the result packed data operand. The masked result data elements correspond to masked mask elements.
    Type: Grant
    Filed: March 27, 2014
    Date of Patent: March 5, 2019
    Assignee: Intel Corporation
    Inventor: Mikhail Plotnikov
  • Patent number: 10209989
    Abstract: A vector reduction instruction is executed by a processor to provide efficient reduction operations on an array of data elements. The processor includes vector registers. Each vector register is divided into a plurality of lanes, and each lane stores the same number of data elements. The processor also includes execution circuitry that receives the vector reduction instruction to reduce the array of data elements stored in a source operand into a result in a destination operand using a reduction operator. Each of the source operand and the destination operand is one of the vector registers. Responsive to the vector reduction instruction, the execution circuitry applies the reduction operator to two of the data elements in each lane, and shifts one or more remaining data elements when there is at least one of the data elements remaining in each lane.
    Type: Grant
    Filed: March 7, 2017
    Date of Patent: February 19, 2019
    Assignee: Intel Corporation
    Inventors: Paul Caprioli, Abhay S. Kanhere, Jeffrey J. Cook, Muawya M. Al-Otoom
  • Patent number: 10114650
    Abstract: Techniques are disclosed relating to handling dependencies between instructions. In one embodiment, an apparatus includes decode circuitry and dependency circuitry. In this embodiment, the decode circuitry is configured to receive an instruction that specifies a destination location and determine a first storage region that includes the destination location. In this embodiment, the storage region is one of a plurality of different storage regions accessible by instructions processed by the apparatus. In this embodiment, the dependency circuitry is configured to stall the instruction until one or more older instructions that specify source locations in the first storage region have read their source locations. The disclosed techniques may be described as “pessimistic” dependency handling, which may, in some instances, maintain performance while limiting complexity, power consumption, and area of dependency logic.
    Type: Grant
    Filed: February 23, 2015
    Date of Patent: October 30, 2018
    Assignee: Apple Inc.
    Inventors: Robert D. Kenney, Liang-Kai Wang
  • Patent number: 10089076
    Abstract: A method of an aspect includes receiving a floating point scaling instruction. The floating point scaling instruction indicates a first source including one or more floating point data elements, a second source including one or more corresponding floating point data elements, and a destination. A result is stored in the destination in response to the floating point scaling instruction. The result includes one or more corresponding result floating point data elements each including a corresponding floating point data element of the second source multiplied by a base of the one or more floating point data elements of the first source raised to a power of an integer representative of the corresponding floating point data element of the first source. Other methods, apparatus, systems, and instructions are disclosed.
    Type: Grant
    Filed: March 15, 2018
    Date of Patent: October 2, 2018
    Assignee: Intel Corporation
    Inventors: Cristina S. Anderson, Amit Gradstein, Robert Valentine, Simon Rubanovich, Benny Eitan
  • Patent number: 10078593
    Abstract: A multi-core computer processor including a plurality of processor cores interconnected in a Network-on-Chip (NoC) architecture, a plurality of caches, each of the plurality of caches being associated with one and only one of the plurality of processor cores, and a plurality of memories, each of the plurality of memories being associated with a different set of at least one of the plurality of processor cores and each of the plurality of memories being configured to be visible in a global memory address space such that the plurality of memories are visible to two or more of the plurality of processor cores, wherein at least one of a number of the processor cores, a size of each of the plurality of caches, or a size of each of the plurality of memories is configured for performing a reverse-time-migration (RTM) computation.
    Type: Grant
    Filed: October 26, 2012
    Date of Patent: September 18, 2018
    Assignee: THE REGENTS OF THE UNIVERSITY OF CALIFORNIA
    Inventors: John Shalf, David Donofrio, Leonid Oliker, Jens Kruger, Samuel Williams
  • Patent number: 10061560
    Abstract: A method and system are disclosed for executing a machine instruction in a central processing unit. The method comprises the steps of obtaining a perform floating-point operation instruction; obtaining a test bit; and determining a value of the test bit. If the test bit has a first value, (a) a specified floating-point operation function is performed, and (b) a condition code is set to a value determined by said specified function. If the test bit has a second value, (c) a check is made to determine if said specified function is valid and installed on the machine, (d) if said specified function is valid and installed on the machine, the condition code is set to one code value, and (e) if said specified function is either not valid or not installed on the machine, the condition code is set to a second code value.
    Type: Grant
    Filed: April 25, 2016
    Date of Patent: August 28, 2018
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michel H. T. Hack, Ronald M. Smith, Sr.
  • Patent number: 9996346
    Abstract: A new zSeries floating-point unit has a fused multiply-add dataflow capable of supporting two architectures and fused MULTIPLY and ADD and Multiply and SUBTRACT in both RRF and RXF formats for the fused functions. Both binary and hexadecimal floating-point instructions are supported for a total of 6 formats. The floating-point unit is capable of performing a multiply-add instruction for hexadecimal or binary every cycle with a latency of 5 cycles. This supports two architectures with two internal formats with their own biases. This has eliminated format conversion cycles and has optimized the width of the dataflow. The unit is optimized for both hexadecimal and binary floating-point architecture supporting a multiply-add/subtract per cycle.
    Type: Grant
    Filed: June 28, 2017
    Date of Patent: June 12, 2018
    Assignee: International Business Machines Corporation
    Inventors: Eric M. Schwarz, Ronald M. Smith, Sr.
  • Patent number: 9921832
    Abstract: A vector reduction instruction with non-unit strided access pattern is received and executed by the execution circuitry of a processor. In response to the instruction, the execution circuitry performs an associative reduction operation on data elements of a first vector register. Based on values of the mask register and a current element position being processed, the execution circuitry sequentially sets one or more data elements of the first vector register to a result, which is generated by the associative reduction operation applied to both a previous data element of the first vector register and a data clement of a third vector register. The previous data element is located more than one element position away from the current element position.
    Type: Grant
    Filed: December 28, 2012
    Date of Patent: March 20, 2018
    Assignee: Intel Corporation
    Inventors: Albert Hartono, Jayashankar Bharadwaj, Nalini Vasudevan, Sara S. Baghsorkhi, Victor W. Lee, Daehyun Kim
  • Patent number: 9804850
    Abstract: Instructions and logic provide SIMD permute controls with leading zero count functionality. Some embodiments include processors with a register with a plurality of data fields, each of the data fields to store a second plurality of bits. A destination register has corresponding data fields, each of these data fields to store a count of the number of most significant contiguous bits set to zero for corresponding data fields. Responsive to decoding a vector leading zero count instruction, execution units count the number of most significant contiguous bits set to zero for each of data fields in the register, and store the counts in corresponding data fields of the first destination register. Vector leading zero count instructions can be used to generate permute controls and completion masks to be used along with the set of permute controls, to resolve dependencies in gather-modify-scatter SIMD operations.
    Type: Grant
    Filed: June 21, 2016
    Date of Patent: October 31, 2017
    Assignee: Intel Corporation
    Inventors: Christopher J. Hughes, Mikhail Plotnikov, Andrey Naraikin, Robert Valentine
  • Patent number: 9734224
    Abstract: A synchronization infrastructure that synchronizes data stored between components in a cloud infrastructure system is described. A first component in the cloud infrastructure system may store subscription information related to a subscription order which may in turn be utilized by a second component in the cloud infrastructure system to orchestrate the provisioning of services and resources for the order placed by the customer. The synchronization architecture utilizes transactionally consistent checkpoints that describe the state of the data stored in the components to synchronize the data between these components.
    Type: Grant
    Filed: March 20, 2015
    Date of Patent: August 15, 2017
    Assignee: Oracle International Corporation
    Inventors: Ramkrishna Chatterjee, Ramesh Vasudevan, Anjani Kalyan Prathipati, Gopalan Arun
  • Patent number: 9733935
    Abstract: A method of processing an instruction is described that includes fetching and decoding the instruction. The instruction has separate destination address, first operand source address and second operand source address components. The first operand source address identifies a location of a first mask pattern in mask register space. The second operand source address identifies a location of a second mask pattern in the mask register space. The method further includes fetching the first mask pattern from the mask register space; fetching the second mask pattern from the mask register space; merging the first and second mask patterns into a merged mask pattern; and, storing the merged mask pattern at a storage location identified by the destination address.
    Type: Grant
    Filed: December 23, 2011
    Date of Patent: August 15, 2017
    Assignee: Intel Corporation
    Inventors: Jesus Corbal, Andrew T. Forsyth, Roger Espasa, Manel Fernandez, Thomas D. Fletcher
  • Patent number: 9710270
    Abstract: A method for programmably controlling an exception includes performing, by a processor, a step of executing a control specification instruction for exception control specification that indicates whether an exception is enabled or not and setting a control specification value for the exception in a register and a step of executing a control execution instruction for exception control execution that indicates whether the exception is to be raised or not, determining whether the control specification value set in the register is a value for enabling the exception, and, when the control specification value is the value for enabling the exception, raising the exception. The method further includes performing a step of not raising the exception when the control specification value set in the register is not the value for enabling the exception.
    Type: Grant
    Filed: December 20, 2011
    Date of Patent: July 18, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Noriaki Asamoto
  • Patent number: 9672034
    Abstract: In-lane vector shuffle operations are described. In one embodiment a shuffle instruction specifies a field of per-lane control bits, a source operand and a destination operand, these operands having corresponding lanes, each lane divided into corresponding portions of multiple data elements. Sets of data elements are selected from corresponding portions of every lane of the source operand according to per-lane control bits. Elements of these sets are copied to specified fields in corresponding portions of every lane of the destination operand. Another embodiment of the shuffle instruction also specifies a second source operand, all operands having corresponding lanes divided into multiple data elements. A set selected according to per-lane control bits contains data elements from every lane portion of a first source operand and data elements from every corresponding lane portion of the second source operand. Set elements are copied to specified fields in every lane of the destination operand.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: June 6, 2017
    Assignee: Intel Corporation
    Inventors: Zeev Sperber, Robert Valentine, Benny Eitan, Doron Orenstein
  • Patent number: 9619233
    Abstract: A computer architecture allows for simplified exception handling by restarting the program after exceptions at the beginning of idempotent regions, the idempotent regions allowing re-execution without the need for restoring complex state information from checkpoints. Recovery from mis-speculation may be provided by a similar mechanism but using smaller idempotent regions reflecting a more frequent occurrence of mis-speculation. A compiler generating different idempotent regions for speculation and exception handling is also disclosed.
    Type: Grant
    Filed: February 19, 2016
    Date of Patent: April 11, 2017
    Assignee: Wisconsin Alumni Research Foundation
    Inventors: Jaikrishnan Menon, Marc Asher De Kruijf, Karthikeyan Sankaralingam
  • Patent number: 9613350
    Abstract: A payment reader includes a contactless interface for communicating with a contactless device. The payment reader has a processor that executes instructions stored in memory, and the instructions include instructions for a plurality of firmware modules including a message dispatcher module and a plurality of functional modules. The functional modules generate messages and the message dispatcher module stores the messages in a queued data structure such as a stack or a queue. The messages are provided to the functional modules from the queued data structure. Some of the messages are timed messages that are returned to the queued data structure.
    Type: Grant
    Filed: February 24, 2016
    Date of Patent: April 4, 2017
    Inventor: Kshitiz Vadera
  • Patent number: 9600280
    Abstract: A hazard check instruction has operands that specify addresses of vector elements to be read by first and second vector memory operations. The hazard check instruction outputs a dependency vector identifying, for each element position of the first vector corresponding to the first vector memory operation, which element position of the second vector that the element of the first vector depends on (if any). In an embodiment, at least one of the vector memory operations has addresses specified using a scalar address in the operands (and a vector attribute associated with the vector). In an embodiment, the operands may include predicates for one or both of the vector memory operations, indicating which vector elements are active. The dependency vector may be qualified by the predicates, indicating dependencies only for active elements.
    Type: Grant
    Filed: September 24, 2013
    Date of Patent: March 21, 2017
    Assignee: Apple Inc.
    Inventor: Jeffry E. Gonion
  • Patent number: 9588952
    Abstract: Reconstituting an attribute associated with data. Data in a tabular form may be received. The data is analyzed for a field that is likely to be determined by a formula. Responsive to identifying the field likely to be determined by the formula, An indication of the field and the formula with the data are stored in a repository. The indication of the field and the formula with the data from the repository may be retrieved to facilitate incorporating the data in an application with the formula for the field integrated into the application.
    Type: Grant
    Filed: June 22, 2015
    Date of Patent: March 7, 2017
    Assignee: International Business Machines Corporation
    Inventors: Al Chakra, Liam Harpur, John Rice
  • Patent number: 9588766
    Abstract: A vector reduction instruction is executed by a processor to provide efficient reduction operations on an array of data elements. The processor includes vector registers. Each vector register is divided into a plurality of lanes, and each lane stores the same number of data elements. The processor also includes execution circuitry that receives the vector reduction instruction to reduce the array of data elements stored in a source operand into a result in a destination operand using a reduction operator. Each of the source operand and the destination operand is one of the vector registers. Responsive to the vector reduction instruction, the execution circuitry applies the reduction operator to two of the data elements in each lane, and shifts one or more remaining data elements when there is at least one of the data elements remaining in each lane.
    Type: Grant
    Filed: September 28, 2012
    Date of Patent: March 7, 2017
    Assignee: Intel Corporation
    Inventors: Paul Caprioli, Abhay S. Kanhere, Jeffrey J. Cook, Muawya M. Al-Otoom
  • Patent number: 9569190
    Abstract: Compiling source code to reduce run-time execution of vector element reverse operations, includes: identifying, by a compiler, a first loop nested within a second loop in a computer program; identifying, by the compiler, a vector element reverse operation within the first loop; moving, by the compiler, the vector element reverse operation from the first loop to the second loop.
    Type: Grant
    Filed: August 4, 2015
    Date of Patent: February 14, 2017
    Assignee: International Business Machines Corporation
    Inventors: Michael Karl Gschwind, William J. Schmidt
  • Patent number: 9569188
    Abstract: Compiling source code to reduce run-time execution of vector element reverse operations, includes: identifying, by a compiler, a first loop nested within a second loop in a computer program; identifying, by the compiler, a vector element reverse operation within the first loop; moving, by the compiler, the vector element reverse operation from the first loop to the second loop.
    Type: Grant
    Filed: July 25, 2016
    Date of Patent: February 14, 2017
    Assignee: International Business Machines Corporation
    Inventors: Michael Karl Gschwind, William J. Schmidt
  • Patent number: 9501277
    Abstract: A first operation of comparison of the first initial operand with the second initial operand uses at least one comparison operator in such a way as to obtain a first final result word. A second operation of comparison of the second initial operand with the first initial operand uses the at least one comparison operator in such a way as to obtain a second final result word. Another operation checks the values of the bits of the two final result words in relation to a part at least of r combinations of reference values taken from possible combinations of values of these two final result words. These reference combinations represent a valid result of comparison of the two operands including an equality, a relationship of inferiority and a relationship of superiority between the two operands.
    Type: Grant
    Filed: June 12, 2014
    Date of Patent: November 22, 2016
    Assignee: STMICROELECTRONICS (ROUSSET) SAS
    Inventors: Pierre Guillemin, Yannick Teglia
  • Patent number: 9495153
    Abstract: A computer processor includes a decoder for decoding machine instructions and an execution unit for executing those instructions. The decoder and the execution unit are capable of decoding and executing vector instructions that include one or more format conversion indicators. For instance, the processor may be capable of executing a vector-load-convert-and-write (VLoadConWr) instruction that provides for loading data from memory to a vector register. The VLoadConWr instruction may include a format conversion indicator to indicate that the data from memory should be converted from a first format to a second format before the data is loaded into the vector register. Other embodiments are described and claimed.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: November 15, 2016
    Assignee: Intel Corporation
    Inventors: Eric Sprangle, Robert D. Cavin, Anwar Rohillah, Douglas M. Carmean
  • Patent number: 9495159
    Abstract: In response to detecting one or more conditions are met, a checkpoint of a current state of a thread may be created. One or more incomplete instructions may be moved from a first level of a re-order buffer to a second level of the re-order buffer. Each incomplete instruction may be currently executing or awaiting execution.
    Type: Grant
    Filed: September 27, 2013
    Date of Patent: November 15, 2016
    Assignee: Intel Corporation
    Inventors: Mark J. Dechene, Srikanth T. Srinivasan, Matthew C. Merten, Tong Li, Christine E. Wang
  • Patent number: 9489344
    Abstract: A data processor of a processing system, such as a graphics processing system, converts an input data value into an output data value by approximating a function which maps input values to output values. The data processor approximates the function using first and second predetermined ranges of values which are quantized into plural corresponding pairs of range sections, a predetermined gradient for each pair of range sections, and predetermined section end values for each pair of range sections. By using these predetermined parameters, the approximation of the function can be implemented efficiently by the data processor of the processing system.
    Type: Grant
    Filed: June 27, 2013
    Date of Patent: November 8, 2016
    Assignee: ARM LIMITED
    Inventors: Jorn Nystad, Sean Tristram Ellis
  • Patent number: 9448796
    Abstract: Restricted instructions are prohibited from execution within a transaction. There are classes of instructions that are restricted regardless of type of transaction: constrained or nonconstrained. There are instructions only restricted in constrained transactions, and there are instructions that are selectively restricted for given transactions based on controls specified on instructions used to initiate the transactions.
    Type: Grant
    Filed: June 15, 2012
    Date of Patent: September 20, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Dan F. Greiner, Christian Jacobi, Timothy J. Slegel
  • Patent number: 9448797
    Abstract: Restricted instructions are prohibited from execution within a transaction. There are classes of instructions that are restricted regardless of type of transaction: constrained or nonconstrained. There are instructions only restricted in constrained transactions, and there are instructions that are selectively restricted for given transactions based on controls specified on instructions used to initiate the transactions.
    Type: Grant
    Filed: March 4, 2013
    Date of Patent: September 20, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Dan F. Greiner, Christian Jacobi, Timothy J. Slegel
  • Patent number: 9355061
    Abstract: A data processing apparatus and method are provided for executing a vector scan instruction. The data processing apparatus comprises a vector register store configured to store vector operands, and processing circuitry configured to perform operations on vector operands retrieved from said vector register store. Further, control circuitry is configured to control the processing circuitry to perform the operations required by one or more instructions, said one or more instructions including a vector scan instruction specifying a vector operand comprising N vector elements and defining a scan operation to be performed on a sequence of vector elements within the vector operand.
    Type: Grant
    Filed: January 28, 2014
    Date of Patent: May 31, 2016
    Assignee: ARM Limited
    Inventors: Matthias Lothar Boettcher, Mbou Eyole-Monono, Giacomo Gabrielli
  • Patent number: 9329868
    Abstract: Embodiments relate to reducing a number of read ports for register pairs. An aspect includes executing an instruction. The instruction identifies a pair of registers as containing a wide operand which spans the pair of registers. It is determined if a pairing indicator associated with the pair of registers has a first value or a second value. The first value indicates that the wide operand is stored in a wide register, and the second value indicates that the wide operand is not stored in the wide register. Based on the pairing indicator having the first value, the wide operand is read from the wide register. Based on the pairing indicator having the second value, the wide operand is read from the pair of registers. An operation is performed using the wide operand.
    Type: Grant
    Filed: October 22, 2014
    Date of Patent: May 3, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Jonathan D. Bradbury, Michael K. Gschwind
  • Patent number: 9323532
    Abstract: Embodiments relate to reducing a number of read ports for register pairs. An aspect includes executing an instruction. The instruction identifies a pair of registers as containing a wide operand which spans the pair of registers. The executing of the instruction includes determining whether a pairing indicator associated with the pair of registers has a first value, a second value or a third value. Based on the pairing indicator having the first value, the wide operand is read from the wide register. Based on the pairing indicator having the second value the wide operand is read from the pair of registers. Based on the pairing indicator having the third value, the wide operand is speculatively read from a predetermined register. The predetermined register consists of the wide register or the pair of registers.
    Type: Grant
    Filed: July 18, 2012
    Date of Patent: April 26, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Jonathan D. Bradbury, Michael K. Gschwind
  • Patent number: 9323529
    Abstract: Embodiments relate to reducing a number of read ports for register pairs. An aspect includes executing an instruction. The instruction identifies a pair of registers as containing a wide operand which spans the pair of registers. It is determined if a pairing indicator associated with the pair of registers has a first value or a second value. The first value indicates that the wide operand is stored in a wide register, and the second value indicates that the wide operand is not stored in the wide register. Based on the pairing indicator having the first value, the wide operand is read from the wide register. Based on the pairing indicator having the second value, the wide operand is read from the pair of registers. An operation is performed using the wide operand.
    Type: Grant
    Filed: July 18, 2012
    Date of Patent: April 26, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Jonathan D. Bradbury, Michael K. Gschwind
  • Patent number: 9317251
    Abstract: A method for correcting a shift error in a fused multiply add operation. The method comprises adjusting a normalized floating-point number before performing a shift error correction to produce an adjusted normalized floating-point number, and correcting a shift error in the adjusted normalized floating-point number. The correcting the shift error comprises shifting a mantissa of the adjusted normalized floating-point number in one direction. A fused multiply add module comprising a normalizer module, a compensation logic, and a round. The normalizer module is operable to normalize a floating-point number to produce a normalized floating-point number. The floating-point number is normalized based upon an estimated quantity of leading zeros. The compensation logic is operable to manage a correction of a shift error in the normalized floating-point number. The rounder is operable to correct the shift error with a mantissa shift in only one direction.
    Type: Grant
    Filed: December 31, 2012
    Date of Patent: April 19, 2016
    Assignee: NVIDIA CORPORATION
    Inventors: Charles Tsen, Adam Dreyer
  • Patent number: 9298497
    Abstract: A computer architecture allows for simplified exception handling by restarting the program after exceptions at the beginning of idempotent regions, the idempotent regions allowing re-execution without the need for restoring complex state information from checkpoints. Recovery from mis-speculation may be provided by a similar mechanism but using smaller idempotent regions reflecting a more frequent occurrence of mis-speculation. A compiler generating different idempotent regions for speculation and exception handling is also disclosed.
    Type: Grant
    Filed: July 13, 2012
    Date of Patent: March 29, 2016
    Assignee: Wisconsin Alumni Research Foundation
    Inventors: Jaikrishnan Menon, Marc Asher De Kruijf, Karthikeyan Sankaralingam