Arithmetic Operation Instruction Processing Patents (Class 712/221)

Floating point or vector (Class 712/222)

Instruction for performing an overload check

Patent number: 10162640

Abstract: A processor is described having a functional unit within an instruction execution pipeline. The functional unit having circuitry to determine whether substantive data from a larger source data size will fit within a smaller data size that the substantive data is to flow to.

Type: Grant

Filed: August 16, 2016

Date of Patent: December 25, 2018

Assignee: Intel Corporation

Inventors: Martin G. Dixon, Baiju V. Patel, Rajeev Gopalakrishna
Processing of multiple instruction streams in a parallel slice processor

Patent number: 10157064

Abstract: A method of managing instruction execution for multiple instruction streams using a processor core having multiple parallel instruction execution slices. An event is detected indicating that either resource requirement or resource availability for a subsequent instruction of an instruction stream will not be met by the instruction execution slice currently executing the instruction stream. In response to detecting the event, dispatch of at least a portion of the subsequent instruction is made to another instruction execution slice. The event may be a compiler-inserted directive, may be an event detected by logic in the processor core, or may be determined by a thread sequencer. The instruction execution slices may be dynamically reconfigured as between single-instruction-multiple-data (SIMD) instruction execution, ordinary instruction execution, wide instruction execution.

Type: Grant

Filed: February 27, 2017

Date of Patent: December 18, 2018

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Lee Evan Eisen, Hung Qui Le, Jentje Leenstra, Jose Eduardo Moreira, Bruce Joseph Ronchetti, Brian William Thompto, Albert James Van Norstrand, Jr.
Systems, apparatuses, and methods for chained fused multiply add

Patent number: 10146535

Abstract: Embodiments of systems, apparatuses, and methods for chained fused multiply add. In some embodiments, an apparatus includes a decoder to decode a single instruction having an opcode, a destination field representing a destination operand, a first source field representing a plurality of packed data source operands of a first type that have packed data elements of a first size, a second source field representing a plurality of packed data source operands that have packed data elements of a second size, and a field for a memory location that stores a scalar value. A register file having a plurality of packed data registers includes registers for the plurality of packed data source operands that have packed data elements of a first size, the source operands that have packed data elements of a second size, and the destination operand.

Type: Grant

Filed: October 20, 2016

Date of Patent: December 4, 2018

Assignee: Intel Corporatoin

Inventors: Jesus Corbal, Robert Valentine, Roman S. Dubtsov, Nikita A. Shustrov, Mark J. Charney, Dennis R. Bradford, Milind B. Girkar, Edward T. Grochowski, Thomas D. Fletcher, Warren E. Ferguson
Fast transitions for massively parallel computing applications

Patent number: 10102032

Abstract: Embodiments relate to facilitating quick and graceful transitions for massively parallel computing applications. A computer-implemented method for facilitating termination of a plurality of threads of a process is provided. The method maintains information about open communications between one or more of the threads of the process and one or more of other processes. In response to receiving a command to terminate one or more of the threads of the process, the method completes the open communications on behalf of the threads after terminating the threads.

Type: Grant

Filed: May 29, 2014

Date of Patent: October 16, 2018

Assignee: Raytheon Company

Inventors: Benjamin M. Howe, Jacob L. Sanders
Pseudorandom communications routing

Patent number: 10091092

Abstract: This invention provides systems and methods to make communication networks more resilient, stealthier and robust. This invention discloses systems and methods wherein either a communications user equipment (UE) with multiple types of wireless links, potentially operating in different frequency bands, or an apparatus which performs communications routing functions, changes the communications routing in pseudo-random manner.

Type: Grant

Filed: February 3, 2017

Date of Patent: October 2, 2018

Assignee: The United States of America as represented by the Secretary of the Air Force

Inventor: Amjad Soomro
Processing denormal numbers in FMA hardware

Patent number: 10078512

Abstract: A microprocessor includes FMA execution logic that determines whether to accumulate an accumulator operand C to the partial products of multiplier and multiplicand operands A and B in the partial product adder or in a second accumulation stage. The logic calculates an exponent delta of Aexp+Bexp?Cexp and determines the number of leading zeroes in C, if C is denormal. The microprocessor accumulates C with the partial products of A and B when the accumulation of C to the product of A and B could result in mass cancellation, when ExpDelta is greater than or equal to ?K (where K is related to a width of a datapath in the partial product adder), and when a C is denormal and its number of leading zeroes plus K exceeds ?ExpDelta. The strategic use of resources in the partial product adder and second accumulation stage reduces latency.

Type: Grant

Filed: October 3, 2016

Date of Patent: September 18, 2018

Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD.

Inventor: Thomas Elmer
Programmable logic device, method for verifying error of programmable logic device, and method for forming circuit of programmable logic device

Patent number: 10067742

Abstract: Arithmetic operation circuits and a verification circuit are formed by loading configuration information into a configuration memory in an FPGA. Arithmetic operation circuits have the same arithmetic operation function, but are different from each other in combination of the circuit blocks. The arithmetic operation circuits are formed by combining the circuit blocks to make the maximum use of the DSP block, while the arithmetic operation circuit is formed by combining the circuit blocks other than DSP block. The arithmetic operation circuits each are configured to use a block RAM as the data hold memory, while the arithmetic operation circuit is configured to use a distributed RAM as the data hold memory. Each of the arithmetic operation circuits receives the input data, and outputs arithmetic operation result data (V1 to V3). A verification circuit compares the arithmetic operation result data to verify whether errors occur.

Type: Grant

Filed: February 24, 2016

Date of Patent: September 4, 2018

Assignee: Control System Laboratory Ltd.

Inventor: Kenichi Morimoto
Adjusting target values of resistive memory devices

Patent number: 10037804

Abstract: Examples disclosed herein relate to programming a first conductance of a first resistive memory device based on a first target value. The first conductance of the first resistive memory device is measured to determine a deviation of the first resistive memory device from the first target value. A second target value of a second resistive memory device is adjusted based on the deviation, and a second conductance of the second resistive memory device is programmed based on the adjusted second target value.

Type: Grant

Filed: January 27, 2017

Date of Patent: July 31, 2018

Assignee: Hewlett Packard Enterprise Development LP

Inventors: Brent Buchanan, Le Zheng, John Paul Strachan
Suspending and resuming continuous queries over data streams

Patent number: 9910896

Abstract: In an embodiment, a method comprises processing an input data stream as the data stream is streamed and producing a derived stream therefrom; storing the input data stream in an input archive; suspending processing of the input data stream; subsequent to suspending processing, resuming processing of the input data stream, wherein resuming comprises: storing newly received data in the input data stream in a buffer, as the input data stream is streamed; determining a first timestamp; determining a second timestamp; searching the input archive to find a data item that matches the first timestamp of the last processed data item; processing data in the input archive having timestamps that are greater than the first timestamp until arriving at data with a third timestamp that is greater than the second timestamp; processing the input data stream from the buffer; continuing processing the input data stream as the input stream is streamed.

Type: Grant

Filed: March 15, 2013

Date of Patent: March 6, 2018

Assignee: CISCO TECHNOLOGY, INC.

Inventors: Sailesh Krishnamurthy, Chris Metz, Rex E. Fernando, Jisu Bhattacharya
Functional unit for instruction execution pipeline capable of shifting different chunks of a packed data operand by different amounts

Patent number: 9851972

Abstract: A method is described that includes fetching an instruction. The method further includes decoding the instruction. The instruction specifies an operation, a first operand and a second operand. The method further includes fetching the first and second operands of the instruction. The first and second operands are each composed of a plurality of larger chunks having constituent elements. The method further includes performing the operation specified by the instruction including generating a resultant composed of a plurality of larger chunks having constituent elements. The generating of the resultant includes selecting for each element in the resultant a contiguous group of bits from a same positioned chunk of the first operand as the chunk of the element in the resultant, the contiguous group of bits being identified by a same positioned element of the second operand as the element in the resultant.

Type: Grant

Filed: January 23, 2017

Date of Patent: December 26, 2017

Assignee: Intel Corporation

Inventors: Tal Uliel, Robert Valentine
Method, apparatus and system for data stream processing with a programmable accelerator

Patent number: 9830154

Abstract: Techniques and mechanisms for programming an accelerator device to enable performance of a data processing algorithm. In an embodiment, an accelerator of a computer platform is programmed based on programming information received from a host processor of the computer platform. In another embodiment, programming of the accelerator is to enable data driven execution of an instruction by a data stream processing engine of the accelerator.

Type: Grant

Filed: December 29, 2011

Date of Patent: November 28, 2017

Assignee: Intel Corporation

Inventor: Vladimir Ivanov
Processing of multiple instruction streams in a parallel slice processor

Patent number: 9690586

Abstract: A method of managing instruction execution for multiple instruction streams using a processor core having multiple parallel instruction execution slices provides instruction processing flexibility. An event is detected indicating that either resource requirement or resource availability for a subsequent instruction of an instruction stream will not be met by the instruction execution slice currently executing the instruction stream. In response to detecting the event, dispatch of at least a portion of the subsequent instruction is made to another instruction execution slice. The event may be a compiler-inserted directive, may be an event detected by logic in the processor core, or may be determined by a thread sequencer. The execution slices may be dynamically reconfigured as between single-instruction-multiple-data (SIMD) instruction execution, ordinary instruction execution, wide instruction execution.

Type: Grant

Filed: June 12, 2014

Date of Patent: June 27, 2017

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Lee Evan Eisen, Hung Qui Le, Jentje Leenstra, Jose Eduardo Moreira, Bruce Joseph Ronchetti, Brian William Thompto, Albert James Van Norstrand, Jr.
Processing of multiple instruction streams in a parallel slice processor

Patent number: 9672043

Abstract: Techniques for managing instruction execution for multiple instruction streams using a processor core having multiple parallel instruction execution slices provide flexibility in execution of program instructions by a processor core. An event is detected indicating that either resource requirement or resource availability will not be met by the execution slice currently executing the instruction stream. In response to detecting the event, dispatch of at least a portion of the subsequent instruction is made to another instruction execution slice. The event may be a compiler-inserted directive, may be an event detected by logic in the processor core, or may be determined by a thread sequencer. The instruction execution slices may be dynamically reconfigured as between single-instruction-multiple-data (SIMD) instruction execution, ordinary instruction execution, wide instruction execution.

Type: Grant

Filed: May 12, 2014

Date of Patent: June 6, 2017

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Lee Evan Eisen, Hung Qui Le, Jentje Leenstra, Jose Eduardo Moreira, Bruce Joseph Ronchetti, Brian William Thompto, Albert James Van Norstrand, Jr.
Coalescing adjacent gather/scatter operations

Patent number: 9658856

Abstract: According to one embodiment, a processor includes an instruction decoder to decode a first instruction to gather data elements from memory, the first instruction having a first operand specifying a first storage location and a second operand specifying a first memory address storing a plurality of data elements. The processor further includes an execution unit coupled to the instruction decoder, in response to the first instruction, to read contiguous a first and a second of the data elements from a memory location based on the first memory address indicated by the second operand, and to store the first data element in a first entry of the first storage location and a second data element in a second entry of a second storage location corresponding to the first entry of the first storage location.

Type: Grant

Filed: December 21, 2015

Date of Patent: May 23, 2017

Assignee: Intel Corporation

Inventors: Andrew T. Forsyth, Brian J. Hickmann, Jonathan C. Hall, Christopher J. Hughes
Coalescing adjacent gather/scatter operations

Patent number: 9645826

Abstract: According to one embodiment, a processor includes an instruction decoder to decode a first instruction to gather data elements from memory, the first instruction having a first operand specifying a first storage location and a second operand specifying a first memory address storing a plurality of data elements. The processor further includes an execution unit coupled to the instruction decoder, in response to the first instruction, to read contiguous a first and a second of the data elements from a memory location based on the first memory address indicated by the second operand, and to store the first data element in a first entry of the first storage location and a second data element in a second entry of a second storage location corresponding to the first entry of the first storage location.

Type: Grant

Filed: December 18, 2015

Date of Patent: May 9, 2017

Assignee: Intel Corporation

Inventors: Andrew T. Forsyth, Brian J. Hickmann, Jonathan C. Hall, Christopher J. Hughes
Coalescing adjacent gather/scatter operations

Patent number: 9632792

Abstract: According to one embodiment, a processor includes an instruction decoder to decode a first instruction to gather data elements from memory, the first instruction having a first operand specifying a first storage location and a second operand specifying a first memory address storing a plurality of data elements. The processor further includes an execution unit coupled to the instruction decoder, in response to the first instruction, to read contiguous a first and a second of the data elements from a memory location based on the first memory address indicated by the second operand, and to store the first data element in a first entry of the first storage location and a second data element in a second entry of a second storage location corresponding to the first entry of the first storage location.

Type: Grant

Filed: December 21, 2015

Date of Patent: April 25, 2017

Assignee: Intel Corporation

Inventors: Andrew T. Forsyth, Brian J. Hickmann, Jonathan C. Hall, Christopher J. Hughes
Coalescing adjacent gather/scatter operations

Patent number: 9626193

Abstract: According to one embodiment, a processor includes an instruction decoder to decode a first instruction to gather data elements from memory, the first instruction having a first operand specifying a first storage location and a second operand specifying a first memory address storing a plurality of data elements. The processor further includes an execution unit coupled to the instruction decoder, in response to the first instruction, to read contiguous a first and a second of the data elements from a memory location based on the first memory address indicated by the second operand, and to store the first data element in a first entry of the first storage location and a second data element in a second entry of a second storage location corresponding to the first entry of the first storage location.

Type: Grant

Filed: December 21, 2015

Date of Patent: April 18, 2017

Assignee: Intel Corporation

Inventors: Andrew T. Forsyth, Brian J. Hickmann, Jonathan C. Hall, Christopher J. Hughes
Coalescing adjacent gather/scatter operations

Patent number: 9626192

Abstract: According to one embodiment, a processor includes an instruction decoder to decode a first instruction to gather data elements from memory, the first instruction having a first operand specifying a first storage location and a second operand specifying a first memory address storing a plurality of data elements. The processor further includes an execution unit coupled to the instruction decoder, in response to the first instruction, to read contiguous a first and a second of the data elements from a memory location based on the first memory address indicated by the second operand, and to store the first data element in a first entry of the first storage location and a second data element in a second entry of a second storage location corresponding to the first entry of the first storage location.

Type: Grant

Filed: December 18, 2015

Date of Patent: April 18, 2017

Assignee: Intel Corporation

Inventors: Andrew T. Forsyth, Brian J. Hickmann, Jonathan C. Hall, Christopher J. Hughes
Coalescing adjacent gather/scatter operations

Patent number: 9612842

Abstract: According to one embodiment, a processor includes an instruction decoder to decode a first instruction to gather data elements from memory, the first instruction having a first operand specifying a first storage location and a second operand specifying a first memory address storing a plurality of data elements. The processor further includes an execution unit coupled to the instruction decoder, in response to the first instruction, to read contiguous a first and a second of the data elements from a memory location based on the first memory address indicated by the second operand, and to store the first data element in a first entry of the first storage location and a second data element in a second entry of a second storage location corresponding to the first entry of the first storage location.

Type: Grant

Filed: December 21, 2015

Date of Patent: April 4, 2017

Assignee: Intel Corporation

Inventors: Andrew T. Forsyth, Brian J. Hickmann, Jonathan C. Hall, Christopher J. Hughes
Coalescing adjacent gather/scatter operations

Patent number: 9575765

Abstract: According to one embodiment, a processor includes an instruction decoder to decode a first instruction to gather data elements from memory, the first instruction having a first operand specifying a first storage location and a second operand specifying a first memory address storing a plurality of data elements. The processor further includes an execution unit coupled to the instruction decoder, in response to the first instruction, to read contiguous a first and a second of the data elements from a memory location based on the first memory address indicated by the second operand, and to store the first data element in a first entry of the first storage location and a second data element in a second entry of a second storage location corresponding to the first entry of the first storage location.

Type: Grant

Filed: December 18, 2015

Date of Patent: February 21, 2017

Assignee: Intel Corporation

Inventors: Andrew T. Forsyth, Brian J. Hickmann, Jonathan C. Hall, Christopher J. Hughes
Coalescing adjacent gather/scatter operations

Patent number: 9563429

Abstract: According to one embodiment, a processor includes an instruction decoder to decode a first instruction to gather data elements from memory, the first instruction having a first operand specifying a first storage location and a second operand specifying a first memory address storing a plurality of data elements. The processor further includes an execution unit coupled to the instruction decoder, in response to the first instruction, to read contiguous a first and a second of the data elements from a memory location based on the first memory address indicated by the second operand, and to store the first data element in a first entry of the first storage location and a second data element in a second entry of a second storage location corresponding to the first entry of the first storage location.

Type: Grant

Filed: December 18, 2015

Date of Patent: February 7, 2017

Assignee: Intel Corporation

Inventors: Andrew T. Forsyth, Brian J. Hickmann, Jonathan C. Hall, Christopher J. Hughes
Functional unit for instruction execution pipeline capable of shifting different chunks of a packed data operand by different amounts

Patent number: 9552209

Abstract: A method is described that includes fetching an instruction. The method further includes decoding the instruction. The instruction specifies an operation, a first operand and a second operand. The method further includes fetching the first and second operands of the instruction. The first and second operands are each composed of a plurality of larger chunks having constituent elements. The method further includes performing the operation specified by the instruction including generating a resultant composed of a plurality of larger chunks having constituent elements. The generating of the resultant includes selecting for each element in the resultant a contiguous group of bits from a same positioned chunk of the first operand as the chunk of the element in the resultant, the contiguous group of bits being identified by a same positioned element of the second operand as the element in the resultant.

Type: Grant

Filed: December 27, 2013

Date of Patent: January 24, 2017

Assignee: Intel Corporation

Inventors: Tal Uliel, Robert Valentine
Processor register error correction management

Patent number: 9529653

Abstract: Processor register protection management is disclosed. In embodiments, a method of processor register protection management can include determining a sensitive logical register for executable code generated by a compiler, generating an error-correction table identifying the sensitive logical register, and storing the error-correction table in a memory accessible by a processor. The processor can be configured to generate a duplicate register of the sensitive logical register identified by the error-correction table.

Type: Grant

Filed: October 9, 2014

Date of Patent: December 27, 2016

Assignee: International Business Machines Corporation

Inventors: Pradip Bose, Chen-Yong Cher, Meeta S. Gupta
Apparatus and method for performing permute operations

Patent number: 9513918

Abstract: An apparatus and method are described for permuting data elements with masking. For example, a method according to one embodiment includes the following operations: reading values from a mask data structure to determine whether masking is implemented for each data element of a destination operand; if masking not implemented for a particular data element, then selecting data elements from a first source operand and a second source operand based on index values stored in destination operand to be copied to data element positions within the destination operand, wherein any one of the data elements from either the first source operand and the second source operand may be copied to any one of the data element positions within the destination operand; and if masking is implemented for a particular data element of the destination operand, then performing a designated masking operation with respect to that particular data element.

Type: Grant

Filed: December 22, 2011

Date of Patent: December 6, 2016

Assignee: INTEL CORPORATION

Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Mostafa Hagog, Jesus Corbal, Bret L Toll, Mark J Charney, Tal Uliel, Zeev Sperber, Amit Gradstein
Instruction merging optimization

Patent number: 9513916

Abstract: A computer-implemented method includes determining that two or more instructions of an instruction stream are eligible for optimization, where the two or more instructions include a memory load instruction and a data processing instruction to process data based on the memory load instruction. The method includes merging, by a processor, the two or more instructions into a single optimized internal instruction and executing the single optimized internal instruction to perform a memory load function and a data processing function corresponding to the memory load instruction and the data processing instruction.

Type: Grant

Filed: March 8, 2013

Date of Patent: December 6, 2016

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Michael K. Gschwind, Valentina Salapura
Data transmission circuit

Patent number: 9514790

Abstract: A data transmission circuit may include data line groups and pass sections arranged among the data line groups to allow the data line groups to form one line. The data transmission circuit may include an input/output unit configured to be coupled to the data line groups and to process write data to be transmitted to the data line groups or read data transmitted from the data line groups. The data transmission circuit may include a pass control unit configured to selectively enable the pass sections in response to an address for specifying a target data line group of the data line groups.

Type: Grant

Filed: May 13, 2015

Date of Patent: December 6, 2016

Assignee: SK HYNIX INC.

Inventor: Sang Oh Lim
Instruction merging optimization

Patent number: 9513915

Abstract: A computer system for optimizing instructions includes a processor including an instruction execution unit configured to execute instructions and an instruction optimization unit configured to optimize instructions and memory to store machine instructions to be executed by the instruction execution unit. The computer system is configured to perform a method including analyzing machine instructions from among a stream of instructions to be executed by the instruction execution unit, the machine instructions including a memory load instruction and a data processing instruction to perform a data processing function based on the memory load instruction, identifying the machine instructions as being eligible for optimization, merging the machine instructions into a single optimized internal instruction, and executing the single optimized internal instruction to perform a memory load function and a data processing function corresponding to the memory load instruction and the data processing instruction.

Type: Grant

Filed: March 28, 2012

Date of Patent: December 6, 2016

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Michael K. Gschwind, Valentina Salapura
Apparatus and method for performing a permute operation

Patent number: 9495162

Abstract: An apparatus and method are described for permuting data elements with masking. For example, a method according to one embodiment includes the following operations: reading values from a mask data structure to determine whether masking is implemented for each data element of a destination operand; if masking not implemented for a particular data element, then selecting data elements from the destination operand and a second source operand based on index values stored in a first source operand to be copied to data element positions within the destination operand, wherein any one of the data elements from either the destination operand and the second source operand may be copied to any one of the data element positions within the destination operand; if masking is implemented for a particular data element of the destination operand, then performing a designated masking operation with respect to that particular data element.

Type: Grant

Filed: December 23, 2011

Date of Patent: November 15, 2016

Assignee: INTEL CORPORATION

Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Mostafa Hagog, Jesus Corbal, Tal Uliel, Zeev Sperber, Amit Gradstein
Processor execution unit with configurable SIMD functional blocks for complex number operations

Patent number: 9465611

Abstract: Methods and systems for executing SIMD instructions that efficiently implement new SIMD instructions and conventional existing SIMD MAC-type instructions, while avoiding replication of functions in order to keep the size of the logic circuit size to as low a level as can reasonably be achieved. An instruction unit executes Single Instruction Multiple Data instructions, including instructions acting on operands representing complex numbers. The instruction unit includes functional blocks that are commonly utilized to execute a plurality of the instructions, wherein the plurality of instructions utilize various individual functional blocks in various combinations with one another. The plurality of instructions is optionally executed in a pipeline fashion.

Type: Grant

Filed: October 4, 2004

Date of Patent: October 11, 2016

Assignee: Broadcom Corporation

Inventors: Mark Taunton, Andrew Jon Dawson
Compiler-controlled region scheduling for SIMD execution of threads

Patent number: 9424038

Abstract: A compiler-controlled technique for scheduling threads to execute different regions of a program. A compiler analyzes program code to determine a control flow graph for the program code. The control flow graph contains regions and directed edges between regions. The regions have associated execution priorities. The directed edges indicate the direction of program control flow. Each region has a thread frontier which contains one or more regions. The compiler inserts one or more update predicate mask variable instructions at the end of a region. The compiler also inserts one or more conditional branch instructions at the end of the region. The conditional branch instructions are arranged in order of execution priority of the regions in the thread frontier of the region, to enforce execution priority of the regions at runtime.

Type: Grant

Filed: December 10, 2012

Date of Patent: August 23, 2016

Assignee: NVIDIA Corporation

Inventors: Gregory Diamos, Mojtaba Mehrara
Apparatus and method for an instruction that determines whether a value is within a range

Patent number: 9411586

Abstract: A method is described that includes performing the following with a single instruction: receiving a first input operand V; receiving a second input operand S; calculating V?S; determining if V?S is positive or negative; and, providing as a resultant: V if V?S is negative; V?S if V?S is positive.

Type: Grant

Filed: December 23, 2011

Date of Patent: August 9, 2016

Assignee: Intel Corporation

Inventors: Thomas R. Craver, Elmoustapha Ould-Ahmed-Vall
Add-compare-select instruction

Patent number: 9389854

Abstract: An apparatus includes memory storing an instruction that identifies a first register, a second register, and a third register. Upon execution of the instruction by a processor, a vector addition operation is performed by the processor to add first values from the first register to second values from the second register. A vector subtraction operation is also performed upon execution of the instruction to subtract the second value from third values from the third register. A vector compare operation is also performed upon execution of the instruction to compare results of the vector addition operation to results of the vector subtraction operation.

Type: Grant

Filed: March 15, 2013

Date of Patent: July 12, 2016

Assignee: Qualcomm Incorporated

Inventor: Nico De Laurentiis
Systems, apparatuses, and methods for mapping a source operand to a different range

Patent number: 9389861

Abstract: Embodiments of systems, apparatuses, and methods for performing a range mapping instruction in a computer processor are described. In some embodiments, the execution of a range mapping instruction maps a data element having a source data range to a destination data element having a destination data range and storage of the of the destination data element.

Type: Grant

Filed: December 22, 2011

Date of Patent: July 12, 2016

Assignee: Intel Corporation

Inventors: Elmoustapha Ould-Ahmed-Vall, Thomas Ray Carver
Instruction merging optimization

Patent number: 9298464

Abstract: A computer-implemented method includes determining that two or more instructions of an instruction stream are eligible for optimization. Eligibility is based on a first instruction specifying a first target register and a second instruction specifying the first target register as a source register and a target register. The method includes merging the two or more machine instructions into a single optimized internal instruction that is configured to perform first and second functions of two or more machine instructions employing operands specified by the two or more machine instructions. The single optimized internal instruction specifies the first target register only as a single target register and the single optimized internal instruction specifies the first and second functions to be performed. The method includes executing the single optimized internal instruction to perform the first and second functions of the two or more instructions.

Type: Grant

Filed: March 8, 2013

Date of Patent: March 29, 2016

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Michael K. Gschwind, Valentina Salapura
Instruction merging optimization

Patent number: 9292291

Abstract: A computer system for optimizing instructions is configured to identify two or more machine instructions as being eligible for optimization, to merge the two or more machine instructions into a single optimized internal instruction that is configured to perform functions of the two or more machine instructions, and to execute the single optimized internal instruction to perform the functions of the two or more machine instructions. Being eligible includes determining that the two or more machine instructions include a first instruction specifying a first target register and a second instruction specifying the first target register as a source register and a target register. The second instruction is a next sequential instruction of the first instruction in program order, wherein the first instruction specifies a first function to be performed, and the second instruction specifies a second function to be performed.

Type: Grant

Filed: March 28, 2012

Date of Patent: March 22, 2016

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Michael K. Gschwind, Valentina Salapura
Compiler-controlled region scheduling for SIMD execution of threads

Patent number: 9274792

Abstract: A compiler-controlled technique for scheduling threads to execute different regions of a program. A compiler analyzes program code to determine a control flow graph for the program code. The control flow graph contains regions and directed edges between regions. The regions have associated execution priorities. The directed edges indicate the direction of program control flow. Each region has a thread frontier which contains one or more regions. The compiler inserts one or more update predicate mask variable instructions at the end of a region. The compiler also inserts one or more conditional branch instructions at the end of the region. The conditional branch instructions are arranged in order of execution priority of the regions in the thread frontier of the region, to enforce execution priority of the regions at runtime.

Type: Grant

Filed: December 10, 2012

Date of Patent: March 1, 2016

Assignee: NVIDIA Corporation

Inventors: Gregory Diamos, Mojtaba Mehrara
Image processor that reduces processing load of a software processing unit

Patent number: 9241164

Abstract: In a high-speed mode, a software processing unit notifies a hardware processing unit of settings information about output pictures before the hardware processing unit starts to encode an input picture, and the hardware processing unit performs continuous encoding for the output pictures, based on the settings information notified of by the software processing unit, without a notification signifying a completion for every picture, and upon completion of encoding for all of a specified number of the output pictures, sends an interrupt notification signifying a completion of encoding to the software processing unit.

Type: Grant

Filed: May 22, 2014

Date of Patent: January 19, 2016

Assignee: MegaChips Corporation

Inventor: Kazuhiro Saito
Fusing conditional write instructions having opposite conditions in instruction processing circuits, and related processor systems, methods, and computer-readable media

Patent number: 9195466

Abstract: Fusing conditional write instructions having opposite conditions in instruction processing circuits and related processor systems, methods, and computer-readable media are disclosed. In one embodiment, a first conditional write instruction writing a first value to a target register based on evaluating a first condition is detected by an instruction processing circuit. The circuit also detects a second conditional write instruction writing a second value to the target register based on evaluating a second condition that is a logical opposite of the first condition. Either the first condition or the second condition is selected as a fused instruction condition, and corresponding values are selected as if-true and if-false values. A fused instruction is generated for selectively writing the if-true value to the target register if the fused instruction condition evaluates to true, and selectively writing the if-false value to the target register if the fused instruction condition evaluates to false.

Type: Grant

Filed: November 14, 2012

Date of Patent: November 24, 2015

Assignee: QUALCOMM Incorporated

Inventors: Melinda J. Brown, James Norris Dieffenderfer, Michael Scott McIlvaine, Brian Michael Stempel, Rodney Wayne Smith, Jeffery M. Schottmiller, Andrew S. Irwin, Michael William Morrow
Processor, system, and method for efficient, high-throughput processing of two-dimensional, interrelated data sets

Patent number: 9183614

Abstract: Systems, processors and methods are disclosed for organizing processing datapaths to perform operations in parallel while executing a single program. Each datapath executes the same sequence of instructions, using a novel instruction sequencing method. Each datapath is implemented through a processor having a data memory partitioned into identical regions. A master processor fetches instructions and conveys them to the datapath processors. All processors are connected serially by an instruction pipeline, such that instructions are executed in parallel datapaths, with execution in each datapath offset in time by one clock cycle from execution in adjacent datapaths. The system includes an interconnection network that enables full sharing of data in both horizontal and vertical dimensions, with the effect of coupling any datapath to the memory of any other datapath without adding processing cycles in common usage.

Type: Grant

Filed: September 4, 2012

Date of Patent: November 10, 2015

Assignee: Mireplica Technology, LLC

Inventor: William M. Johnson
Trusted device having virtualized registers

Patent number: 9171161

Abstract: A trusted device having virtualized registers provides an extensible amount of storage for hash values and other information stored within a trusted device. The trusted device includes a buffer to which registers are virtualized to and from external storage, by encrypting the register values using a private device key. The registers may be platform control registers (PCRs) or other storage of the trusted device, which may be a trusted platform module (TPM). The registers are accessed in accordance with a register number. When the externally stored values are retrieved, they are decrypted and placed in the buffer. The buffer may implement a cache mechanism, such as a most recently used algorithm, so that encryption/decryption and fetch overhead is reduced. A register shadowing technique may be employed at boot time, to ensure that the trusted device is not compromised by tampering with the externally stored virtualized registers.

Type: Grant

Filed: November 9, 2006

Date of Patent: October 27, 2015

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Arun P. Anbalagan, Pruthvi P. Nataraj, Bipin Tomar
Rotate instructions that complete execution without reading carry flag

Patent number: 9164762

Abstract: A method of one aspect may include receiving a rotate instruction. The rotate instruction may indicate a source operand and a rotate amount. A result may be stored in a destination operand indicated by the rotate instruction. The result may have the source operand rotated by the rotate amount. Execution of the rotate instruction may complete without reading a carry flag.

Type: Grant

Filed: July 22, 2013

Date of Patent: October 20, 2015

Assignee: Intel Corporation

Inventors: Vinodh Gopal, James D. Gulilford, Gilbert M. Wolrich, Waidi K. Feghali, Erdinc Ozturk, Martin G. Dixon, Sean P. Mirkes, Bret L. Toll, Maxim Loktyukhin, Mark C. Davis, Alexandre J. Farcy
Early execution of conditional branch instruction with pc operand at which point target is fetched

Patent number: 9135006

Abstract: In accordance with the teachings described herein, systems and methods are provided for advanced execution of branch instructions in a microprocessor pipeline. In one embodiment, a branch instruction of an assembly language program code is executed that includes (i) a condition operand, (ii) a branch destination operand, and (iii) a program count operand. It is determined whether a current program count matches a stored program count operand. After determining that a condition was met when the branch instruction was executed, and in response to determining that the current program count matches the stored program count operand, a destination instruction specified by the stored branch destination operand is fetched.

Type: Grant

Filed: September 5, 2012

Date of Patent: September 15, 2015

Assignee: MARVELL INTERNATIONAL LTD.

Inventors: Li Sha, Ching-Han Tsai, Chi-Kuang Chen, Tzun-Wei Lee
Generating constant for microinstructions from modified immediate field during instruction translation

Patent number: 9128701

Abstract: An ISA-defined instruction includes an immediate field having a first and second portions specifying first and second values, which instructs the microprocessor to perform an operation using a constant value as one of its source operands. The constant value is the first value rotated/shifted by a number of bits based on the second value. An instruction translator translates the instruction into one or more microinstructions. An execution pipeline executes the microinstructions generated by the instruction translator. The instruction translator, rather than the execution pipeline, generates the constant value for the execution pipeline as a source operand of at least one of the microinstructions for execution by the execution pipeline.

Type: Grant

Filed: March 9, 2012

Date of Patent: September 8, 2015

Assignee: VIA TECHNOLOGIES, INC.

Inventors: G. Glenn Henry, Terry Parks, Rodney E. Hooker
Microarchitecture for floating point fused multiply-add with exponent scaling

Patent number: 9110713

Abstract: Systems and methods for implementing a floating point fused multiply and accumulate with scaling (FMASc) operation. A floating point unit receives input multiplier, multiplicand, addend, and scaling factor operands. A multiplier block is configured to multiply mantissas of the multiplier and multiplicand to generate an intermediate product. Alignment logic is configured to pre-align the addend with the intermediate product based on the scaling factor and exponents of the addend, multiplier, and multiplicand, and accumulation logic is configured to add or subtract a mantissa of the pre-aligned addend with the intermediate product to obtain a result of the floating point unit. Normalization and rounding are performed on the result, avoiding rounding during intermediate stages.

Type: Grant

Filed: August 30, 2012

Date of Patent: August 18, 2015

Assignee: QUALCOMM Incorporated

Inventor: Liang-Kai Wang
ARITHMETIC PROCESSING DEVICE, INFORMATION PROCESSING DEVICE, AND A METHOD OF CONTROLLING THE INFORMATION PROCESSING DEVICE

Publication number: 20150149746

Abstract: An arithmetic processing device promotes transmission efficiency between a processor and a memory. The arithmetic processing device has an arithmetic processing unit which issues an instruction accompanying with data which is sent to the memory, a judgment unit which judges whether or not a redundancy degree of the data which is accompanied with the instruction is more than a predetermined value, a compression unit which judges whether or not compress the data based on an waiting time and a compression time when the redundancy degree of the data is more than the predetermined value, and compress the data when judging that performs the compression, and an instruction arbitration unit which transfers the instruction accompanying with the compressed data to the memory when the compression unit performs the compression and transfers the instruction accompanying with the non-compressed data to the memory when the compression unit does not perform the compression.

Type: Application

Filed: November 10, 2014

Publication date: May 28, 2015

Inventors: Makoto SUGA, AKIO TOKOYODA, Koji HOSOE, Masatoshi Aihara, Yuta Toyoda
Storing in other queue when reservation station instruction queue reserved for immediate source operand instruction execution unit is full

Patent number: 9043581

Abstract: A processing apparatus includes an execution unit which performs computation on two operand inputs each being selectable between read data from a register and an immediate value. The processing apparatus also includes another execution unit which performs computation on two operand inputs, one of which is selectable between read data from a register and an immediate value, and the other of which is an immediate value. A control unit determines, based on a received instruction specifying a computation on two operands, whether each of the two operands specifies read data from a register or an immediate value. Depending on the determination result, the control unit causes one of the execution units to execute the computation specified by the received instruction.

Type: Grant

Filed: November 14, 2011

Date of Patent: May 26, 2015

Assignee: FUJITSU LIMITED

Inventor: Masaki Ukai
SINGLE INSTRUCTION MULTIPLE DATA ADD PROCESSORS, METHODS, SYSTEMS, AND INSTRUCTIONS

Publication number: 20150134936

Abstract: New instruction definitions for a packet add (PADD) operation and for a single instruction multiple add (SMAD) operation are disclosed. In addition, a new dedicated PADD logic device that performs the PADD operation in about one to two processor clock cycles is disclosed. Also, a new dedicated SMAD logic device that performs a single instruction multiple data add (SMAD) operation in about one to two clock cycles is disclosed.

Type: Application

Filed: January 20, 2015

Publication date: May 14, 2015

Inventors: Corey GEE, Bapiraju VINNAKOTA, Saleem MOHAMMADALI, Carl A. ALBEROLA
ARITHMETIC DEVICE

Publication number: 20150121042

Abstract: According to an embodiment, an arithmetic device includes an arithmetic processing unit, an address generating unit, and a control unit. The arithmetic processing unit performs a plurality of arithmetic processing used in an encryption method. Based on an upper bit of the address of the first piece of data and based on an offset which is a value corresponding to a counter value and which is based on the address of the first piece of data, the address generating unit generates addresses of the memory device. The control unit controls the arithmetic processing unit in such a way that the arithmetic processing is done in a sequence determined in the encryption method, and that specifies an update of the counter value at a timing of modifying the type of data and at a timing of modifying data.

Type: Application

Filed: January 5, 2015

Publication date: April 30, 2015

Applicant: Kabushiki Kaisha Toshiba

Inventor: Hideo SHIMIZU
COMPUTER AND METHODS FOR SOLVING MATH FUNCTIONS

Publication number: 20150121043

Abstract: Computers and methods for performing mathematical functions are disclosed. An embodiment of a computer includes an operations level and a driver level. The operations level performs mathematical operations. The driver level includes a first lookup table and a second lookup table, wherein the first lookup table includes first data for calculating at least one mathematical function using a first level of accuracy. The second lookup table includes second data for calculating the at least one mathematical function using a second level of accuracy, wherein the first level of accuracy is greater than the second level of accuracy. A driver executes either the first data or the second data depending on a selected level of accuracy.

Type: Application

Filed: October 30, 2013

Publication date: April 30, 2015

Applicant: Texas Instruments Incorporated

Inventors: Kyong Ho Lee, Seok-Jun Lee, Manish Goel
Vector math instruction execution by DSP processor approximating division and complex number magnitude

Patent number: 9015452

Abstract: A digital signal processor (DSP) includes an instruction fetch unit, an instruction decode unit, a register set and a plurality of work units in communication with the instruction decode unit. A first embodiment calculates two divisions on packed numerators and packed denominators. The DSP work units calculate indexes into a 1/d look-up table and make a final sign correction. A second embodiment calculates an approximation of a vector magnitude of a complex number x+jy. The approximation is based upon ?(x2+y2)??*max(|x|, |y|)+?*min(|x|, |y|). The DSP work units calculate the absolute values, find the maxima and minima, and form the packed results of two vector magnitude calculations.

Type: Grant

Filed: February 18, 2010

Date of Patent: April 21, 2015

Assignee: Texas Instruments Incorporated

Inventor: Udayan Dasgupta

prev 1 2 3 4 5 6 … next