Patents by Inventor Amit Gradstein

Amit Gradstein has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

APPARATUS AND METHOD FOR PERFORMING A PERMUTE OPERATION

Publication number: 20150026440

Abstract: An apparatus and method are described for permuting data elements with masking. For example, a method according to one embodiment includes the following operations: reading values from a mask data structure to determine whether masking is implemented for each data element of a destination operand; if masking not implemented for a particular data element, then selecting data elements from the destination operand and a second source operand based on index values stored in a first source operand to be copied to data element positions within the destination operand, wherein any one of the data elements from either the destination operand and the second source operand may be copied to any one of the data element positions within the destination operand; if masking is implemented for a particular data element of the destination operand, then performing a designated masking operation with respect to that particular data element.

Type: Application

Filed: December 23, 2011

Publication date: January 22, 2015

Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Mostafa Hagog, Jesus Corbal, Tal Uliel, Zeev Sperber, Amit Gradstein
FUSED MULTIPLY ADD OPERATIONS USING BIT MASKS

Publication number: 20140379773

Abstract: Systems and methods of performing a fused multiply add (FMA) operations are provided. In one embodiment, the length of the adder used by the FMA operation is less than 3*N, where N is the number of bits in the mantissa term of a floating point number. A mask may be used to perform the addition portion of the FMA operation using the adder. A second mask may be used to denormalize the result of the addition portion of the FMA operation if an underflow occurs.

Type: Application

Filed: June 25, 2013

Publication date: December 25, 2014

Inventors: Simon Rubanovich, Thierry Pons, Amit Gradstein, Zeev Sperber
Multiply add functional unit capable of executing scale, round, GETEXP, round, GETMANT, reduce, range and class instructions

Patent number: 8914430

Abstract: A method is described that involves executing a first instruction with a functional unit. The first instruction is a multiply-add instruction. The method further includes executing a second instruction with the functional unit. The second instruction is a round instruction.

Type: Grant

Filed: September 24, 2010

Date of Patent: December 16, 2014

Assignee: Intel Corporation

Inventors: Amit Gradstein, Cristina S. Anderson, Zeev Sperber, Simon Rubanovich, Benny Eitan
SYSTEMS, APPARATUSES, AND METHODS FOR PERFORMING A HORIZONTAL PARTIAL SUM IN RESPONSE TO A SINGLE INSTRUCTION

Publication number: 20140365747

Abstract: Embodiments of systems, apparatuses, and methods for performing in a computer processor vector packed horizontal partial sum of packed data elements in response to a single vector packed horizontal sum instruction that includes a destination vector register operand, a source vector register operand, and an opcode are described.

Type: Application

Filed: December 23, 2011

Publication date: December 11, 2014

Inventors: Elmoustapha Ould-Ahmed-Vall, Moustapha Hagog, Robert Valentine, Amit Gradstein, Simon Rubanovich, Zeev Sperber, Boris Ginzburg, Ziv Aviv
SYSTEMS, APPARATUSES, AND METHODS FOR PERFORMING CONVERSION OF A MASK REGISTER INTO A VECTOR REGISTER.

Publication number: 20140223138

Abstract: Embodiments of systems, apparatuses, and methods for performing in a computer processor conversion of a mask register into a vector register in response to a single vector packed convert a mask register to a vector register instruction that includes a destination vector register operand, a source writemask register operand, and an opcode are described.

Type: Application

Filed: December 23, 2011

Publication date: August 7, 2014

Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Bret L. Toll, Mark J. Charney, Amit Gradstein, Zeev Sperber
MATH CIRCUIT FOR ESTIMATING A TRANSCENDENTAL FUNCTION

Publication number: 20140222883

Abstract: A math circuit for computing an estimate of a transcendental function is described. A lookup table storage circuit has stored therein several groups of binary values, where each group of values represents a respective coefficient of a first polynomial that estimates the function to a high precision. A computing circuit uses a portion of a binary value, that is also taken from one of the groups of values, to evaluate a second polynomial that estimates the function to a low precision. Other embodiments are also described and claimed.

Type: Application

Filed: December 21, 2011

Publication date: August 7, 2014

Inventors: Jose-Alejandro Pineiro, Simon Rubanovich, Benny Eitan, Amit Gradstein, Thomas D. Fletcher
APPARATUS AND METHOD FOR DOWN CONVERSION OF DATA TYPES

Publication number: 20140208080

Abstract: An apparatus and method are described for down-converting from a source operand to a destination operand with masking. For example, a method according to one embodiment includes the following operations: reading a source operand value to be down-converted from a first value to a down-converted value and stored in a destination location; reading each mask register bit stored in a mask register, the mask register bit(s) indicating whether to perform a masking operation or a conversion operation on the source operand value; if the mask register bit(s) indicates that a masking operation is to be performed, then performing a specified masking operation and storing the results of the masking operation in the destination location; and if the mask register bit(s) indicates that a masking operation is not to be performed, then down-converting the source operand value and storing the down-converted value in the specified destination location.

Type: Application

Filed: December 23, 2011

Publication date: July 24, 2014

Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Tal Uliel, Jesus Corbal, Zeev Sperber, Amit Gradstein
SYSTEMS, APPARATUSES, AND METHODS FOR PERFORMING A BUTTERFLY HORIZONTAL AND CROSS ADD OR SUBSTRACT IN RESPONSE TO A SINGLE INSTRUCTION

Publication number: 20140201502

Abstract: Embodiments of systems, apparatuses, and methods for performing in a computer processor vector packed butterfly horizontal cross add or subtract of packed data elements in response to a single vector packed butterfly horizontal cross add or subtract instruction that includes a destination vector register operand, a source vector register operand, an immediate, and an opcode are described.

Type: Application

Filed: December 23, 2011

Publication date: July 17, 2014

Inventors: Elmoustapha Ould-Ahmed-Vall, Mostafa Hagog, Robert Valentine, Amit Gradstein, Simon Rubanovich, Zeev Sperber
Leading Change Anticipator Logic

Publication number: 20140188967

Abstract: In one embodiment, a processor includes at least one floating point unit. The at least one floating point unit may include an adder, leading change anticipator (LCA) logic, and a shifter. The adder may be to add a first operand X and a second operand Y to obtain an output operand having a bit length n. The LCA logic may be to: for each bit position i from n?1 to 1, obtain a set of propagation values and a set of bit values based on the first operand X and the second operand Y; and generate a LCA mask based on the set of propagation values and the set of bit values. The shifter may be to normalize the output operand based on the LCA mask. Other embodiments are described and claimed.

Type: Application

Filed: December 28, 2012

Publication date: July 3, 2014

Inventors: Simon Rubanovich, Thierry Pons, Amit Gradstein, Zeev Sperber
Performing reciprocal instructions with high accuracy

Patent number: 8706789

Abstract: In one embodiment, the present invention includes a method for receiving a reciprocal instruction and an operand in a processor, accessing an entry of a lookup table based on a portion of the operand and the instruction, generating an encoder output based on a type of the reciprocal instruction and whether the reciprocal instruction is a legacy instruction, and selecting portions of the lookup table entry and input operand to be provided to a reciprocal logic unit based on the encoder output. Other embodiments are described and claimed.

Type: Grant

Filed: December 22, 2010

Date of Patent: April 22, 2014

Assignee: Intel Corporation

Inventors: Zeev Sperber, Cristina S. Anderson, Benny Eitan, Simon Rubanovich, Amit Gradstein
SYSTEMS, APPARATUSES, AND METHODS FOR PERFORMING AN ABSOLUTE DIFFERENCE CALCULATION BETWEEN CORRESPONDING PACKED DATA ELEMENTS OF TWO VECTOR REGISTERS

Publication number: 20140082333

Abstract: Embodiments of systems, apparatuses, and methods for performing in a computer processor absolute difference calculation in response to a single vector packed absolute difference instruction that includes a first and second source vector register operand, a destination vector register operand, and an opcode are described.

Type: Application

Filed: December 22, 2011

Publication date: March 20, 2014

Inventors: Elmoustapha Ould-Ahmed-Vall, Mostafa Hagog, Robert Valentine, Amit Gradstein, Simon Rubanovich, Zeev Sperber
APPARATUS AND METHOD FOR BROADCASTING FROM A GENERAL PURPOSE REGISTER TO A VECTOR REGISTER

Publication number: 20140059322

Abstract: An apparatus and method are described for broadcasting from a general purpose source register to a destination vector register. For example, a method according to one embodiment includes the following operations: selecting data element position N within the destination vector register to be updated; broadcasting a set of data from the general purpose source register to data element position N within the destination vector register if a mask indicator is set to a first indication; and either copying zeroes to data element position N within the destination vector register or maintaining existing values stored within data element position N within the destination vector register if the mask indicator is set to a second indication.

Type: Application

Filed: December 23, 2011

Publication date: February 27, 2014

Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Bret L Toll, Mark J Charney, Zeev Sperber, Amit Gradstein
PACKED ROTATE PROCESSORS, METHODS, SYSTEMS, AND INSTRUCTIONS

Publication number: 20140040604

Abstract: A method of an aspect includes receiving a masked packed rotate instruction. The instruction indicates a first source packed data including a plurality of packed data elements, a packed data operation mask having a plurality of mask elements, at least one rotation amount, and a destination storage location. A result packed data is stored in the destination storage location in response to the instruction. The result packed data includes result data elements that each correspond to a different one of the mask elements in a corresponding relative position. Result data elements that are not masked out by the corresponding mask element include one of the data elements of the first source packed data in a corresponding position that has been rotated. Result data elements that are masked out by the corresponding mask element include a masked out value. Other methods, apparatus, systems, and instructions are disclosed.

Type: Application

Filed: December 30, 2011

Publication date: February 6, 2014

Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal San Andrian, Suleyman Sair, Bret L. Toll, Zeev Sperber, Amit Gradstein, Asaf Rubenstein
SYSTEMS, APPARATUSES, AND METHODS FOR PERFORMING A DOUBLE BLOCKED SUM OF ABSOLUTE DIFFERENCES

Publication number: 20140019713

Abstract: Embodiments of systems, apparatuses, and methods for performing in a computer processor vector double block packed sum of absolute differences (SAD) in response to a single vector double block packed sum of absolute differences instruction that includes a destination vector register operand, first and second source operands, an immediate, and an opcode are described.

Type: Application

Filed: December 23, 2011

Publication date: January 16, 2014

Inventors: Elmoustapha Ould-Ahmed-Vall, Mostafa Hagog, Robert Valentine, Amit Gradstein, Simon Rubanovich, Zeev Sperber
SYSTEMS, APPARATUSES, AND METHODS FOR PERFORMING A HORIZONTAL ADD OR SUBTRACT IN RESPONSE TO A SINGLE INSTRUCTION

Publication number: 20140013075

Abstract: Embodiments of systems, apparatuses, and methods for performing in a computer processor vector packed horizontal add or subtract of packed data elements in response to a single vector packed horizontal add or subtract instruction that includes a destination vector register operand, a source vector register operand, and an opcode are describes.

Type: Application

Filed: December 23, 2011

Publication date: January 9, 2014

Inventors: Mostafa Hagog, Elmoustapha Ould-Aumed-Vall, Robert Valentine, Amit Gradstein, Simon Rubanovich, Zeev Sperber
VECTOR MULTIPLICATION WITH ACCUMULATION IN LARGE REGISTER SPACE

Publication number: 20140006755

Abstract: An apparatus is described having an instruction execution pipeline that has a vector functional unit to support a vector multiply add instruction. The vector multiply add instruction to multiply respective K bit elements of two vectors and accumulate a portion of each of their respective products with another respective input operand in an X bit accumulator, where X is greater than K.

Type: Application

Filed: June 29, 2012

Publication date: January 2, 2014

Inventors: Shay Gueron, Vlad Krasnov, Robert Valentine, Zeev Sperber, Amit Gradstein, Simon Rubanovich
Method and apparatus for optimizing advanced encryption standard (AES) encryption and decryption in parallel modes of operation

Patent number: 8600049

Abstract: The throughput of an encryption/decryption operation is increased in a system having a pipelined execution unit. Different independent encryptions (decryptions) of different data blocks may be performed in parallel by dispatching an AES round instruction in every cycle.

Type: Grant

Filed: May 10, 2012

Date of Patent: December 3, 2013

Assignee: Intel Corporation

Inventors: Shay Gueron, Amit Gradstein, Zeev Sperber
APPARATUS AND METHOD OF IMPROVED PERMUTE INSTRUCTIONS

Publication number: 20130290687

Abstract: An apparatus is described having instruction execution logic circuitry. The instruction execution logic circuitry has input vector element routing circuitry to perform the following for each of three different instructions: for each of a plurality of output vector element locations, route into an output vector element location an input vector element from one of a plurality of input vector element locations that are available to source the output vector element. The output vector element and each of the input vector element locations are one of three available bit widths for the three different instructions. The apparatus further includes masking layer circuitry coupled to the input vector element routing circuitry to mask a data structure created by the input vector routing element circuitry. The masking layer circuitry is designed to mask at three different levels of granularity that correspond to the three available bit widths.

Type: Application

Filed: December 23, 2011

Publication date: October 31, 2013

Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Bret L. Toll, Mark J. Charney, Zeev Sperber, Amit Gradstein
FLOATING POINT ROUNDING PROCESSORS, METHODS, SYSTEMS, AND INSTRUCTIONS

Publication number: 20130290685

Abstract: A method of an aspect includes receiving a floating point rounding instruction. The floating point rounding instruction indicates a source of one or more floating point data elements, indicates a number of fraction bits after a radix point that each of the one or more floating point data elements are to be rounded to, and indicates a destination storage location. A result is stored in the destination storage location in response to the floating point rounding instruction. The result includes one or more rounded result floating point data elements. Each of the one or more rounded result floating point data elements includes one of the floating point data elements of the source, in a corresponding position, which has been rounded to the indicated number of fraction bits. Other methods, apparatus, systems, and instructions are disclosed.

Type: Application

Filed: December 22, 2011

Publication date: October 31, 2013

Inventors: Jesus Corbal San Adrian, Cristina S. Anderson, Robert Valentine, Bret Toll, Amit Gradstein, Simon Rubanovich, Benny Eitan
PROCESSORS, METHODS, SYSTEMS, AND INSTRUCTIONS TO GENERATE SEQUENCES OF INTEGERS IN NUMERICAL ORDER THAT DIFFER BY A CONSTANT STRIDE

Publication number: 20130283019

Abstract: A method of an aspect includes receiving an instruction indicating a destination storage location. A result is stored in the destination storage location in response to the instruction. The result includes a sequence of at least four non-negative integers in numerical order with all integers in consecutive positions differing by a constant stride of at least two. In an aspect, storing the result including the sequence of the at least four integers is performed without calculating the at least four integers using a result of a preceding instruction. Other methods, apparatus, systems, and instructions are disclosed.

Type: Application

Filed: December 22, 2011

Publication date: October 24, 2013

Inventors: Elmoustapha Ould-Ahmed-Vall, Seth Abraham, Robert Valentine, Zeev Sperber, Amit Gradstein

prev … 10 11 12 13 14 15 next