Patents by Inventor Amit Gradstein

Amit Gradstein has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Apparatus and method for vector compression

Patent number: 10623015

Abstract: An apparatus and method are described for performing vector compression. For example, one embodiment of a processor comprises: vector compression logic to compress a source vector comprising a plurality of valid data elements and invalid data elements to generate a destination vector in which valid data elements are stored contiguously on one side of the destination vector, the vector compression logic to utilize a bit mask associated with the source vector and comprising a plurality of bits, each bit corresponding to one of the plurality of data elements of the source vector and indicating whether the data element comprises a valid data element or an invalid data element, the vector compression logic to utilize indices of the bit mask and associated bit values of the bit mask to generate a control vector; and shuffle logic to shuffle/permute the data elements of the source vector to the destination vector in accordance with the control vector.

Type: Grant

Filed: March 15, 2018

Date of Patent: April 14, 2020

Assignee: Intel Corporation

Inventors: Simon Rubanovich, David M. Russinoff, Amit Gradstein, John W. O'Leary, Zeev Sperber
Processors, methods, systems, and instructions to generate sequences of consecutive integers in numerical order

Patent number: 10565283

Abstract: A method of an aspect includes receiving an instruction indicating a destination storage location. A result is stored in the destination storage location in response to the instruction. The result includes a sequence of at least four consecutive non-negative integers in numerical order. In an aspect, the instruction does not indicate a source packed data operand having a plurality of packed data elements in an architecturally-visible storage location. Other methods, apparatus, systems, and instructions are disclosed.

Type: Grant

Filed: December 22, 2011

Date of Patent: February 18, 2020

Assignee: Intel Corporation

Inventors: Seth Abraham, Robert Valentine, Elmoustapha Ould-Ahmed-Vall, Zeev Sperber, Amit Gradstein
SYSTEMS, APPARATUSES, AND METHODS FOR FUSED MULTIPLY ADD

Publication number: 20200026515

Abstract: In some embodiments, packed data elements of first and second packed data source operands are of a first, different size than a second size of packed data elements of a third packed data operand. Execution circuitry executes decoded single instruction to perform, for each packed data element position of a destination operand, a multiplication of a M N-sized packed data elements from the first and second packed data sources that correspond to a packed data element position of the third packed data source, add of results from these multiplications to a full-sized packed data element of a packed data element position of the third packed data source, and storage of the addition result in a packed data element position destination corresponding to the packed data element position of the third packed data source, wherein M is equal to the full-sized packed data element divided by N.

Type: Application

Filed: October 20, 2016

Publication date: January 23, 2020

Inventors: Robert Valentine, Galina RYVCHIN, Piotr MAJCHER, Mark J. CHARNEY, Elmoustapha OULD-AHMED-VALL, Jesus CORBAL, Milind B. GIRKAR, Zeev SPERBER, Simon RUBANOVICH, Amit GRADSTEIN
Efficient implementation of complex vector fused multiply add and complex vector multiply

Patent number: 10521226

Abstract: Disclosed embodiments relate to efficient complex vector multiplication. In one example, an apparatus includes execution circuitry, responsive to an instruction having fields to specify multiplier, multiplicand, and summand complex vectors, to perform two operations: first, to generate a double-even multiplicand by duplicating even elements of the specified multiplicand, and to generate a temporary vector using a fused multiply-add (FMA) circuit having A, B, and C inputs set to the specified multiplier, the double-even multiplicand, and the specified summand, respectively, and second, to generate a double-odd multiplicand by duplicating odd elements of the specified multiplicand, to generate a swapped multiplier by swapping even and odd elements of the specified multiplier, and to generate a result using a second FMA circuit having its even product negated, and having A, B, and C inputs set to the swapped multiplier, the double-odd multiplicand, and the temporary vector, respectively.

Type: Grant

Filed: March 30, 2018

Date of Patent: December 31, 2019

Assignee: Intel Corporation

Inventors: Raanan Sade, Thierry Pons, Amit Gradstein, Zeev Sperber, Mark J. Charney, Robert Valentine, Eyal Oz-Sinay
Vector multiplication with accumulation in large register space

Patent number: 10514912

Abstract: An apparatus is described having an instruction execution pipeline that has a vector functional unit to support a vector multiply add instruction. The vector multiply add instruction to multiply respective K bit elements of two vectors and accumulate a portion of each of their respective products with another respective input operand in an X bit accumulator, where X is greater than K.

Type: Grant

Filed: September 17, 2018

Date of Patent: December 24, 2019

Assignee: intel corporation

Inventors: Shay Gueron, Vlad Krasnov, Robert Valentine, Zeev Sperber, Amit Gradstein, Simon Rubanovich
MULTIPLY ADD FUNCTIONAL UNIT CAPABLE OF EXECUTING SCALE, ROUND, GETEXP, ROUND, GETMANT, REDUCE, RANGE AND CLASS INSTRUCTIONS

Publication number: 20190361676

Abstract: A method is described that involves executing a first instruction with a functional unit. The first instruction is a multiply-add instruction. The method further includes executing a second instruction with the functional unit. The second instruction is a round instruction.

Type: Application

Filed: June 10, 2019

Publication date: November 28, 2019

Inventors: Cristina S. Anderson, Zeev Sperber, Simon Rubanovich, Benny Eitan, Amit Gradstein
Apparatus and method for down conversion of data types

Patent number: 10474463

Abstract: An apparatus and method are described for down-converting from a source operand to a destination operand with masking. For example, a method according to one embodiment includes the following operations: reading a source operand value to be down-converted from a first value to a down-converted value and stored in a destination location; reading each mask register bit stored in a mask register, the mask register bit(s) indicating whether to perform a masking operation or a conversion operation on the source operand value; if the mask register bit(s) indicates that a masking operation is to be performed, then performing a specified masking operation and storing the results of the masking operation in the destination location; and if the mask register bit indicates that a masking operation is not to be performed, then down-converting the source operand value and storing the down-converted value in the specified destination location.

Type: Grant

Filed: December 23, 2011

Date of Patent: November 12, 2019

Assignee: INTEL CORPORATION

Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Tal Uliel, Jesus Corbal, Zeev Sperber, Amit Gradstein
Apparatus and method of improved permute instructions

Patent number: 10474459

Abstract: An apparatus is described having instruction execution logic circuitry. The instruction execution logic circuitry has input vector element routing circuitry to perform the following for each of three different instructions: for each of a plurality of output vector element locations, route into an output vector element location an input vector element from one of a plurality of input vector element locations that are available to source the output vector element. The output vector element and each of the input vector element locations are one of three available bit widths for the three different instructions. The apparatus further includes masking layer circuitry coupled to the input vector element routing circuitry to mask a data structure created by the input vector routing element circuitry. The masking layer circuitry is designed to mask at three different levels of granularity that correspond to the three available bit widths.

Type: Grant

Filed: November 9, 2017

Date of Patent: November 12, 2019

Assignee: Intel Corporation

Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Bret L. Toll, Mark J. Charney, Zeev Sperber, Amit Gradstein
Apparatus and method of improved insert instructions

Patent number: 10459728

Abstract: An apparatus is described having instruction execution logic circuitry to execute first, second, third and fourth instruction. Both the first instruction and the second instruction insert a first group of input vector elements to one of multiple first non overlapping sections of respective first and second resultant vectors. The first group has a first bit width. Each of the multiple first non overlapping sections have a same bit width as the first group. Both the third instruction and the fourth instruction insert a second group of input vector elements to one of multiple second non overlapping sections of respective third and fourth resultant vectors. The second group has a second bit width that is larger than said first bit width. Each of the multiple second non overlapping sections have a same bit width as the second group.

Type: Grant

Filed: November 10, 2017

Date of Patent: October 29, 2019

Assignee: Intel Corporation

Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Bret L. Toll, Mark J. Charney, Zeev Sperber, Amit Gradstein
EFFICIENT IMPLEMENTATION OF COMPLEX VECTOR FUSED MULTIPLY ADD AND COMPLEX VECTOR MULTIPLY

Publication number: 20190303142

Abstract: Disclosed embodiments relate to efficient complex vector multiplication. In one example, an apparatus includes execution circuitry, responsive to an instruction having fields to specify multiplier, multiplicand, and summand complex vectors, to perform two operations: first, to generate a double-even multiplicand by duplicating even elements of the specified multiplicand, and to generate a temporary vector using a fused multiply-add (FMA) circuit having A, B, and C inputs set to the specified multiplier, the double-even multiplicand, and the specified summand, respectively, and second, to generate a double-odd multiplicand by duplicating odd elements of the specified multiplicand, to generate a swapped multiplier by swapping even and odd elements of the specified multiplier, and to generate a result using a second FMA circuit having its even product negated, and having A, B, and C inputs set to the swapped multiplier, the double-odd multiplicand, and the temporary vector, respectively.

Type: Application

Filed: March 30, 2018

Publication date: October 3, 2019

Inventors: Raanan SADE, Thierry PONS, Amit GRADSTEIN, Zeev SPERBER, Mark J. CHARNEY, Robert VALENTINE, Eyal Oz-Sinay
PROCESSORS, METHODS, SYSTEMS, AND INSTRUCTIONS TO GENERATE SEQUENCES OF INTEGERS IN WHICH INTEGERS IN CONSECUTIVE POSITIONS DIFFER BY A CONSTANT INTEGER STRIDE AND WHERE A SMALLEST INTEGER IS OFFSET FROM ZERO BY AN INTEGER OFFSET

Publication number: 20190286441

Abstract: A method of an aspect includes receiving an instruction. The instruction indicates an integer stride, indicates an integer offset, and indicates a destination storage location. A result is stored in the destination storage location in response to the instruction. The result includes a sequence of at least four integers in numerical order with a smallest one of the at least four integers differing from zero by the integer offset and with all integers of the sequence in consecutive positions differing by the integer stride. Other methods, apparatus, systems, and instructions are disclosed.

Type: Application

Filed: February 8, 2019

Publication date: September 19, 2019

Inventors: Seth ABRAHAM, Elmoustapha OULD-AHMED-VALL, Robert VALENTINE, Zeev SPERBER, Amit GRADSTEIN
APPARATUS AND METHOD FOR DOWN-CONVERTING AND INTERLEAVING MULTIPLE FLOATING POINT VALUES

Publication number: 20190220278

Abstract: An apparatus and method down-converting and interleaving data elements.

Type: Application

Filed: March 27, 2019

Publication date: July 18, 2019

Inventors: MENACHEM ADELMAN, ROBERT VALENTINE, BARUKH ZIV, AMIT GRADSTEIN, SIMON RUBANOVITCH, ALEXANDER HEINECKE, EVANGELOS GEORGANAS
Linear memory address transformation and management

Patent number: 10324857

Abstract: A processing device including a linear address transformation circuit to determine that a metadata value stored in a portion of a linear address falls within a pre-defined metadata range. The metadata value corresponds to a plurality of metadata bits. The linear address transformation circuit to replace each of the plurality of the metadata bits with a constant value.

Type: Grant

Filed: January 26, 2017

Date of Patent: June 18, 2019

Assignee: Intel Corporation

Inventors: Joseph Nuzman, Raanan Sade, Igor Yanover, Ron Gabor, Amit Gradstein
Packed rotate processors, methods, systems, and instructions

Patent number: 10324718

Abstract: A method of an aspect includes receiving a masked packed rotate instruction. The instruction indicates a first source packed data including a plurality of packed data elements, a packed data operation mask having a plurality of mask elements, at least one rotation amount, and a destination storage location. A result packed data is stored in the destination storage location in response to the instruction. The result packed data includes result data elements that each correspond to a different one of the mask elements in a corresponding relative position. Result data elements that are not masked out by the corresponding mask element include one of the data elements of the first source packed data in a corresponding position that has been rotated. Result data elements that are masked out by the corresponding mask element include a masked out value. Other methods, apparatus, systems, and instructions are disclosed.

Type: Grant

Filed: January 8, 2018

Date of Patent: June 18, 2019

Assignee: Intel Corporation

Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal San Andrian, Suleyman Sair, Bret L. Toll, Zeev Sperber, Amit Gradstein, Asaf Rubinstein
Multiply add functional unit capable of executing scale, round, getexp, round, getmant, reduce, range and class instructions

Patent number: 10318244

Abstract: A method is described that involves executing a first instruction with a functional unit. The first instruction is a multiply-add instruction. The method further includes executing a second instruction with the functional unit. The second instruction is a round instruction.

Type: Grant

Filed: March 27, 2017

Date of Patent: June 11, 2019

Assignee: Intel Corporation

Inventors: Cristina S. Anderson, Zeev Sperber, Simon Rubanovich, Benny Eitan, Amit Gradstein
Systems, apparatuses, and methods for performing a double blocked sum of absolute differences

Patent number: 10303471

Abstract: Embodiments of systems, apparatuses, and methods for performing in a computer processor vector double block packed sum of absolute differences (SAD) in response to a single vector double block packed sum of absolute differences instruction that includes a destination vector register operand, first and second source operands, an immediate, and an opcode are described.

Type: Grant

Filed: February 28, 2017

Date of Patent: May 28, 2019

Assignee: Intel Corporation

Inventors: Elmoustapha Ould-Ahmed-Vall, Mostafa Hagog, Robert Valentine, Amit Gradstein, Simon Rubanovich, Zeev Sperber
Floating point scaling processors, methods, systems, and instructions

Patent number: 10275216

Abstract: A method of an aspect includes receiving a floating point scaling instruction. The floating point scaling instruction indicates a first source including one or more floating point data elements, a second source including one or more corresponding floating point data elements, and a destination. A result is stored in the destination in response to the floating point scaling instruction. The result includes one or more corresponding result floating point data elements each including a corresponding floating point data element of the second source multiplied by a base of the one or more floating point data elements of the first source raised to a power of an integer representative of the corresponding floating point data element of the first source. Other methods, apparatus, systems, and instructions are disclosed.

Type: Grant

Filed: March 30, 2018

Date of Patent: April 30, 2019

Assignee: Intel Corporation

Inventors: Cristina S. Anderson, Amit Gradstein, Robert Valentine, Simon Rubanovich, Benny Eitan
VECTOR MULTIPLICATION WITH ACCUMULATION IN LARGE REGISTER SPACE

Publication number: 20190114169

Abstract: An apparatus is described having an instruction execution pipeline that has a vector functional unit to support a vector multiply add instruction. The vector multiply add instruction to multiply respective K bit elements of two vectors and accumulate a portion of each of their respective products with another respective input operand in an X bit accumulator, where X is greater than K.

Type: Application

Filed: September 17, 2018

Publication date: April 18, 2019

Inventors: Shay Gueron, Vlad Krasnov, Robert Valentine, Zeev Sperber, Amit Gradstein, Simon Rubanovich
Processors, Methods, Systems, and Instructions to Generate Sequences of Integers in which Integers in Consecutive Positions Differ by a Constant Integer Stride and Where a Smallest Integer is Offset from Zero by an Integer Offset

Publication number: 20190102187

Abstract: A method of an aspect includes receiving an instruction. The instruction indicates an integer stride, indicates an integer offset, and indicates a destination storage location. A result is stored in the destination storage location in response to the instruction. The result includes a sequence of at least four integers in numerical order with a smallest one of the at least four integers differing from zero by the integer offset and with all integers of the sequence in consecutive positions differing by the integer stride. Other methods, apparatus, systems, and instructions are disclosed.

Type: Application

Filed: September 30, 2017

Publication date: April 4, 2019

Applicant: Intel Corporation

Inventors: Seth Abraham, Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Zeev Sperber, Amit Gradstein
SYSTEMS AND METHODS FOR PERFORMING INSTRUCTIONS TO CONVERT TO 16-BIT FLOATING-POINT FORMAT

Publication number: 20190079762

Abstract: Disclosed embodiments relate to systems and methods for performing instructions to convert to 16-bit floating-point format. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode and locations of a first source vector comprising N single-precision elements, and a destination vector comprising at least N 16-bit floating-point elements, the opcode to indicate execution circuitry is to convert each of the elements of the specified source vector to 16-bit floating-point, the conversion to include truncation and rounding, as necessary, and to store each converted element into a corresponding location of the specified destination vector, decode circuitry to decode the fetched instruction, and execution circuitry to respond to the decoded instruction as specified by the opcode.

Type: Application

Filed: November 9, 2018

Publication date: March 14, 2019

Inventors: Alexander F. HEINECKE, Robert VALENTINE, Mark J. CHARNEY, Raanan SADE, Menachem ADELMAN, Zeev SPERBER, Amit GRADSTEIN, Simon RUBANOVICH

prev … 5 6 7 8 9 10 11 12 13 … next