Patents by Inventor Zeev Sperber

Zeev Sperber has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Apparatus and method for vector compression

Patent number: 9929745

Abstract: An apparatus and method are described for performing vector compression. For example, one embodiment of a processor comprises: vector compression logic to compress a source vector comprising a plurality of valid data elements and invalid data elements to generate a destination vector in which valid data elements are stored contiguously on one side of the destination vector, the vector compression logic to utilize a bit mask associated with the source vector and comprising a plurality of bits, each bit corresponding to one of the plurality of data elements of the source vector and indicating whether the data element comprises a valid data element or an invalid data element, the vector compression logic to utilize indices of the bit mask and associated bit values of the bit mask to generate a control vector; and shuffle logic to shuffle/permute the data elements of the source vector to the destination vector in accordance with the control vector.

Type: Grant

Filed: September 26, 2014

Date of Patent: March 27, 2018

Assignee: Intel Corporation

Inventors: Simon Rubanovich, David M. Russinoff, Amit Gradstein, John W. O'Leary, Zeev Sperber
APPARATUS AND METHOD OF IMPROVED EXTRACT INSTRUCTIONS

Publication number: 20180081689

Abstract: An apparatus is described that includes instruction execution circuitry to execute first, second, third, and fourth instructions, the first and second instructions select a first group of input vector elements from one of multiple first non-overlapping sections of respective first and second input vectors. Each of the multiple first non-overlapping sections have a same bit width as the first group. Both the third and fourth instructions select a second group of input vector elements from one of multiple second non-overlapping sections of respective third and fourth input vectors. The second group has a second bit width that is larger than the first bit width. Each of multiple second non-overlapping sections have a same bit width as the second group. The apparatus includes masking layer circuitry to mask the first and second groups at a first granularity and second granularity.

Type: Application

Filed: November 10, 2017

Publication date: March 22, 2018

Inventors: ELMOUSTAPHA OULD-AHMED-VALL, ROBERT VALENTINE, JESUS CORBAL, BRET L. TOLL, MARK J. CHARNEY, ZEEV SPERBER, AMIT GRADSTEIN
APPARATUS AND METHOD OF IMPROVED INSERT INSTRUCTIONS

Publication number: 20180074825

Abstract: An apparatus is described having instruction execution logic circuitry to execute first, second, third and fourth instruction. Both the first instruction and the second instruction insert a first group of input vector elements to one of multiple first non overlapping sections of respective first and second resultant vectors. The first group has a first bit width. Each of the multiple first non overlapping sections have a same bit width as the first group. Both the third instruction and the fourth instruction insert a second group of input vector elements to one of multiple second non overlapping sections of respective third and fourth resultant vectors. The second group has a second bit width that is larger than said first bit width. Each of the multiple second non overlapping sections have a same bit width as the second group.

Type: Application

Filed: November 10, 2017

Publication date: March 15, 2018

Inventors: ELMOUSTAPHA OULD-AHMED-VALL, ROBERT VALENTINE, JESUS CORBAL, BRET L. TOLL, MARK J. CHARNEY, ZEEV SPERBER, AMIT GRADSTEIN
APPARATUS AND METHOD OF IMPROVED PERMUTE INSTRUCTIONS

Publication number: 20180074822

Abstract: An apparatus is described having instruction execution logic circuitry. The instruction execution logic circuitry has input vector element routing circuitry to perform the following for each of three different instructions: for each of a plurality of output vector element locations, route into an output vector element location an input vector element from one of a plurality of input vector element locations that are available to source the output vector element. The output vector element and each of the input vector element locations are one of three available bit widths for the three different instructions. The apparatus further includes masking layer circuitry coupled to the input vector element routing circuitry to mask a data structure created by the input vector routing element circuitry. The masking layer circuitry is designed to mask at three different levels of granularity that correspond to the three available bit widths.

Type: Application

Filed: November 9, 2017

Publication date: March 15, 2018

Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Bret L. Toll, Mark J. Charney, Zeev Sperber, Amit Gradstein
APPARATUS AND METHOD OF IMPROVED PERMUTE INSTRUCTIONS

Publication number: 20180067742

Abstract: An apparatus is described having instruction execution logic circuitry. The instruction execution logic circuitry has input vector element routing circuitry to perform the following for each of three different instructions: for each of a plurality of output vector element locations, route into an output vector element location an input vector element from one of a plurality of input vector element locations that are available to source the output vector element. The output vector element and each of the input vector element locations are one of three available bit widths for the three different instructions. The apparatus further includes masking layer circuitry coupled to the input vector element routing circuitry to mask a data structure created by the input vector routing element circuitry. The masking layer circuitry is designed to mask at three different levels of granularity that correspond to the three available bit widths.

Type: Application

Filed: November 9, 2017

Publication date: March 8, 2018

Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Bret L. Toll, Mark J. Charney, Zeev Sperber, Amit Gradstein
Packed data rearrangement control indexes generation processors, methods, systems and instructions

Patent number: 9904547

Abstract: A method of an aspect includes receiving a packed data rearrangement control indexes generation instruction. The packed data rearrangement control indexes generation instruction indicates a destination storage location. A result is stored in the destination storage location in response to the packed data rearrangement control indexes generation instruction. The result includes a sequence of at least four non-negative integers representing packed data rearrangement control indexes. In an aspect, values of the at least four non-negative integers are not calculated using a result of a preceding instruction. Other methods, apparatus, systems, and instructions are disclosed.

Type: Grant

Filed: December 22, 2011

Date of Patent: February 27, 2018

Assignee: Intel Corporation

Inventors: Elmoustapha Ould-Ahmed-Vall, Seth Abraham, Robert Valentine, Zeev Sperber, Amit Gradstein
Processors, methods, systems, and instructions to generate sequences of integers in which integers in consecutive positions differ by a constant integer stride and where a smallest integer is offset from zero by an integer offset

Patent number: 9898283

Abstract: A method of an aspect includes receiving an instruction. The instruction indicates an integer stride, indicates an integer offset, and indicates a destination storage location. A result is stored in the destination storage location in response to the instruction. The result includes a sequence of at least four integers in numerical order with a smallest one of the at least four integers differing from zero by the integer offset and with all integers of the sequence in consecutive positions differing by the integer stride. Other methods, apparatus, systems, and instructions are disclosed.

Type: Grant

Filed: December 22, 2011

Date of Patent: February 20, 2018

Assignee: Intel Corporation

Inventors: Seth Abraham, Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Zeev Sperber, Amit Gradstein
Method and apparatus for performing logical compare operations

Patent number: 9898285

Abstract: A method and apparatus for including in a processor instructions for performing logical-comparison and branch support operations on packed or unpacked data. In one embodiment, instruction decode logic decodes instructions for an execution unit to operate on packed data elements including logical comparisons. A register file including 128-bit packed data registers stores packed single-precision floating point (SPFP) and packed integer data elements. The logical comparisons may include comparison of SPFP data elements and comparison of integer data elements and setting at least one bit to indicate the results. Based on these comparisons, branch support actions are taken. Such branch support actions may include setting the at least one bit, which in turn may be utilized by a branching unit in response to a branch instruction. Alternatively, the branch support actions may include branching to an indicated target code location.

Type: Grant

Filed: November 7, 2016

Date of Patent: February 20, 2018

Assignee: Intel Corporation

Inventors: Rajiv Kapoor, Ronen Zohar, Mark Buxton, Zeev Sperber, Koby Gottlieb
Packed rotate processors, methods, systems, and instructions

Patent number: 9864602

Abstract: A method of an aspect includes receiving a masked packed rotate instruction. The instruction indicates a first source packed data including a plurality of packed data elements, a packed data operation mask having a plurality of mask elements, at least one rotation amount, and a destination storage location. A result packed data is stored in the destination storage location in response to the instruction. The result packed data includes result data elements that each correspond to a different one of the mask elements in a corresponding relative position. Result data elements that are not masked out by the corresponding mask element include one of the data elements of the first source packed data in a corresponding position that has been rotated. Result data elements that are masked out by the corresponding mask element include a masked out value. Other methods, apparatus, systems, and instructions are disclosed.

Type: Grant

Filed: December 30, 2011

Date of Patent: January 9, 2018

Assignee: Intel Corporation

Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal San Andrian, Suleyman Sair, Bret L. Toll, Zeev Sperber, Amit Gradstein, Asaf Rubenstein
Performing Local Power Gating In A Processor

Publication number: 20170371397

Abstract: In an embodiment, the present invention includes an execution unit to execute instructions of a first type, a local power gate circuit coupled to the execution unit to power gate the execution unit while a second execution unit is to execute instructions of a second type, and a controller coupled to the local power gate circuit to cause it to power gate the execution unit when an instruction stream does not include the first type of instructions. Other embodiments are described and claimed.

Type: Application

Filed: July 12, 2017

Publication date: December 28, 2017

Inventors: Nadav Bonen, Ron Gabor, Zeev Sperber, Vjekoslav Svilan, David N. Mackintosh, Jose A. Baiocchi Paredes, Naveen Kumar, Shantanu Gupta
APPARATUS AND METHOD OF IMPROVED INSERT INSTRUCTIONS

Publication number: 20170357510

Abstract: An apparatus is described having instruction execution logic circuitry to execute first, second, third and fourth instruction. Both the first instruction and the second instruction insert a first group of input vector elements to one of multiple first non overlapping sections of respective first and second resultant vectors. The first group has a first bit width. Each of the multiple first non overlapping sections have a same bit width as the first group. Both the third instruction and the fourth instruction insert a second group of input vector elements to one of multiple second non overlapping sections of respective third and fourth resultant vectors. The second group has a second bit width that is larger than said first bit width. Each of the multiple second non overlapping sections have a same bit width as the second group.

Type: Application

Filed: August 3, 2017

Publication date: December 14, 2017

Inventors: ELMOUSTAPHA OULD-AHMED-VALL, ROBERT VALENTINE, JESUS CORBAL SAN ADRIAN, BRET L. TOLL, MARK J. CHARNEY, ZEEV SPERBER, AMIT GRADSTEIN
SCATTER USING INDEX ARRAY AND FINITE STATE MACHINE

Publication number: 20170351641

Abstract: Methods and apparatus are disclosed using an index array and finite state machine for scatter/gather operations. Embodiment of apparatus may comprise: decode logic to decode scatter/gather instructions and generate micro-operations. An index array holds a set of indices and a corresponding set of mask elements. A finite state machine facilitates the scatter operation. Address generation logic generates an address from an index of the set of indices for at least each of the corresponding mask elements having a first value. Storage is allocated in a buffer for each of the set of addresses being generated. Data elements corresponding to the set of addresses being generated are copied to the buffer. Addresses from the set are accessed to store data elements if a corresponding mask element has said first value and the mask element is changed to a second value responsive to completion of their respective stores.

Type: Application

Filed: April 18, 2017

Publication date: December 7, 2017

Inventors: ZEEV SPERBER, ROBERT VALENTINE, SHLOMO RAIKIN, STANISLAV SHWARTSMAN, GAL OFIR, IGOR YANOVER, GUY PATKIN, OFER LEVY
APPARATUS AND METHOD OF IMPROVED INSERT INSTRUCTIONS

Publication number: 20170329605

Abstract: An apparatus is described having instruction execution logic circuitry to execute first, second, third and fourth instruction. Both the first instruction and the second instruction insert a first group of input vector elements to one of multiple first non overlapping sections of respective first and second resultant vectors. The first group has a first bit width. Each of the multiple first non overlapping sections have a same bit width as the first group. Both the third instruction and the fourth instruction insert a second group of input vector elements to one of multiple second non overlapping sections of respective third and fourth resultant vectors. The second group has a second bit width that is larger than said first bit width. Each of the multiple second non overlapping sections have a same bit width as the second group.

Type: Application

Filed: August 3, 2017

Publication date: November 16, 2017

Inventors: ELMOUSTAPHA OULD-AHMED-VALL, ROBERT VALENTINE, JESUS CORBAL SAN ADRIAN, BRET L. TOLL, MARK J. CHARNEY, ZEEV SPERBER, AMIT GRADSTEIN
APPARATUS AND METHOD OF IMPROVED INSERT INSTRUCTIONS

Publication number: 20170300332

Abstract: An apparatus is described having instruction execution logic circuitry to execute first, second, third and fourth instruction. Both the first instruction and the second instruction insert a first group of input vector elements to one of multiple first non overlapping sections of respective first and second resultant vectors. The first group has a first bit width. Each of the multiple first non overlapping sections have a same bit width as the first group. Both the third instruction and the fourth instruction insert a second group of input vector elements to one of multiple second non overlapping sections of respective third and fourth resultant vectors. The second group has a second bit width that is larger than said first bit width. Each of the multiple second non overlapping sections have a same bit width as the second group.

Type: Application

Filed: March 31, 2017

Publication date: October 19, 2017

Inventors: ELMOUSTAPHA OULD-AHMED-VALL, ROBERT VALENTINE, JESUS CORBAL SAN ADRIAN, BRET L. TOLL, MARK J. CHARNEY, ZEEV SPERBER, AMIT GRADSTEIN
Performing local power gating in a processor

Patent number: 9772674

Abstract: In an embodiment, the present invention includes an execution unit to execute instructions of a first type, a local power gate circuit coupled to the execution unit to power gate the execution unit while a second execution unit is to execute instructions of a second type, and a controller coupled to the local power gate circuit to cause it to power gate the execution unit when an instruction stream does not include the first type of instructions. Other embodiments are described and claimed.

Type: Grant

Filed: December 7, 2015

Date of Patent: September 26, 2017

Assignee: Intel Corporation

Inventors: Nadav Bonen, Ron Gabor, Zeev Sperber, Vjekoslav Svilan, David N. Mackintosh, Jose A. Baiocchi Paredes, Naveen Kumar, Shantanu Gupta
IN-LANE VECTOR SHUFFLE INSTRUCTIONS

Publication number: 20170269934

Abstract: In-lane vector shuffle operations are described. In one embodiment a shuffle instruction specifies a field of per-lane control bits, a source operand and a destination operand, these operands having corresponding lanes, each lane divided into corresponding portions of multiple data elements. Sets of data elements are selected from corresponding portions of every lane of the source operand according to per-lane control bits. Elements of these sets are copied to specified fields in corresponding portions of every lane of the destination operand. Another embodiment of the shuffle instruction also specifies a second source operand, all operands having corresponding lanes divided into multiple data elements. A set selected according to per-lane control bits contains data elements from every lane portion of a first source operand and data elements from every corresponding lane portion of the second source operand. Set elements are copied to specified fields in every lane of the destination operand.

Type: Application

Filed: June 5, 2017

Publication date: September 21, 2017

Applicant: Intel Corporation

Inventors: Zeev Sperber, Robert Valentine, Benny Eitan, Doron Orenstein
APPARATUS AND METHOD OF IMPROVED PERMUTE INSTRUCTIONS

Publication number: 20170262282

Abstract: An apparatus is described having instruction execution logic circuitry. The instruction execution logic circuitry has input vector element routing circuitry to perform the following for each of three different instructions: for each of a plurality of output vector element locations, route into an output vector element location an input vector element from one of a plurality of input vector element locations that are available to source the output vector element. The output vector element and each of the input vector element locations are one of three available bit widths for the three different instructions. The apparatus further includes masking layer circuitry coupled to the input vector element routing circuitry to mask a data structure created by the input vector routing element circuitry. The masking layer circuitry is designed to mask at three different levels of granularity that correspond to the three available bit widths.

Type: Application

Filed: May 22, 2017

Publication date: September 14, 2017

Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Bret L. Toll, Mark J. Charney, Zeev Sperber, Amit Gradstein
Gather using index array and finite state machine

Patent number: 9753889

Abstract: Methods and apparatus are disclosed for using an index array and finite state machine for scatter/gather operations. Embodiment of apparatus may comprise: decode logic to decode a scatter/gather instruction and generate a set of micro-operations, and an index array to hold a set of indices and a corresponding set of mask elements. A finite state machine facilitates the gather operation. Address generation logic generates an address from an index of the set of indices for at least each of the corresponding mask elements having a first value. An address is accessed to load a corresponding data element if the mask element had the first value. The data element is written at an in-register position in a destination vector register according to a respective in-register position the index. Values of corresponding mask elements are changed from the first value to a second value responsive to completion of their respective loads.

Type: Grant

Filed: October 12, 2015

Date of Patent: September 5, 2017

Assignee: Intel Corporation

Inventors: Zeev Sperber, Robert Valentine, Guy Patkin, Stanislav Shwartsman, Shlomo Raikin, Igor Yanover, Gal Ofir
APPARATUS AND METHOD OF IMPROVED EXTRACT INSTRUCTIONS

Publication number: 20170242704

Abstract: An apparatus is described that includes instruction execution circuitry to execute first, second, third, and fourth instructions, the first and second instructions select a first group of input vector elements from one of multiple first non-overlapping sections of respective first and second input vectors. Each of the multiple first non-overlapping sections have a same bit width as the first group. Both the third and fourth instructions select a second group of input vector elements from one of multiple second non-overlapping sections of respective third and fourth input vectors. The second group has a second bit width that is larger than the first bit width. Each of multiple second non-overlapping sections have a same bit width as the second group. The apparatus includes masking layer circuitry to mask the first and second groups at a first granularity a second granularity.

Type: Application

Filed: March 7, 2017

Publication date: August 24, 2017

Inventors: ELMOUSTAPHA OULD-AHMED-VALL, ROBERT VALENTINE, JESUS CORBAL, BRET L. TOLL, MARK J. CHARNEY, ZEEV SPERBER, AMIT GRADSTEIN
SYSTEMS, APPARATUSES, AND METHODS FOR PERFORMING A DOUBLE BLOCKED SUM OF ABSOLUTE DIFFERENCES

Publication number: 20170242694

Abstract: Embodiments of systems, apparatuses, and methods for performing in a computer processor vector double block packed sum of absolute differences (SAD) in response to a single vector double block packed sum of absolute differences instruction that includes a destination vector register operand, first and second source operands, an immediate, and an opcode are described.

Type: Application

Filed: February 28, 2017

Publication date: August 24, 2017

Inventors: ELMOUSTAPHA OULD-AHMED-VALL, MOSTAFA HAGOG, ROBERT VALENTINE, AMIT GRADSTEIN, SIMON RUBANOVICH, ZEEV SPERBER

prev … 9 10 11 12 13 14 15 16 17 … next