Patents by Inventor Zeev Sperber

Zeev Sperber has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Instructions and logic to perform mask load and store operations as sequential or one-at-a-time operations after exceptions and for un-cacheable type memory

Patent number: 10120684

Abstract: Logic is provided to receive and execute a mask move instruction to transfer unmasked data elements of a vector data element including a plurality of packed data elements from a source location to a destination location, subject to mask information for the instruction. The logic is to execute a speculative full width operation, and if an exception occurs is to perform operations sequentially or one at a time. Other embodiments are described and claimed.

Type: Grant

Filed: March 11, 2013

Date of Patent: November 6, 2018

Assignee: Intel Corporation

Inventors: Doron Orenstien, Zeev Sperber, Robert Valentine, Benny Eitan
APPARATUS AND METHOD FOR VECTOR COMPRESSION

Publication number: 20180309461

Abstract: An apparatus and method are described for performing vector compression. For example, one embodiment of a processor comprises: vector compression logic to compress a source vector comprising a plurality of valid data elements and invalid data elements to generate a destination vector in which valid data elements are stored contiguously on one side of the destination vector, the vector compression logic to utilize a bit mask associated with the source vector and comprising a plurality of bits, each bit corresponding to one of the plurality of data elements of the source vector and indicating whether the data element comprises a valid data element or an invalid data element, the vector compression logic to utilize indices of the bit mask and associated bit values of the bit mask to generate a control vector; and shuffle logic to shuffle/permute the data elements of the source vector to the destination vector in accordance with the control vector.

Type: Application

Filed: March 15, 2018

Publication date: October 25, 2018

Inventors: Simon Rubanovich, David M. Russinoff, Amit Gradstein, John W. O'Leary, Zeev Sperber
Vector multiplication with accumulation in large register space

Patent number: 10095516

Abstract: An apparatus is described having an instruction execution pipeline that has a vector functional unit to support a vector multiply add instruction. The vector multiply add instruction to multiply respective K bit elements of two vectors and accumulate a portion of each of their respective products with another respective input operand in an X bit accumulator, where X is greater than K.

Type: Grant

Filed: June 29, 2012

Date of Patent: October 9, 2018

Assignee: INTEL CORPORATION

Inventors: Shay Gueron, Vlad Krasnov, Robert Valentine, Zeev Sperber, Amit Gradstein, Simon Rubanovich
Instruction and logic for register based hardware memory renaming

Patent number: 10095522

Abstract: A processor includes a core, a memory subsystem, a predictor module, and a memory rename module. The predictor module may include a first logic to identify a dependency between a store instruction and a load instruction, and a second logic to assign a memory renaming (MRN) register to the store instruction and the load instruction based on the identified dependency. Further, the memory rename module may include a third logic to copy, based on the assigned MRN register, information in a first logical register associated with the store instruction directly to a second logical register associated with the load instruction.

Type: Grant

Filed: December 23, 2014

Date of Patent: October 9, 2018

Assignee: Intel Corporation

Inventors: Kamil Garifullin, Stanislav Shwartsman, Lihu Rappoport, Zeev Sperber, Pavel I. Kryukov, Andrey Kluchnikov, Igor Yanover, George Leifman, Alex Gerber, Jared W. Stark
PACKED ROTATE PROCESSORS, METHODS, SYSTEMS, AND INSTRUCTIONS

Publication number: 20180253308

Abstract: A method of an aspect includes receiving a masked packed rotate instruction. The instruction indicates a first source packed data including a plurality of packed data elements, a packed data operation mask having a plurality of mask elements, at least one rotation amount, and a destination storage location. A result packed data is stored in the destination storage location in response to the instruction. The result packed data includes result data elements that each correspond to a different one of the mask elements in a corresponding relative position. Result data elements that are not masked out by the corresponding mask element include one of the data elements of the first source packed data in a corresponding position that has been rotated. Result data elements that are masked out by the corresponding mask element include a masked out value. Other methods, apparatus, systems, and instructions are disclosed.

Type: Application

Filed: January 8, 2018

Publication date: September 6, 2018

Applicant: lntel Corporation

Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal San Andrian, Suleyman Sair, Bret L. Toll, Zeev Sperber, Amit Gradstein, Asaf Rubinstein
METHOD AND APPARATUS FOR PERFORMING LOGICAL COMPARE OPERATIONS

Publication number: 20180181395

Abstract: A method and apparatus for including in a processor instructions for performing logical-comparison and branch support operations on packed or unpacked data. In one embodiment, instruction decode logic decodes instructions for an execution unit to operate on packed data elements including logical comparisons. A register file including 128-bit packed data registers stores packed single-precision floating point (SPFP) and packed integer data elements. The logical comparisons may include comparison of SPFP data elements and comparison of integer data elements and setting at least one bit to indicate the results. Based on these comparisons, branch support actions are taken. Such branch support actions may include setting the at least one bit, which in turn may be utilized by a branching unit in response to a branch instruction. Alternatively, the branch support actions may include branching to an indicated target code location.

Type: Application

Filed: January 31, 2018

Publication date: June 28, 2018

Inventors: Rajiv Kapoor, Ronen Zohar, Mark Buxton, Zeev Sperber, Koby Gottlieb
Fused multiply-add (FMA) low functional unit

Patent number: 9996320

Abstract: An example processor includes a register and a fused multiply-add (FMA) low functional unit. The register stores first, second, and third floating point (FP) values. The FMA low functional unit receives a request to perform an FMA low operation: multiplies the first FP value with the second FP value to obtain a first product value; adds the first product with the third FP value to generate a first result value; rounds the first result to generate a first FMA value; multiplies the first FP value with the second FP value to obtain a second product value; adds the second product value with the third FP value to generate a second result value; and subtracts the FMA value from the second result value to obtain a third result value, which can then be normalized and rounded (FMA low result) and sent the FMA low result to an application.

Type: Grant

Filed: December 23, 2015

Date of Patent: June 12, 2018

Assignee: Intel Corporation

Inventors: Cristina S. Anderson, Marius A. Cornea-Hasegan, Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Nikita Astafev, Mark J. Charney, Milind B. Girkar, Amit Gradstein, Simon Rubanovich, Zeev Sperber
Method and apparatus for proactive throttling for improved power transitions in a processor core

Patent number: 9996127

Abstract: A processor and method are described for performing proactive throttling of execution unit ports. For example, one embodiment of a processor core comprises: a plurality of execution unit ports within an execution stage of the processor core; a scheduler unit to schedule execution of a plurality of operations to the plurality of execution unit ports; and proactive throttling logic to limit acceleration of execution of the operations by the ports to an acceleration level which does not result in significant power supply droops.

Type: Grant

Filed: March 12, 2014

Date of Patent: June 12, 2018

Assignee: Intel Corporation

Inventors: Omer Vikinski, Igor Yanover, Gavri Berger, Gabi Malka, Zeev Sperber
Floating point (FP) add low instructions functional unit

Patent number: 9996319

Abstract: An example processor includes a register and an ADD low functional unit. The register stores first, second, and third floating point (FP) values. The ADD low functional unit receives a request to perform an ADD low operation and, responsive to the request: adds the first FP value with the second FP value to obtain a first sum value; rounds the first sum value to generate an ADD value; adds the first FP value with the second FP value to obtain a second sum value; subtracts the ADD value from the second sum value to generate a difference value; normalizes the difference value to obtain a normalized difference value; rounds the normalized difference value to generate an ADD low value; and sends the ADD low value to an application.

Type: Grant

Filed: December 23, 2015

Date of Patent: June 12, 2018

Assignee: Intel Corporation

Inventors: Cristina S. Anderson, Marius A. Cornea-Hasegan, Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Nikita Astafev, Mark J. Charney, Milind B. Girkar, Amit Gradstein, Simon Rubanovich, Zeev Sperber
Performing Local Power Gating In A Processor

Publication number: 20180129266

Abstract: In an embodiment, the present invention includes an execution unit to execute instructions of a first type, a local power gate circuit coupled to the execution unit to power gate the execution unit while a second execution unit is to execute instructions of a second type, and a controller coupled to the local power gate circuit to cause it to power gate the execution unit when an instruction stream does not include the first type of instructions. Other embodiments are described and claimed.

Type: Application

Filed: December 21, 2017

Publication date: May 10, 2018

Inventors: Nadav Bonen, Ron Gabor, Zeev Sperber, Vjekoslav Svilan, David N. Mackintosh, Jose A. Baiocchi Paredes, Naveen Kumar, Shantanu Gupta
Performing Local Power Gating In A Processor

Publication number: 20180129265

Abstract: In an embodiment, the present invention includes an execution unit to execute instructions of a first type, a local power gate circuit coupled to the execution unit to power gate the execution unit while a second execution unit is to execute instructions of a second type, and a controller coupled to the local power gate circuit to cause it to power gate the execution unit when an instruction stream does not include the first type of instructions. Other embodiments are described and claimed.

Type: Application

Filed: December 21, 2017

Publication date: May 10, 2018

Inventors: Nadav Bonen, Ron Gabor, Zeev Sperber, Vjekoslav Svilan, David N. Mackintosh, Jose A. Baiocchi Paredes, Naveen Kumar, Shantanu Gupta
IN-LANE VECTOR SHUFFLE INSTRUCTIONS

Publication number: 20180121198

Abstract: In-lane vector shuffle operations are described. In one embodiment a shuffle instruction specifies a field of per-lane control bits, a source operand and a destination operand, these operands having corresponding lanes, each lane divided into corresponding portions of multiple data elements. Sets of data elements are selected from corresponding portions of every lane of the source operand according to per-lane control bits. Elements of these sets are copied to specified fields in corresponding portions of every lane of the destination operand. Another embodiment of the shuffle instruction also specifies a second source operand, all operands having corresponding lanes divided into multiple data elements. A set selected according to per-lane control bits contains data elements from every lane portion of a first source operand and data elements from every corresponding lane portion of the second source operand. Set elements are copied to specified fields in every lane of the destination operand.

Type: Application

Filed: November 2, 2017

Publication date: May 3, 2018

Applicant: Intel Corporation

Inventors: Zeev Sperber, Robert Valentine, Benny Eitan, Doron Orenstein
IN-LANE VECTOR SHUFFLE INSTRUCTIONS

Publication number: 20180113710

Abstract: In-lane vector shuffle operations are described. In one embodiment a shuffle instruction specifies a field of per-lane control bits, a source operand and a destination operand, these operands having corresponding lanes, each lane divided into corresponding portions of multiple data elements. Sets of data elements are selected from corresponding portions of every lane of the source operand according to per-lane control bits. Elements of these sets are copied to specified fields in corresponding portions of every lane of the destination operand. Another embodiment of the shuffle instruction also specifies a second source operand, all operands having corresponding lanes divided into multiple data elements. A set selected according to per-lane control bits contains data elements from every lane portion of a first source operand and data elements from every corresponding lane portion of the second source operand. Set elements are copied to specified fields in every lane of the destination operand.

Type: Application

Filed: December 21, 2017

Publication date: April 26, 2018

Applicant: Intel Corporation

Inventors: Zeev Sperber, Robert Valentine, Benny Eitan, Doron Orenstein
IN-LANE VECTOR SHUFFLE INSTRUCTIONS

Publication number: 20180113712

Abstract: In-lane vector shuffle operations are described. In one embodiment a shuffle instruction specifies a field of per-lane control bits, a source operand and a destination operand, these operands having corresponding lanes, each lane divided into corresponding portions of multiple data elements. Sets of data elements are selected from corresponding portions of every lane of the source operand according to per-lane control bits. Elements of these sets are copied to specified fields in corresponding portions of every lane of the destination operand. Another embodiment of the shuffle instruction also specifies a second source operand, all operands having corresponding lanes divided into multiple data elements. A set selected according to per-lane control bits contains data elements from every lane portion of a first source operand and data elements from every corresponding lane portion of the second source operand. Set elements are copied to specified fields in every lane of the destination operand.

Type: Application

Filed: December 21, 2017

Publication date: April 26, 2018

Applicant: Intel Corporation

Inventors: Zeev Sperber, Robert Valentine, Benny Eitan, Doron Orenstein
IN-LANE VECTOR SHUFFLE INSTRUCTIONS

Publication number: 20180113711

Abstract: In-lane vector shuffle operations are described. In one embodiment a shuffle instruction specifies a field of per-lane control bits, a source operand and a destination operand, these operands having corresponding lanes, each lane divided into corresponding portions of multiple data elements. Sets of data elements are selected from corresponding portions of every lane of the source operand according to per-lane control bits. Elements of these sets are copied to specified fields in corresponding portions of every lane of the destination operand. Another embodiment of the shuffle instruction also specifies a second source operand, all operands having corresponding lanes divided into multiple data elements. A set selected according to per-lane control bits contains data elements from every lane portion of a first source operand and data elements from every corresponding lane portion of the second source operand. Set elements are copied to specified fields in every lane of the destination operand.

Type: Application

Filed: December 21, 2017

Publication date: April 26, 2018

Applicant: Intel Corporation

Inventors: Zeev Sperber, Robert Valentine, Benny Eitan, Doron Orenstein
Apparatus and method of improved permute instructions with multiple granularities

Patent number: 9946540

Abstract: An apparatus is described having instruction execution logic circuitry. The instruction execution logic circuitry has input vector element routing circuitry to perform the following for each of three different instructions: for each of a plurality of output vector element locations, route into an output vector element location an input vector element from one of a plurality of input vector element locations that are available to source the output vector element. The output vector element and each of the input vector element locations are one of three available bit widths for the three different instructions. The apparatus further includes masking layer circuitry coupled to the input vector element routing circuitry to mask a data structure created by the input vector routing element circuitry. The masking layer circuitry is designed to mask at three different levels of granularity that correspond to the three available bit widths.

Type: Grant

Filed: May 22, 2017

Date of Patent: April 17, 2018

Assignee: INTEL CORPORATION

Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Bret L. Toll, Mark J. Charney, Zeev Sperber, Amit Gradstein
Instruction and Logic for Early Underflow Detection and Rounder Bypass

Publication number: 20180088940

Abstract: A processor for floating point underflow detection includes circuitry to decode a first instruction and a floating point unit. The decoded instruction, when executed by the processor, may be for performing a fused multiply-add (FMA) operation. The floating point unit includes circuitry to determine a non-normalized result of the first instruction based on a first input, a second input, and a third input. The floating point unit further includes circuitry to determine whether underflow exists in the non-normalized result based on a first exponent of the first input, a second exponent of the second input, and a third exponent of the third input.

Type: Application

Filed: September 29, 2016

Publication date: March 29, 2018

Inventors: Simon Rubanovich, Thierry Pons, Zeev Sperber, Amit Gradstein
Processors, Methods, Systems, and Instructions to Generate Sequences of Integers in which Integers in Consecutive Positions Differ by a Constant Integer Stride and Where a Smallest Integer is Offset from Zero by an Integer Offset

Publication number: 20180088943

Abstract: A method of an aspect includes receiving an instruction. The instruction indicates an integer stride, indicates an integer offset, and indicates a destination storage location. A result is stored in the destination storage location in response to the instruction. The result includes a sequence of at least four integers in numerical order with a smallest one of the at least four integers differing from zero by the integer offset and with all integers of the sequence in consecutive positions differing by the integer stride. Other methods, apparatus, systems, and instructions are disclosed.

Type: Application

Filed: September 30, 2017

Publication date: March 29, 2018

Applicant: Intel Corporation

Inventors: Seth Abraham, Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Zeev Sperber, Amit Gradstein
Processors, Methods, Systems, and Instructions to Generate Sequences of Integers in which Integers in Consecutive Positions Differ by a Constant Integer Stride and Where a Smallest Integer is Offset from Zero by an Integer Offset

Publication number: 20180088942

Abstract: A method of an aspect includes receiving an instruction. The instruction indicates an integer stride, indicates an integer offset, and indicates a destination storage location. A result is stored in the destination storage location in response to the instruction. The result includes a sequence of at least four integers in numerical order with a smallest one of the at least four integers differing from zero by the integer offset and with all integers of the sequence in consecutive positions differing by the integer stride. Other methods, apparatus, systems, and instructions are disclosed.

Type: Application

Filed: September 30, 2017

Publication date: March 29, 2018

Applicant: Intel Corporation

Inventors: Seth Abraham, Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Zeev Sperber, Amit Gradstein
Apparatus and method for vector compression

Patent number: 9929745

Abstract: An apparatus and method are described for performing vector compression. For example, one embodiment of a processor comprises: vector compression logic to compress a source vector comprising a plurality of valid data elements and invalid data elements to generate a destination vector in which valid data elements are stored contiguously on one side of the destination vector, the vector compression logic to utilize a bit mask associated with the source vector and comprising a plurality of bits, each bit corresponding to one of the plurality of data elements of the source vector and indicating whether the data element comprises a valid data element or an invalid data element, the vector compression logic to utilize indices of the bit mask and associated bit values of the bit mask to generate a control vector; and shuffle logic to shuffle/permute the data elements of the source vector to the destination vector in accordance with the control vector.

Type: Grant

Filed: September 26, 2014

Date of Patent: March 27, 2018

Assignee: Intel Corporation

Inventors: Simon Rubanovich, David M. Russinoff, Amit Gradstein, John W. O'Leary, Zeev Sperber

prev … 8 9 10 11 12 13 14 15 16 … next