Patents by Inventor Mikhail Plotnikov

Mikhail Plotnikov has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

APPARATUSES, METHODS, AND SYSTEMS FOR MULTIPLE SOURCE BLEND OPERATIONS

Publication number: 20180088945

Abstract: Systems, methods, and apparatuses relating to multiple source blend operations are described. In one embodiment, a processor is to execute an instruction to: receive a first input operand of a first input vector, a second input operand of a second input vector, and a third input operand of a third input vector, compare each element from the first input vector to each corresponding element of the second input vector to produce a first comparison vector, compare each element from the first input vector to each corresponding element of the third input vector to produce a second comparison vector, compare each element from the second input vector to each corresponding element of the third input vector to produce a third comparison vector, determine a middle value for each element position of the input vectors from the comparison vectors, and output the middle values into same element positions in an output vector.

Type: Application

Filed: September 23, 2016

Publication date: March 29, 2018

Inventors: MIKHAIL PLOTNIKOV, IGOR ERMOLAEV
Instruction for implementing iterations having an iteration dependent condition with a vector loop

Patent number: 9921837

Abstract: A processor is described having an instruction execution pipeline. The instruction execution pipeline includes an instruction fetch stage to fetch an instruction. The instruction identifies an input vector operand whose input elements specify one or the other of two states. The instruction execution pipeline also includes an instruction decoder to decode the instruction. The instruction execution pipeline also includes a functional unit to execute the instruction and provide a resultant output vector. The functional unit includes logic circuitry to produce an element in a specific element position of the resultant output vector by performing an operation on a value derived from a base value using a stride in response to one but not the other of the two states being present in a corresponding element position of the input vector operand.

Type: Grant

Filed: July 19, 2016

Date of Patent: March 20, 2018

Assignee: INTEL CORPORATION

Inventor: Mikhail Plotnikov
Instruction set for eliminating misaligned memory accesses during processing of an array having misaligned data rows

Patent number: 9910670

Abstract: A processor is described having an instruction execution pipeline. The instruction execution pipeline includes an instruction fetch stage to fetch an instruction. The instruction format of the instruction specifies a first input vector, a second input vector and a third input operand. The instruction execution pipeline comprises an instruction decode stage to decode the instruction. The instruction execution pipeline includes a functional unit to execute the instruction. The functional unit includes a routing network to route a first contiguous group of elements from a first end of one of the input vectors to a second end of the instruction's resultant vector, and, route a second contiguous group of elements from a second end of the other of the input vectors to a first end of the instruction's resultant vector. The first and second ends are opposite vector ends. The first and second groups of contiguous elements are defined from the third input operand.

Type: Grant

Filed: July 9, 2014

Date of Patent: March 6, 2018

Assignee: INTEL CORPORATION

Inventors: Mikhail Plotnikov, Igor Ermolaev
Methods, Apparatus, Instructions and Logic to Provide Permute Controls With Leading Zero Count Functionality

Publication number: 20180046462

Abstract: Instructions and logic provide SIMD permute controls with leading zero count functionality. Some embodiments include processors with a register with a plurality of data fields, each of the data fields to store a second plurality of bits. A destination register has corresponding data fields, each of these data fields to store a count of the number of most significant contiguous bits set to zero for corresponding data fields. Responsive to decoding a vector leading zero count instruction, execution units count the number of most significant contiguous bits set to zero for each of data fields in the register, and store the counts in corresponding data fields of the first destination register. Vector leading zero count instructions can be used to generate permute controls and completion masks to be used along with the set of permute controls, to resolve dependencies in gather-modify-scatter SIMD operations.

Type: Application

Filed: October 23, 2017

Publication date: February 15, 2018

Inventors: Christopher J. Hughes, Mikhail Plotnikov, Andrey Naraikin, Robert Valentine
LOOP VECTORIZATION METHODS AND APPARATUS

Publication number: 20180032342

Abstract: Loop vectorization methods and apparatus are disclosed. An example method includes generating a first control mask for a set of iterations of a loop by evaluating a condition of the loop, wherein generating the first control mask includes setting a bit of the control mask to a first value when the condition indicates that an operation of the loop is to be executed, and setting the bit of the first control mask to a second value when the condition indicates that the operation of the loop is to be bypassed. The example method also includes compressing indexes corresponding to the first set of iterations of the loop according to the first control mask.

Type: Application

Filed: August 18, 2017

Publication date: February 1, 2018

Inventors: Mikhail Plotnikov, Andrey Naraikin, Christopher J. Hughes
APPARATUSES, METHODS, AND SYSTEMS FOR ELEMENT SORTING OF VECTORS

Publication number: 20180004513

Abstract: Systems, methods, and apparatuses relating to element sorting of vectors are described. In one embodiment, a processor incudes a decoder to decode an instruction into a decoded instruction; and an execution unit to execute the decoded instruction to: provide storage for a comparison matrix to store a comparison value for each element of an input vector compared against the other elements of the input vector, perform a comparison operation on elements of the input vector corresponding to storage of comparison values above a main diagonal of the comparison matrix, perform a different operation on elements of the input vector corresponding to storage of comparison values below the main diagonal of the comparison matrix, and store results of the comparison operation and the different operation in the comparison matrix.

Type: Application

Filed: July 1, 2016

Publication date: January 4, 2018

Inventors: MIKHAIL PLOTNIKOV, IGOR ERMOLAEV
Systems, Apparatuses, and Methods for Strided Load

Publication number: 20180004518

Abstract: Systems, methods, and apparatuses for strided loads are described. In an embodiment, an instruction to include at least an opcode, a field for at least two packed data source operands, a field for a packed data destination operand, and an immediate is designated as a strided load instruction. This instruction is executed to load packed data elements from the at least two packed data source operands using a stride and storing results of the strided loads in the packed data destination operand starting from a defined position determined in part from the immediate.

Type: Application

Filed: July 2, 2016

Publication date: January 4, 2018

Inventors: Mikhail Plotnikov, Elmoustapha Ould-Ahmed-Vall
Methods, apparatus, instructions and logic to provide permute controls with leading zero count functionality

Patent number: 9804850

Abstract: Instructions and logic provide SIMD permute controls with leading zero count functionality. Some embodiments include processors with a register with a plurality of data fields, each of the data fields to store a second plurality of bits. A destination register has corresponding data fields, each of these data fields to store a count of the number of most significant contiguous bits set to zero for corresponding data fields. Responsive to decoding a vector leading zero count instruction, execution units count the number of most significant contiguous bits set to zero for each of data fields in the register, and store the counts in corresponding data fields of the first destination register. Vector leading zero count instructions can be used to generate permute controls and completion masks to be used along with the set of permute controls, to resolve dependencies in gather-modify-scatter SIMD operations.

Type: Grant

Filed: June 21, 2016

Date of Patent: October 31, 2017

Assignee: Intel Corporation

Inventors: Christopher J. Hughes, Mikhail Plotnikov, Andrey Naraikin, Robert Valentine
System and method of loop vectorization by compressing indexes and data elements from iterations based on a control mask

Patent number: 9740493

Abstract: Loop vectorization methods and apparatus are disclosed. An example method includes generating a first control mask for a set of iterations of a loop by evaluating a condition of the loop, wherein generating the first control mask includes setting a bit of the control mask to a first value when the condition indicates that an operation of the loop is to be executed, and setting the bit of the first control mask to a second value when the condition indicates that the operation of the loop is to be bypassed. The example method also includes compressing indexes corresponding to the first set of iterations of the loop according to the first control mask.

Type: Grant

Filed: September 28, 2012

Date of Patent: August 22, 2017

Assignee: Intel Corporation

Inventors: Christopher J. Hughes, Mikhail Plotnikov, Andrey Naraikin
COLLAPSING OF MULTIPLE NESTED LOOPS, METHODS, AND INSTRUCTIONS

Publication number: 20170206087

Abstract: In an embodiment, the present invention is directed to a processor including a decode logic to receive a multi-dimensional loop counter update instruction and to decode the multi-dimensional loop counter update instruction into at least one decoded instruction, and an execution logic to execute the at least one decoded instruction to update at least one loop counter value of a first operand associated with the multi-dimensional loop counter update instruction by a first amount. Methods to collapse loops using such instructions are also disclosed. Other embodiments are described and claimed.

Type: Application

Filed: April 4, 2017

Publication date: July 20, 2017

Inventors: MIKHAIL PLOTNIKOV, ANDREY NARAIKIN, ELMOUSTAPHA OULD-AHMED-VALL
Collapsing of multiple nested loops, methods and instructions

Patent number: 9619229

Abstract: In an embodiment, the present invention is directed to a processor including a decode logic to receive a multi-dimensional loop counter update instruction and to decode the multi-dimensional loop counter update instruction into at least one decoded instruction, and an execution logic to execute the at least one decoded instruction to update at least one loop counter value of a first operand associated with the multi-dimensional loop counter update instruction by a first amount. Methods to collapse loops using such instructions are also disclosed. Other embodiments are described and claimed.

Type: Grant

Filed: December 27, 2012

Date of Patent: April 11, 2017

Assignee: Intel Corporation

Inventors: Mikhail Plotnikov, Andrey Naraikin, Elmoustapha Ould-Ahmed-Vall
INSTRUCTION FOR ELEMENT OFFSET CALCULATION IN A MULTI-DIMENSIONAL ARRAY

Publication number: 20170075691

Abstract: An apparatus is described having functional unit logic circuitry. The functional unit logic circuitry has a first register to store a first input vector operand having an element for each dimension of a multi-dimensional data structure. Each element of the first vector operand specifying the size of its respective dimension. The functional unit has a second register to store a second input vector operand specifying coordinates of a particular segment of the multi-dimensional structure. The functional unit also has logic circuitry to calculate an address offset for the particular segment relative to an address of an origin segment of the multi-dimensional structure.

Type: Application

Filed: November 29, 2016

Publication date: March 16, 2017

Inventors: MIKHAIL PLOTNIKOV, ANDREY NARAIKIN, ELMOUSTAPHA OULD-AHMED-VALL
MULTI-ELEMENT INSTRUCTION WITH DIFFERENT READ AND WRITE MASKS

Publication number: 20170052783

Abstract: A method is described that includes reading a first read mask from a first register. The method also includes reading a first vector operand from a second register or memory location. The method also includes applying the read mask against the first vector operand to produce a set of elements for operation. The method also includes performing an operation of the set elements. The method also includes creating an output vector by producing multiple instances of the operation's result. The method also includes reading a first write mask from a third register, the first write mask being different than the first read mask. The method also includes applying the write mask against the output vector to create a resultant vector. The method also includes writing the resultant vector to a destination register.

Type: Application

Filed: November 8, 2016

Publication date: February 23, 2017

Inventors: MIKHAIL PLOTNIKOV, ANDREY NARAIKIN, ELMOUSTAPHA OULD-AHMED-VALL, ROBERT VALENTINE, BRET L. TOLL, JESUS CORBAL
PROCESSORS, METHODS, SYSTEMS, AND INSTRUCTIONS TO STORE CONSECUTIVE SOURCE ELEMENTS TO UNMASKED RESULT ELEMENTS WITH PROPAGATION TO MASKED RESULT ELEMENTS

Publication number: 20170017488

Abstract: A processor of an aspect includes a decode unit to decode an instruction indicating a first source packed data operand including at least four data elements, a source mask including at least four mask elements, and a destination storage location. An execution unit, in response to the instruction, stores a result packed data operand having a series of at least two unmasked result data elements. Each of the unmasked result data elements stores a value of a different one of at least two consecutive data elements of the first source packed data operand in a relative order. All masked result elements, which are between a nearest corresponding pair of unmasked result data elements, have a same value as an unmasked result data element of the corresponding pair, which is closest to a first end of the result packed data operand. The masked result data elements correspond to masked mask elements.

Type: Application

Filed: March 27, 2014

Publication date: January 19, 2017

Applicant: lntel Corporation

Inventor: Mikhail PLOTNIKOV
PROCESSORS, METHODS, SYSTEMS, AND INSTRUCTIONS TO STORE SOURCE ELEMENTS TO CORRESPONDING UNMASKED RESULT ELEMENTS WITH PROPAGATION TO MASKED RESULT ELEMENTS

Publication number: 20170017487

Abstract: A processor of an aspect includes a decode unit to decode an instruction that indicates a first source packed data operand including a first plurality of data elements, a source mask including a plurality of mask elements, and a destination storage location. An execution unit, in response to the instruction, stores a result packed data operand. The result packed data operand has at least two unmasked result data elements corresponding to unmasked mask elements of the source mask. Each of the unmasked result data elements has a value of a corresponding data element of the first source packed data operand in a same relative position. All masked result data elements, between each nearest pair of unmasked result data elements, have a same value as an unmasked result data element of the pair closest to a first end of the result packed data operand.

Type: Application

Filed: March 28, 2014

Publication date: January 19, 2017

Applicant: lntel Corporation

Inventor: Mikhail Plotnikov
Instruction for element offset calculation in a multi-dimensional array

Patent number: 9507593

Abstract: An apparatus is described having functional unit logic circuitry. The functional unit logic circuitry has a first register to store a first input vector operand having an element for each dimension of a multi-dimensional data structure. Each element of the first vector operand specifying the size of its respective dimension. The functional unit has a second register to store a second input vector operand specifying coordinates of a particular segment of the multi-dimensional structure. The functional unit also has logic circuitry to calculate an address offset for the particular segment relative to an address of an origin segment of the multi-dimensional structure.

Type: Grant

Filed: December 23, 2011

Date of Patent: November 29, 2016

Assignee: Intel Corporation

Inventors: Mikhail Plotnikov, Andrey Naraikin, Elmoustapha Ould-Ahmed-Vall
Read and Write Masks Update Instruction for Vectorization of Recursive Computations Over Independent Data

Publication number: 20160335086

Abstract: A processor executes a mask update instruction to perform updates to a first mask register and a second mask register. A register file within the processor includes the first mask register and the second mask register. The processor includes execution circuitry to execute the mask update instruction. In response to the mask update instruction, the execution circuitry is to invert a given number of mask bits in the first mask register, and also to invert the given number of mask bits in the second mask register.

Type: Application

Filed: July 25, 2016

Publication date: November 17, 2016

Inventors: Mikhail Plotnikov, Andrey Naraikin, Christopher J. Hughes
INSTRUCTION FOR IMPLEMENTING ITERATIONS HAVING AN ITERATION DEPENDENT CONDITION WITH A VECTOR LOOP

Publication number: 20160328235

Abstract: A processor is described having an instruction execution pipeline. The instruction execution pipeline includes an instruction fetch stage to fetch an instruction. The instruction identifies an input vector operand whose input elements specify one or the other of two states. The instruction execution pipeline also includes an instruction decoder to decode the instruction. The instruction execution pipeline also includes a functional unit to execute the instruction and provide a resultant output vector. The functional unit includes logic circuitry to produce an element in a specific element position of the resultant output vector by performing an operation on a value derived from a base value using a stride in response to one but not the other of the two states being present in a corresponding element position of the input vector operand.

Type: Application

Filed: July 19, 2016

Publication date: November 10, 2016

Inventor: MIKHAIL PLOTNIKOV
Multi-element instruction with different read and write masks

Patent number: 9489196

Abstract: A method is described that includes reading a first read mask from a first register. The method also includes reading a first vector operand from a second register or memory location. The method also includes applying the read mask against the first vector operand to produce a set of elements for operation. The method also includes performing an operation of the set elements. The method also includes creating an output vector by producing multiple instances of the operation's result. The method also includes reading a first write mask from a third register, the first write mask being different than the first read mask. The method also includes applying the write mask against the output vector to create a resultant vector. The method also includes writing the resultant vector to a destination register.

Type: Grant

Filed: December 23, 2011

Date of Patent: November 8, 2016

Assignee: Intel Corporation

Inventors: Mikhail Plotnikov, Andrey Naraikan, Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Bret L. Toll, Jesus Corbal
Methods, Apparatus, Instructions and Logic to Provide Permute Controls With Leading Zero Count Functionality

Publication number: 20160299763

Abstract: Instructions and logic provide SIMD permute controls with leading zero count functionality. Some embodiments include processors with a register with a plurality of data fields, each of the data fields to store a second plurality of bits. A destination register has corresponding data fields, each of these data fields to store a count of the number of most significant contiguous bits set to zero for corresponding data fields. Responsive to decoding a vector leading zero count instruction, execution units count the number of most significant contiguous bits set to zero for each of data fields in the register, and store the counts in corresponding data fields of the first destination register. Vector leading zero count instructions can be used to generate permute controls and completion masks to be used along with the set of permute controls, to resolve dependencies in gather-modify-scatter SIMD operations.

Type: Application

Filed: June 21, 2016

Publication date: October 13, 2016

Inventors: Christopher J. Hughes, Mikhail Plotnikov, Andrey Naraikin, Robert Valentine

prev 1 2 3 4 5 next