Patents by Inventor Igor Ermolaev

Igor Ermolaev has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Instructions and support for calculating prefix sums

Patent number: 12517728

Abstract: Techniques for performing prefix sums in response to a single instruction are describe are described. In some examples, the single instruction includes fields for an opcode, one or fields to reference a first source operand, one or fields to reference a second source operand, one or fields to reference a destination operand, wherein the opcode is to indicate that execution circuitry is, in response to a decoded instance of the single instruction, to at least: perform a prefix sum by for each non-masked data element position of the second source operand adding a data element of that data element position to each data element of preceding data element positions and adding at least one data element of a defined data element position of the first source operand, and store each prefix sum for each data element position of the second source operand into a corresponding data element position of the destination operand.

Type: Grant

Filed: June 17, 2022

Date of Patent: January 6, 2026

Assignee: Intel Corporation

Inventors: Menachem Adelman, Amit Gradstein, Regev Shemy, Chitra Natarajan, Igor Ermolaev
INSTRUCTIONS AND SUPPORT FOR HORIZONTAL REDUCTIONS

Publication number: 20240004662

Abstract: Techniques for performing horizontal reductions are described. In some examples, an instance of a horizontal instruction is to include at least one field for an opcode, one or more fields to reference a first source operand, and one or more fields to reference a destination operand, wherein the opcode is to indicate that execution circuitry is, in response to a decoded instance of the single instruction, to at least perform a horizontal reduction using at least one data element of a non-masked data element position of at least the first source operand and store a result of the horizontal reduction in the destination operand.

Type: Application

Filed: July 2, 2022

Publication date: January 4, 2024

Inventors: Menachem ADELMAN, Amit GRADSTEIN, Regev SHEMY, Chitra NATARAJAN, Leonardo BORGES, Chytra SHIVASWAMY, Igor ERMOLAEV, Michael ESPIG, Or BEIT AHARON, Jeff WIEDEMEIER
INSTRUCTIONS AND SUPPORT FOR CALCULATING PREFIX SUMS

Publication number: 20230409333

Abstract: Techniques for performing prefix sums in response to a single instruction are describe are described. In some examples, the single instruction includes fields for an opcode, one or fields to reference a first source operand, one or fields to reference a second source operand, one or fields to reference a destination operand, wherein the opcode is to indicate that execution circuitry is, in response to a decoded instance of the single instruction, to at least: perform a prefix sum by for each non-masked data element position of the second source operand adding a data element of that data element position to each data element of preceding data element positions and adding at least one data element of a defined data element position of the first source operand, and store each prefix sum for each data element position of the second source operand into a corresponding data element position of the destination operand.

Type: Application

Filed: June 17, 2022

Publication date: December 21, 2023

Inventors: Menachem ADELMAN, Amit GRADSTEIN, Regev SHEMY, Chitra NATARAJAN, Igor ERMOLAEV
Apparatuses, methods, and systems for element sorting of vectors

Patent number: 10929133

Abstract: Systems, methods, and apparatuses relating to element sorting of vectors are described. In one embodiment, a processor includes a decoder to decode an instruction into a decoded instruction; and an execution unit to execute the decoded instruction to: provide storage for a comparison matrix to store a comparison value for each element of an input vector compared against the other elements of the input vector, perform a comparison operation on elements of the input vector corresponding to storage of comparison values above a main diagonal of the comparison matrix, perform a different operation on elements of the input vector corresponding to storage of comparison values below the main diagonal of the comparison matrix, and store results of the comparison operation and the different operation in the comparison matrix.

Type: Grant

Filed: January 16, 2019

Date of Patent: February 23, 2021

Assignee: Intel Corporation

Inventors: Mikhail Plotnikov, Igor Ermolaev
Strideshift instruction for transposing bits inside vector register

Patent number: 10884750

Abstract: A processor includes a decode circuit to decode an instruction into a decoded instruction and an execution circuit to execute the decoded instruction to access a first bit of a first input vector located at a bit position indicated by an element of a second input vector, stride over bits of the first input vector using a stride to access bits of the first input vector that are located at a strided bit position with respect to the first bit of the first input vector, and store the first bit of the first input vector and the bits of the first input vector that are located at a strided bit position with respect to the first bit of the first input vector as consecutive bits in a destination vector.

Type: Grant

Filed: February 28, 2017

Date of Patent: January 5, 2021

Assignee: INTEL CORPORATION

Inventors: Mikhail Plotnikov, Igor Ermolaev
Methods and processors having instructions to determine middle, lowest, or highest values of corresponding elements of three vectors

Patent number: 10838720

Abstract: Systems, methods, and apparatuses relating to multiple source blend operations are described. In one embodiment, a processor is to execute an instruction to: receive a first input operand of a first input vector, a second input operand of a second input vector, and a third input operand of a third input vector, compare each element from the first input vector to each corresponding element of the second input vector to produce a first comparison vector, compare each element from the first input vector to each corresponding element of the third input vector to produce a second comparison vector, compare each element from the second input vector to each corresponding element of the third input vector to produce a third comparison vector, determine a middle value for each element position of the input vectors from the comparison vectors, and output the middle values into same element positions in an output vector.

Type: Grant

Filed: September 23, 2016

Date of Patent: November 17, 2020

Assignee: Intel Corporation

Inventors: Mikhail Plotnikov, Igor Ermolaev
VECTOR COMPRESS2 AND EXPAND2 INSTRUCTIONS WITH TWO MEMORY LOCATIONS

Publication number: 20190347101

Abstract: Disclosed embodiments relate to vector compress2 and expand2 instructions with two memory locations. In one example, a system includes a memory and a processor that includes circuits to fetch, decode, and execute the instruction that includes an opcode, a first destination operand identifier, a second operand identifier, a source operand identifier, and a control mask, wherein, for each element of the source operand, the execution circuit is to generate a result by performing one of compression and expansion of the element; and, based on the value of a bit of the control mask corresponding to the element, store the result to a first location identified by the first destination operand identifier and increment the first destination operand identifier by a size of the result, and, otherwise, store the result to a second location identified by the second destination operand identifier and increment the second destination operand identifier by the size of the result.

Type: Application

Filed: April 6, 2017

Publication date: November 14, 2019

Applicant: Intel Corporation

Inventors: Mikhail PLOTNIKOV, Igor ERMOLAEV, Alexander BOBYR
STRIDESHIFT INSTRUCTION FOR TRANSPOSING BITS INSIDE VECTOR REGISTER

Publication number: 20190347104

Abstract: A processor includes a decode circuit to decode an instruction into a decoded instruction and an execution circuit to execute the decoded instruction to access a first bit of a first input vector located at a bit position indicated by an element of a second input vector, stride over bits of the first input vector using a stride to access bits of the first input vector that are located at a strided bit position with respect to the first bit of the first input vector, and store the first bit of the first input vector and the bits of the first input vector that are located at a strided bit position with respect to the first bit of the first input vector as consecutive bits in a destination vector.

Type: Application

Filed: February 28, 2017

Publication date: November 14, 2019

Inventors: Mikhail PLOTNIKOV, Igor ERMOLAEV
APPARATUSES, METHODS, AND SYSTEMS FOR ELEMENT SORTING OF VECTORS

Publication number: 20190146792

Abstract: Systems, methods, and apparatuses relating to element sorting of vectors are described. In one embodiment, a processor includes a decoder to decode an instruction into a decoded instruction; and an execution unit to execute the decoded instruction to: provide storage for a comparison matrix to store a comparison value for each element of an input vector compared against the other elements of the input vector, perform a comparison operation on elements of the input vector corresponding to storage of comparison values above a main diagonal of the comparison matrix, perform a different operation on elements of the input vector corresponding to storage of comparison values below the main diagonal of the comparison matrix, and store results of the comparison operation and the different operation in the comparison matrix.

Type: Application

Filed: January 16, 2019

Publication date: May 16, 2019

Inventors: MIKHAIL PLOTNIKOV, IGOR ERMOLAEV
Apparatuses, methods, and systems for element sorting of vectors

Patent number: 10191744

Abstract: Systems, methods, and apparatuses relating to element sorting of vectors are described. In one embodiment, a processor includes a decoder to decode an instruction into a decoded instruction; and an execution unit to execute the decoded instruction to: provide storage for a comparison matrix to store a comparison value for each element of an input vector compared against the other elements of the input vector, perform a comparison operation on elements of the input vector corresponding to storage of comparison values above a main diagonal of the comparison matrix, perform a different operation on elements of the input vector corresponding to storage of comparison values below the main diagonal of the comparison matrix, and store results of the comparison operation and the different operation in the comparison matrix.

Type: Grant

Filed: July 1, 2016

Date of Patent: January 29, 2019

Assignee: Intel Corporation

Inventors: Mikhail Plotnikov, Igor Ermolaev
APPARATUSES, METHODS, AND SYSTEMS FOR MULTIPLE SOURCE BLEND OPERATIONS

Publication number: 20180088945

Abstract: Systems, methods, and apparatuses relating to multiple source blend operations are described. In one embodiment, a processor is to execute an instruction to: receive a first input operand of a first input vector, a second input operand of a second input vector, and a third input operand of a third input vector, compare each element from the first input vector to each corresponding element of the second input vector to produce a first comparison vector, compare each element from the first input vector to each corresponding element of the third input vector to produce a second comparison vector, compare each element from the second input vector to each corresponding element of the third input vector to produce a third comparison vector, determine a middle value for each element position of the input vectors from the comparison vectors, and output the middle values into same element positions in an output vector.

Type: Application

Filed: September 23, 2016

Publication date: March 29, 2018

Inventors: MIKHAIL PLOTNIKOV, IGOR ERMOLAEV
Instruction set for eliminating misaligned memory accesses during processing of an array having misaligned data rows

Patent number: 9910670

Abstract: A processor is described having an instruction execution pipeline. The instruction execution pipeline includes an instruction fetch stage to fetch an instruction. The instruction format of the instruction specifies a first input vector, a second input vector and a third input operand. The instruction execution pipeline comprises an instruction decode stage to decode the instruction. The instruction execution pipeline includes a functional unit to execute the instruction. The functional unit includes a routing network to route a first contiguous group of elements from a first end of one of the input vectors to a second end of the instruction's resultant vector, and, route a second contiguous group of elements from a second end of the other of the input vectors to a first end of the instruction's resultant vector. The first and second ends are opposite vector ends. The first and second groups of contiguous elements are defined from the third input operand.

Type: Grant

Filed: July 9, 2014

Date of Patent: March 6, 2018

Assignee: INTEL CORPORATION

Inventors: Mikhail Plotnikov, Igor Ermolaev
APPARATUSES, METHODS, AND SYSTEMS FOR ELEMENT SORTING OF VECTORS

Publication number: 20180004513

Abstract: Systems, methods, and apparatuses relating to element sorting of vectors are described. In one embodiment, a processor incudes a decoder to decode an instruction into a decoded instruction; and an execution unit to execute the decoded instruction to: provide storage for a comparison matrix to store a comparison value for each element of an input vector compared against the other elements of the input vector, perform a comparison operation on elements of the input vector corresponding to storage of comparison values above a main diagonal of the comparison matrix, perform a different operation on elements of the input vector corresponding to storage of comparison values below the main diagonal of the comparison matrix, and store results of the comparison operation and the different operation in the comparison matrix.

Type: Application

Filed: July 1, 2016

Publication date: January 4, 2018

Inventors: MIKHAIL PLOTNIKOV, IGOR ERMOLAEV
APPARATUS AND METHOD FOR PERFORMING A SPIN-LOOP JUMP

Publication number: 20170329609

Abstract: An apparatus and method for performing a spin-loop jump. One embodiment of a processor comprises: jump-pause execution logic to execute a jump-pause instruction, the jump-pause instruction to specify a condition and identify a destination instruction; wherein responsive to the execution of the jump-pause instruction, the jump-pause execution logic is to provide a hint that a loop between the jump-pause instruction and the destination instruction comprises a spin-wait loop and to test the condition, the jump-pause execution logic to delay execution by a specified amount prior to jumping to the destination instruction if the condition is satisfied.

Type: Application

Filed: December 17, 2014

Publication date: November 16, 2017

Inventors: DMITRY SIVKOV, IGOR ERMOLAEV
Systems, Methods, and Apparatuses for Improving Vector Throughput

Publication number: 20170192789

Abstract: Detailed herein are systems, methods, and apparatuses for improving vector throughput. For example, an apparatus comprising a plurality of aliasable registers, wherein each of the plurality of aliasable registers is partitioned into a plurality of lanes and each lane is aliasable as a distinct register; and execution circuitry to execute instructions using data from the plurality of aliasable registers as input and output operands is described.

Type: Application

Filed: December 30, 2015

Publication date: July 6, 2017

Inventors: Rama Kishnan V. Malladi, Elmoustapha Ould-Ahmed-Vall, Igor Ermolaev
Systems, apparatuses, and methods for performing a shuffle and operation (Shuffle-Op)

Patent number: 9684510

Abstract: Embodiments of systems, apparatuses, and methods for performing in a computer processor a data element shuffle and an operation on the shuffled data elements in response to a single data element shuffle and an operation instruction that includes a destination vector register operand, a first and second source vector register operands, an immediate value, and an opcode are described.

Type: Grant

Filed: December 28, 2015

Date of Patent: June 20, 2017

Assignee: Intel Corporation

Inventors: Igor Ermolaev, Elmoustapha Ould-Ahmed-Vall, Bret L. Toll, Jesus Corbal San Adrian, Andrey Naraikin
Vector indexed memory access plus arithmetic and/or logical operation processors, methods, systems, and instructions

Patent number: 9552205

Abstract: A processor including a decode unit to receive a vector indexed load plus arithmetic and/or logical (A/L) operation plus store instruction. The instruction is to indicate a source packed memory indices operand that is to have a plurality of packed memory indices. The instruction is also to indicate a source packed data operand that is to have a plurality of packed data elements. The processor also includes an execution unit coupled with the decode unit. The execution unit, in response to the instruction, is to load a plurality of data elements from memory locations corresponding to the plurality of packed memory indices, perform A/L operations on the plurality of packed data elements of the source packed data operand and the loaded plurality of data elements, and store a plurality of result data elements in the memory locations corresponding to the plurality of packed memory indices.

Type: Grant

Filed: September 27, 2013

Date of Patent: January 24, 2017

Assignee: Intel Corporation

Inventors: Igor Ermolaev, Bret L. Toll, Robert Valentine, Jesus Corbal San Adrian, Gautam B. Doshi, Rama Kishan V. Malladi, Prasenjit Chakraborty
SYSTEMS, APPARATUSES, AND METHODS FOR PERFORMING A SHUFFLE AND OPERATION (SHUFFLE-OP)

Publication number: 20160196139

Abstract: Embodiments of systems, apparatuses, and methods for performing in a computer processor a data element shuffle and an operation on the shuffled data elements in response to a single data element shuffle and an operation instruction that includes a destination vector register operand, a first and second source vector register operands, an immediate value, and an opcode are described.

Type: Application

Filed: December 28, 2015

Publication date: July 7, 2016

Inventors: Igor Ermolaev, Elmoustapha Ould-Ahmed-Vall, Bret L. Toll, Jesus Corbal San Adrian, Andrey Naraikin
INSTRUCTION SET FOR ELIMINATING MISALIGNED MEMORY ACCESSES DURING PROCESSING OF AN ARRAY HAVING MISALIGNED DATA ROWS

Publication number: 20160011870

Abstract: A processor is described having an instruction execution pipeline. The instruction execution pipeline includes an instruction fetch stage to fetch an instruction. The instruction format of the instruction specifies a first input vector, a second input vector and a third input operand. The instruction execution pipeline comprises an instruction decode stage to decode the instruction. The instruction execution pipeline includes a functional unit to execute the instruction. The functional unit includes a routing network to route a first contiguous group of elements from a first end of one of the input vectors to a second end of the instruction's resultant vector, and, route a second contiguous group of elements from a second end of the other of the input vectors to a first end of the instruction's resultant vector. The first and second ends are opposite vector ends. The first and second groups of contiguous elements are defined from the third input operand.

Type: Application

Filed: July 9, 2014

Publication date: January 14, 2016

Inventors: MIKHAIL PLOTNIKOV, IGOR ERMOLAEV
Systems, apparatuses, and methods for performing a shuffle and operation (shuffle-op)

Patent number: 9218182

Abstract: Embodiments of systems, apparatuses, and methods for performing in a computer processor a data element shuffle and an operation on the shuffled data elements in response to a single data element shuffle and an operation instruction that includes a destination vector register operand, a first and second source vector register operands, an immediate value, and an opcode are described.

Type: Grant

Filed: June 29, 2012

Date of Patent: December 22, 2015

Assignee: Intel Corporation

Inventors: Igor Ermolaev, Elmoustapha Ould-Ahmed-Vall, Bret Toll, Jesus Corbal, Andrey Naraikin

1 2 next