Patents by Inventor Igor Ermolaev

Igor Ermolaev has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240004662
    Abstract: Techniques for performing horizontal reductions are described. In some examples, an instance of a horizontal instruction is to include at least one field for an opcode, one or more fields to reference a first source operand, and one or more fields to reference a destination operand, wherein the opcode is to indicate that execution circuitry is, in response to a decoded instance of the single instruction, to at least perform a horizontal reduction using at least one data element of a non-masked data element position of at least the first source operand and store a result of the horizontal reduction in the destination operand.
    Type: Application
    Filed: July 2, 2022
    Publication date: January 4, 2024
    Inventors: Menachem ADELMAN, Amit GRADSTEIN, Regev SHEMY, Chitra NATARAJAN, Leonardo BORGES, Chytra SHIVASWAMY, Igor ERMOLAEV, Michael ESPIG, Or BEIT AHARON, Jeff WIEDEMEIER
  • Publication number: 20230409333
    Abstract: Techniques for performing prefix sums in response to a single instruction are describe are described. In some examples, the single instruction includes fields for an opcode, one or fields to reference a first source operand, one or fields to reference a second source operand, one or fields to reference a destination operand, wherein the opcode is to indicate that execution circuitry is, in response to a decoded instance of the single instruction, to at least: perform a prefix sum by for each non-masked data element position of the second source operand adding a data element of that data element position to each data element of preceding data element positions and adding at least one data element of a defined data element position of the first source operand, and store each prefix sum for each data element position of the second source operand into a corresponding data element position of the destination operand.
    Type: Application
    Filed: June 17, 2022
    Publication date: December 21, 2023
    Inventors: Menachem ADELMAN, Amit GRADSTEIN, Regev SHEMY, Chitra NATARAJAN, Igor ERMOLAEV
  • Patent number: 10929133
    Abstract: Systems, methods, and apparatuses relating to element sorting of vectors are described. In one embodiment, a processor includes a decoder to decode an instruction into a decoded instruction; and an execution unit to execute the decoded instruction to: provide storage for a comparison matrix to store a comparison value for each element of an input vector compared against the other elements of the input vector, perform a comparison operation on elements of the input vector corresponding to storage of comparison values above a main diagonal of the comparison matrix, perform a different operation on elements of the input vector corresponding to storage of comparison values below the main diagonal of the comparison matrix, and store results of the comparison operation and the different operation in the comparison matrix.
    Type: Grant
    Filed: January 16, 2019
    Date of Patent: February 23, 2021
    Assignee: Intel Corporation
    Inventors: Mikhail Plotnikov, Igor Ermolaev
  • Patent number: 10884750
    Abstract: A processor includes a decode circuit to decode an instruction into a decoded instruction and an execution circuit to execute the decoded instruction to access a first bit of a first input vector located at a bit position indicated by an element of a second input vector, stride over bits of the first input vector using a stride to access bits of the first input vector that are located at a strided bit position with respect to the first bit of the first input vector, and store the first bit of the first input vector and the bits of the first input vector that are located at a strided bit position with respect to the first bit of the first input vector as consecutive bits in a destination vector.
    Type: Grant
    Filed: February 28, 2017
    Date of Patent: January 5, 2021
    Assignee: INTEL CORPORATION
    Inventors: Mikhail Plotnikov, Igor Ermolaev
  • Patent number: 10838720
    Abstract: Systems, methods, and apparatuses relating to multiple source blend operations are described. In one embodiment, a processor is to execute an instruction to: receive a first input operand of a first input vector, a second input operand of a second input vector, and a third input operand of a third input vector, compare each element from the first input vector to each corresponding element of the second input vector to produce a first comparison vector, compare each element from the first input vector to each corresponding element of the third input vector to produce a second comparison vector, compare each element from the second input vector to each corresponding element of the third input vector to produce a third comparison vector, determine a middle value for each element position of the input vectors from the comparison vectors, and output the middle values into same element positions in an output vector.
    Type: Grant
    Filed: September 23, 2016
    Date of Patent: November 17, 2020
    Assignee: Intel Corporation
    Inventors: Mikhail Plotnikov, Igor Ermolaev
  • Publication number: 20190347101
    Abstract: Disclosed embodiments relate to vector compress2 and expand2 instructions with two memory locations. In one example, a system includes a memory and a processor that includes circuits to fetch, decode, and execute the instruction that includes an opcode, a first destination operand identifier, a second operand identifier, a source operand identifier, and a control mask, wherein, for each element of the source operand, the execution circuit is to generate a result by performing one of compression and expansion of the element; and, based on the value of a bit of the control mask corresponding to the element, store the result to a first location identified by the first destination operand identifier and increment the first destination operand identifier by a size of the result, and, otherwise, store the result to a second location identified by the second destination operand identifier and increment the second destination operand identifier by the size of the result.
    Type: Application
    Filed: April 6, 2017
    Publication date: November 14, 2019
    Applicant: Intel Corporation
    Inventors: Mikhail PLOTNIKOV, Igor ERMOLAEV, Alexander BOBYR
  • Publication number: 20190347104
    Abstract: A processor includes a decode circuit to decode an instruction into a decoded instruction and an execution circuit to execute the decoded instruction to access a first bit of a first input vector located at a bit position indicated by an element of a second input vector, stride over bits of the first input vector using a stride to access bits of the first input vector that are located at a strided bit position with respect to the first bit of the first input vector, and store the first bit of the first input vector and the bits of the first input vector that are located at a strided bit position with respect to the first bit of the first input vector as consecutive bits in a destination vector.
    Type: Application
    Filed: February 28, 2017
    Publication date: November 14, 2019
    Inventors: Mikhail PLOTNIKOV, Igor ERMOLAEV
  • Publication number: 20190146792
    Abstract: Systems, methods, and apparatuses relating to element sorting of vectors are described. In one embodiment, a processor includes a decoder to decode an instruction into a decoded instruction; and an execution unit to execute the decoded instruction to: provide storage for a comparison matrix to store a comparison value for each element of an input vector compared against the other elements of the input vector, perform a comparison operation on elements of the input vector corresponding to storage of comparison values above a main diagonal of the comparison matrix, perform a different operation on elements of the input vector corresponding to storage of comparison values below the main diagonal of the comparison matrix, and store results of the comparison operation and the different operation in the comparison matrix.
    Type: Application
    Filed: January 16, 2019
    Publication date: May 16, 2019
    Inventors: MIKHAIL PLOTNIKOV, IGOR ERMOLAEV
  • Patent number: 10191744
    Abstract: Systems, methods, and apparatuses relating to element sorting of vectors are described. In one embodiment, a processor includes a decoder to decode an instruction into a decoded instruction; and an execution unit to execute the decoded instruction to: provide storage for a comparison matrix to store a comparison value for each element of an input vector compared against the other elements of the input vector, perform a comparison operation on elements of the input vector corresponding to storage of comparison values above a main diagonal of the comparison matrix, perform a different operation on elements of the input vector corresponding to storage of comparison values below the main diagonal of the comparison matrix, and store results of the comparison operation and the different operation in the comparison matrix.
    Type: Grant
    Filed: July 1, 2016
    Date of Patent: January 29, 2019
    Assignee: Intel Corporation
    Inventors: Mikhail Plotnikov, Igor Ermolaev
  • Publication number: 20180088945
    Abstract: Systems, methods, and apparatuses relating to multiple source blend operations are described. In one embodiment, a processor is to execute an instruction to: receive a first input operand of a first input vector, a second input operand of a second input vector, and a third input operand of a third input vector, compare each element from the first input vector to each corresponding element of the second input vector to produce a first comparison vector, compare each element from the first input vector to each corresponding element of the third input vector to produce a second comparison vector, compare each element from the second input vector to each corresponding element of the third input vector to produce a third comparison vector, determine a middle value for each element position of the input vectors from the comparison vectors, and output the middle values into same element positions in an output vector.
    Type: Application
    Filed: September 23, 2016
    Publication date: March 29, 2018
    Inventors: MIKHAIL PLOTNIKOV, IGOR ERMOLAEV
  • Patent number: 9910670
    Abstract: A processor is described having an instruction execution pipeline. The instruction execution pipeline includes an instruction fetch stage to fetch an instruction. The instruction format of the instruction specifies a first input vector, a second input vector and a third input operand. The instruction execution pipeline comprises an instruction decode stage to decode the instruction. The instruction execution pipeline includes a functional unit to execute the instruction. The functional unit includes a routing network to route a first contiguous group of elements from a first end of one of the input vectors to a second end of the instruction's resultant vector, and, route a second contiguous group of elements from a second end of the other of the input vectors to a first end of the instruction's resultant vector. The first and second ends are opposite vector ends. The first and second groups of contiguous elements are defined from the third input operand.
    Type: Grant
    Filed: July 9, 2014
    Date of Patent: March 6, 2018
    Assignee: INTEL CORPORATION
    Inventors: Mikhail Plotnikov, Igor Ermolaev
  • Publication number: 20180004513
    Abstract: Systems, methods, and apparatuses relating to element sorting of vectors are described. In one embodiment, a processor incudes a decoder to decode an instruction into a decoded instruction; and an execution unit to execute the decoded instruction to: provide storage for a comparison matrix to store a comparison value for each element of an input vector compared against the other elements of the input vector, perform a comparison operation on elements of the input vector corresponding to storage of comparison values above a main diagonal of the comparison matrix, perform a different operation on elements of the input vector corresponding to storage of comparison values below the main diagonal of the comparison matrix, and store results of the comparison operation and the different operation in the comparison matrix.
    Type: Application
    Filed: July 1, 2016
    Publication date: January 4, 2018
    Inventors: MIKHAIL PLOTNIKOV, IGOR ERMOLAEV
  • Publication number: 20170329609
    Abstract: An apparatus and method for performing a spin-loop jump. One embodiment of a processor comprises: jump-pause execution logic to execute a jump-pause instruction, the jump-pause instruction to specify a condition and identify a destination instruction; wherein responsive to the execution of the jump-pause instruction, the jump-pause execution logic is to provide a hint that a loop between the jump-pause instruction and the destination instruction comprises a spin-wait loop and to test the condition, the jump-pause execution logic to delay execution by a specified amount prior to jumping to the destination instruction if the condition is satisfied.
    Type: Application
    Filed: December 17, 2014
    Publication date: November 16, 2017
    Inventors: DMITRY SIVKOV, IGOR ERMOLAEV
  • Publication number: 20170192789
    Abstract: Detailed herein are systems, methods, and apparatuses for improving vector throughput. For example, an apparatus comprising a plurality of aliasable registers, wherein each of the plurality of aliasable registers is partitioned into a plurality of lanes and each lane is aliasable as a distinct register; and execution circuitry to execute instructions using data from the plurality of aliasable registers as input and output operands is described.
    Type: Application
    Filed: December 30, 2015
    Publication date: July 6, 2017
    Inventors: Rama Kishnan V. Malladi, Elmoustapha Ould-Ahmed-Vall, Igor Ermolaev
  • Patent number: 9684510
    Abstract: Embodiments of systems, apparatuses, and methods for performing in a computer processor a data element shuffle and an operation on the shuffled data elements in response to a single data element shuffle and an operation instruction that includes a destination vector register operand, a first and second source vector register operands, an immediate value, and an opcode are described.
    Type: Grant
    Filed: December 28, 2015
    Date of Patent: June 20, 2017
    Assignee: Intel Corporation
    Inventors: Igor Ermolaev, Elmoustapha Ould-Ahmed-Vall, Bret L. Toll, Jesus Corbal San Adrian, Andrey Naraikin
  • Patent number: 9552205
    Abstract: A processor including a decode unit to receive a vector indexed load plus arithmetic and/or logical (A/L) operation plus store instruction. The instruction is to indicate a source packed memory indices operand that is to have a plurality of packed memory indices. The instruction is also to indicate a source packed data operand that is to have a plurality of packed data elements. The processor also includes an execution unit coupled with the decode unit. The execution unit, in response to the instruction, is to load a plurality of data elements from memory locations corresponding to the plurality of packed memory indices, perform A/L operations on the plurality of packed data elements of the source packed data operand and the loaded plurality of data elements, and store a plurality of result data elements in the memory locations corresponding to the plurality of packed memory indices.
    Type: Grant
    Filed: September 27, 2013
    Date of Patent: January 24, 2017
    Assignee: Intel Corporation
    Inventors: Igor Ermolaev, Bret L. Toll, Robert Valentine, Jesus Corbal San Adrian, Gautam B. Doshi, Rama Kishan V. Malladi, Prasenjit Chakraborty
  • Publication number: 20160196139
    Abstract: Embodiments of systems, apparatuses, and methods for performing in a computer processor a data element shuffle and an operation on the shuffled data elements in response to a single data element shuffle and an operation instruction that includes a destination vector register operand, a first and second source vector register operands, an immediate value, and an opcode are described.
    Type: Application
    Filed: December 28, 2015
    Publication date: July 7, 2016
    Inventors: Igor Ermolaev, Elmoustapha Ould-Ahmed-Vall, Bret L. Toll, Jesus Corbal San Adrian, Andrey Naraikin
  • Publication number: 20160011870
    Abstract: A processor is described having an instruction execution pipeline. The instruction execution pipeline includes an instruction fetch stage to fetch an instruction. The instruction format of the instruction specifies a first input vector, a second input vector and a third input operand. The instruction execution pipeline comprises an instruction decode stage to decode the instruction. The instruction execution pipeline includes a functional unit to execute the instruction. The functional unit includes a routing network to route a first contiguous group of elements from a first end of one of the input vectors to a second end of the instruction's resultant vector, and, route a second contiguous group of elements from a second end of the other of the input vectors to a first end of the instruction's resultant vector. The first and second ends are opposite vector ends. The first and second groups of contiguous elements are defined from the third input operand.
    Type: Application
    Filed: July 9, 2014
    Publication date: January 14, 2016
    Inventors: MIKHAIL PLOTNIKOV, IGOR ERMOLAEV
  • Patent number: 9218182
    Abstract: Embodiments of systems, apparatuses, and methods for performing in a computer processor a data element shuffle and an operation on the shuffled data elements in response to a single data element shuffle and an operation instruction that includes a destination vector register operand, a first and second source vector register operands, an immediate value, and an opcode are described.
    Type: Grant
    Filed: June 29, 2012
    Date of Patent: December 22, 2015
    Assignee: Intel Corporation
    Inventors: Igor Ermolaev, Elmoustapha Ould-Ahmed-Vall, Bret Toll, Jesus Corbal, Andrey Naraikin
  • Patent number: 9122475
    Abstract: A mask generating instruction is executed by a processor to improve efficiency of vector operations on an array of data elements. The processor includes vector registers, one of which stores data elements of an array. The processor further includes execution circuitry to receive a mask generating instruction that specifies at least a first operand and a second operand. Responsive to the mask generating instruction, the execution circuitry is to shift bits of the first operand to the left by a number of times defined in the second operand, and pull in a bit of one from the right each time a most significant bit of the first operand is shifted out from the left to generate a result. Each bit in the result corresponds to one of the data elements of the array.
    Type: Grant
    Filed: September 28, 2012
    Date of Patent: September 1, 2015
    Assignee: Intel Corporation
    Inventors: Mikhail Plotnikov, Igor Ermolaev, Andrey Naraikin, Robert Valentine