Patents by Inventor Stanislav Shwartsman

Stanislav Shwartsman has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20210349720
    Abstract: Embodiments detailed herein relate to matrix operations. In particular, matrix (tile) multiply accumulate and negated matrix (tile) multiply accumulate are discussed. For example, in some embodiments decode circuitry to decode an instruction having fields for an opcode, an identifier for a first source matrix operand, an identifier of a second source matrix operand, and an identifier for a source/destination matrix operand; and execution circuitry to execute the decoded instruction to multiply the identified first source matrix operand by the identified second source matrix operand, add a result of the multiplication to the identified source/destination matrix operand, and store a result of the addition in the identified source/destination matrix operand and zero unconfigured columns of identified source/destination matrix operand are detailed.
    Type: Application
    Filed: July 22, 2021
    Publication date: November 11, 2021
    Inventors: Robert VALENTINE, Zeev SPERBER, Mark J. CHARNEY, Bret L. TOLL, Rinat RAPPOPORT, Stanislav SHWARTSMAN, Dan BAUM, Igor YANOVER, Elmoustapha OULD-AHMED-VALL, Menachem ADELMAN, Jesus CORBAL, Yuri GEBIL, Simon RUBANOVICH
  • Patent number: 11150979
    Abstract: A method for handling load faults in an out-of-order processor is described. The method includes detecting, by a memory ordering buffer of the out-of-order processor, a load fault corresponding to a load instruction that was executed out-of-order by the out-of-order processor; determining, by the memory ordering buffer, whether instant reclamation is available for resolving the load fault of the load instruction; and performing, in response to determining that instant reclamation is available for resolving the load fault of the load instruction, instant reclamation to re-fetch the load instruction for execution prior to attempting to retire the load instruction.
    Type: Grant
    Filed: August 13, 2019
    Date of Patent: October 19, 2021
    Assignee: Intel Corporation
    Inventors: Zeev Sperber, Stanislav Shwartsman, Jared W. Stark, IV, Lihu Rappoport, Igor Yanover, George Leifman
  • Patent number: 11086623
    Abstract: Embodiments detailed herein relate to matrix operations. In particular, matrix (tile) multiply accumulate and negated matrix (tile) multiply accumulate are discussed. For example, in some embodiments decode circuitry to decode an instruction having fields for an opcode, an identifier for a first source matrix operand, an identifier of a second source matrix operand, and an identifier for a source/destination matrix operand; and execution circuitry to execute the decoded instruction to multiply the identified first source matrix operand by the identified second source matrix operand, add a result of the multiplication to the identified source/destination matrix operand, and store a result of the addition in the identified source/destination matrix operand and zero unconfigured columns of identified source/destination matrix operand are detailed.
    Type: Grant
    Filed: July 1, 2017
    Date of Patent: August 10, 2021
    Assignee: Intel Corporation
    Inventors: Robert Valentine, Zeev Sperber, Mark J. Charney, Bret L. Toll, Rinat Rappoport, Stanislav Shwartsman, Dan Baum, Igor Yanover, Elmoustapha Ould-Ahmed-Vall, Menachem Adelman, Jesus Corbal, Yuri Gebil, Simon Rubanovich
  • Patent number: 11080194
    Abstract: System and method for prefetching pointer-referenced data. A method embodiment includes: tracking a plurality of load instructions which includes a first load instruction to access a first data that identifies a first memory location; detecting a second load instruction which accesses a second memory location for a second data, the second memory location matching the first memory location identified by the first data; responsive to the detecting, updating a list of pointer load instructions to include information identifying the first load instruction as a pointer load instruction; prefetching a third data for a third load instruction prior to executing the third load instruction; identifying the third load instruction as a pointer load instruction based on information from the list of pointer load instructions and responsively prefetching a fourth data from a fourth memory location, wherein the fourth memory location is identified by the third data.
    Type: Grant
    Filed: December 27, 2018
    Date of Patent: August 3, 2021
    Assignee: Intel Corporation
    Inventors: Sreenivas Subramoney, Stanislav Shwartsman, Anant Nori, Shankar Balachandran, Elad Shtiegmann, Vineeth Mekkat, Manjunath Shevgoor, Sourabh Alurkar
  • Publication number: 20210096860
    Abstract: An apparatus and method for pairing store operations. For example, one embodiment of a processor comprises: a grouping eligibility checker to evaluate a plurality of store instructions based on a set of grouping rules to determine whether two or more of the plurality of store instructions are eligible for grouping; and a dispatcher to simultaneously dispatch a first group of store instructions of the plurality of store instructions determined to be eligible for grouping by the grouping eligibility checker.
    Type: Application
    Filed: March 28, 2020
    Publication date: April 1, 2021
    Inventors: Raanan SADE, Igor YANOVER, Stanislav SHWARTSMAN, Muhammad TAHER, David ZYSMAN, Liron ZUR, Yiftach GILAD
  • Publication number: 20210034544
    Abstract: Systems, methods, and apparatuses relating to hardware for split data translation lookaside buffers. In one embodiment, a processor includes a decode circuit to decode instructions into decoded instructions, an execution circuit to execute the decoded instructions, and a memory circuit comprising a load data translation lookaside buffer circuit and a store data translation lookaside buffer circuit separate and distinct from the load data translation lookaside buffer circuit, wherein the memory circuit sends a memory access request of the instructions to the load data translation lookaside buffer circuit when the memory access request is a load data request and to the store data translation lookaside buffer circuit when the memory access request is a store data request to determine a physical address for a virtual address of the memory access request.
    Type: Application
    Filed: March 9, 2020
    Publication date: February 4, 2021
    Inventors: Stanislav Shwartsman, Igor Yanover, Assaf Zaltsman, Ron Rais
  • Patent number: 10761844
    Abstract: Disclosed embodiments relate to predicting load data. In one example, a processor a pipeline having stages ordered as fetch, decode, allocate, write back, and commit, a training table to store an address, predicted data, a state, and a count of instances of unchanged return data, and tracking circuitry to determine, during one or more of the allocate and decode stages, whether a training table entry has a first state and matches a fetched first load instruction, and, if so, using the data predicted by the entry during the execute stage, the tracking circuitry further to update the training table during or after the write back stage to set the state of the first load instruction in the training table to the first state when the count reaches a first threshold.
    Type: Grant
    Filed: June 29, 2018
    Date of Patent: September 1, 2020
    Assignee: Intel Corporation
    Inventors: Manjunath Shevgoor, Mark J. Dechene, Stanislav Shwartsman, Pavel I. Kryukov
  • Patent number: 10754655
    Abstract: A processing device includes a branch IP table and branch predication circuitry coupled to the branch IP table. The branch predication circuitry to: determine a dynamic convergence point in a conditional branch of set of instructions; store the dynamic convergence point in the branch IP table; fetch a first and second speculative path of the conditional branch; while determining which of the first speculative path and the second speculative path is a taken path of the conditional branch and determining whether a dynamic convergence point is fetched corresponding to the stored dynamic convergence point, stall scheduling of instructions of the first speculative path and the second speculative path; and in response to determining that one of the first speculative path and the second speculative path is the taken path and the fetched dynamic convergence point corresponds to the stored convergence point, resume scheduling of the instructions of the taken path.
    Type: Grant
    Filed: June 28, 2018
    Date of Patent: August 25, 2020
    Assignee: Intel Corporation
    Inventors: Adarsh Chauhan, Hong Wang, Jayesh Gaur, Zeev Sperber, Sumeet Bandishte, Lihu Rappoport, Stanislav Shwartsman, Kamil Garifullin, Sreenivas Subramoney, Adi Yoaz
  • Publication number: 20200249949
    Abstract: Embodiments detailed herein relate to matrix operations. In particular, the loading of a matrix (tile) from memory. For example, support for a loading instruction is described in the form of decode circuitry to decode an instruction having fields for an opcode, a destination matrix operand identifier, and source memory information, and execution circuitry to execute the decoded instruction to load groups of strided data elements from memory into configured rows of the identified destination matrix operand to memory.
    Type: Application
    Filed: July 1, 2017
    Publication date: August 6, 2020
    Applicant: Intel Corporation
    Inventors: Robert VALENTINE, Menachem ADELMAN, Milind B. GIRKAR, Zeev SPERBER, Mark J. CHARNEY, Bret L. TOLL, Rinat RAPPOPORT, Jesus Corbal, Stanislav SHWARTSMAN, Dan BAUM, Igor YANOVER, Alexander F. HEINECKE, Barukh ZIV, Elmoustapha OULD-AHMED-VALL, Yuri GEBIL
  • Publication number: 20200249947
    Abstract: Embodiments detailed herein relate to matrix operations. In particular, embodiment of broadcasting elements are described. For example, some embodiments describe broadcasting a scalar to all configured data element positons of a destination matrix (tile). For example, some embodiments describe broadcasting a row to all configured data element positons of a destination matrix (tile). For example, some embodiments describe broadcasting a column to all configured data element positons of a destination matrix (tile).
    Type: Application
    Filed: July 1, 2017
    Publication date: August 6, 2020
    Applicant: Intel Corporation
    Inventors: Robert VALENTINE, Zeev SPERBER, Mark J. CHARNEY, Bret L. TOLL, Jesus CORBAL, Alexander HEINECKE, Barukh ZIV, Dan BAUM, Elmoustapha OULD-AHMED-VALL, Stanislav SHWARTSMAN
  • Publication number: 20200241873
    Abstract: Embodiments detailed herein relate to matrix operations. In particular, performing a matrix operation of zeroing a matrix in response to a single instruction. For example, a processor detailed which includes decode circuitry to decode an instruction having fields for an opcode and a source/destination matrix operand identifier; and execution circuitry to execute the decoded instruction to zero each data element of the identified source/destination matrix.
    Type: Application
    Filed: July 1, 2017
    Publication date: July 30, 2020
    Applicant: Intel Corporation
    Inventors: Robert VALENTINE, Menachem ADELMAN, Zeev SPERBER, Mark J. CHARNEY, Bret L. TOLL, Jesus CORBAL, Alexander F. HEINECKE, Barukh ZIV, Elmoustapha OULD-AHMED-VALL, Stanislav SHWARTSMAN
  • Publication number: 20200233666
    Abstract: Embodiments detailed herein relate to matrix operations. In particular, the loading of a matrix (tile) from memory.
    Type: Application
    Filed: July 1, 2017
    Publication date: July 23, 2020
    Applicant: Intel Corporation
    Inventors: Robert VALENTINE, Menachem ADELMAN, Elmoustapha OULD-AHMED-VALL, Bret L. TOLL, Milind B. GIRKAR, Zeev SPERBER, Mark J. CHARNEY, Rinat RAPPOPORT, Jesus CORBAL, Stanislav SHWARTSMAN, Igor YANOVER, Alexander F. HEINECKE, Barukh ZIV, Dan BAUM, Yuri GEBIL
  • Publication number: 20200233667
    Abstract: Embodiments detailed herein relate to matrix operations. In particular, matrix (tile) multiply accumulate and negated matrix (tile) multiply accumulate are discussed. For example, in some embodiments decode circuitry to decode an instruction having fields for an opcode, an identifier for a first source matrix operand, an identifier of a second source matrix operand, and an identifier for a source/destination matrix operand; and execution circuitry to execute the decoded instruction to multiply the identified first source matrix operand by the identified second source matrix operand, add a result of the multiplication to the identified source/destination matrix operand, and store a result of the addition in the identified source/destination matrix operand and zero unconfigured columns of identified source/destination matrix operand are detailed.
    Type: Application
    Filed: July 1, 2017
    Publication date: July 23, 2020
    Applicant: Intel Corporation
    Inventors: Robert VALENTINE, Zeev SPERBER, Mark J. CHARNEY, Bret L. TOLL, Rinat RAPPOPORT, Stanislav SHWARTSMAN, Dan BAUM, Igor YANOVER, Elmoustapha OULD-AHMED-VALL, Menachem ADELMAN, Jesus CORBAL, Yuri GEBIL, Simon RUBANOVICH
  • Patent number: 10719355
    Abstract: A processor including an execution unit, an instruction scheduler circuit to identify a first instruction of an instruction stream, identify a second instruction on which execution of the first instruction depends, and assign a first dispatch priority value to the first instruction and the second instruction, and a dispatch circuit to dispatch, based on the first dispatch priority value, the first instruction and the second instruction to an instruction execution circuit.
    Type: Grant
    Filed: February 7, 2018
    Date of Patent: July 21, 2020
    Assignee: Intel Corporation
    Inventors: Pooja Roy, Jayesh Gaur, Sreenivas Subramoney, Zeev Sperber, Alexandr Titov, Lihu Rappoport, Stanislav Shwartsman, Hong Wang, Adi Yoaz, Ronak Singhal, Robert S. Chappell
  • Publication number: 20200210339
    Abstract: System and method for prefetching pointer-referenced data. A method embodiment includes: tracking a plurality of load instructions which includes a first load instruction to access a first data that identifies a first memory location; detecting a second load instruction which accesses a second memory location for a second data, the second memory location matching the first memory location identified by the first data; responsive to the detecting, updating a list of pointer load instructions to include information identifying the first load instruction as a pointer load instruction; prefetching a third data for a third load instruction prior to executing the third load instruction; identifying the third load instruction as a pointer load instruction based on information from the list of pointer load instructions and responsively prefetching a fourth data from a fourth memory location, wherein the fourth memory location is identified by the third data.
    Type: Application
    Filed: December 27, 2018
    Publication date: July 2, 2020
    Inventors: Sreenivas Subramoney, Stanislav Shwartsman, Anant Nori, Shankar Balachandran, Elad Shtiegmann, Vineeth Mekkat, Manjunath Shevgoor, Sourabh Alurkar
  • Publication number: 20200125769
    Abstract: In one embodiment, a processor of a cryptographic computing system includes data cache units storing encrypted data and circuitry coupled to the data cache units. The circuitry accesses a sequence of cryptographic-based instructions to execute based on the encrypted data, decrypts the encrypted data based on a first pointer value, executes the cryptographic-based instruction using the decrypted data, encrypts a result of the execution of the cryptographic-based instruction based on a second pointer value, and stores the encrypted result in the data cache units. In some embodiments, the circuitry generates, for each cryptographic-based instruction, at least one encryption-based microoperation and at least one non-encryption-based microoperation. The circuitry also schedules the at least one encryption-based microoperation and the at least one non-encryption-based microoperation for execution based on timings of the encryption-based microoperation.
    Type: Application
    Filed: December 20, 2019
    Publication date: April 23, 2020
    Applicant: Intel Corporation
    Inventors: Michael E. Kounavis, Santosh Ghosh, Sergej Deutsch, Michael LeMay, David M. Durham, Stanislav Shwartsman
  • Patent number: 10579530
    Abstract: In an embodiment, a processor includes a plurality of cores, with at least one core including prefetch logic. The prefetch logic comprises circuitry to: receive a prefetch request; compare the received prefetch request to a plurality of entries of a prefetch filter cache; and in response to a determination that the received prefetch request matches one of the plurality of entries of the prefetch filter cache, drop the received prefetch request. Other embodiments are described and claimed.
    Type: Grant
    Filed: May 31, 2016
    Date of Patent: March 3, 2020
    Assignee: Intel Corporation
    Inventors: Stanislav Shwartsman, Ron Rais
  • Publication number: 20200065352
    Abstract: Embodiments detailed herein relate to matrix (tile) operations. For example, decode circuitry to decode an instruction having fields for an opcode and a memory address; and execution circuitry to execute the decoded instruction to set a tile configuration for the processor to utilize tiles in matrix operations based on a description retrieved from the memory address, wherein a tile a set of 2-dimensional registers are discussed.
    Type: Application
    Filed: July 1, 2017
    Publication date: February 27, 2020
    Applicant: Intel Corporation
    Inventors: Robert VALENTINE, Mark J. CHARNEY, Elmoustapha OULD-AHMED-VALL, Dan BAUM, Zeev SPERBER, Jesus CORBAL, Bret L. TOLL, Raanan SADE, Igor YANOVER, Yuri GEBIL, Rinat RAPPOPORT, Stanislav SHWARTSMAN, Menachem ADELMAN, Simon RUBANOVICH
  • Publication number: 20200004536
    Abstract: Disclosed embodiments relate to predicting load data. In one example, a processor a pipeline having stages ordered as fetch, decode, allocate, write back, and commit, a training table to store an address, predicted data, a state, and a count of instances of unchanged return data, and tracking circuitry to determine, during one or more of the allocate and decode stages, whether a training table entry has a first state and matches a fetched first load instruction, and, if so, using the data predicted by the entry during the execute stage, the tracking circuitry further to update the training table during or after the write back stage to set the state of the first load instruction in the training table to the first state when the count reaches a first threshold.
    Type: Application
    Filed: June 29, 2018
    Publication date: January 2, 2020
    Inventors: Manjunath SHEVGOOR, Mark J. DECHENE, Stanislav SHWARTSMAN, Pavel I. KRYUKOV
  • Publication number: 20200004542
    Abstract: A processing device includes a branch IP table and branch predication circuitry coupled to the branch IP table. The branch predication circuitry to: determine a dynamic convergence point in a conditional branch of set of instructions; store the dynamic convergence point in the branch IP table; fetch a first and second speculative path of the conditional branch; while determining which of the first speculative path and the second speculative path is a taken path of the conditional branch and determining whether a dynamic convergence point is fetched corresponding to the stored dynamic convergence point, stall scheduling of instructions of the first speculative path and the second speculative path; and in response to determining that one of the first speculative path and the second speculative path is the taken path and the fetched dynamic convergence point corresponds to the stored convergence point, resume scheduling of the instructions of the taken path.
    Type: Application
    Filed: June 28, 2018
    Publication date: January 2, 2020
    Inventors: Adarsh Chauhan, Jayesh Gaur, Zeev Sperber, Sumeet Bandishte, Lihu Rappoport, Stanislav Shwartsman, Kamil Garifullin, Sreenivas Subramoney, Adi Yoaz, Hong Wang