Patents by Inventor Stanislav Shwartsman

Stanislav Shwartsman has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

SYSTEMS, METHODS, AND APPARATUSES FOR TILE MATRIX MULTIPLICATION AND ACCUMULATION

Publication number: 20210349720

Abstract: Embodiments detailed herein relate to matrix operations. In particular, matrix (tile) multiply accumulate and negated matrix (tile) multiply accumulate are discussed. For example, in some embodiments decode circuitry to decode an instruction having fields for an opcode, an identifier for a first source matrix operand, an identifier of a second source matrix operand, and an identifier for a source/destination matrix operand; and execution circuitry to execute the decoded instruction to multiply the identified first source matrix operand by the identified second source matrix operand, add a result of the multiplication to the identified source/destination matrix operand, and store a result of the addition in the identified source/destination matrix operand and zero unconfigured columns of identified source/destination matrix operand are detailed.

Type: Application

Filed: July 22, 2021

Publication date: November 11, 2021

Inventors: Robert VALENTINE, Zeev SPERBER, Mark J. CHARNEY, Bret L. TOLL, Rinat RAPPOPORT, Stanislav SHWARTSMAN, Dan BAUM, Igor YANOVER, Elmoustapha OULD-AHMED-VALL, Menachem ADELMAN, Jesus CORBAL, Yuri GEBIL, Simon RUBANOVICH
Accelerating memory fault resolution by performing fast re-fetching

Patent number: 11150979

Abstract: A method for handling load faults in an out-of-order processor is described. The method includes detecting, by a memory ordering buffer of the out-of-order processor, a load fault corresponding to a load instruction that was executed out-of-order by the out-of-order processor; determining, by the memory ordering buffer, whether instant reclamation is available for resolving the load fault of the load instruction; and performing, in response to determining that instant reclamation is available for resolving the load fault of the load instruction, instant reclamation to re-fetch the load instruction for execution prior to attempting to retire the load instruction.

Type: Grant

Filed: August 13, 2019

Date of Patent: October 19, 2021

Assignee: Intel Corporation

Inventors: Zeev Sperber, Stanislav Shwartsman, Jared W. Stark, IV, Lihu Rappoport, Igor Yanover, George Leifman
Systems, methods, and apparatuses for tile matrix multiplication and accumulation

Patent number: 11086623

Abstract: Embodiments detailed herein relate to matrix operations. In particular, matrix (tile) multiply accumulate and negated matrix (tile) multiply accumulate are discussed. For example, in some embodiments decode circuitry to decode an instruction having fields for an opcode, an identifier for a first source matrix operand, an identifier of a second source matrix operand, and an identifier for a source/destination matrix operand; and execution circuitry to execute the decoded instruction to multiply the identified first source matrix operand by the identified second source matrix operand, add a result of the multiplication to the identified source/destination matrix operand, and store a result of the addition in the identified source/destination matrix operand and zero unconfigured columns of identified source/destination matrix operand are detailed.

Type: Grant

Filed: July 1, 2017

Date of Patent: August 10, 2021

Assignee: Intel Corporation

Inventors: Robert Valentine, Zeev Sperber, Mark J. Charney, Bret L. Toll, Rinat Rappoport, Stanislav Shwartsman, Dan Baum, Igor Yanover, Elmoustapha Ould-Ahmed-Vall, Menachem Adelman, Jesus Corbal, Yuri Gebil, Simon Rubanovich
System, method, and apparatus for enhanced pointer identification and prefetching

Patent number: 11080194

Abstract: System and method for prefetching pointer-referenced data. A method embodiment includes: tracking a plurality of load instructions which includes a first load instruction to access a first data that identifies a first memory location; detecting a second load instruction which accesses a second memory location for a second data, the second memory location matching the first memory location identified by the first data; responsive to the detecting, updating a list of pointer load instructions to include information identifying the first load instruction as a pointer load instruction; prefetching a third data for a third load instruction prior to executing the third load instruction; identifying the third load instruction as a pointer load instruction based on information from the list of pointer load instructions and responsively prefetching a fourth data from a fourth memory location, wherein the fourth memory location is identified by the third data.

Type: Grant

Filed: December 27, 2018

Date of Patent: August 3, 2021

Assignee: Intel Corporation

Inventors: Sreenivas Subramoney, Stanislav Shwartsman, Anant Nori, Shankar Balachandran, Elad Shtiegmann, Vineeth Mekkat, Manjunath Shevgoor, Sourabh Alurkar
Apparatus and Method for Store Pairing with Reduced Hardware Requirements

Publication number: 20210096860

Abstract: An apparatus and method for pairing store operations. For example, one embodiment of a processor comprises: a grouping eligibility checker to evaluate a plurality of store instructions based on a set of grouping rules to determine whether two or more of the plurality of store instructions are eligible for grouping; and a dispatcher to simultaneously dispatch a first group of store instructions of the plurality of store instructions determined to be eligible for grouping by the grouping eligibility checker.

Type: Application

Filed: March 28, 2020

Publication date: April 1, 2021

Inventors: Raanan SADE, Igor YANOVER, Stanislav SHWARTSMAN, Muhammad TAHER, David ZYSMAN, Liron ZUR, Yiftach GILAD
HARDWARE FOR SPLIT DATA TRANSLATION LOOKASIDE BUFFERS

Publication number: 20210034544

Abstract: Systems, methods, and apparatuses relating to hardware for split data translation lookaside buffers. In one embodiment, a processor includes a decode circuit to decode instructions into decoded instructions, an execution circuit to execute the decoded instructions, and a memory circuit comprising a load data translation lookaside buffer circuit and a store data translation lookaside buffer circuit separate and distinct from the load data translation lookaside buffer circuit, wherein the memory circuit sends a memory access request of the instructions to the load data translation lookaside buffer circuit when the memory access request is a load data request and to the store data translation lookaside buffer circuit when the memory access request is a store data request to determine a physical address for a virtual address of the memory access request.

Type: Application

Filed: March 9, 2020

Publication date: February 4, 2021

Inventors: Stanislav Shwartsman, Igor Yanover, Assaf Zaltsman, Ron Rais
Systems and methods to predict load data values

Patent number: 10761844

Abstract: Disclosed embodiments relate to predicting load data. In one example, a processor a pipeline having stages ordered as fetch, decode, allocate, write back, and commit, a training table to store an address, predicted data, a state, and a count of instances of unchanged return data, and tracking circuitry to determine, during one or more of the allocate and decode stages, whether a training table entry has a first state and matches a fetched first load instruction, and, if so, using the data predicted by the entry during the execute stage, the tracking circuitry further to update the training table during or after the write back stage to set the state of the first load instruction in the training table to the first state when the count reaches a first threshold.

Type: Grant

Filed: June 29, 2018

Date of Patent: September 1, 2020

Assignee: Intel Corporation

Inventors: Manjunath Shevgoor, Mark J. Dechene, Stanislav Shwartsman, Pavel I. Kryukov
Automatic predication of hard-to-predict convergent branches

Patent number: 10754655

Abstract: A processing device includes a branch IP table and branch predication circuitry coupled to the branch IP table. The branch predication circuitry to: determine a dynamic convergence point in a conditional branch of set of instructions; store the dynamic convergence point in the branch IP table; fetch a first and second speculative path of the conditional branch; while determining which of the first speculative path and the second speculative path is a taken path of the conditional branch and determining whether a dynamic convergence point is fetched corresponding to the stored dynamic convergence point, stall scheduling of instructions of the first speculative path and the second speculative path; and in response to determining that one of the first speculative path and the second speculative path is the taken path and the fetched dynamic convergence point corresponds to the stored convergence point, resume scheduling of the instructions of the taken path.

Type: Grant

Filed: June 28, 2018

Date of Patent: August 25, 2020

Assignee: Intel Corporation

Inventors: Adarsh Chauhan, Hong Wang, Jayesh Gaur, Zeev Sperber, Sumeet Bandishte, Lihu Rappoport, Stanislav Shwartsman, Kamil Garifullin, Sreenivas Subramoney, Adi Yoaz
SYSTEMS, METHODS, AND APPARATUSES FOR TILE LOAD

Publication number: 20200249949

Abstract: Embodiments detailed herein relate to matrix operations. In particular, the loading of a matrix (tile) from memory. For example, support for a loading instruction is described in the form of decode circuitry to decode an instruction having fields for an opcode, a destination matrix operand identifier, and source memory information, and execution circuitry to execute the decoded instruction to load groups of strided data elements from memory into configured rows of the identified destination matrix operand to memory.

Type: Application

Filed: July 1, 2017

Publication date: August 6, 2020

Applicant: Intel Corporation

Inventors: Robert VALENTINE, Menachem ADELMAN, Milind B. GIRKAR, Zeev SPERBER, Mark J. CHARNEY, Bret L. TOLL, Rinat RAPPOPORT, Jesus Corbal, Stanislav SHWARTSMAN, Dan BAUM, Igor YANOVER, Alexander F. HEINECKE, Barukh ZIV, Elmoustapha OULD-AHMED-VALL, Yuri GEBIL
SYSTEMS, METHODS, AND APPARATUSES FOR TILE BROADCAST

Publication number: 20200249947

Abstract: Embodiments detailed herein relate to matrix operations. In particular, embodiment of broadcasting elements are described. For example, some embodiments describe broadcasting a scalar to all configured data element positons of a destination matrix (tile). For example, some embodiments describe broadcasting a row to all configured data element positons of a destination matrix (tile). For example, some embodiments describe broadcasting a column to all configured data element positons of a destination matrix (tile).

Type: Application

Filed: July 1, 2017

Publication date: August 6, 2020

Applicant: Intel Corporation

Inventors: Robert VALENTINE, Zeev SPERBER, Mark J. CHARNEY, Bret L. TOLL, Jesus CORBAL, Alexander HEINECKE, Barukh ZIV, Dan BAUM, Elmoustapha OULD-AHMED-VALL, Stanislav SHWARTSMAN
SYSTEMS, METHODS, AND APPARATUSES FOR ZEROING A MATRIX

Publication number: 20200241873

Abstract: Embodiments detailed herein relate to matrix operations. In particular, performing a matrix operation of zeroing a matrix in response to a single instruction. For example, a processor detailed which includes decode circuitry to decode an instruction having fields for an opcode and a source/destination matrix operand identifier; and execution circuitry to execute the decoded instruction to zero each data element of the identified source/destination matrix.

Type: Application

Filed: July 1, 2017

Publication date: July 30, 2020

Applicant: Intel Corporation

Inventors: Robert VALENTINE, Menachem ADELMAN, Zeev SPERBER, Mark J. CHARNEY, Bret L. TOLL, Jesus CORBAL, Alexander F. HEINECKE, Barukh ZIV, Elmoustapha OULD-AHMED-VALL, Stanislav SHWARTSMAN
SYSTEMS, METHODS, AND APPARATUSES FOR TILE STORE

Publication number: 20200233666

Abstract: Embodiments detailed herein relate to matrix operations. In particular, the loading of a matrix (tile) from memory.

Type: Application

Filed: July 1, 2017

Publication date: July 23, 2020

Applicant: Intel Corporation

Inventors: Robert VALENTINE, Menachem ADELMAN, Elmoustapha OULD-AHMED-VALL, Bret L. TOLL, Milind B. GIRKAR, Zeev SPERBER, Mark J. CHARNEY, Rinat RAPPOPORT, Jesus CORBAL, Stanislav SHWARTSMAN, Igor YANOVER, Alexander F. HEINECKE, Barukh ZIV, Dan BAUM, Yuri GEBIL
SYSTEMS, METHODS, AND APPARATUSES FOR TILE MATRIX MULTIPLICATION AND ACCUMULATION

Publication number: 20200233667

Abstract: Embodiments detailed herein relate to matrix operations. In particular, matrix (tile) multiply accumulate and negated matrix (tile) multiply accumulate are discussed. For example, in some embodiments decode circuitry to decode an instruction having fields for an opcode, an identifier for a first source matrix operand, an identifier of a second source matrix operand, and an identifier for a source/destination matrix operand; and execution circuitry to execute the decoded instruction to multiply the identified first source matrix operand by the identified second source matrix operand, add a result of the multiplication to the identified source/destination matrix operand, and store a result of the addition in the identified source/destination matrix operand and zero unconfigured columns of identified source/destination matrix operand are detailed.

Type: Application

Filed: July 1, 2017

Publication date: July 23, 2020

Applicant: Intel Corporation

Inventors: Robert VALENTINE, Zeev SPERBER, Mark J. CHARNEY, Bret L. TOLL, Rinat RAPPOPORT, Stanislav SHWARTSMAN, Dan BAUM, Igor YANOVER, Elmoustapha OULD-AHMED-VALL, Menachem ADELMAN, Jesus CORBAL, Yuri GEBIL, Simon RUBANOVICH
Criticality based port scheduling

Patent number: 10719355

Abstract: A processor including an execution unit, an instruction scheduler circuit to identify a first instruction of an instruction stream, identify a second instruction on which execution of the first instruction depends, and assign a first dispatch priority value to the first instruction and the second instruction, and a dispatch circuit to dispatch, based on the first dispatch priority value, the first instruction and the second instruction to an instruction execution circuit.

Type: Grant

Filed: February 7, 2018

Date of Patent: July 21, 2020

Assignee: Intel Corporation

Inventors: Pooja Roy, Jayesh Gaur, Sreenivas Subramoney, Zeev Sperber, Alexandr Titov, Lihu Rappoport, Stanislav Shwartsman, Hong Wang, Adi Yoaz, Ronak Singhal, Robert S. Chappell
SYSTEM, METHOD, AND APPARATUS FOR ENHANCED POINTER IDENTIFICATION AND PREFETCHING

Publication number: 20200210339

Abstract: System and method for prefetching pointer-referenced data. A method embodiment includes: tracking a plurality of load instructions which includes a first load instruction to access a first data that identifies a first memory location; detecting a second load instruction which accesses a second memory location for a second data, the second memory location matching the first memory location identified by the first data; responsive to the detecting, updating a list of pointer load instructions to include information identifying the first load instruction as a pointer load instruction; prefetching a third data for a third load instruction prior to executing the third load instruction; identifying the third load instruction as a pointer load instruction based on information from the list of pointer load instructions and responsively prefetching a fourth data from a fourth memory location, wherein the fourth memory location is identified by the third data.

Type: Application

Filed: December 27, 2018

Publication date: July 2, 2020

Inventors: Sreenivas Subramoney, Stanislav Shwartsman, Anant Nori, Shankar Balachandran, Elad Shtiegmann, Vineeth Mekkat, Manjunath Shevgoor, Sourabh Alurkar
MICROPROCESSOR PIPELINE CIRCUITRY TO SUPPORT CRYPTOGRAPHIC COMPUTING

Publication number: 20200125769

Abstract: In one embodiment, a processor of a cryptographic computing system includes data cache units storing encrypted data and circuitry coupled to the data cache units. The circuitry accesses a sequence of cryptographic-based instructions to execute based on the encrypted data, decrypts the encrypted data based on a first pointer value, executes the cryptographic-based instruction using the decrypted data, encrypts a result of the execution of the cryptographic-based instruction based on a second pointer value, and stores the encrypted result in the data cache units. In some embodiments, the circuitry generates, for each cryptographic-based instruction, at least one encryption-based microoperation and at least one non-encryption-based microoperation. The circuitry also schedules the at least one encryption-based microoperation and the at least one non-encryption-based microoperation for execution based on timings of the encryption-based microoperation.

Type: Application

Filed: December 20, 2019

Publication date: April 23, 2020

Applicant: Intel Corporation

Inventors: Michael E. Kounavis, Santosh Ghosh, Sergej Deutsch, Michael LeMay, David M. Durham, Stanislav Shwartsman
Prefetch filter cache for a processor

Patent number: 10579530

Abstract: In an embodiment, a processor includes a plurality of cores, with at least one core including prefetch logic. The prefetch logic comprises circuitry to: receive a prefetch request; compare the received prefetch request to a plurality of entries of a prefetch filter cache; and in response to a determination that the received prefetch request matches one of the plurality of entries of the prefetch filter cache, drop the received prefetch request. Other embodiments are described and claimed.

Type: Grant

Filed: May 31, 2016

Date of Patent: March 3, 2020

Assignee: Intel Corporation

Inventors: Stanislav Shwartsman, Ron Rais
SYSTEMS, METHODS, AND APPARATUSES FOR MATRIX OPERATIONS

Publication number: 20200065352

Abstract: Embodiments detailed herein relate to matrix (tile) operations. For example, decode circuitry to decode an instruction having fields for an opcode and a memory address; and execution circuitry to execute the decoded instruction to set a tile configuration for the processor to utilize tiles in matrix operations based on a description retrieved from the memory address, wherein a tile a set of 2-dimensional registers are discussed.

Type: Application

Filed: July 1, 2017

Publication date: February 27, 2020

Applicant: Intel Corporation

Inventors: Robert VALENTINE, Mark J. CHARNEY, Elmoustapha OULD-AHMED-VALL, Dan BAUM, Zeev SPERBER, Jesus CORBAL, Bret L. TOLL, Raanan SADE, Igor YANOVER, Yuri GEBIL, Rinat RAPPOPORT, Stanislav SHWARTSMAN, Menachem ADELMAN, Simon RUBANOVICH
SYSTEMS AND METHODS TO PREDICT LOAD DATA VALUES

Publication number: 20200004536

Abstract: Disclosed embodiments relate to predicting load data. In one example, a processor a pipeline having stages ordered as fetch, decode, allocate, write back, and commit, a training table to store an address, predicted data, a state, and a count of instances of unchanged return data, and tracking circuitry to determine, during one or more of the allocate and decode stages, whether a training table entry has a first state and matches a fetched first load instruction, and, if so, using the data predicted by the entry during the execute stage, the tracking circuitry further to update the training table during or after the write back stage to set the state of the first load instruction in the training table to the first state when the count reaches a first threshold.

Type: Application

Filed: June 29, 2018

Publication date: January 2, 2020

Inventors: Manjunath SHEVGOOR, Mark J. DECHENE, Stanislav SHWARTSMAN, Pavel I. KRYUKOV
AUTOMATIC PREDICATION OF HARD-TO-PREDICT CONVERGENT BRANCHES

Publication number: 20200004542

Abstract: A processing device includes a branch IP table and branch predication circuitry coupled to the branch IP table. The branch predication circuitry to: determine a dynamic convergence point in a conditional branch of set of instructions; store the dynamic convergence point in the branch IP table; fetch a first and second speculative path of the conditional branch; while determining which of the first speculative path and the second speculative path is a taken path of the conditional branch and determining whether a dynamic convergence point is fetched corresponding to the stored dynamic convergence point, stall scheduling of instructions of the first speculative path and the second speculative path; and in response to determining that one of the first speculative path and the second speculative path is the taken path and the fetched dynamic convergence point corresponds to the stored convergence point, resume scheduling of the instructions of the taken path.

Type: Application

Filed: June 28, 2018

Publication date: January 2, 2020

Inventors: Adarsh Chauhan, Jayesh Gaur, Zeev Sperber, Sumeet Bandishte, Lihu Rappoport, Stanislav Shwartsman, Kamil Garifullin, Sreenivas Subramoney, Adi Yoaz, Hong Wang

prev 1 2 3 4 5 next