Patents by Inventor Milind Girkar

Milind Girkar has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

SYSTEMS, METHODS, AND APPARATUSES FOR MATRIX ADD, SUBTRACT, AND MULTIPLY

Publication number: 20240134644

Abstract: Embodiments detailed herein relate to matrix operations. In particular, support for matrix (tile) addition, subtraction, and multiplication is described. For example, circuitry to support instructions for element-by-element matrix (tile) addition, subtraction, and multiplication are detailed. In some embodiments, for matrix (tile) addition, decode circuitry is to decode an instruction having fields for an opcode, a first source matrix operand identifier, a second source matrix operand identifier, and a destination matrix operand identifier; and execution circuitry is to execute the decoded instruction to, for each data element position of the identified first source matrix operand: add a first data value at that data element position to a second data value at a corresponding data element position of the identified second source matrix operand, and store a result of the addition into a corresponding data element position of the identified destination matrix operand.

Type: Application

Filed: December 29, 2023

Publication date: April 25, 2024

Applicant: Intel Corporation

Inventors: Robert VALENTINE, Dan BAUM, Zeev SPERBER, Jesus CORBAL, Elmoustapha OULD-AHMED-VALL, Bret L. TOLL, Mark J. CHARNEY, Barukh ZIV, Alexander HEINECKE, Milind GIRKAR, Simon RUBANOVICH
SYSTEMS AND METHODS FOR EXECUTING A FUSED MULTIPLY-ADD INSTRUCTION FOR COMPLEX NUMBERS

Publication number: 20240126546

Abstract: Disclosed embodiments relate to executing a vector-complex fused multiply-add instruction. In one example, a method includes fetching an instruction, a format of the instruction including an opcode, a first source operand identifier, a second source operand identifier, and a destination operand identifier, wherein each of the identifiers identifies a location storing a packed data comprising at least one complex number, decoding the instruction, retrieving data associated with the first and second source operand identifiers, and executing the decoded instruction to, for each packed data element position of the identified first and second source operands, cross-multiply the real and imaginary components to generate four products: a product of real components, a product of imaginary components, and two mixed products, generate a complex result by using the four products according to the instruction, and store a result to the corresponding position of the identified destination operand.

Type: Application

Filed: December 28, 2023

Publication date: April 18, 2024

Inventors: Roman S. Dubtsov, Robert Valentine, Jesus Corbal, Milind Girkar, Elmoustapha Ould-Ahmed-Vall
HARDWARE ENHANCEMENTS FOR DOUBLE PRECISION SYSTOLIC SUPPORT

Publication number: 20240111826

Abstract: An apparatus to facilitate hardware enhancements for double precision systolic support is disclosed. The apparatus includes matrix acceleration hardware having double-precision (DP) matrix multiplication circuitry including a multiplier circuits to multiply pairs of input source operands in a DP floating-point format; adders to receive multiplier outputs from the multiplier circuits and accumulate the multiplier outputs in a high precision intermediate format; an accumulator circuit to accumulate adder outputs from the adders with at least one of a third global source operand on a first pass of the DP matrix multiplication circuitry or an intermediate result from the first pass on a second pass of the DP matrix multiplication circuitry, wherein the accumulator circuit to generate an accumulator output in the high precision intermediate format; and a down conversion and rounding circuit to down convert and round an output of the second pass as final result in the DP floating-point format.

Type: Application

Filed: September 30, 2022

Publication date: April 4, 2024

Applicant: Intel Corporation

Inventors: Jiasheng Chen, Kevin Hurd, Changwon Rhee, Jorge Parra, Fangwen Fu, Theo Drane, William Zorn, Peter Caday, Gregory Henry, Guei-Yuan Lueh, Farzad Chehrazi, Amit Karande, Turbo Majumder, Xinmin Tian, Milind Girkar, Hong Jiang
Protection of keys and sensitive data from attack within microprocessor architecture

Patent number: 11838418

Abstract: A processor core that includes a token generator circuit is to execute a first instruction in response to initialization of a software program that requests access to protected data output by a cryptographic operation. To execute the first instruction, the processor core is to: retrieve a key that is to be used by the cryptographic operation; trigger the token generator circuit to generate an authorization token; cryptographically encode the key and the authorization token within a key handle; store the key handle in memory; and embed the authorization token within a cryptographic instruction that is to perform the cryptographic operation. The cryptographic instruction may be associated with a first logical compartment of the software program that is authorized access to the protected data.

Type: Grant

Filed: August 20, 2020

Date of Patent: December 5, 2023

Assignee: Intel Corporation

Inventors: Milind Girkar, Jason W. Brandt, Michael LeMay
Systems, apparatuses, and methods for controllable sine and/or cosine operations

Patent number: 11579871

Abstract: Embodiments of systems, apparatuses, and methods for performing vector-packed controllable sine and/or cosine operations in a processor are described. For example, execution circuitry executes a decoded instruction to compute at least a real output value and an imaginary output value based on at least a cosine calculation and a sine calculation, the cosine and sine calculations each based on an index value from a packed data source operand, add the index value with an index increment value from the packed data source operand to create an updated index value, and store the real output value, the imaginary output value, and the updated index value to a packed data destination operand.

Type: Grant

Filed: June 14, 2021

Date of Patent: February 14, 2023

Assignee: Intel Corporation

Inventors: Venkateswara R. Madduri, Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Mark J. Charney, Carl Murray, Milind Girkar, Bret Toll
SYSTEMS, METHODS, AND APPARATUSES FOR MATRIX ADD, SUBTRACT, AND MULTIPLY

Publication number: 20220171623

Abstract: Embodiments detailed herein relate to matrix operations. In particular, support for matrix (tile) addition, subtraction, and multiplication is described. For example, circuitry to support instructions for element-by-element matrix (tile) addition, subtraction, and multiplication are detailed. In some embodiments, for matrix (tile) addition, decode circuitry is to decode an instruction having fields for an opcode, a first source matrix operand identifier, a second source matrix operand identifier, and a destination matrix operand identifier; and execution circuitry is to execute the decoded instruction to, for each data element position of the identified first source matrix operand: add a first data value at that data element position to a second data value at a corresponding data element position of the identified second source matrix operand, and store a result of the addition into a corresponding data element position of the identified destination matrix operand.

Type: Application

Filed: December 10, 2021

Publication date: June 2, 2022

Applicant: Intel Corporation

Inventors: Robert VALENTINE, Dan BAUM, Zeev SPERBER, Jesus CORBAL, Elmoustapha OULD-AHMED-VALL, Bret L. TOLL, Mark J. CHARNEY, Barukh ZIV, Alexander HEINECKE, Milind GIRKAR, Simon RUBANOVICH
SYSTEMS, APPARATUSES, AND METHODS FOR CONTROLLABLE SINE AND/OR COSINE OPERATIONS

Publication number: 20220035630

Abstract: Embodiments of systems, apparatuses, and methods for performing vector-packed controllable sine and/or cosine operations in a processor are described. For example, execution circuitry executes a decoded instruction to compute at least a real output value and an imaginary output value based on at least a cosine calculation and a sine calculation, the cosine and sine calculations each based on an index value from a packed data source operand, add the index value with an index increment value from the packed data source operand to create an updated index value, and store the real output value, the imaginary output value, and the updated index value to a packed data destination operand.

Type: Application

Filed: June 14, 2021

Publication date: February 3, 2022

Applicant: Intel Corporation

Inventors: Venkateswara R. MADDURI, Elmoustapha OULD-AHMED-VALL, Robert VALENTINE, Jesus CORBAL, Mark J. CHARNEY, Carl MURRAY, Milind GIRKAR, Bret TOLL
Systems, methods, and apparatuses for matrix add, subtract, and multiply

Patent number: 11200055

Abstract: Embodiments detailed herein relate to matrix operations. In particular, support for matrix (tile) addition, subtraction, and multiplication is described. For example, circuitry to support instructions for element-by-element matrix (tile) addition, subtraction, and multiplication are detailed. In some embodiments, for matrix (tile) addition, decode circuitry is to decode an instruction having fields for an opcode, a first source matrix operand identifier, a second source matrix operand identifier, and a destination matrix operand identifier; and execution circuitry is to execute the decoded instruction to, for each data element position of the identified first source matrix operand: add a first data value at that data element position to a second data value at a corresponding data element position of the identified second source matrix operand, and store a result of the addition into a corresponding data element position of the identified destination matrix operand.

Type: Grant

Filed: July 1, 2017

Date of Patent: December 14, 2021

Assignee: Intel Corporation

Inventors: Robert Valentine, Dan Baum, Zeev Sperber, Jesus Corbal, Elmoustapha Ould-Ahmed-Vall, Bret L. Toll, Mark J. Charney, Barukh Ziv, Alexander Heinecke, Milind Girkar, Simon Rubanovich
SYSTEMS AND METHODS FOR EXECUTING A FUSED MULTIPLY-ADD INSTRUCTION FOR COMPLEX NUMBERS

Publication number: 20210357217

Abstract: Disclosed embodiments relate to executing a vector-complex fused multiply-add Instruction. In one example, a method includes fetching an instruction, a format of the instruction including an opcode, a first source operand identifier, a second source operand identifier, and a destination operand identifier, wherein each of the identifiers identifies a location storing a packed data comprising at least one complex number, decoding the instruction, retrieving data associated with the first and second source operand identifiers, and executing the decoded instruction to, for each packed data element position of the identified first and second source operands, cross-multiply the real and imaginary components to generate four products: a product of real components, a product of imaginary components, and two mixed products, generate a complex result by using the four products according to the instruction, and store a result to the corresponding position of the identified destination operand.

Type: Application

Filed: June 1, 2021

Publication date: November 18, 2021

Inventors: Roman S. Dubtsov, Robert Valentine, Jesus Corbal, Milind Girkar, Elmoustapha Ould-Ahmed-Vall
Systems, apparatuses, and methods for controllable sine and/or cosine operations

Patent number: 11036499

Abstract: Embodiments of systems, apparatuses, and methods for performing controllable sine and/or cosine operations in a processor are described. For example, execution circuitry executes a decoded instruction to compute at least a real output value and an imaginary output value based on at least a cosine calculation and a sine calculation, the cosine and sine calculations each based on an index value from a packed data source operand, add the index value with an index increment value from the packed data source operand to create an updated index value, and store the real output value, the imaginary output value, and the updated index value to a packed data destination operand.

Type: Grant

Filed: June 30, 2017

Date of Patent: June 15, 2021

Assignee: Intel Corporation

Inventors: Venkateswara R. Madduri, Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Mark J. Charney, Carl Murray, Milind Girkar, Bret Toll
Systems and methods for executing a fused multiply-add instruction for complex numbers

Patent number: 11023231

Abstract: Disclosed embodiments relate to executing a vector-complex fused multiply-add Instruction. In one example, a method includes fetching an instruction, a format of the instruction including an opcode, a first source operand identifier, a second source operand identifier, and a destination operand identifier, wherein each of the identifiers identifies a location storing a packed data comprising at least one complex number, decoding the instruction, retrieving data associated with the first and second source operand identifiers, and executing the decoded instruction to, for each packed data element position of the identified first and second source operands, cross-multiply the real and imaginary components to generate four products: a product of real components, a product of imaginary components, and two mixed products, generate a complex result by using the four products according to the instruction, and store a result to the corresponding position of the identified destination operand.

Type: Grant

Filed: October 1, 2016

Date of Patent: June 1, 2021

Assignee: Intel Corporation

Inventors: Roman S. Dubtsov, Robert Valentine, Jesus Corbal, Milind Girkar, Elmoustapha Ould-Ahmed-Vall
Programmable event driven yield mechanism which may activate other threads

Patent number: 10877910

Abstract: Method, apparatus, and program means for a programmable event driven yield mechanism that may activate other threads. In one embodiment, an apparatus includes execution resources to execute a plurality of instructions and a monitor to detect a condition indicating a low level of progress. The monitor can disrupt processing of a program by transferring to a handler in response to detecting the condition indicating a low level of progress. In another embodiment, thread switch logic may be coupled to a plurality of event monitors which monitor events within the multithreading execution logic. The thread switch logic switches threads based at least partially on a programmable condition of one or more of the performance monitors.

Type: Grant

Filed: March 31, 2017

Date of Patent: December 29, 2020

Assignee: Intel Corporation

Inventors: Hong Wang, Per Hammarlund, Xiang Zou, John P. Shen, Xinmin Tian, Milind Girkar, Perry H. Wang, Piyush N. Desai
PROTECTION OF KEYS AND SENSITIVE DATA FROM ATTACK WITHIN MICROPROCESSOR ARCHITECTURE

Publication number: 20200382303

Abstract: A processor core that includes a token generator circuit is to execute a first instruction in response to initialization of a software program that requests access to protected data output by a cryptographic operation. To execute the first instruction, the processor core is to: retrieve a key that is to be used by the cryptographic operation; trigger the token generator circuit to generate an authorization token; cryptographically encode the key and the authorization token within a key handle; store the key handle in memory; and embed the authorization token within a cryptographic instruction that is to perform the cryptographic operation. The cryptographic instruction may be associated with a first logical compartment of the software program that is authorized access to the protected data.

Type: Application

Filed: August 20, 2020

Publication date: December 3, 2020

Applicant: Intel Corporation

Inventors: Milind Girkar, Jason W. Brandt, Michael LeMay
Protection of keys and sensitive data from attack within microprocessor architecture

Patent number: 10785028

Abstract: A processor core that includes a token generator circuit is to execute a first instruction in response to initialization of a software program that requests access to protected data output by a cryptographic operation. To execute the first instruction, the processor core is to: retrieve a key that is to be used by the cryptographic operation; trigger the token generator circuit to generate an authorization token; cryptographically encode the key and the authorization token within a key handle; store the key handle in memory; and embed the authorization token within a cryptographic instruction that is to perform the cryptographic operation. The cryptographic instruction may be associated with a first logical compartment of the software program that is authorized access to the protected data.

Type: Grant

Filed: June 29, 2018

Date of Patent: September 22, 2020

Assignee: Intel Corporation

Inventors: Milind Girkar, Jason W. Brandt, Michael LeMay
SYSTEMS, APPARATUSES, AND METHODS FOR VECTOR-PACKED FRACTIONAL MULTIPLICATION OF SIGNED WORDS WITH ROUNDING, SATURATION, AND HIGH-RESULT SELECTION

Publication number: 20200073635

Abstract: Embodiments of systems, apparatuses, and methods for vector-packed fractional multiplication of signed words with rounding, saturation, and high-result selection in a processor are described. For example, execution circuitry executes a decoded instruction to perform a fractional multiplication operation for each of a plurality of pairs of packed data elements to yield a plurality of output values, round each of the plurality of output values, detect whether any of the plurality of output values reflect an overflow or underflow, for any of the plurality of output values that reflect an overflow or underflow, saturate the output value, and store the plurality of output values into a corresponding plurality of positions of the packed data destination operand.

Type: Application

Filed: June 29, 2017

Publication date: March 5, 2020

Inventors: Venkateswara R. MADDURI, Elmoustapha OULD-AHMED-VALL, Robert VALENTINE, Jesus CORBAL, Mark J. CHARNEY, Carl MURRAY, Milind GIRKAR, Bret TOLL
SYSTEMS, APPARATUSES, AND METHODS FOR CONTROLLABLE SINE AND/OR COSINE OPERATIONS

Publication number: 20200073658

Abstract: Embodiments of systems, apparatuses, and methods for performing controllable sine and/or cosine operations in a processor are described. For example, execution circuitry executes a decoded instruction to compute at least a real output value and an imaginary output value based on at least a cosine calculation and a sine calculation, the cosine and sine calculations each based on an index value from a packed data source operand, add the index value with an index increment value from the packed data source operand to create an updated index value, and store the real output value, the imaginary output value, and the updated index value to a packed data destination operand.

Type: Application

Filed: June 30, 2017

Publication date: March 5, 2020

Inventors: Venkateswara R. MADDURI, Elmoustapha OULD-AHMED-VALL, Robert VALENTINE, Jesus CORBAL, Mark J. CHARNEY, Carl MURRAY, Milind GIRKAR, Bret TOLL
PROTECTION OF KEYS AND SENSITIVE DATA FROM ATTACK WITHIN MICROPROCESSOR ARCHITECTURE

Publication number: 20200007332

Abstract: A processor core that includes a token generator circuit is to execute a first instruction in response to initialization of a software program that requests access to protected data output by a cryptographic operation. To execute the first instruction, the processor core is to: retrieve a key that is to be used by the cryptographic operation; trigger the token generator circuit to generate an authorization token; cryptographically encode the key and the authorization token within a key handle; store the key handle in memory; and embed the authorization token within a cryptographic instruction that is to perform the cryptographic operation. The cryptographic instruction may be associated with a first logical compartment of the software program that is authorized access to the protected data.

Type: Application

Filed: June 29, 2018

Publication date: January 2, 2020

Inventors: Milind Girkar, Jason W. Brandt, Michael LeMay
SYSTEMS, METHODS, AND APPARATUSES FOR MATRIX ADD, SUBTRACT, AND MULTIPLY

Publication number: 20190347310

Abstract: Embodiments detailed herein relate to matrix operations. In particular, support for matrix (tile) addition, subtraction, and multiplication is described. For example, circuitry to support instructions for element-by-element matrix (tile) addition, subtraction, and multiplication are detailed. In some embodiments, for matrix (tile) addition, decode circuitry is to decode an instruction having fields for an opcode, a first source matrix operand identifier, a second source matrix operand identifier, and a destination matrix operand identifier; and execution circuitry is to execute the decoded instruction to, for each data element position of the identified first source matrix operand: add a first data value at that data element position to a second data value at a corresponding data element position of the identified second source matrix operand, and store a result of the addition into a corresponding data element position of the identified destination matrix operand.

Type: Application

Filed: July 1, 2017

Publication date: November 14, 2019

Applicant: Intel Corporation

Inventors: Robert VALENTINE, Dan BAUM, Zeev SPERBER, Jesus CORBAL, Elmoustapha OULD-AHMED-VALL, Bret L. TOLL, Mark J. CHARNEY, Barukh ZIV, Alexander HEINECKE, Milind GIRKAR, Simon RUBANOVICH
SYSTEMS, METHODS, AND APPARATUSES FOR TILE TRANSPOSE

Publication number: 20190347100

Abstract: Embodiments detailed herein relate to matrix operations. In particular, support for a matrix transpose instruction is detailed. In some embodiments, decode circuitry to decode an instruction having fields for an opcode, a source matrix operand identifier, and a destination matrix operand identifier; and execution circuitry to execute the decoded instruction to transpose each row of elements of the identified source matrix operand into a corresponding column of the identified destination matrix operand are detailed.

Type: Application

Filed: July 1, 2017

Publication date: November 14, 2019

Applicant: Intel Corporation

Inventors: Robert VALENTINE, Dan BAUM, Zeev SPERBER, Jesus CORBAL, Elmoustapha OULD-AHMED-VALL, Bret L. TOLL, Mark J. CHARNEY, Barukh ZIV, Alexander HEINECKE, Milind GIRKAR, Menachem ADELMAN, Simon RUBANOVICH
Programmable event driven yield mechanism which may activate other threads

Patent number: 10459858

Abstract: Method, apparatus, and program means for a programmable event driven yield mechanism that may activate other threads. In one embodiment, an apparatus includes execution resources to execute a plurality of instructions and a monitor to detect a condition indicating a low level of progress. The monitor can disrupt processing of a program by transferring to a handler in response to detecting the condition indicating a low level of progress. In another embodiment, thread switch logic may be coupled to a plurality of event monitors which monitor events within the multithreading execution logic. The thread switch logic switches threads based at least partially on a programmable condition of one or more of the performance monitors.

Type: Grant

Filed: November 6, 2017

Date of Patent: October 29, 2019

Assignee: Intel Corporation

Inventors: Hong Wang, Per Hammarlund, Xiang Zou, John P. Shen, Xinmin Tian, Milind Girkar, Perry H. Wang, Piyush N. Desai

1 2 3 next