Patents by Inventor Mrinmay DUTTA

Mrinmay DUTTA has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240126544
    Abstract: Disclosed embodiments relate to instructions for fused multiply-add (FMA) operations with variable-precision inputs. In one example, a processor to execute an asymmetric FMA instruction includes fetch circuitry to fetch an FMA instruction having fields to specify an opcode, a destination, and first and second source vectors having first and second widths, respectively, decode circuitry to decode the fetched FMA instruction, and a single instruction multiple data (SIMD) execution circuit to process as many elements of the second source vector as fit into an SIMD lane width by multiplying each element by a corresponding element of the first source vector, and accumulating a resulting product with previous contents of the destination, wherein the SIMD lane width is one of 16 bits, 32 bits, and 64 bits, the first width is one of 4 bits and 8 bits, and the second width is one of 1 bit, 2 bits, and 4 bits.
    Type: Application
    Filed: December 28, 2023
    Publication date: April 18, 2024
    Inventors: Dipankar DAS, Naveen K. MELLEMPUDI, Mrinmay DUTTA, Arun KUMAR, Dheevatsa MUDIGERE, Abhisek KUNDU
  • Patent number: 11900107
    Abstract: Disclosed embodiments relate to instructions for fused multiply-add (FMA) operations with variable-precision inputs. In one example, a processor to execute an asymmetric FMA instruction includes fetch circuitry to fetch an FMA instruction having fields to specify an opcode, a destination, and first and second source vectors having first and second widths, respectively, decode circuitry to decode the fetched FMA instruction, and a single instruction multiple data (SIMD) execution circuit to process as many elements of the second source vector as fit into an SIMD lane width by multiplying each element by a corresponding element of the first source vector, and accumulating a resulting product with previous contents of the destination, wherein the SIMD lane width is one of 16 bits, 32 bits, and 64 bits, the first width is one of 4 bits and 8 bits, and the second width is one of 1 bit, 2 bits, and 4 bits.
    Type: Grant
    Filed: March 25, 2022
    Date of Patent: February 13, 2024
    Assignee: Intel Corporation
    Inventors: Dipankar Das, Naveen K. Mellempudi, Mrinmay Dutta, Arun Kumar, Dheevatsa Mudigere, Abhisek Kundu
  • Publication number: 20230195417
    Abstract: One embodiment provides a processor comprising at least one of a first mask to receive a first input operand and a second input operand and to generate a selected portion of an AND of a sum of the first input operand and the second input operand using an AND chain of the first mask in parallel with generation of the sum by an adder; and a second mask to receive the first input operand and the second input operand and to generate the selected portion of an OR of the sum using an OR chain of the second mask in parallel with generation of the sum.
    Type: Application
    Filed: December 22, 2021
    Publication date: June 22, 2023
    Applicant: Intel Corporation
    Inventors: Mrinmay Dutta, Simon Rubanovich, Amit Gradstein, Zeev Sperber
  • Publication number: 20220214877
    Abstract: Disclosed embodiments relate to instructions for fused multiply-add (FMA) operations with variable-precision inputs. In one example, a processor to execute an asymmetric FMA instruction includes fetch circuitry to fetch an FMA instruction having fields to specify an opcode, a destination, and first and second source vectors having first and second widths, respectively, decode circuitry to decode the fetched FMA instruction, and a single instruction multiple data (SIMD) execution circuit to process as many elements of the second source vector as fit into an SIMD lane width by multiplying each element by a corresponding element of the first source vector, and accumulating a resulting product with previous contents of the destination, wherein the SIMD lane width is one of 16 bits, 32 bits, and 64 bits, the first width is one of 4 bits and 8 bits, and the second width is one of 1 bit, 2 bits, and 4 bits.
    Type: Application
    Filed: March 25, 2022
    Publication date: July 7, 2022
    Inventors: Dipankar DAS, Naveen K. MELLEMPUDI, Mrinmay DUTTA, Arun KUMAR, Dheevatsa MUDIGERE, Abhisek KUNDU
  • Patent number: 11321086
    Abstract: Disclosed embodiments relate to instructions for fused multiply-add (FMA) operations with variable-precision inputs. In one example, a processor to execute an asymmetric FMA instruction includes fetch circuitry to fetch an FMA instruction having fields to specify an opcode, a destination, and first and second source vectors having first and second widths, respectively, decode circuitry to decode the fetched FMA instruction, and a single instruction multiple data (SIMD) execution circuit to process as many elements of the second source vector as fit into an SIMD lane width by multiplying each element by a corresponding element of the first source vector, and accumulating a resulting product with previous contents of the destination, wherein the SIMD lane width is one of 16 bits, 32 bits, and 64 bits, the first width is one of 4 bits and 8 bits, and the second width is one of 1 bit, 2 bits, and 4 bits.
    Type: Grant
    Filed: January 6, 2020
    Date of Patent: May 3, 2022
    Assignee: Intel Corporation
    Inventors: Dipankar Das, Naveen K. Mellempudi, Mrinmay Dutta, Arun Kumar, Dheevatsa Mudigere, Abhisek Kundu
  • Publication number: 20220100517
    Abstract: Disclosed embodiments relate to systems and methods to performing instructions structured to compute a plurality of cryptic rounds of the block cipher. In one example, a processor includes fetch and decode circuitry to fetch and decode a single instruction comprising a first field to identify a destination of a first operand, a second field to identify a source of a second operand comprising an input state, a third field to identify a source of a third operand comprising a round key. The processor includes execution circuitry to execute the decoded instruction to compute a plurality of cryptic rounds of the block cipher by performing a round function on data elements of the second operand and the third operand to generate a word.
    Type: Application
    Filed: September 26, 2020
    Publication date: March 31, 2022
    Inventors: Ilya Albrekht, Wajdi Feghali, Regev Shemy, Or Beit Aharon, Mrinmay Dutta, Vinodh Gopal, Vikram B. Suresh
  • Patent number: 11175891
    Abstract: Disclosed embodiments relate to performing floating-point addition with selected rounding. In one example, a processor includes circuitry to decode and execute an instruction specifying locations of first and second floating-point (FP) sources, and an opcode indicating the processor is to: bring the FP sources into alignment by shifting a mantissa of the smaller source FP operand to the right by a difference between their exponents, generating rounding controls based on any bits that escape; simultaneously generate a sum of the FP sources and of the FP sources plus one, the sums having a fuzzy-Jbit format having an additional Jbit into which a carry-out, if any, select one of the sums based on the rounding controls, and generate a result comprising a mantissa-wide number of most-significant bits of the selected sum, starting with the most significant non-zero Jbit.
    Type: Grant
    Filed: March 30, 2019
    Date of Patent: November 16, 2021
    Assignee: Intel Corporation
    Inventors: Simon Rubanovich, Amit Gradstein, Zeev Sperber, Mrinmay Dutta
  • Publication number: 20200310756
    Abstract: Disclosed embodiments relate to performing floating-point addition with selected rounding. In one example, a processor includes circuitry to decode and execute an instruction specifying locations of first and second floating-point (FP) sources, and an opcode indicating the processor is to: bring the FP sources into alignment by shifting a mantissa of the smaller source FP operand to the right by a difference between their exponents, generating rounding controls based on any bits that escape; simultaneously generate a sum of the FP sources and of the FP sources plus one, the sums having a fuzzy-Jbit format having an additional Jbit into which a carry-out, if any, select one of the sums based on the rounding controls, and generate a result comprising a mantissa-wide number of most-significant bits of the selected sum, starting with the most significant non-zero Jbit.
    Type: Application
    Filed: March 30, 2019
    Publication date: October 1, 2020
    Applicant: Intel Corporation
    Inventors: Simon RUBANOVICH, Amit GRADSTEIN, Zeev SPERBER, Mrinmay DUTTA
  • Publication number: 20200257527
    Abstract: Disclosed embodiments relate to instructions for fused multiply-add (FMA) operations with variable-precision inputs. In one example, a processor to execute an asymmetric FMA instruction includes fetch circuitry to fetch an FMA instruction having fields to specify an opcode, a destination, and first and second source vectors having first and second widths, respectively, decode circuitry to decode the fetched FMA instruction, and a single instruction multiple data (SIMD) execution circuit to process as many elements of the second source vector as fit into an SIMD lane width by multiplying each element by a corresponding element of the first source vector, and accumulating a resulting product with previous contents of the destination, wherein the SIMD lane width is one of 16 bits, 32 bits, and 64 bits, the first width is one of 4 bits and 8 bits, and the second width is one of 1 bit, 2 bits, and 4 bits.
    Type: Application
    Filed: January 6, 2020
    Publication date: August 13, 2020
    Applicant: Intel Corporation
    Inventors: Dipankar DAS, Naveen K. MELLEMPUDI, Mrinmay DUTTA, Arun KUMAR, Dheevatsa MUDIGERE, Abhisek KUNDU
  • Patent number: 10528346
    Abstract: Disclosed embodiments relate to instructions for fused multiply-add (FMA) operations with variable-precision inputs. In one example, a processor to execute an asymmetric FMA instruction includes fetch circuitry to fetch an FMA instruction having fields to specify an opcode, a destination, and first and second source vectors having first and second widths, respectively, decode circuitry to decode the fetched FMA instruction, and a single instruction multiple data (SIMD) execution circuit to process as many elements of the second source vector as fit into an SIMD lane width by multiplying each element by a corresponding element of the first source vector, and accumulating a resulting product with previous contents of the destination, wherein the SIMD lane width is one of 16 bits, 32 bits, and 64 bits, the first width is one of 4 bits and 8 bits, and the second width is one of 1 bit, 2 bits, and 4 bits.
    Type: Grant
    Filed: March 29, 2018
    Date of Patent: January 7, 2020
    Assignee: Intel Corporation
    Inventors: Dipankar Das, Naveen K. Mellempudi, Mrinmay Dutta, Arun Kumar, Dheevatsa Mudigere, Abhisek Kundu
  • Publication number: 20190042242
    Abstract: Disclosed embodiments relate to instructions for fused multiply-add (FMA) operations with variable-precision inputs. In one example, a processor to execute an asymmetric FMA instruction includes fetch circuitry to fetch an FMA instruction having fields to specify an opcode, a destination, and first and second source vectors having first and second widths, respectively, decode circuitry to decode the fetched FMA instruction, and a single instruction multiple data (SIMD) execution circuit to process as many elements of the second source vector as fit into an SIMD lane width by multiplying each element by a corresponding element of the first source vector, and accumulating a resulting product with previous contents of the destination, wherein the SIMD lane width is one of 16 bits, 32 bits, and 64 bits, the first width is one of 4 bits and 8 bits, and the second width is one of 1 bit, 2 bits, and 4 bits.
    Type: Application
    Filed: March 29, 2018
    Publication date: February 7, 2019
    Inventors: Dipankar DAS, Naveen K. MELLEPUDI, Mrinmay DUTTA, Arun KUMAR, Dheevatsa MUDIGERE, Abhisek KUNDU