Patents by Inventor Sanu K. Mathew

Sanu K. Mathew has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

SIDE-CHANNEL RESISTANT BULK AES ENCRYPTION

Publication number: 20240007267

Abstract: In one example an apparatus comprises a first input node to receive a first plaintext input, a second input node to receive a second plaintext input, a third input node to receive a random mask and an advanced encryption standard (AES) circuitry configurable to operate in one of a first mode in which the random mask is added to the first plaintext input during one or more computations to convert the first plaintext input to a first ciphertext output, or a second mode in which the first plaintext input is converted to a first ciphertext output and the second plaintext input is converted to a second ciphertext output without using the random mask. Other examples may be described.

Type: Application

Filed: June 30, 2022

Publication date: January 4, 2024

Applicant: Intel Corporation

Inventors: Raghavan Kumar, Sanu K. Mathew
RECONFIGURABLE SIDE-CHANNEL RESISTANT DOUBLE-THROUGHPUT AES ACCELERATOR

Publication number: 20240007266

Abstract: In one example an apparatus comprises a first input node to receive a first plaintext input, a second input node to receive a random mask, an advanced encryption standard (AES) engine configurable to operate in one of a first mode in which the random mask is added to the first plaintext input during one or more computations performed by the AES engine, or second mode in which the random mask is not added to the first plaintext input during one or more computations performed by the AES engine. Other examples may be described.

Type: Application

Filed: June 30, 2022

Publication date: January 4, 2024

Applicant: Intel Corporation

Inventors: Raghavan Kumar, Vikram B. Suresh, Sanu K. Mathew
Instructions and logic to perform floating point and integer operations for machine learning

Patent number: 11720355

Abstract: One embodiment provides a graphics processor comprising a memory controller and a graphics processing resource coupled with the memory controller. The graphics processing resource includes circuitry configured to execute an instruction to perform a matrix operation on first input including weight data and second input including input activation data, generate intermediate data based on a result of the matrix operation, quantize the intermediate data to a floating-point format determined based on a statistical distribution of first output data, and output, as second output data, quantized intermediate data in a determined floating-point format.

Type: Grant

Filed: June 7, 2022

Date of Patent: August 8, 2023

Assignee: Intel Corporation

Inventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
AUTOMATIC ON-DIE FREQUENCY TUNING USING TUNABLE REPLICA CIRCUITS

Publication number: 20230195200

Abstract: Embodiments herein relate to optimizing the operation of multiple integrated circuits (ICs) operating in parallel. In one aspect, the ICs are arranged in a voltage-stacked configuration, and an operating frequency of each IC is controlled using a tunable replica circuit to stabilize its voltage drop. The tunable replica circuit mimics a critical path on the IC. In another aspect, an IC is divided into top and bottom portions which are in respective voltage domains on a substrate. The substrate include a deep n-well region for the higher voltage domain. In another aspect, a physically unclonable function (PUF) is used to generate identifiers for each IC among a multiple ICs on a board. Entropy sources of the PUF generate bits of the identifiers. Unstable entropy sources are identified and their bits are masked out.

Type: Application

Filed: June 3, 2022

Publication date: June 22, 2023

Inventors: Vikram B. Suresh, Sanu K. Mathew, Christopher Schaef, Chandra S. Katta, Long Sheng, Chin S. Park, Srinivasan Rajagopalan, Raju Rakha
Methods and apparatus to parallelize data decompression

Patent number: 11595055

Abstract: Methods and apparatus to parallelize data decompression are disclosed. An example method selecting initial starting positions in a compressed data bitstream; adjusting a first one of the initial starting positions to determine a first adjusted starting position by decoding the bitstream starting at a training position in the bitstream, the decoding including traversing the bitstream from the training position as though first data located at the training position is a valid token; outputting first decoded data generated by decoding a first segment of the bitstream starting from the first adjusted starting position; and merging the first decoded data with second decoded data generated by decoding a second segment of the bitstream, the decoding of the second segment starting from a second position in the bitstream and being performed in parallel with the decoding of the first segment, and the second segment preceding the first segment in the bitstream.

Type: Grant

Filed: January 27, 2022

Date of Patent: February 28, 2023

Assignee: Intel Corporation

Inventors: Vinodh Gopal, James D. Guilford, Sudhir K. Satpathy, Sanu K. Mathew
INSTRUCTIONS AND LOGIC TO PERFORM FLOATING POINT AND INTEGER OPERATIONS FOR MACHINE LEARNING

Publication number: 20230046506

Abstract: One embodiment provides for a graphics processing unit to accelerate machine-learning operations, the graphics processing unit comprising a multiprocessor having a single instruction, multiple thread (SIMT) architecture, the multiprocessor to execute at least one single instruction; and a first compute unit included within the multiprocessor, the at least one single instruction to cause the first compute unit to perform a two-dimensional matrix multiply and accumulate operation, wherein to perform the two-dimensional matrix multiply and accumulate operation includes to compute an intermediate product of 16-bit operands and to compute a 32-bit sum based on the intermediate product.

Type: Application

Filed: October 17, 2022

Publication date: February 16, 2023

Applicant: Intel Corporation

Inventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
INSTRUCTIONS AND LOGIC TO PERFORM FLOATING POINT AND INTEGER OPERATIONS FOR MACHINE LEARNING

Publication number: 20220357945

Abstract: One embodiment provides a graphics processor comprising a memory controller and a graphics processing resource coupled with the memory controller. The graphics processing resource includes circuitry configured to execute an instruction to perform a matrix operation on first input including weight data and second input including input activation data, generate intermediate data based on a result of the matrix operation, quantize the intermediate data to a floating-point format determined based on a statistical distribution of first output data, and output, as second output data, quantized intermediate data in a determined floating-point format.

Type: Application

Filed: June 7, 2022

Publication date: November 10, 2022

Applicant: Intel Corporation

Inventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
Method and apparatus to provide memory based physically unclonable functions

Patent number: 11483167

Abstract: Physically unclonable functions response in memory cells is improved by transistor sizing, transistor threshold voltage (VT) and body bias in the memory cell to improve the reproducibility of the memory cell and multiple Sense Amplifiers (SA) per column to further enhance physically unclonable function entropy. A physically unclonable function exploits a large number of read-sequence-order combinations available in a physically unclonable function memory array to generate an exponentially large challenge-response pair space, without incurring the area and energy costs of hosting and operating an exponentially large memory array.

Type: Grant

Filed: June 20, 2019

Date of Patent: October 25, 2022

Assignee: Intel Corporation

Inventors: Vikram B. Suresh, Manoj Sachdev, Sanu K. Mathew, Sudhir K. Satpathy
METHODS AND APPARATUS TO PARALLELIZE DATA DECOMPRESSION

Publication number: 20220224353

Abstract: Methods and apparatus to parallelize data decompression are disclosed. An example method selecting initial starting positions in a compressed data bitstream; adjusting a first one of the initial starting positions to determine a first adjusted starting position by decoding the bitstream starting at a training position in the bitstream, the decoding including traversing the bitstream from the training position as though first data located at the training position is a valid token; outputting first decoded data generated by decoding a first segment of the bitstream starting from the first adjusted starting position; and merging the first decoded data with second decoded data generated by decoding a second segment of the bitstream, the decoding of the second segment starting from a second position in the bitstream and being performed in parallel with the decoding of the first segment, and the second segment preceding the first segment in the bitstream.

Type: Application

Filed: January 27, 2022

Publication date: July 14, 2022

Inventors: Vinodh Gopal, James D. Guilford, Sudhir K. Satpathy, Sanu K. Mathew
Instructions and logic to perform floating point and integer operations for machine learning

Patent number: 11360767

Abstract: A processing apparatus is provided comprising a multiprocessor having a multithreaded architecture. The multiprocessor can execute at least one single instruction to perform parallel mixed precision matrix operations. In one embodiment the apparatus includes a memory interface and an array of multiprocessors coupled to the memory interface. At least one multiprocessor in the array of multiprocessors is configured to execute a fused multiply-add instruction in parallel across multiple threads.

Type: Grant

Filed: July 6, 2021

Date of Patent: June 14, 2022

Assignee: Intel Corporation

Inventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
Methods and apparatus to parallelize data decompression

Patent number: 11258459

Abstract: Methods and apparatus to parallelize data decompression are disclosed. An example method selecting initial starting positions in a compressed data bitstream; adjusting a first one of the initial starting positions to determine a first adjusted starting position by decoding the bitstream starting at a training position in the bitstream, the decoding including traversing the bitstream from the training position as though first data located at the training position is a valid token; outputting first decoded data generated by decoding a first segment of the bitstream starting from the first adjusted starting position; and merging the first decoded data with second decoded data generated by decoding a second segment of the bitstream, the decoding of the second segment starting from a second position in the bitstream and being performed in parallel with the decoding of the first segment, and the second segment preceding the first segment in the bitstream.

Type: Grant

Filed: August 18, 2020

Date of Patent: February 22, 2022

Assignee: INTEL CORPORATION

Inventors: Vinodh Gopal, James D. Guilford, Sudhir K. Satpathy, Sanu K. Mathew
INSTRUCTIONS AND LOGIC TO PERFORM FLOATING POINT AND INTEGER OPERATIONS FOR MACHINE LEARNING

Publication number: 20220019431

Abstract: A processing apparatus is provided comprising a multiprocessor having a multithreaded architecture. The multiprocessor can execute at least one single instruction to perform parallel mixed precision matrix operations. In one embodiment the apparatus includes a memory interface and an array of multiprocessors coupled to the memory interface. At least one multiprocessor in the array of multiprocessors is configured to execute a fused multiply-add instruction in parallel across multiple threads.

Type: Application

Filed: July 6, 2021

Publication date: January 20, 2022

Applicant: Intel Corporation

Inventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
Instructions and logic to perform floating-point and integer operations for machine learning

Patent number: 11169799

Abstract: One embodiment provides for a graphics processing unit to accelerate machine-learning operations, the graphics processing unit comprising a multiprocessor having a single instruction, multiple thread (SIMT) architecture, the multiprocessor to execute at least one single instruction; and a first compute unit included within the multiprocessor, the at least one single instruction to cause the first compute unit to perform a two-dimensional matrix multiply and accumulate operation, wherein to perform the two-dimensional matrix multiply and accumulate operation includes to compute a 32-bit intermediate product of 16-bit operands and to compute a 32-bit sum based on the 32-bit intermediate product.

Type: Grant

Filed: June 5, 2019

Date of Patent: November 9, 2021

Assignee: Intel Corporation

Inventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
Method and apparatus for energy efficient decompression using ordered tokens

Patent number: 11126663

Abstract: In one embodiment, an apparatus comprises a decompression engine to determine a plurality of tokens used to encode a block of data; populate a lookup table with at least two of the tokens in order of increasing token length; disable a first portion of the lookup table and enable a second portion of the lookup table based on a value of a payload of the block of data; and search for a match between a token and the payload in the second portion of the lookup table.

Type: Grant

Filed: May 25, 2017

Date of Patent: September 21, 2021

Assignee: Intel Corporation

Inventors: Sudhir K. Satpathy, Vikram B. Suresh, Sanu K. Mathew, Vinodh Gopal
Instructions and logic to perform floating point and integer operations for machine learning

Patent number: 11080046

Abstract: A processing apparatus is provided comprising a multiprocessor having a multithreaded architecture. The multiprocessor can execute at least one single instruction to perform parallel mixed precision matrix operations. In one embodiment the apparatus includes a memory interface and an array of multiprocessors coupled to the memory interface. At least one multiprocessor in the array of multiprocessors is configured to execute a fused multiply-add instruction in parallel across multiple threads.

Type: Grant

Filed: February 5, 2021

Date of Patent: August 3, 2021

Assignee: Intel Corporation

Inventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
METHODS AND APPARATUS TO PARALLELIZE DATA DECOMPRESSION

Publication number: 20210211139

Abstract: Methods and apparatus to parallelize data decompression are disclosed. An example method selecting initial starting positions in a compressed data bitstream; adjusting a first one of the initial starting positions to determine a first adjusted starting position by decoding the bitstream starting at a training position in the bitstream, the decoding including traversing the bitstream from the training position as though first data located at the training position is a valid token; outputting first decoded data generated by decoding a first segment of the bitstream starting from the first adjusted starting position; and merging the first decoded data with second decoded data generated by decoding a second segment of the bitstream, the decoding of the second segment starting from a second position in the bitstream and being performed in parallel with the decoding of the first segment, and the second segment preceding the first segment in the bitstream.

Type: Application

Filed: August 18, 2020

Publication date: July 8, 2021

Inventors: Vinodh Gopal, James D. Guilford, Sudhir K. Satpathy, Sanu K. Mathew
INSTRUCTIONS AND LOGIC TO PERFORM FLOATING POINT AND INTEGER OPERATIONS FOR MACHINE LEARNING

Publication number: 20210182058

Abstract: A processing apparatus is provided comprising a multiprocessor having a multithreaded architecture. The multiprocessor can execute at least one single instruction to perform parallel mixed precision matrix operations. In one embodiment the apparatus includes a memory interface and an array of multiprocessors coupled to the memory interface. At least one multiprocessor in the array of multiprocessors is configured to execute a fused multiply-add instruction in parallel across multiple threads.

Type: Application

Filed: February 5, 2021

Publication date: June 17, 2021

Applicant: Intel Corporation

Inventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
INSTRUCTIONS AND LOGIC TO PERFORM FLOATING POINT AND INTEGER OPERATIONS FOR MACHINE LEARNING

Publication number: 20210124579

Abstract: One embodiment provides for a graphics processing unit to accelerate machine-learning operations, the graphics processing unit comprising a multiprocessor having a single instruction, multiple thread (SIMT) architecture, the multiprocessor to execute at least one single instruction; and a first compute unit included within the multiprocessor, the at least one single instruction to cause the first compute unit to perform a two-dimensional matrix multiply and accumulate operation, wherein to perform the two-dimensional matrix multiply and accumulate operation includes to compute a 32-bit intermediate product of 16-bit operands and to compute a 32-bit sum based on the 32-bit intermediate product.

Type: Application

Filed: December 9, 2020

Publication date: April 29, 2021

Applicant: Intel Corporation

Inventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
TECHNOLOGIES FOR MEMORY AND I/O EFFICIENT OPERATIONS ON HOMOMORPHICALLY ENCRYPTED DATA

Publication number: 20210119766

Abstract: Technologies for memory and I/O efficient operations on homomorphically encrypted data are disclosed. In the illustrative embodiment, a cloud compute device is to perform operations on homomorphically encrypted data. In order to reduce memory storage space and network and I/O bandwidth, ciphertext blocks can be manipulated as data structures, allowing operands for operations on a compute engine to be created on the fly as the compute engine is performing other operations, using orders of magnitude less storage space and bandwidth.

Type: Application

Filed: December 24, 2020

Publication date: April 22, 2021

Inventors: Vikram B. Suresh, Rosario Cammarota, Sanu K. Mathew, Zeshan A. Chishti, Raghavan Kumar, Rafael Misoczki
Power side-channel attack resistant advanced encryption standard accelerator processor

Patent number: 10985903

Abstract: A processing system includes a processing core and a hardware accelerator communicatively coupled to the processing core. The hardware accelerator includes a random number generator to generate a byte order indicator. The hardware accelerator also includes a first switching module communicatively coupled to the random value indicator generator. The switching module receives an byte sequence in an encryption round of the cryptographic operation and feeds a portion of the input byte sequence to one of a first substitute box (S-box) module or a second S-box module in view of a byte order indicator value generated by the random number generator.

Type: Grant

Filed: October 12, 2018

Date of Patent: April 20, 2021

Assignee: Intel Corporation

Inventors: Raghavan Kumar, Sanu K. Mathew, Sudhir K. Satpathy, Vikram B. Suresh

1 2 3 4 5 … next