Patents by Inventor Vinodh Gopal

Vinodh Gopal has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Circuitry and methods for low-latency page decompression and compression acceleration

Patent number: 12367164

Abstract: Systems, methods, and apparatuses to low-latency page decompression and compression acceleration are described. In one embodiment, a system on a chip (SoC) includes a hardware processor core, and an accelerator circuit coupled to the hardware processor core, the accelerator circuit comprising a decompressor circuit and a direct memory access circuit to: in response to a first descriptor sent from the hardware processor core, cause the decompressor circuit to decompress compressed data from the direct memory access circuit into decompressed data and store the decompressed data in a buffer in the accelerator circuit, and in response to a second descriptor sent from the hardware processor core separately from the first descriptor, cause the decompressed data to be written from the buffer to memory external to the accelerator circuit by the direct memory access circuit.

Type: Grant

Filed: December 24, 2020

Date of Patent: July 22, 2025

Assignee: Intel Corporation

Inventors: Vinodh Gopal, George Powley
SYSTEMS AND METHODS FOR PERFORMING MATRIX COMPRESS AND DECOMPRESS INSTRUCTIONS

Publication number: 20250199812

Abstract: Disclosed embodiments relate to matrix compress/decompress instructions. In one example, a processor includes fetch circuitry to fetch a compress instruction having a format with fields to specify an opcode and locations of decompressed source and compressed destination matrices, decode circuitry to decode the fetched compress instructions, and execution circuitry, responsive to the decoded compress instruction, to: generate a compressed result according to a compress algorithm by compressing the specified decompressed source matrix by either packing non-zero-valued elements together and storing the matrix position of each non-zero-valued element in a header, or using fewer bits to represent one or more elements and using the header to identify matrix elements being represented by fewer bits; and store the compressed result to the specified compressed destination matrix.

Type: Application

Filed: December 23, 2024

Publication date: June 19, 2025

Inventors: Dan BAUM, Michael ESPIG, James GUILFORD, Wajdi K. FEGHALI, Raanan SADE, Christopher J. HUGHES, Robert VALENTINE, Bret TOLL, Elmoustapha OULD-AHMED-VALL, Mark J. CHARNEY, Vinodh GOPAL, Ronen ZOHAR, Alexander F. HEINECKE
OS-TRANSPARENT MEMORY DECOMPRESSION WITH HARDWARE ACCELERATION

Publication number: 20250190360

Abstract: Methods and apparatus for Operating System (OS)-transparent memory decompression with hardware acceleration. A physical address space for system memory is partitioned into compressed and uncompressed partitions. A core issues a memory Read request and on-chip L1, L2, and a last level cache (LLC) are checked, with misses leading to page table lookups to determine where in system memory the requested data are stored. When stored in the compressed partition, a compressed page table is searched to find the location of the compressed form of the data on a memory device. The compressed data are read from the memory device, decompressed using hardware acceleration and returned to the requesting core without writing the data to the uncompressed partition. Under one approach, a compressed page containing the requested data is decompressed and written to the LLC. When data (e.g.

Type: Application

Filed: February 10, 2025

Publication date: June 12, 2025

Inventors: Stijn EYERMAN, Wim HEIRMAN, Vinodh GOPAL, Wajdi FEGHALI
Instructions and logic to provide SIMD SM4 cryptographic block cipher functionality

Patent number: 12323515

Abstract: Instructions and logic provide for a Single Instruction Multiple Data (SIMD) SM4 round slice operation. Embodiments of an instruction specify a first and a second source data operand set, and substitution function indicators, e.g. in an immediate operand. Embodiments of a processor may include encryption units, responsive to the first instruction, to: perform a slice of SM4-round exchanges on a portion of the first source data operand set with a corresponding keys from the second source data operand set in response to a substitution function indicator that indicates a first substitution function, perform a slice of SM4 key generations using another portion of the first source data operand set with corresponding constants from the second source data operand set in response to a substitution function indicator that indicates a second substitution function, and store a set of result elements of the first instruction in a SIMD destination register.

Type: Grant

Filed: December 18, 2023

Date of Patent: June 3, 2025

Assignee: Intel Corporation

Inventors: Sean M. Gulley, Gilbert M. Wolrich, Vinodh Gopal, Kirk S. Yap, Wajdi K. Feghali
Instruction and Micro-Architecture Support for Decompression on Core

Publication number: 20250117329

Abstract: Methods and apparatus relating to an instruction and/or micro-architecture support for decompression on core are described. In an embodiment, decode circuitry decodes a decompression instruction into a first micro operation and a second micro operation. The first micro operation causes one or more load operations to fetch data into one or more cachelines of a cache of a processor core. Decompression Engine (DE) circuitry decompresses the fetched data from the one or more cachelines of the cache of the processor core in response to the second micro operation. Other embodiments are also disclosed and claimed.

Type: Application

Filed: November 14, 2024

Publication date: April 10, 2025

Applicant: Intel Corporation

Inventors: Jayesh Gaur, Adarsh Chauhan, Vinodh Gopal, Vedvyas Shanbhogue, Sreenivas Subramoney, Wajdi Feghali
Efficient encryption in VPN sessions

Patent number: 12255921

Abstract: Methods, apparatus, and software for efficient encryption in virtual private network (VPN) sessions. A VPN link and an auxiliary link (and associated sessions) are established between computing platforms to support end-to-end communication between respective application running on the platforms. The VPN link may employ a conventional VPN protocol such as TLS or IPsec, while the auxiliary link comprises a NULL encryption VPN tunnel. To transfer data, a determination is made to whether the data are encrypted or non-encrypted. Encrypted data are transferred over the auxiliary link to avoid re-encryption of the data. Non-encrypted are transferred over the VPN link. TLS and IPsec VPN agents may be used to assist in setting up the VPN and auxiliary sessions. The techniques avoid double encryption of VPN traffic, while ensuring that various types of traffic transferred between platforms is encrypted.

Type: Grant

Filed: June 16, 2021

Date of Patent: March 18, 2025

Assignee: Intel Corporation

Inventors: Akhilesh S. Thyagaturu, Vinodh Gopal
Verifying compressed stream fused with copy or transform operations

Patent number: 12242851

Abstract: Methods and apparatus relating to verifying a compressed stream fused with copy or transform operation(s) are described. In an embodiment, compression logic circuitry compresses input data and stores the compressed data in a temporary buffer. The compression logic circuitry determines a first checksum value corresponding to the compressed data stored in the temporary buffer. Decompression logic circuitry performs a decompress-verify operation and a copy operation. The decompress-verify operation decompresses the compressed data stored in the temporary buffer to determine a second checksum value corresponding to the decompressed data from the temporary buffer. The copy operation transfers the compressed data from the temporary buffer to a destination buffer in response to a match between the first checksum value and the second checksum value. Other embodiments are also disclosed and claimed.

Type: Grant

Filed: September 9, 2021

Date of Patent: March 4, 2025

Assignee: Intel Corporation

Inventors: Vinodh Gopal, James D. Guilford, Daniel F. Cutter
METHOD AND APPARATUS FOR DECOMPRESSION HARDWARE COPY ENGINE WITH EFFICIENT SEQUENCE OVERLAPPING COPY

Publication number: 20250004772

Abstract: Apparatus and method for a decompression hardware copy engine with efficient sequence overlapping copy. For example, one embodiment of an apparatus comprises: a plurality of processing cores, one or more of the plurality of processing cores to execute program code to produce a plurality of literals and sequences from a compressed data stream; and decompression acceleration circuitry to generate a decompressed data stream based on the plurality of literals and sequences, the decompression acceleration circuitry comprising: a sequence pre-processor circuit to process batches of sequences of the plurality of sequences and generate a plurality of copy instructions, the sequence pre-processor circuit to merge multiple copy operations corresponding to multiple sequences into a merged copy instruction; and a copy engine circuit to execute the copy instructions to produce the decompressed data stream.

Type: Application

Filed: June 30, 2023

Publication date: January 2, 2025

Inventors: Kamlesh PILLAI, Vinodh GOPAL, Gurpreet Singh KALSI, Sreenivas SUBRAMONEY, Wajdi K. FEGHALI
Instruction and micro-architecture support for decompression on core

Patent number: 12182018

Abstract: Methods and apparatus relating to an instruction and/or micro-architecture support for decompression on core are described. In an embodiment, decode circuitry decodes a decompression instruction into a first micro operation and a second micro operation. The first micro operation causes one or more load operations to fetch data into one or more cachelines of a cache of a processor core. Decompression Engine (DE) circuitry decompresses the fetched data from the one or more cachelines of the cache of the processor core in response to the second micro operation. Other embodiments are also disclosed and claimed.

Type: Grant

Filed: December 23, 2020

Date of Patent: December 31, 2024

Assignee: Intel Corporation

Inventors: Jayesh Gaur, Adarsh Chauhan, Vinodh Gopal, Vedvyas Shanbhogue, Sreenivas Subramoney, Wajdi Feghali
Process-based multi-key total memory encryption

Patent number: 12175274

Abstract: Systems, methods, and circuitries are disclosed for a per-process memory encryption system. At least one translation lookaside buffer (TLB) is configured to encode key identifiers for keys in one or more bits of either the virtual memory address or the physical address. The process state memory configured to store a first process key table for a first process that maps key identifiers to unique keys and a second process key table that maps the key identifiers to different unique keys. The active process key table memory configured to store an active key table. In response to a request for data corresponding to a virtual memory address, the at least one TLB is configured to provide a key identifier for the data to the active process key table to cause the active process key table to return the unique key mapped to the key identifier.

Type: Grant

Filed: August 26, 2022

Date of Patent: December 24, 2024

Assignee: Tahoe Research, Ltd.

Inventors: Wajdi Feghali, Vinodh Gopal, Kirk S. Yap, Sean Gulley, Raghunandan Makaram
Systems and methods for performing matrix compress and decompress instructions

Patent number: 12175246

Abstract: Disclosed embodiments relate to matrix compress/decompress instructions. In one example, a processor includes fetch circuitry to fetch a compress instruction having a format with fields to specify an opcode and locations of decompressed source and compressed destination matrices, decode circuitry to decode the fetched compress instructions, and execution circuitry, responsive to the decoded compress instruction, to: generate a compressed result according to a compress algorithm by compressing the specified decompressed source matrix by either packing non-zero-valued elements together and storing the matrix position of each non-zero-valued element in a header, or using fewer bits to represent one or more elements and using the header to identify matrix elements being represented by fewer bits; and store the compressed result to the specified compressed destination matrix.

Type: Grant

Filed: September 1, 2023

Date of Patent: December 24, 2024

Assignee: Intel Corporation

Inventors: Dan Baum, Michael Espig, James Guilford, Wajdi K. Feghali, Raanan Sade, Christopher J. Hughes, Robert Valentine, Bret Toll, Elmoustapha Ould-Ahmed-Vall, Mark J. Charney, Vinodh Gopal, Ronen Zohar, Alexander F. Heinecke
TECHNOLOGIES FOR PROCESSING PACKETS ON A NETWORK INTERFACE CONTROLLER WITH HIGH-BANDWIDTH MEMORY CHIPLETS

Publication number: 20240364643

Abstract: Techniques for processing packets on a network interface controller (NIC) with memory chiplets are disclosed. In an illustrative embodiment, a NIC includes a disaggregated memory with several high-bandwidth memory chiplets spread out in various locations on the NIC. The disaggregated nature of the memory can improve latency, throughput, and scalability as well as improve thermal performance by distributing heat generation to different locations on the NIC. In use, ports of the NIC can be configured to identify packets associated with certain flows and direct those packets to queues on the NIC. Direct memory access circuitry can copy the packets from queues on the NIC to queues on the system memory. This chain of copying packets from the port to the system memory creates a kind of virtual circuit, delivering packets directly to applications with low latency.

Type: Application

Filed: April 28, 2023

Publication date: October 31, 2024

Applicant: Intel Corporation

Inventors: Sandeep K. Jain, Akhilesh Thyagaturu, Joshua B. Fryman, Poornima Lalwaney, Vinodh Gopal, Brian Paul Johnson
Compressed cache memory with decompress on fault

Patent number: 12130738

Abstract: An embodiment of an integrated circuit may comprise, coupled to a core, a hardware decompression accelerator, a compressed cache, a processor and communicatively coupled to the hardware decompression accelerator and the compressed cache, and memory and communicatively coupled to the processor, wherein the memory stores microcode instructions which when executed by the processor causes the processor to store a first address to a decompression work descriptor, retrieve a second address where a compressed page is stored in the compressed cache from the decompression work descriptor at the first address in response to an indication of a page fault, and send instructions to the hardware decompression accelerator to decompress the compressed page at the second address. Other embodiments are disclosed and claimed.

Type: Grant

Filed: December 22, 2020

Date of Patent: October 29, 2024

Assignee: Intel Corporation

Inventors: Vedvyas Shanbhogue, Jayesh Gaur, Wajdi K. Feghali, Vinodh Gopal, Utkarsh Kakaiya
Circuitry and methods for low-latency efficient chained decryption and decompression acceleration

Patent number: 12118130

Abstract: Systems, methods, and apparatuses for low-latency page efficient chained decryption and decompression acceleration are described.

Type: Grant

Filed: March 27, 2021

Date of Patent: October 15, 2024

Assignee: Intel Corporation

Inventor: Vinodh Gopal
Error resilient cryptographic units and methods

Patent number: 12081649

Abstract: An apparatus of an aspect includes an encryption unit to receive unencrypted data. The encryption unit is to encrypt the unencrypted data to generate encrypted data. The apparatus also includes circuitry coupled with the encryption unit. The circuitry is to generate a first checksum for a copy of the unencrypted data, generate a second checksum for a copy of the encrypted data, and combine the first and second checksums to generate a first value.

Type: Grant

Filed: September 2, 2020

Date of Patent: September 3, 2024

Assignee: Intel Corporation

Inventors: Vinodh Gopal, Kirk Yap
Flexible compression header and code generation

Patent number: 12074618

Abstract: An embodiment of an integrated circuit may comprise a hardware compressor to compress data, the hardware compressor including circuitry to store input data in a history buffer, compute one or more code tables based on the input data, and compute a compression stream header based on the computed one or more code tables. Other embodiments are disclosed and claimed.

Type: Grant

Filed: December 21, 2020

Date of Patent: August 27, 2024

Assignee: Intel Corporation

Inventors: James Guilford, Vinodh Gopal, Daniel Cutter
Managing state in accelerators

Patent number: 12052370

Abstract: Methods and apparatus for managing state in accelerators. An accelerator performs processing operations on a data chunk relating to a job submitted to the accelerator. During or following processing the data chunk, the accelerator generates state information corresponding to its current state and stores the state information or, optionally, the accelerator state information is obtained and stored by privileged software. In connection with continued processing of the current data chunk or a next job and next data chunk, the accelerator accesses previously stored state information identified by the job and validates the state information was generated by itself, another accelerator, or privileged software. Valid state information is then reloaded to restore the state of the accelerator/process state, and processing continues. The chunk processing, accelerator state store, validation, and restore operations are repeated to process subsequent jobs.

Type: Grant

Filed: May 4, 2020

Date of Patent: July 30, 2024

Assignee: Intel Corporation

Inventor: Vinodh Gopal
Application programming interface for fine grained low latency decompression within processor core

Patent number: 12028094

Abstract: Methods and apparatus relating to an Application Programming Interface (API) for fine grained low latency decompression within a processor core are described. In an embodiment, a decompression Application Programming Interface (API) receives an input handle to a data object. The data object includes compressed data and metadata. Decompression Engine (DE) circuitry decompresses the compressed data to generate uncompressed data. The DE circuitry decompress the compressed data in response to invocation of a decompression instruction by the decompression API. The metadata comprises a first operand to indicate a location of the compressed data, a second operand to indicate a size of the compressed data, a third operand to indicate a location to which decompressed data by the DE circuitry is to be stored, and a fourth operand to indicate a size of the decompressed data. Other embodiments are also disclosed and claimed.

Type: Grant

Filed: December 23, 2020

Date of Patent: July 2, 2024

Inventors: Jayesh Gaur, Adarsh Chauhan, Vinodh Gopal, Vedvyas Shanbhogue, Sreenivas Subramoney, Wajdi Feghali
Method and apparatus for efficient deflate decompression using content-addressable data structures

Patent number: 12021551

Abstract: Apparatus and method for efficient compression block decoding using content-addressable structure for header processing. For example, one embodiment of an apparatus comprises: a header parser to extract a sequence of tokens and corresponding length values from a header of a compression block, the tokens and corresponding length values associated with a type of compression used to compress a payload of the compression block; and a content-addressable data structure builder to construct a content-addressable data structure based on the tokens and length values, the content-addressable data structure builder to write an entry in the content-addressable data structure comprising a length value and a count value, the count value indicating a number of times the length value was previously written to an entry in the content-addressable data structure.

Type: Grant

Filed: December 23, 2020

Date of Patent: June 25, 2024

Assignee: Intel Corporation

Inventors: James Guilford, Vinodh Gopal, Daniel Cutter, Kirk Yap
Apparatus and method for low-latency decompression acceleration via a single job descriptor

Patent number: 11989582

Abstract: Apparatus and method for performing low-latency multi-job submission via a single job descriptor is described herein. An apparatus embodiment includes a plurality of descriptor queues to stores job descriptors describing work to be performed and enqueue circuitry to receive a first job descriptor which includes a first field to store a Single Instruction Multiple Data (SIMD) width. If the SIMD width indicates that the first job descriptor is an SIMD job descriptor and open slots are available in the descriptor queues to store new job descriptors, then the enqueue circuitry is to generate a plurality of job descriptors based on fields of the first job descriptor and to store them in the open slots of the descriptor queues. The generated job descriptors are processed by processing pipelines to perform the work described. At least some of the generated job descriptors are processed concurrently or in parallel by different processing pipelines.

Type: Grant

Filed: September 26, 2020

Date of Patent: May 21, 2024

Assignee: Intel Corporation

Inventors: James Guilford, George Powley, Vinodh Gopal, Wajdi Feghali

1 2 3 4 5 … next