Patents by Inventor Vinodh Gopal
Vinodh Gopal has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20250117329Abstract: Methods and apparatus relating to an instruction and/or micro-architecture support for decompression on core are described. In an embodiment, decode circuitry decodes a decompression instruction into a first micro operation and a second micro operation. The first micro operation causes one or more load operations to fetch data into one or more cachelines of a cache of a processor core. Decompression Engine (DE) circuitry decompresses the fetched data from the one or more cachelines of the cache of the processor core in response to the second micro operation. Other embodiments are also disclosed and claimed.Type: ApplicationFiled: November 14, 2024Publication date: April 10, 2025Applicant: Intel CorporationInventors: Jayesh Gaur, Adarsh Chauhan, Vinodh Gopal, Vedvyas Shanbhogue, Sreenivas Subramoney, Wajdi Feghali
-
Patent number: 12255921Abstract: Methods, apparatus, and software for efficient encryption in virtual private network (VPN) sessions. A VPN link and an auxiliary link (and associated sessions) are established between computing platforms to support end-to-end communication between respective application running on the platforms. The VPN link may employ a conventional VPN protocol such as TLS or IPsec, while the auxiliary link comprises a NULL encryption VPN tunnel. To transfer data, a determination is made to whether the data are encrypted or non-encrypted. Encrypted data are transferred over the auxiliary link to avoid re-encryption of the data. Non-encrypted are transferred over the VPN link. TLS and IPsec VPN agents may be used to assist in setting up the VPN and auxiliary sessions. The techniques avoid double encryption of VPN traffic, while ensuring that various types of traffic transferred between platforms is encrypted.Type: GrantFiled: June 16, 2021Date of Patent: March 18, 2025Assignee: Intel CorporationInventors: Akhilesh S. Thyagaturu, Vinodh Gopal
-
Patent number: 12242851Abstract: Methods and apparatus relating to verifying a compressed stream fused with copy or transform operation(s) are described. In an embodiment, compression logic circuitry compresses input data and stores the compressed data in a temporary buffer. The compression logic circuitry determines a first checksum value corresponding to the compressed data stored in the temporary buffer. Decompression logic circuitry performs a decompress-verify operation and a copy operation. The decompress-verify operation decompresses the compressed data stored in the temporary buffer to determine a second checksum value corresponding to the decompressed data from the temporary buffer. The copy operation transfers the compressed data from the temporary buffer to a destination buffer in response to a match between the first checksum value and the second checksum value. Other embodiments are also disclosed and claimed.Type: GrantFiled: September 9, 2021Date of Patent: March 4, 2025Assignee: Intel CorporationInventors: Vinodh Gopal, James D. Guilford, Daniel F. Cutter
-
METHOD AND APPARATUS FOR DECOMPRESSION HARDWARE COPY ENGINE WITH EFFICIENT SEQUENCE OVERLAPPING COPY
Publication number: 20250004772Abstract: Apparatus and method for a decompression hardware copy engine with efficient sequence overlapping copy. For example, one embodiment of an apparatus comprises: a plurality of processing cores, one or more of the plurality of processing cores to execute program code to produce a plurality of literals and sequences from a compressed data stream; and decompression acceleration circuitry to generate a decompressed data stream based on the plurality of literals and sequences, the decompression acceleration circuitry comprising: a sequence pre-processor circuit to process batches of sequences of the plurality of sequences and generate a plurality of copy instructions, the sequence pre-processor circuit to merge multiple copy operations corresponding to multiple sequences into a merged copy instruction; and a copy engine circuit to execute the copy instructions to produce the decompressed data stream.Type: ApplicationFiled: June 30, 2023Publication date: January 2, 2025Inventors: Kamlesh PILLAI, Vinodh GOPAL, Gurpreet Singh KALSI, Sreenivas SUBRAMONEY, Wajdi K. FEGHALI -
Patent number: 12182018Abstract: Methods and apparatus relating to an instruction and/or micro-architecture support for decompression on core are described. In an embodiment, decode circuitry decodes a decompression instruction into a first micro operation and a second micro operation. The first micro operation causes one or more load operations to fetch data into one or more cachelines of a cache of a processor core. Decompression Engine (DE) circuitry decompresses the fetched data from the one or more cachelines of the cache of the processor core in response to the second micro operation. Other embodiments are also disclosed and claimed.Type: GrantFiled: December 23, 2020Date of Patent: December 31, 2024Assignee: Intel CorporationInventors: Jayesh Gaur, Adarsh Chauhan, Vinodh Gopal, Vedvyas Shanbhogue, Sreenivas Subramoney, Wajdi Feghali
-
Patent number: 12175274Abstract: Systems, methods, and circuitries are disclosed for a per-process memory encryption system. At least one translation lookaside buffer (TLB) is configured to encode key identifiers for keys in one or more bits of either the virtual memory address or the physical address. The process state memory configured to store a first process key table for a first process that maps key identifiers to unique keys and a second process key table that maps the key identifiers to different unique keys. The active process key table memory configured to store an active key table. In response to a request for data corresponding to a virtual memory address, the at least one TLB is configured to provide a key identifier for the data to the active process key table to cause the active process key table to return the unique key mapped to the key identifier.Type: GrantFiled: August 26, 2022Date of Patent: December 24, 2024Assignee: Tahoe Research, Ltd.Inventors: Wajdi Feghali, Vinodh Gopal, Kirk S. Yap, Sean Gulley, Raghunandan Makaram
-
Patent number: 12175246Abstract: Disclosed embodiments relate to matrix compress/decompress instructions. In one example, a processor includes fetch circuitry to fetch a compress instruction having a format with fields to specify an opcode and locations of decompressed source and compressed destination matrices, decode circuitry to decode the fetched compress instructions, and execution circuitry, responsive to the decoded compress instruction, to: generate a compressed result according to a compress algorithm by compressing the specified decompressed source matrix by either packing non-zero-valued elements together and storing the matrix position of each non-zero-valued element in a header, or using fewer bits to represent one or more elements and using the header to identify matrix elements being represented by fewer bits; and store the compressed result to the specified compressed destination matrix.Type: GrantFiled: September 1, 2023Date of Patent: December 24, 2024Assignee: Intel CorporationInventors: Dan Baum, Michael Espig, James Guilford, Wajdi K. Feghali, Raanan Sade, Christopher J. Hughes, Robert Valentine, Bret Toll, Elmoustapha Ould-Ahmed-Vall, Mark J. Charney, Vinodh Gopal, Ronen Zohar, Alexander F. Heinecke
-
Publication number: 20240364643Abstract: Techniques for processing packets on a network interface controller (NIC) with memory chiplets are disclosed. In an illustrative embodiment, a NIC includes a disaggregated memory with several high-bandwidth memory chiplets spread out in various locations on the NIC. The disaggregated nature of the memory can improve latency, throughput, and scalability as well as improve thermal performance by distributing heat generation to different locations on the NIC. In use, ports of the NIC can be configured to identify packets associated with certain flows and direct those packets to queues on the NIC. Direct memory access circuitry can copy the packets from queues on the NIC to queues on the system memory. This chain of copying packets from the port to the system memory creates a kind of virtual circuit, delivering packets directly to applications with low latency.Type: ApplicationFiled: April 28, 2023Publication date: October 31, 2024Applicant: Intel CorporationInventors: Sandeep K. Jain, Akhilesh Thyagaturu, Joshua B. Fryman, Poornima Lalwaney, Vinodh Gopal, Brian Paul Johnson
-
Patent number: 12130738Abstract: An embodiment of an integrated circuit may comprise, coupled to a core, a hardware decompression accelerator, a compressed cache, a processor and communicatively coupled to the hardware decompression accelerator and the compressed cache, and memory and communicatively coupled to the processor, wherein the memory stores microcode instructions which when executed by the processor causes the processor to store a first address to a decompression work descriptor, retrieve a second address where a compressed page is stored in the compressed cache from the decompression work descriptor at the first address in response to an indication of a page fault, and send instructions to the hardware decompression accelerator to decompress the compressed page at the second address. Other embodiments are disclosed and claimed.Type: GrantFiled: December 22, 2020Date of Patent: October 29, 2024Assignee: Intel CorporationInventors: Vedvyas Shanbhogue, Jayesh Gaur, Wajdi K. Feghali, Vinodh Gopal, Utkarsh Kakaiya
-
Patent number: 12118130Abstract: Systems, methods, and apparatuses for low-latency page efficient chained decryption and decompression acceleration are described.Type: GrantFiled: March 27, 2021Date of Patent: October 15, 2024Assignee: Intel CorporationInventor: Vinodh Gopal
-
Patent number: 12081649Abstract: An apparatus of an aspect includes an encryption unit to receive unencrypted data. The encryption unit is to encrypt the unencrypted data to generate encrypted data. The apparatus also includes circuitry coupled with the encryption unit. The circuitry is to generate a first checksum for a copy of the unencrypted data, generate a second checksum for a copy of the encrypted data, and combine the first and second checksums to generate a first value.Type: GrantFiled: September 2, 2020Date of Patent: September 3, 2024Assignee: Intel CorporationInventors: Vinodh Gopal, Kirk Yap
-
Patent number: 12074618Abstract: An embodiment of an integrated circuit may comprise a hardware compressor to compress data, the hardware compressor including circuitry to store input data in a history buffer, compute one or more code tables based on the input data, and compute a compression stream header based on the computed one or more code tables. Other embodiments are disclosed and claimed.Type: GrantFiled: December 21, 2020Date of Patent: August 27, 2024Assignee: Intel CorporationInventors: James Guilford, Vinodh Gopal, Daniel Cutter
-
Patent number: 12052370Abstract: Methods and apparatus for managing state in accelerators. An accelerator performs processing operations on a data chunk relating to a job submitted to the accelerator. During or following processing the data chunk, the accelerator generates state information corresponding to its current state and stores the state information or, optionally, the accelerator state information is obtained and stored by privileged software. In connection with continued processing of the current data chunk or a next job and next data chunk, the accelerator accesses previously stored state information identified by the job and validates the state information was generated by itself, another accelerator, or privileged software. Valid state information is then reloaded to restore the state of the accelerator/process state, and processing continues. The chunk processing, accelerator state store, validation, and restore operations are repeated to process subsequent jobs.Type: GrantFiled: May 4, 2020Date of Patent: July 30, 2024Assignee: Intel CorporationInventor: Vinodh Gopal
-
Patent number: 12028094Abstract: Methods and apparatus relating to an Application Programming Interface (API) for fine grained low latency decompression within a processor core are described. In an embodiment, a decompression Application Programming Interface (API) receives an input handle to a data object. The data object includes compressed data and metadata. Decompression Engine (DE) circuitry decompresses the compressed data to generate uncompressed data. The DE circuitry decompress the compressed data in response to invocation of a decompression instruction by the decompression API. The metadata comprises a first operand to indicate a location of the compressed data, a second operand to indicate a size of the compressed data, a third operand to indicate a location to which decompressed data by the DE circuitry is to be stored, and a fourth operand to indicate a size of the decompressed data. Other embodiments are also disclosed and claimed.Type: GrantFiled: December 23, 2020Date of Patent: July 2, 2024Inventors: Jayesh Gaur, Adarsh Chauhan, Vinodh Gopal, Vedvyas Shanbhogue, Sreenivas Subramoney, Wajdi Feghali
-
Patent number: 12021551Abstract: Apparatus and method for efficient compression block decoding using content-addressable structure for header processing. For example, one embodiment of an apparatus comprises: a header parser to extract a sequence of tokens and corresponding length values from a header of a compression block, the tokens and corresponding length values associated with a type of compression used to compress a payload of the compression block; and a content-addressable data structure builder to construct a content-addressable data structure based on the tokens and length values, the content-addressable data structure builder to write an entry in the content-addressable data structure comprising a length value and a count value, the count value indicating a number of times the length value was previously written to an entry in the content-addressable data structure.Type: GrantFiled: December 23, 2020Date of Patent: June 25, 2024Assignee: Intel CorporationInventors: James Guilford, Vinodh Gopal, Daniel Cutter, Kirk Yap
-
Patent number: 11989582Abstract: Apparatus and method for performing low-latency multi-job submission via a single job descriptor is described herein. An apparatus embodiment includes a plurality of descriptor queues to stores job descriptors describing work to be performed and enqueue circuitry to receive a first job descriptor which includes a first field to store a Single Instruction Multiple Data (SIMD) width. If the SIMD width indicates that the first job descriptor is an SIMD job descriptor and open slots are available in the descriptor queues to store new job descriptors, then the enqueue circuitry is to generate a plurality of job descriptors based on fields of the first job descriptor and to store them in the open slots of the descriptor queues. The generated job descriptors are processed by processing pipelines to perform the work described. At least some of the generated job descriptors are processed concurrently or in parallel by different processing pipelines.Type: GrantFiled: September 26, 2020Date of Patent: May 21, 2024Assignee: Intel CorporationInventors: James Guilford, George Powley, Vinodh Gopal, Wajdi Feghali
-
Publication number: 20240146521Abstract: Instructions and logic provide for a Single Instruction Multiple Data (SIMD) SM4 round slice operation. Embodiments of an instruction specify a first and a second source data operand set, and substitution function indicators, e.g. in an immediate operand. Embodiments of a processor may include encryption units, responsive to the first instruction, to: perform a slice of SM4-round exchanges on a portion of the first source data operand set with a corresponding keys from the second source data operand set in response to a substitution function indicator that indicates a first substitution function, perform a slice of SM4 key generations using another portion of the first source data operand set with corresponding constants from the second source data operand set in response to a substitution function indicator that indicates a second substitution function, and store a set of result elements of the first instruction in a SIMD destination register.Type: ApplicationFiled: December 18, 2023Publication date: May 2, 2024Inventors: Sean M. Gulley, Gilbert M. Wolrich, Vinodh Gopal, Kirk S. Yap, Wajdi K. Feghali
-
Patent number: 11955995Abstract: A lossless data compressor of an aspect includes a first lossless data compressor circuitry coupled to receive input data. The first lossless data compressor circuitry is to apply a first lossless data compression approach to compress the input data to generate intermediate compressed data. The apparatus also includes a second lossless data compressor circuitry coupled with the first lossless data compressor circuitry to receive the intermediate compressed data. The second lossless data compressor circuitry is to apply a second lossless data compression approach to compress at least some of the intermediate compressed data to generate compressed data. The second lossless data compression approach different than the first lossless data compression approach. Lossless data decompressors are also disclosed, as are methods of lossless data compression and decompression.Type: GrantFiled: May 11, 2020Date of Patent: April 9, 2024Assignee: Intel CorporationInventors: James Guilford, Vinodh Gopal, Daniel Cutter, Kirk Yap, Wajdi Feghali, George Powley
-
Patent number: 11956156Abstract: Methods and apparatus for dynamic offline end-to-end packet processing based on traffic class. An end-to-end connection is set up between an application on a client including a processor and host memory and an application on a remote server. An offline packet buffer is allocated in host memory. While the processor and/or a core on with the client application is executed is in a sleep state, the client is operated in an interrupt-less and polling-less mode as applied to a predetermined traffic class. Under the mode, a Network Interface Controller (NIC) at the client receives network traffic from the remote server and determines whether the network traffic is associated with the predetermined traffic class. When it is, the NIC writes packet data extracted from the network traffic to an offline packet buffer. Descriptors are generated and provided to the NIC to inform the NIC of the location and size of the offline packet buffer.Type: GrantFiled: September 10, 2020Date of Patent: April 9, 2024Assignee: Intel CorporationInventors: Akhilesh S. Thyagaturu, Vinodh Gopal
-
Publication number: 20240113863Abstract: Methods and apparatus relating to an efficient implementation of ZUC authentication are described. In one embodiment, a processor computes a tag update, based at least in part on stored data, for an authentication operation. The tag update is computed by replacing a ‘for’ loop with a carry-less multiply operation. Other embodiments are also claimed and disclosed.Type: ApplicationFiled: March 31, 2023Publication date: April 4, 2024Applicant: Intel CorporationInventors: Pablo De Lara Guarch, Tomasz Kantecki, Krystian Matusiewicz, Wajdi Feghali, Vinodh Gopal, James D. Guilford