Patents by Inventor Vinodh Gopal

Vinodh Gopal has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10778425
    Abstract: Instructions and logic provide for a Single Instruction Multiple Data (SIMD) SM4 round slice operation. Embodiments of an instruction specify a first and a second source data operand set, and substitution function indicators, e.g. in an immediate operand. Embodiments of a processor may include encryption units, responsive to the first instruction, to: perform a slice of SM4-round exchanges on a portion of the first source data operand set with a corresponding keys from the second source data operand set in response to a substitution function indicator that indicates a first substitution function, perform a slice of SM4 key generations using another portion of the first source data operand set with corresponding constants from the second source data operand set in response to a substitution function indicator that indicates a second substitution function, and store a set of result elements of the first instruction in a SIMD destination register.
    Type: Grant
    Filed: December 17, 2018
    Date of Patent: September 15, 2020
    Assignee: Intel Corporation
    Inventors: Sean M. Gulley, Gilbert M. Wolrich, Vinodh Gopal, Kirk S. Yap, Wajdi K. Feghali
  • Publication number: 20200280432
    Abstract: A method of an aspect includes receiving an instruction. The instruction indicates a first source of a first packed data including state data elements ai, bi, ei, and fi for a current round (i) of a secure hash algorithm 2 (SHA2) hash algorithm. The instruction indicates a second source of a second packed data. The first packed data has a width in bits that is less than a combined width in bits of eight state data elements ai, bi, ci, di, ei, fi, gi, hi of the SHA2 hash algorithm. The method also includes storing a result in a destination indicated by the instruction in response to the instruction. The result includes updated state data elements ai+, bi+, ei+, and fi+ that have been updated from the corresponding state data elements ai, bi, ei, and fi by at least one round of the SHA2 hash algorithm.
    Type: Application
    Filed: March 2, 2020
    Publication date: September 3, 2020
    Inventors: Gilbert M. WOLRICH, Kirk S. YAP, Vinodh GOPAL, James D. GUILFORD
  • Patent number: 10763894
    Abstract: Methods and apparatus to parallelize data decompression are disclosed. An example method selecting initial starting positions in a compressed data bitstream; adjusting a first one of the initial starting positions to determine a first adjusted starting position by decoding the bitstream starting at a training position in the bitstream, the decoding including traversing the bitstream from the training position as though first data located at the training position is a valid token; outputting first decoded data generated by decoding a first segment of the bitstream starting from the first adjusted starting position; and merging the first decoded data with second decoded data generated by decoding a second segment of the bitstream, the decoding of the second segment starting from a second position in the bitstream and being performed in parallel with the decoding of the first segment, and the second segment preceding the first segment in the bitstream.
    Type: Grant
    Filed: May 3, 2019
    Date of Patent: September 1, 2020
    Assignee: Intel Corporation
    Inventors: Vinodh Gopal, James D. Guilford, Sudhir K. Satpathy, Sanu K. Mathew
  • Publication number: 20200266995
    Abstract: Methods and apparatus for managing state in accelerators. An accelerator performs processing operations on a data chunk relating to a job submitted to the accelerator. During or following processing the data chunk, the accelerator generates state information corresponding to its current state and stores the state information or, optionally, the accelerator state information is obtained and stored by privileged software. In connection with continued processing of the current data chunk or a next job and next data chunk, the accelerator accesses previously stored state information identified by the job and validates the state information was generated by itself, another accelerator, or privileged software. Valid state information is then reloaded to restore the state of the accelerator/process state, and processing continues. The chunk processing, accelerator state store, validation, and restore operations are repeated to process subsequent jobs.
    Type: Application
    Filed: May 4, 2020
    Publication date: August 20, 2020
    Inventor: Vinodh Gopal
  • Patent number: 10725779
    Abstract: A processor includes an instruction decoder to receive a first instruction to process a secure hash algorithm 2 (SHA-2) hash algorithm, the first instruction having a first operand associated with a first storage location to store a SHA-2 state and a second operand associated with a second storage location to store a plurality of messages and round constants. The processor further includes an execution unit coupled to the instruction decoder to perform one or more iterations of the SHA-2 hash algorithm on the SHA-2 state specified by the first operand and the plurality of messages and round constants specified by the second operand, in response to the first instruction.
    Type: Grant
    Filed: June 24, 2019
    Date of Patent: July 28, 2020
    Assignee: Intel Corporation
    Inventors: Kirk S. Yap, Gilbert M. Wolrich, James D. Guilford, Vinodh Gopal, Erdinc Ozturk, Sean M. Gulley, Wajdi K. Feghali, Martin G. Dixon
  • Patent number: 10719323
    Abstract: Disclosed embodiments relate to matrix compress/decompress instructions. In one example, a processor includes fetch circuitry to fetch a compress instruction having a format with fields to specify an opcode and locations of decompressed source and compressed destination matrices, decode circuitry to decode the fetched compress instructions, and execution circuitry, responsive to the decoded compress instruction, to: generate a compressed result according to a compress algorithm by compressing the specified decompressed source matrix by either packing non-zero-valued elements together and storing the matrix position of each non-zero-valued element in a header, or using fewer bits to represent one or more elements and using the header to identify matrix elements being represented by fewer bits; and store the compressed result to the specified compressed destination matrix.
    Type: Grant
    Filed: September 27, 2018
    Date of Patent: July 21, 2020
    Assignee: Intel Corporation
    Inventors: Dan Baum, Michael Espig, James Guilford, Wajdi K. Feghali, Raanan Sade, Christopher J. Hughes, Robert Valentine, Bret Toll, Elmoustapha Ould-Ahmed-Vall, Mark J. Charney, Vinodh Gopal, Ronen Zohar, Alexander F. Heinecke
  • Patent number: 10705842
    Abstract: Methods and apparatuses relating to high-performance authenticated encryption are described.
    Type: Grant
    Filed: April 2, 2018
    Date of Patent: July 7, 2020
    Assignee: INTEL CORPORATION
    Inventors: Vikram Suresh, Sanu Mathew, Sudhir Satpathy, Vinodh Gopal
  • Patent number: 10691458
    Abstract: A processor includes a plurality of registers, an instruction decoder to receive an instruction to process a KECCAK state cube of data representing a KECCAK state of a KECCAK hash algorithm, to partition the KECCAK state cube into a plurality of subcubes, and to store the subcubes in the plurality of registers, respectively, and an execution unit coupled to the instruction decoder to perform the KECCAK hash algorithm on the plurality of subcubes respectively stored in the plurality of registers in a vector manner.
    Type: Grant
    Filed: September 26, 2017
    Date of Patent: June 23, 2020
    Assignee: Intel Corporation
    Inventors: Kirk S. Yap, Gilbert Wolrich, James D. Guilford, Vinodh Gopal, Erdinc Ozturk, Sean M. Gulley, Wajdi K. Feghali, Martin G. Dixon
  • Patent number: 10691529
    Abstract: A processing device comprising compression circuitry to: determine a compression configuration to compress source data; generate a checksum of the source data in an uncompressed state; compress the source data into at least one block based on the compression configuration, wherein the at least one block comprises: a plurality of sub-blocks, wherein the plurality of sub-block includes a predetermined size; a block header corresponding to the plurality of sub-blocks; and decompression circuitry coupled to the compression circuitry, wherein the decompression circuitry to: while not outputting a decompressed data stream of the source data: generate index information corresponding to the plurality of sub-blocks; in response to generating the index information, generate a checksum of the compressed source data associated with the plurality of sub-blocks; and determine whether the checksum of the source data in the uncompressed format matches the checksum of the compressed source data.
    Type: Grant
    Filed: June 20, 2018
    Date of Patent: June 23, 2020
    Assignee: INTEL CORPORATION
    Inventors: Vinodh Gopal, James Guilford, Daniel Cutter, Kirk Yap
  • Patent number: 10694217
    Abstract: A processing device includes compression circuitry to encode an input stream with an encoding that translates multiple symbols of fixed length into multiple codes of variable length between one and a maximum length, to generate a compressed stream. The compression circuitry is to: determine at least a first symbol of the multiple symbols having a first code that exceeds the maximum length; identify a short code of the multiple codes that is to be lengthened to provide an increased encoding capacity for the at least the first symbol; generate multiple code-length converted values including to increase the length of the short code to the maximum length and decrease, to the maximum length, a length of the first code of the at least the first symbol; and generate, with use of the set of code-length converted values, the compressed stream at the output terminal.
    Type: Grant
    Filed: September 21, 2018
    Date of Patent: June 23, 2020
    Assignee: Intel Corporation
    Inventors: Sudhir K. Satpathy, Vinodh Gopal, James D. Guilford, Sanu K. Mathew, Vikram B. Suresh
  • Patent number: 10686591
    Abstract: Instructions and logic provide SIMD secure hashing round slice functionality. Some embodiments include a processor comprising: a decode stage to decode an instruction for a SIMD secure hashing algorithm round slice, the instruction specifying a source data operand set, a message-plus-constant operand set, a round-slice portion of the secure hashing algorithm round, and a rotator set portion of rotate settings. Processor execution units, are responsive to the decoded instruction, to perform a secure hashing round-slice set of round iterations upon the source data operand set, applying the message-plus-constant operand set and the rotator set, and store a result of the instruction in a SIMD destination register. One embodiment of the instruction specifies a hash round type as one of four MD5 round types. Other embodiments may specify a hash round type by an immediate operand as one of three SHA-1 round types or as a SHA-2 round type.
    Type: Grant
    Filed: December 3, 2018
    Date of Patent: June 16, 2020
    Assignee: Intel Corporation
    Inventors: Gilbert M. Wolrich, Vinodh Gopal, Kirk S. Yap
  • Patent number: 10684855
    Abstract: Method and apparatus for performing a shift and XOR operation. In one embodiment, an apparatus includes execution resources to execute a first instruction. In response to the first instruction, said execution resources perform a shift and XOR on at least one value.
    Type: Grant
    Filed: August 25, 2017
    Date of Patent: June 16, 2020
    Assignee: Intel Corporation
    Inventors: Vinodh Gopal, James D. Guilford, Erdinc Ozturk, Wajdi K. Feghali, Gilbert M. Wolrich, Martin G. Dixon
  • Patent number: 10666288
    Abstract: Detailed herein are embodiments of systems, methods, and apparatuses for decompression using hardware and software. In hardware, an input buffer stores incoming input records from a compressed stream. A plurality of decoders decode at least one input record from the input buffer out output an intermediate record from the decoded data and a subset of the plurality of decoders to output a stream of literals. Finally, a reformat circuit formats an intermediate record into one of two types of tokens.
    Type: Grant
    Filed: November 20, 2018
    Date of Patent: May 26, 2020
    Assignee: Intel Corporation
    Inventors: Vinodh Gopal, James D. Guilford, Sean M. Gulley, Kirk S. Yap
  • Patent number: 10656947
    Abstract: Receiving an instruction indicating a source operand and a destination operand. Storing a result in the destination operand in response to the instruction. The result operand may have: (1) first range of bits having a first end explicitly specified by the instruction in which each bit is identical in value to a bit of the source operand in a corresponding position; and (2) second range of bits that all have a same value regardless of values of bits of the source operand in corresponding positions. Execution of instruction may complete without moving the first range of the result relative to the bits of identical value in the corresponding positions of the source operand, regardless of the location of the first range of bits in the result. Execution units to execute such instructions, computer systems having processors to execute such instructions, and machine-readable medium storing such an instruction are also disclosed.
    Type: Grant
    Filed: March 14, 2018
    Date of Patent: May 19, 2020
    Assignee: Intel Corporation
    Inventors: Maxim Loktyukhin, Eric W. Mahurin, Bret L. Toll, Martin G. Dixon, Sean P. Mirkes, David L. Kreitzer, Elmoustapha Ould-Ahmed-Vall, Vinodh Gopal
  • Patent number: 10649774
    Abstract: A method in one aspect may include receiving a multiply instruction. The multiply instruction may indicate a first source operand and a second source operand. A product of the first and second source operands may be stored in one or more destination operands indicated by the multiply instruction. Execution of the multiply instruction may complete without writing a carry flag. Other methods are also disclosed, as are apparatus, systems, and instructions on machine-readable medium.
    Type: Grant
    Filed: December 27, 2017
    Date of Patent: May 12, 2020
    Assignee: Intel Corporation
    Inventors: Vinodh Gopal, James D. Guilford, Wajdi K. Feghali, Erdinc Ozturk, Gilbert M. Wolrich, Martin G. Dixon, Mark C. Davis, Sean P. Mirkes, Alexandre J. Farcy, Bret L. Toll, Maxim Loktyukhin
  • Patent number: 10635338
    Abstract: Technologies for high-ratio compression with heterogeneous history buffers include a computing device having an accelerator complex with a large history buffer and a small history buffer. The large history buffer has a larger size than the small history buffer. For example, the small history buffer may be 32 kilobytes and the large history buffer may be 64 kilobytes, 1 megabyte, or larger. The large history buffer is coupled to a large-buffer compare core that searches for matches in the large history buffer, finds a best match, and forwards the best match to a small-buffer compare core. The small-buffer compare core searches the small history buffer for matches, receives the match forwarded from the large-buffer compare core, and determines a best match from the matches in the small history buffer and the forwarded match. Other embodiments are described and claimed.
    Type: Grant
    Filed: September 29, 2017
    Date of Patent: April 28, 2020
    Assignee: Intel Corporation
    Inventors: Vinodh Gopal, James D. Guilford
  • Patent number: 10628068
    Abstract: Technologies for database acceleration include a computing device having a database accelerator. The database accelerator performs a decompress operation on one or more compressed elements of a compressed database to generate one or more decompressed elements. After decompression of the compressed elements, the database accelerator prepares the one or more decompressed elements to generate one or more prepared elements to be processed by an accelerated filter. The database accelerator then performs the accelerated filter on the one or more prepared elements to generate one or more output elements. Other embodiments are described and claimed.
    Type: Grant
    Filed: September 29, 2017
    Date of Patent: April 21, 2020
    Assignee: Intel Corporation
    Inventors: Vinodh Gopal, James D. Guilford, Kirk S. Yap, Simon N. Peffers, Daniel F. Cutter
  • Patent number: 10615963
    Abstract: A flexible aes instruction for a general purpose processor is provided that performs aes encryption or decryption using n rounds, where n includes the standard aes set of rounds {10, 12, 14}. A parameter is provided to allow the type of aes round to be selected, that is, whether it is a “last round”. In addition to standard aes, the flexible aes instruction allows an AES-like cipher with 20 rounds to be specified or a “one round” pass.
    Type: Grant
    Filed: June 30, 2017
    Date of Patent: April 7, 2020
    Assignee: Intel Corporation
    Inventors: Shay Gueron, Wajdi K. Feghali, Vinodh Gopal
  • Publication number: 20200099958
    Abstract: A processing device includes compression circuitry to encode an input stream with an encoding that translates multiple symbols of fixed length into multiple codes of variable length between one and a maximum length, to generate a compressed stream. The compression circuitry is to: determine at least a first symbol of the multiple symbols having a first code that exceeds the maximum length; identify a short code of the multiple codes that is to be lengthened to provide an increased encoding capacity for the at least the first symbol; generate multiple code-length converted values including to increase the length of the short code to the maximum length and decrease, to the maximum length, a length of the first code of the at least the first symbol; and generate, with use of the set of code-length converted values, the compressed stream at the output terminal.
    Type: Application
    Filed: September 21, 2018
    Publication date: March 26, 2020
    Inventors: Sudhir K. Satpathy, Vinodh Gopal, James D. Guilford, Sanu K. Mathew, Vikram B. Suresh
  • Patent number: 10601583
    Abstract: A flexible aes instruction for a general purpose processor is provided that performs aes encryption or decryption using n rounds, where n includes the standard aes set of rounds {10, 12, 14}. A parameter is provided to allow the type of aes round to be selected, that is, whether it is a “last round”. In addition to standard aes, the flexible aes instruction allows an AES-like cipher with 20 rounds to be specified or a “one round” pass.
    Type: Grant
    Filed: June 30, 2017
    Date of Patent: March 24, 2020
    Assignee: Intel Corporation
    Inventors: Shay Gueron, Wajdi K. Feghali, Vinodh Gopal