Patents by Inventor Kirk S. Yap

Kirk S. Yap has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8838997
    Abstract: A processor includes a first execution unit to receive and execute a first instruction to process a first part of secure hash algorithm 256 (SHA256) message scheduling operations, the first instruction having a first operand associated with a first storage location to store a first set of message inputs and a second operand associated with a second storage location to store a second set of message inputs. The processor further includes a second execution unit to receive and execute a second instruction to process a second part of the SHA256 message scheduling operations, the second instruction having a third operand associated with a third storage location to store an intermediate result of the first part and a third set of message inputs and a fourth operand associated with a fourth storage location to store a fourth set of message inputs.
    Type: Grant
    Filed: September 28, 2012
    Date of Patent: September 16, 2014
    Assignee: Intel Corporation
    Inventors: Gilbert M. Wolrich, Kirk S. Yap, James D. Guilford, Vinodh Gopal, Sean M. Gulley
  • Publication number: 20140237218
    Abstract: A multiply-and-accumulate (MAC) instruction allows efficient execution of unsigned integer multiplications. The MAC instruction indicates a first vector register as a first operand, a second vector register as a second operand, and a third vector register as a destination. The first vector register stores a first factor, and the second vector register stores a partial sum. The MAC instruction is executed to multiply the first factor with an implicit second factor to generate a product, and to add the partial sum to the product to generate a result. The first factor, the implicit second factor and the partial sum have a same data width and the product has twice the data width. The most significant half of the result is stored in the third vector register, and the least significant half of the result is stored in the second vector register.
    Type: Application
    Filed: December 19, 2011
    Publication date: August 21, 2014
    Inventors: Vinodh Gopal, Gilbert M. Wolrich, Erdinc Ozturk, James D. Guilford, Kirk S. Yap, Sean M. Gulley, wajdi K. Feghali, Martin G. Dixon
  • Publication number: 20140205084
    Abstract: A method is described. The method includes executing one or more JH_SBOX_L instructions to perform S-Box mappings and a linear (L) transformation on a JH state and executing one or more JH_P instructions to perform a permutation function on the JH state once the S-Box mappings and the L transformation have been performed.
    Type: Application
    Filed: December 22, 2011
    Publication date: July 24, 2014
    Inventors: Gilbert M. Wolrich, Kirk S. Yap, Vinodh Gopal, James D. Guilford, Erdinc Ozturk, Sean M. Gulley, Wajdi K. Feghali, Martin G. Dixon
  • Publication number: 20140195782
    Abstract: A processor includes an instruction decoder to receive a first instruction to process a secure hash algorithm 2 (SHA-2) hash algorithm, the first instruction having a first operand associated with a first storage location to store a SHA-2 state and a second operand associated with a second storage location to store a plurality of messages and round constants. The processor further includes an execution unit coupled to the instruction decoder to perform one or more iterations of the SHA-2 hash algorithm on the SHA-2 state specified by the first operand and the plurality of messages and round constants specified by the second operand, in response to the first instruction.
    Type: Application
    Filed: March 30, 2012
    Publication date: July 10, 2014
    Inventors: Kirk S. Yap, Gilbert M. Wolrich, James D. Guilford, Vinodh Gopal, Erdinc Ozturk, Sean M. Gulley, Wajdi K. Feghali, Martin G. Dixon
  • Publication number: 20140195817
    Abstract: A method is described that includes performing the following within an instruction execution pipeline implemented on a semiconductor chip: summing three input vector operands through execution of a single instruction; and, not raising any arithmetic flags even though a result of the summing creates more bits than circuitry designed to transport the summation is able to transport.
    Type: Application
    Filed: December 23, 2011
    Publication date: July 10, 2014
    Applicant: INTEL CORPORATION
    Inventors: Wajdi K. Feghali, Vinodh Gopal, James D. Guilford, Erdinc Ozturk, Gilbert M. Wolrich, Kirk S. Yap, Sean M. Gulley, Martin G. Dixon
  • Publication number: 20140185793
    Abstract: A method of an aspect includes receiving an instruction. The instruction indicates a first source of a first packed data including state data elements ai, bi, ei, and fi for a current round (i) of a secure hash algorithm 2 (SHA2) hash algorithm. The instruction indicates a second source of a second packed data. The first packed data has a width in bits that is less than a combined width in bits of eight state data elements ai, bi, ci, di, ei, fi, gi, hi of the SHA2 hash algorithm. The method also includes storing a result in a destination indicated by the instruction in response to the instruction. The result includes updated state data elements ai+, bi+, ei+, and fi+ that have been updated from the corresponding state data elements ai, bi, ei, and fi by at least one round of the SHA2 hash algorithm.
    Type: Application
    Filed: December 28, 2012
    Publication date: July 3, 2014
    Inventors: GILBERT M. WOLRICH, Kirk S. Yap, Vinodh Gopal, James D. Guilford
  • Publication number: 20140189289
    Abstract: Vector instructions for performing SNOW 3G wireless security operations are received and executed by the execution circuitry of a processor. The execution circuitry receives a first operand of the first instruction specifying a first vector register that stores a current state of a finite state machine (FSM). The execution circuitry also receives a second operand of the first instruction specifying a second vector register that stores data elements of a liner feedback shift register (LFSR) that are needed for updating the FSM. The execution circuitry executes the first instruction to produce a updated state of the FSM and an output of the FSM in a destination operand of the first instruction.
    Type: Application
    Filed: December 28, 2012
    Publication date: July 3, 2014
    Inventors: Gilbert M. Wolrich, Vinodh Gopal, Erdinc Ozturk, Kirk S. Yap, Wajdi K. Feghali
  • Publication number: 20140189290
    Abstract: Vector instructions for performing ZUC stream cipher operations are received and executed by the execution circuitry of a processor. The execution circuitry receives a first vector instruction to perform an update to a liner feedback shift register (LFSR), and receives a second vector instruction to perform an update to a state of a finite state machine (FSM), where the FSM receives inputs from re-ordered bits of the LFSR. The execution circuitry executes the first vector instruction and the second vector instruction in a single-instruction multiple data (SIMD) pipeline.
    Type: Application
    Filed: December 28, 2012
    Publication date: July 3, 2014
    Inventors: Gilbert M. Wolrich, Vinodh Gopal, Kirk S. Yap, Wajdi K. Feghali
  • Publication number: 20140189368
    Abstract: Instructions and logic provide SIMD secure hashing round slice functionality. Some embodiments include a processor comprising: a decode stage to decode an instruction for a SIMD secure hashing algorithm round slice, the instruction specifying a source data operand set, a message-plus-constant operand set, a round-slice portion of the secure hashing algorithm round, and a rotator set portion of rotate settings. Processor execution units, are responsive to the decoded instruction, to perform a secure hashing round-slice set of round iterations upon the source data operand set, applying the message-plus-constant operand set and the rotator set, and store a result of the instruction in a SIMD destination register. One embodiment of the instruction specifies a hash round type as one of four MD5 round types. Other embodiments may specify a hash round type by an immediate operand as one of three SHA-1 round types or as a SHA-2 round type.
    Type: Application
    Filed: December 29, 2012
    Publication date: July 3, 2014
    Inventors: Gilbert M. Wolrich, Vinodh Gopal, Kirk S. Yap
  • Publication number: 20140189369
    Abstract: A method of an aspect includes receiving an instruction. The instruction indicates a first source of a first packed data including state data elements ai, bi, ei, and fi for a current round (i) of a secure hash algorithm 2 (SHA2) hash algorithm. The instruction indicates a second source of a second packed data. The first packed data has a width in bits that is less than a combined width in bits of eight state data elements ai, bi, ci, di, ei, fi, gi, hi of the SHA2 hash algorithm. The method also includes storing a result in a destination indicated by the instruction in response to the instruction. The result includes updated state data elements ai+, bi+, ei+, and fi+ that have been updated from the corresponding state data elements ai, bi, ei, and fi by at least one round of the SHA2 hash algorithm.
    Type: Application
    Filed: March 15, 2013
    Publication date: July 3, 2014
    Inventors: Gilbert M. Wolrich, Kirk S. Yap, Vinodh Gopal, James D. Guilford
  • Publication number: 20140164467
    Abstract: An apparatus is described that includes a semiconductor chip having an instruction execution pipeline having one or more execution units with respective logic circuitry to: a) execute a first instruction that multiplies a first input operand and a second input operand and presents a lower portion of the result, where, the first and second input operands are respective elements of first and second input vectors; b) execute a second instruction that multiplies a first input operand and a second input operand and presents an upper portion of the result, where, the first and second input operands are respective elements of first and second input vectors; and, c) execute an add instruction where a carry term of the add instruction's adding is recorded in a mask register.
    Type: Application
    Filed: December 23, 2011
    Publication date: June 12, 2014
    Inventors: Gilbert M. Wolrich, Kirk S. Yap, James D. Guilford, Erdinc Ozturk, Vinodh Gopal, Wajdi K. Feghali, Sean M. Gulley, Martin G. Dixon
  • Publication number: 20140122839
    Abstract: An apparatus is described that includes an execution unit within an instruction pipeline. The execution unit has multiple stages of a circuit that includes a) and b) as follows. a) a first logic circuitry section having multiple mix logic sections each having: i) a first input to receive a first quad word and a second input to receive a second quad word; ii) an adder having a pair of inputs that are respectively coupled to the first and second inputs; iii) a rotator having a respective input coupled to the second input; iv) an XOR gate having a first input coupled to an output of the adder and a second input coupled to an output of the rotator. b) permute logic circuitry having inputs coupled to the respective adder and XOR gate outputs of the multiple mix logic sections.
    Type: Application
    Filed: December 22, 2011
    Publication date: May 1, 2014
    Inventors: Gilbert M. Wolrich, Kirk S. Yap, James D. Guilford, Erdinc Ozturk, Vinodh Gopal, Wajdi K. Feghali, Sean M. Gulley, Martin G. Dixon
  • Publication number: 20140095891
    Abstract: According to one embodiment, a processor includes an instruction decoder to receive a first instruction to process a SHA1 hash algorithm, the first instruction having a first operand, a second operand, and a third operand, the first operand specifying a first storage location storing four SHA states, the second operand specifying a second storage location storing a plurality of SHA1 message inputs in combination with a fifth SHA1 state. The processor further includes an execution unit coupled to the instruction decoder, in response to the first instruction, to perform at least four rounds of the SHA1 round operations on the SHA1 states and the message inputs obtained from the first and second operands, using a combinational logic function specified in the third operand.
    Type: Application
    Filed: September 28, 2012
    Publication date: April 3, 2014
    Inventors: Gilbert M. Wolrich, Kirk S. Yap, Vinodh Gopal, Sean M. Gulley, James D. Guilford
  • Publication number: 20140093069
    Abstract: A processor includes a first execution unit to receive and execute a first instruction to process a first part of secure hash algorithm 256 (SHA256) message scheduling operations, the first instruction having a first operand associated with a first storage location to store a first set of message inputs and a second operand associated with a second storage location to store a second set of message inputs. The processor further includes a second execution unit to receive and execute a second instruction to process a second part of the SHA256 message scheduling operations, the second instruction having a third operand associated with a third storage location to store an intermediate result of the first part and a third set of message inputs and a fourth operand associated with a fourth storage location to store a fourth set of message inputs.
    Type: Application
    Filed: September 28, 2012
    Publication date: April 3, 2014
    Inventors: Gilbert M. Wolrich, Kirk S. Yap, James D. Guilford, Vinodh Gopal, Sean M. Gulley
  • Publication number: 20140093068
    Abstract: According to one embodiment, a processor includes an instruction decoder to receive a first instruction to perform first SKEIN256 MIX-PERMUTE operations, the first instruction having a first operand associated with a first storage location to store a plurality of odd words, a second operand associated with a second storage location to store a plurality of even words, and a third operand. The processor further includes a first execution unit coupled to the instruction decoder, in response to the first instruction, to perform multiple rounds of the first SKEIN256 MIX-PERMUTE operations based on the odd words and even words using a first rotate value obtained from a third storage location indicated by the third operand, and to store new odd words in the first storage location indicated by the first operand.
    Type: Application
    Filed: September 28, 2012
    Publication date: April 3, 2014
    Inventors: Gilbert M. Wolrich, Kirk S. Yap, Vinodh Gopal
  • Publication number: 20140095844
    Abstract: Disclosed herein are systems, apparatuses, and methods performing in a computer processor of performing a rotate and XOR in response to a single XOR and rotate instruction, wherein the rotate and XOR instruction includes a first and second source operand, a destination operand, and an immediate value.
    Type: Application
    Filed: September 28, 2012
    Publication date: April 3, 2014
    Inventors: Vinodh Gopal, Gilbert M. Wolrich, James D. Guilford, Kirk S. Yap
  • Publication number: 20140053000
    Abstract: A method is described.
    Type: Application
    Filed: December 22, 2011
    Publication date: February 20, 2014
    Applicant: INTEL CORPORATION
    Inventors: Kirk S. Yap, Gilbert M. Wolrich, Vinodh Gopal, James D. Guilford, Erdinc Ozturk, Sean M. Gulley, Wajdi K. Feghali, Martin G. Dixon
  • Publication number: 20140019694
    Abstract: Technologies for executing a serial data processing algorithm on a single variable-length data buffer includes padding data segments of the buffer, streaming the data segments into a data register and executing the serial data processing algorithm on each of the segments in parallel.
    Type: Application
    Filed: September 28, 2012
    Publication date: January 16, 2014
    Inventors: Sean M. Gulley, Wajdi K. Feghali, Vinodh Gopal, James D. Guilford, Gilbert Wolrich, Kirk S. Yap
  • Publication number: 20140019693
    Abstract: Technologies for executing a serial data processing algorithm on a single variable length data buffer includes streaming segments of the buffer into a data register, executing the algorithm on each of the segments in parallel, and combining the results of executing the algorithm on each of the segments to form the output of the serial data processing algorithm.
    Type: Application
    Filed: September 28, 2012
    Publication date: January 16, 2014
    Inventors: Sean M. Gulley, Wajdi K. Feghali, Vinodh Gopal, James D. Guilford, Gilbert M. Wolrich, Kirk S. Yap
  • Publication number: 20140016774
    Abstract: A method is described. The method includes executing an instruction to perform one or more Galois Field (GF) multiply by 2 operations on a state matrix and executing an instruction to combine results of the one or more GF multiply by 2 operations with exclusive or (XOR) functions to generate a result matrix.
    Type: Application
    Filed: December 22, 2011
    Publication date: January 16, 2014
    Inventors: Gilbert M. Wolrich, Kirk S. Yap, Vinodh Gopal, James D. Guilford, Erdinc Ozturk, Sean M. Gulley, Wajdi K. Feghali, Martin G. Dixon