Patents by Inventor Christopher H. Olson

Christopher H. Olson has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Instruction support for performing montgomery multiplication

Patent number: 8583902

Abstract: Techniques are disclosed relating to a processor including instruction support for performing a Montgomery multiplication. The processor may issue, for execution, programmer-selectable instruction from a defined instruction set architecture (ISA). The processor may include an instruction execution unit configured to receive instructions including a first instance of a Montgomery-multiply instruction defined within the ISA. The Montgomery-multiply instruction is executable by the processor to operate on at least operands A, B, and N residing in respective portions of a general-purpose register file of the processor, where at least one of operands A, B, N spans at least two registers of general-purpose register file. The instruction execution unit is configured to calculate P mod N in response to receiving the first instance of the Montgomery-multiply instruction, where P is the product of at least operand A, operand B, and R^?1.

Type: Grant

Filed: May 7, 2010

Date of Patent: November 12, 2013

Assignee: Oracle International Corporation

Inventors: Christopher H. Olson, Gregory F. Grohoski, Lawrence Spracklen, Nils Gura
Thread fairness on a multi-threaded processor with multi-cycle cryptographic operations

Patent number: 8560814

Abstract: Systems and methods for efficient execution of operations in a multi-threaded processor. Each thread may include a blocking instruction. A blocking instruction blocks other threads from utilizing hardware resources for an appreciable amount of time. One example of a blocking type instruction is a Montgomery multiplication cryptographic instruction. Each thread can operate in a thread-based mode that allows the insertion of stall cycles during the execution of blocking instructions, during which other threads may utilize the previously blocked hardware resources. At times when multiple threads are scheduled to execute blocking instructions, the thread-based mode may be changed to increase throughput for these multiple threads. For example, the mode may be changed to disallow the insertion of stall cycles. Therefore, the time for sequential operation of the blocking instructions corresponding to the multiple threads may be reduced.

Type: Grant

Filed: May 4, 2010

Date of Patent: October 15, 2013

Assignee: Oracle International Corporation

Inventors: Robert T. Golla, Christopher H. Olson, Gregory F. Grohoski
Processor and method providing instruction support for instructions that utilize multiple register windows

Patent number: 8555038

Abstract: A processor including instruction support for large-operand instructions that use multiple register windows may issue, for execution, programmer-selectable instructions from a defined instruction set architecture (ISA). The processor may also include an instruction execution unit that, during operation, receives instructions for execution from the instruction fetch unit and executes a large-operand instruction defined within the ISA, where execution of the large-operand instruction is dependent upon a plurality of registers arranged within a plurality of register windows. The processor may further include control circuitry (which may be included within the fetch unit, the execution unit, or elsewhere within the processor) that determines whether one or more of the register windows depended upon by the large-operand instruction are not present. In response to determining that one or more of these register windows are not present, the control circuitry causes them to be restored.

Type: Grant

Filed: May 28, 2010

Date of Patent: October 8, 2013

Assignee: Oracle International Corporation

Inventors: Christopher H. Olson, Paul J. Jordan, Jama I. Barreh
DIVISION UNIT WITH MULTIPLE DIVIDE ENGINES

Publication number: 20130179664

Abstract: Techniques are disclosed relating to integrated circuits that include hardware support for divide and/or square root operations. In one embodiment, an integrated circuit is disclosed that includes a division unit that, in turn, includes a normalization circuit and a plurality of divide engines. The normalization circuit is configured to normalize a set of operands. Each divide engine is configured to operate on a respective normalized set of operands received from the normalization circuit. In some embodiments, the integrated circuit includes a scheduler unit configured to select instructions for issuance to a plurality of execution units including the division unit. The scheduler unit is further configured to maintain a counter indicative of a number of instructions currently being operated on by the division unit, and to determine, based on the counter whether to schedule subsequent instructions for issuance to the division unit.

Type: Application

Filed: January 6, 2012

Publication date: July 11, 2013

Inventors: Christopher H. Olson, Jeffrey S. Brooks, Matthew B. Smittle
Register error correction of speculative data in an out-of-order processor

Patent number: 8468425

Abstract: In one embodiment, a processor comprises a first register file configured to store speculative register state, a second register file configured to store committed register state, a check circuit and a control unit. The first register file is protected by a first error protection scheme and the second register file is protected by a second error protection scheme. A check circuit is coupled to receive a value and corresponding one or more check bits read from the first register file to be committed to the second register file in response to the processor selecting a first instruction to be committed. The check circuit is configured to detect an error in the value responsive to the value and the check bits. Coupled to the check circuit, the control unit is configured to cause reexecution of the first instruction responsive to the error detected by the check circuit.

Type: Grant

Filed: November 14, 2011

Date of Patent: June 18, 2013

Assignee: Oracle International Corporation

Inventors: Paul J. Jordan, Christopher H. Olson
Accessing a multibank register file using a thread identifier

Patent number: 8458446

Abstract: A processor includes an instruction fetch unit configured to issue instructions for execution, where the instructions are selected from a number of threads, where each given instruction has a corresponding thread identifier, and where at least some of the instructions specify operand(s) via register identifiers. A register file stores operands usable by the instructions, and may include several banks, each corresponding to a register identifiers and including several entries corresponding to the several threads, wherein the entries are configured to store data values. In response to receiving a request to read a particular register identifier for a given thread identifier, the register file may be configured to decode the given thread identifier to retrieve entries from the banks that correspond to the given thread identifier. The register file may further select, from among the retrieved entries, a data value corresponding to the particular register identifier to be output.

Type: Grant

Filed: September 30, 2009

Date of Patent: June 4, 2013

Assignee: Oracle America, Inc.

Inventors: Christopher H. Olson, Xiang Shan Li, Robert T. Golla
STORING A TARGET ADDRESS OF A CONTROL TRANSFER INSTRUCTION IN AN INSTRUCTION FIELD

Publication number: 20130138888

Abstract: A control transfer instruction (CTI), such as a branch, jump, etc., may have an offset value for a control transfer that is to be performed. The offset value may be usable to compute a target address for the CTI (e.g., the address of a next instruction to be executed for a thread or instruction stream). The offset may be specified relative to a program counter. In response to detecting a specified offset value, the CTI may be modified to include at least a portion of a computed target address. Information indicating this modification has been performed may be stored, for example, in a pre-decode bit. In some cases, CTI modification may be performed only when a target address is a “near” target, rather than a “far” target. Modifying CTIs as described herein may eliminate redundant address calculations and produce a savings of power and/or time in some embodiments.

Type: Application

Filed: November 30, 2011

Publication date: May 30, 2013

Inventors: Jama I. Barreh, Manish K. Shah, Christopher H. Olson
Apparatus and method for implementing hardware support for denormalized operands for floating-point divide operations

Patent number: 8452831

Abstract: A floating-point circuit may include a floating-point operand normalization circuit configured to receive input floating-point operands of a given floating-point divide operation, the operands comprising a dividend and a divisor, as well as a divide engine coupled to the normalization circuit. In response to determining that one or more of the input floating-point operands is a denormal number, the operand normalization circuit may be further configured to normalize the one or more of the input floating-point operands and output a normalized dividend and normalized divisor to the divide engine, and dependent upon respective numbers of leading zeros of the dividend and divisor prior to normalization, generate a value indicative of a maximum possible number of digits of a quotient (NDQ). The divide engine may be configured to iteratively generate NDQ digits of a floating-point quotient from the normalized dividend and the normalized divisor provided by the floating-point operand normalization circuit.

Type: Grant

Filed: March 31, 2009

Date of Patent: May 28, 2013

Assignee: Oracle America, Inc.

Inventors: Christopher H. Olson, Jeffrey S. Brooks
Processor and method for implementing instruction support for multiplication of large operands

Patent number: 8438208

Abstract: A processor including instruction support for implementing large-operand multiplication may issue, for execution, programmer-selectable instructions from a defined instruction set architecture (ISA). The processor may include an instruction execution unit comprising a hardware multiplier datapath circuit, where the hardware multiplier datapath circuit is configured to multiply operands having a maximum number of bits M.

Type: Grant

Filed: June 19, 2009

Date of Patent: May 7, 2013

Assignee: Oracle America, Inc.

Inventors: Christopher H. Olson, Jeffrey S. Brooks, Robert T. Golla, Paul J. Jordan
Apparatus and method for implementing instruction support for performing a cyclic redundancy check (CRC)

Patent number: 8417961

Abstract: Techniques relating to a processor including instruction support for implementing a cyclic redundancy check (CRC) operation. The processor may issue, for execution, programmer-selectable instructions from a defined instruction set architecture (ISA). The processor may include a cryptographic unit configured to receive instructions that include a first instance of a cyclic redundancy check (CRC) instruction defined within the ISA, where the first instance of the CRC instruction is executable by the cryptographic unit to perform a first CRC operation on a set of data that produces a checksum value. In one embodiment, the cryptographic unit is configured to generate the checksum value using a generator polynomial of 0x11EDC6F41.

Type: Grant

Filed: March 16, 2010

Date of Patent: April 9, 2013

Assignee: Oracle International Corporation

Inventors: Christopher H. Olson, Gregory F. Grohoski, Lawrence A. Spracklen
Execution unit for performing the data encryption standard

Patent number: 8358780

Abstract: Described is an execution unit for performing at least part of the Data Encryption Standard that includes a Left Half input; a Key input; and a Table input, as well as a first group of transistors configured to receive the Table input, perform a table look-up, and output data. The execution unit further includes a first exclusive-or operator having two inputs and an output that is configured to receive the Left Half input and the Key input. The execution unit also includes a second exclusive-or operator having two inputs and an output that is configured to receive the data output by the first group of transistors and to receive the output of the first exclusive-or operator. The execution unit also includes a third exclusive-or operator having two inputs and an output that is configured to receive the Left Half input and the data output by the first group of transistors.

Type: Grant

Filed: November 7, 2011

Date of Patent: January 22, 2013

Assignee: Oracle America, Inc.

Inventors: Leonard D. Rarick, Christopher H. Olson
Apparatus and method for local operand bypassing for cryptographic instructions

Patent number: 8356185

Abstract: A processor may include a hardware instruction fetch unit configured to issue instructions for execution, and a hardware functional unit configured to receive instructions for execution, where the instructions include cryptographic instruction(s) and non-cryptographic instruction(s). The functional unit may include a cryptographic execution pipeline configured to execute the cryptographic instructions with a corresponding cryptographic execution latency, and a non-cryptographic execution pipeline configured to execute the non-cryptographic instructions with a corresponding non-cryptographic execution latency that is longer than the cryptographic execution latency.

Type: Grant

Filed: October 8, 2009

Date of Patent: January 15, 2013

Assignee: Oracle America, Inc.

Inventors: Christopher H. Olson, Gregory F. Grohoski, Robert T. Golla
SUPPRESSION OF CONTROL TRANSFER INSTRUCTIONS ON INCORRECT SPECULATIVE EXECUTION PATHS

Publication number: 20120290820

Abstract: Techniques are disclosed relating to a processor that is configured to execute control transfer instructions (CTIs). In some embodiments, the processor includes a mechanism that suppresses results of mispredicted younger CTIs on a speculative execution path. This mechanism permits the branch predictor to maintain its fidelity, and eliminates spurious flushes of the pipeline. In one embodiment, a misprediction bit is be used to indicate that a misprediction has occurred, and younger CTIs than the CTI that was mispredicted are suppressed. In some embodiments, the processor may be configured to execute instruction streams from multiple threads. Each thread may include a misprediction indication. CTIs in each thread may execute in program order with respect to other CTIs of the thread, while instructions other than CTIs may execute out of program order.

Type: Application

Filed: September 8, 2011

Publication date: November 15, 2012

Applicant: ORACLE INTERNATIONAL CORPORATION

Inventors: Christopher H. Olson, Manish K. Shah
BRANCH TARGET STORAGE AND RETRIEVAL IN AN OUT-OF-ORDER PROCESSOR

Publication number: 20120290817

Abstract: A processor configured to facilitate transfer and storage of predicted targets for control transfer instructions (CTIs). In certain embodiments, the processor may be multithreaded and support storage of predicted targets for multiple threads. In some embodiments, a CTI branch target may be stored by one element of a processor and a tag may indicate the location of the stored target. The tag may be associated with the CTI rather than associating the complete target address with the CTI. When the CTI reaches an execution stage of the processor, the tag may be used to retrieve the predicted target address. In some embodiments using a tag to retrieve a predicted target, CTI instructions from different processor threads may be interleaved without affecting retrieval of predicted targets.

Type: Application

Filed: September 8, 2011

Publication date: November 15, 2012

Applicant: ORACLE INTERNATIONAL CORPORATION

Inventors: Christopher H. Olson, Manish K. Shah
PIPELINED DIVIDE CIRCUIT FOR SMALL OPERAND SIZES

Publication number: 20120259907

Abstract: A pipelined circuit for performing a divide operation on small operand sizes. The circuit includes a plurality of stages connected together in a series to perform a subtractive divide algorithm based on iterative subtractions and shifts. Each stage computes two quotient bits and outputs a partial remainder value to the next stage in the series. The first and last stages utilize a radix-4 serial architecture with edge modifications to increase efficiency. The intermediate stages utilize a radix-4 parallel architecture. The divide architecture is pipelined such that input operands can be applied to the divider on each clock cycle.

Type: Application

Filed: April 7, 2011

Publication date: October 11, 2012

Inventors: Christopher H. Olson, Jeffrey S. Brooks
SYSTEM AND METHOD OF BYPASSING UNROUNDED RESULTS IN A MULTIPLY-ADD PIPELINE UNIT

Publication number: 20120233234

Abstract: A processing unit, system, and method for performing a multiply operation in a multiply-add pipeline. To reduce the pipeline latency, the unrounded result of a multiply-add operation is bypassed to the inputs of the multiply-add pipeline for use in a subsequent operation. If it is determined that rounding is required for the prior operation, then the rounding will occur during the subsequent operation. During the subsequent operation, a Booth encoder not utilized by the multiply operation will output a rounding correction factor as a selection input to a Booth multiplexer not utilized by the multiply operation. When the Booth multiplexer receives the rounding correction factor, the Booth multiplexer will output a rounding correction value to a carry save adder (CSA) tree, and the CSA tree will generate the correct sum from the rounding correction value and the other partial products.

Type: Application

Filed: March 8, 2011

Publication date: September 13, 2012

Inventors: Jeffrey S. Brooks, Christopher H. Olson
Processor Pipeline which Implements Fused and Unfused Multiply-Add Instructions

Publication number: 20120221614

Abstract: Implementing an unfused multiply-add instruction within a fused multiply-add pipeline. The system may include an aligner having an input for receiving an addition term, a multiplier tree having two inputs for receiving a first value and a second value for multiplication, and a first carry save adder (CSA), wherein the first CSA may receive partial products from the multiplier tree and an aligned addition term from the aligner. The system may include a fused/unfused multiply add (FUMA) block which may receive the first partial product, the second partial product, and the aligned addition term, wherein the first partial product and the second partial product are not truncated. The FUMA block may perform an unfused multiply add operation or a fused multiply add operation using the first partial product, the second partial product, and the aligned addition term, e.g., depending on an opcode or mode bit.

Type: Application

Filed: May 11, 2012

Publication date: August 30, 2012

Inventors: Jeffrey S. Brooks, Christopher H. Olson
INSTRUCTION SUPPORT FOR PERFORMING STREAM CIPHER

Publication number: 20120216020

Abstract: Techniques relating to a processor that provides instruction-level support for a stream cipher are disclosed. In one embodiment, the processor supports a first instruction executable to perform an alpha multiplication, an alpha division, and an exclusive-OR operation using a result of the alpha multiplication and a result of the alpha division. In one embodiment, the processor supports a second instruction executable to perform a modular addition of a value R1 and a value S, and to perform a first exclusive-OR operation on a result of the modular addition and a value R2. In one embodiment, the processor supports a third instruction executable to perform a substitution-box (S-Box) operation on a value R1 to produce a value R2?, and to perform a modular addition using a value R2 to produce a value R1'.

Type: Application

Filed: February 21, 2011

Publication date: August 23, 2012

Inventors: Christopher H. Olson, Gregory F. Grohoski, Manish K. Shah
Processor which implements fused and unfused multiply-add instructions in a pipelined manner

Patent number: 8239440

Abstract: Implementing an unfused multiply-add instruction within a fused multiply-add pipeline. The system may include an aligner having an input for receiving an addition term, a multiplier tree having two inputs for receiving a first value and a second value for multiplication, and a first carry save adder (CSA), wherein the first CSA may receive partial products from the multiplier tree and an aligned addition term from the aligner. The system may include a fused/unfused multiply add (FUMA) block which may receive the first partial product, the second partial product, and the aligned addition term, wherein the first partial product and the second partial product are not truncated. The FUMA block may perform an unfused multiply add operation or a fused multiply add operation using the first partial product, the second partial product, and the aligned addition term, e.g., depending on an opcode or mode bit.

Type: Grant

Filed: March 28, 2008

Date of Patent: August 7, 2012

Assignee: Oracle America, Inc.

Inventors: Jeffrey S. Brooks, Christopher H. Olson
Handling multi-cycle integer operations for a multi-threaded processor

Patent number: 8195919

Abstract: Determining an effective address of a memory with a three-operand add operation in single execution cycle of a multithreaded processor that can access both segmented memory and non-segmented memory. During that cycle, the processor determines whether a memory segment base is zero. If the segment base is zero, the processor can access a memory location at the effective address without adding the segment base. If the segment base is not zero, such as when executing legacy code, the processor consumes another cycle to add the segment base to the effective address. Similarly, the processor consumes another cycle if the effective address or the linear address is misaligned. An integer execution unit that performs the three-operand add using a carry-save adder coupled to a carry look-ahead adder. If the segment base is not zero, the effective address is fed back through the integer execution unit to add the segment base.

Type: Grant

Filed: October 29, 2007

Date of Patent: June 5, 2012

Assignee: Oracle America, Inc.

Inventors: Christopher H. Olson, Robert T. Golla, Manish Shah, Jeffrey S. Brooks

prev 1 2 3 4 5 next