Patents by Inventor Brian Michael Stempel

Brian Michael Stempel has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Method and apparatus for managing cache partitioning using a dynamic boundary

Patent number: 7650466

Abstract: A method of managing cache partitions provides a first pointer for higher priority writes and a second pointer for lower priority writes, and uses the first pointer to delimit the lower priority writes. For example, locked writes have greater priority than unlocked writes, and a first pointer may be used for locked writes, and a second pointer may be used for unlocked writes. The first pointer is advanced responsive to making locked writes, and its advancement thus defines a locked region and an unlocked region. The second pointer is advanced responsive to making unlocked writes. The second pointer also is advanced (or retreated) as needed to prevent it from pointing to locations already traversed by the first pointer. Thus, the pointer delimits the unlocked region and allows the locked region to grow at the expense of the unlocked region.

Type: Grant

Filed: September 21, 2005

Date of Patent: January 19, 2010

Assignee: QUALCOMM Incorporated

Inventors: Brian Michael Stempel, James Norris Dieffenderfer, Jeffrey Todd Bridges, Thomas Andrew Sartorius, Rodney Wayne Smith, Robert Douglas Clancy, Victor Roberts Augsburg
Methods and system for resolving simultaneous predicted branch instructions

Patent number: 7617387

Abstract: A method of resolving simultaneous branch predictions prior to validation of the predicted branch instruction is disclosed. The method includes processing two or more predicted branch instructions, with each predicted branch instruction having a predicted state and a corrected state. The method further includes selecting one of the corrected states. Should one of the predicted branch instructions be mispredicted, the selected corrected state is used to direct future instruction fetches.

Type: Grant

Filed: September 27, 2006

Date of Patent: November 10, 2009

Assignee: QUALCOMM Incorporated

Inventors: Rodney Wayne Smith, Brian Michael Stempel, James Norris Dieffenderfer, Thomas Andrew Sartorius
Predecode Repair Cache For Instructions That Cross An Instruction Cache Line

Publication number: 20090119485

Abstract: A predecode repair cache is described in a processor capable of fetching and executing variable length instructions having instructions of at least two lengths which may be mixed in a program. An instruction cache is operable to store in an instruction cache line instructions having at least a first length and a second length, the second length longer than the first length. A predecoder is operable to predecode instructions fetched from the instruction cache that have invalid predecode information to form repaired predecode information. A predecode repair cache is operable to store the repaired predecode information associated with instructions of the second length that span across two cache lines in the instruction cache. Methods for filling the predecode repair cache and for executing an instruction that spans across two cache lines are also described.

Type: Application

Filed: November 2, 2007

Publication date: May 7, 2009

Applicant: QUALCOMM INCORPORATED

Inventors: Rodney Wayne Smith, Brian Michael Stempel, David John Mandzak, James Norris Dieffenderfer
Link Stack Repair of Erroneous Speculative Update

Publication number: 20090094444

Abstract: Whenever a link address is written to the link stack, the prior value of the link stack entry is saved, and is restored to the link stack after a link stack push operation is speculatively executed following a mispredicted branch. This condition is detected by maintaining a count of the total number of uncommitted link stack write instructions in the pipeline, and a count of the number of uncommitted link stack write instructions ahead of each branch instruction. When a branch is evaluated and determined to have been mispredicted, the count associated with it is compared to the total count. A discrepancy indicates a link stack write instruction was speculatively issued into the pipeline after the mispredicted branch instruction, and pushed a link address onto the link stack. The prior link address is restored to the link stack from the link stack restore buffer.

Type: Application

Filed: October 5, 2007

Publication date: April 9, 2009

Applicant: QUALCOMM INCORPORATED

Inventors: James Norris Dieffenderfer, Brian Michael Stempel, Rodney Wayne Smith
Apparatus for generating return address predictions for implicit and explicit subroutine calls

Patent number: 7478228

Abstract: An apparatus for emulating the branch prediction behavior of an explicit subroutine call is disclosed. The apparatus includes a first input which is configured to receive an instruction address and a second input. The second input is configured to receive predecode information which describes the instruction address as being related to an implicit subroutine call to a subroutine. In response to the predecode information, the apparatus also includes an adder configured to add a constant to the instruction address defining a return address, causing the return address to be stored to an explicit subroutine resource, thus, facilitating subsequent branch prediction of a return call instruction.

Type: Grant

Filed: August 31, 2006

Date of Patent: January 13, 2009

Assignee: QUALCOMM Incorporated

Inventors: Brian Michael Stempel, James Norris Dieffenderfer, Thomas Andrew Sartorius, Rodney Wayne Smith
Methods and Apparatus for Emulating the Branch Prediction Behavior of an Explicit Subroutine Call

Publication number: 20080288753

Abstract: An apparatus for emulating the branch prediction behavior of an explicit subroutine call is disclosed. The apparatus includes a first input which is configured to receive an instruction address and a second input. The second input is configured to receive predecode information which describes the instruction address as being related to an implicit subroutine call to a subroutine. In response to the predecode information, the apparatus also includes an adder configured to add a constant to the instruction address defining a return address, causing the return address to be stored to an explicit subroutine resource, thus, facilitating subsequent branch prediction of a return call instruction.

Type: Application

Filed: July 31, 2008

Publication date: November 20, 2008

Applicant: QUALCOMM INCORPORATED

Inventors: Brian Michael Stempel, James Norris Dieffenderfer, Thomas Andrew Sartorius, Rodney Wayne Smith
System, Method and Software to Preload Instructions from a Variable-Length Instruction Set with Proper Pre-Decoding

Publication number: 20080250229

Abstract: In a processor executing instructions from a variable-length instruction set, a preload instruction is operative to retrieve from memory a data block corresponding to an instruction cache line, pre-decode instructions from a variable-length instruction set in the data block, and load the instructions and pre-decode information into the instruction cache. An instruction execution unit indicates to a pre-decoder the position within the data block of a first valid instruction. The pre-decoder successively determines the length of each instruction and hence the instruction boundaries. An instruction cache line offset indicator that identifies the position of the first valid instruction may be generated and provided to the pre-decoder in a variety of ways.

Type: Application

Filed: April 4, 2007

Publication date: October 9, 2008

Applicant: QUALCOMM INCORPORATED

Inventors: Brian Michael Stempel, Thomas Andrew Sartorius, Rodney Wayne Smith
System, Method And Software To Preload Instructions From An Instruction Set Other Than One Currently Executing

Publication number: 20080229069

Abstract: An instruction preload instruction executed in a first processor instruction set operating mode is operative to correctly preload instructions in a different, second instruction set. The instructions are pre-decoded according to the second instruction set encoding in response to an instruction set preload indicator (ISPI). In various embodiments, the ISPI may be set prior to executing the preload instruction, or may comprise part of the preload instruction or the preload target address.

Type: Application

Filed: March 14, 2007

Publication date: September 18, 2008

Applicant: QUALCOMM INCORPORATED

Inventors: Thomas Andrew Sartorius, Brian Michael Stempel, Rodney Wayne Smith
Power saving methods and apparatus to selectively enable cache bits based on known processor state

Patent number: 7421568

Abstract: A processor capable of fetching and executing variable length instructions is described having instructions of at least two lengths. The processor operates in multiple modes. One of the modes restricts instructions that can be fetched and executed to the longer length instructions. An instruction cache is used for storing variable length instructions and their associated predecode bit fields in an instruction cache line and storing the instruction address and processor operating mode state information at the time of the fetch in a tag line. The processor operating mode state information indicates the program specified mode of operation of the processor. The processor fetches instructions from the instruction cache for execution.

Type: Grant

Filed: March 4, 2005

Date of Patent: September 2, 2008

Assignee: QUALCOMM Incorporated

Inventors: Brian Michael Stempel, James Norris Dieffenderfer, Jeffrey Todd Bridges, Rodney Wayne Smith, Thomas Andrew Sartorius
Pre-decode error handling via branch correction

Patent number: 7415638

Abstract: In a pipelined processor where instructions are pre-decoded prior to being stored in a cache, an incorrectly pre-decoded instruction is detected during execution in the pipeline. The corresponding instruction is invalidated in the cache, and the instruction is forced to evaluate as a branch instruction. In particular, the branch instruction is evaluated as “mispredicted not taken” with a branch target address of the incorrectly pre-decoded instruction's address. This, with the invalidated cache line, causes the incorrectly pre-decoded instruction to be re-fetched from memory with a precise address. The re-fetched instruction is then correctly pre-decoded, written to the cache, and executed.

Type: Grant

Filed: November 22, 2004

Date of Patent: August 19, 2008

Assignee: QUALCOMM Incorporated

Inventors: Rodney Wayne Smith, Brian Michael Stempel, James Norris Dieffenderfer, Jeffrey Todd Bridges, Thomas Andrew Sartorius
Translation lookaside buffer (TLB) suppression for intra-page program counter relative or absolute address branch instructions

Patent number: 7406613

Abstract: In a pipelined processor, a pre-decoder in advance of an instruction cache calculates the branch target address (BTA) of PC-relative and absolute address branch instructions. The pre-decoder compares the BTA with the branch instruction address (BIA) to determine whether the target and instruction are in the same memory page. A branch target same page (BTSP) bit indicating this is written to the cache and associated with the instruction. When the branch is executed and evaluated as taken, a TLB access to check permission attributes for the BTA is suppressed if the BTA is in the same page as the BIA, as indicated by the BTSP bit. This reduces power consumption as the TLB access is suppressed and the BTA/BIA comparison is only performed once, when the branch instruction is first fetched. Additionally, the pre-decoder removes the BTA/BIA comparison from the BTA generation and selection critical path.

Type: Grant

Filed: December 2, 2004

Date of Patent: July 29, 2008

Assignee: QUALCOMM Incorporated

Inventors: James Norris Dieffenderfer, Thomas Andrew Sartorius, Rodney Wayne Smith, Brian Michael Stempel
Handling cache miss in an instruction crossing a cache line boundary

Patent number: 7404042

Abstract: A fetch section of a processor comprises an instruction cache and a pipeline of several stages for obtaining instructions. Instructions may cross cache line boundaries. The pipeline stages process two addresses to recover a complete boundary crossing instruction. During such processing, if the second piece of the instruction is not in the cache, the fetch with regard to the first line is invalidated and recycled. On this first pass, processing of the address for the second part of the instruction is treated as a pre-fetch request to load instruction data to the cache from higher level memory, without passing any of that data to the later stages of the processor. When the first line address passes through the fetch stages again, the second line address follows in the normal order, and both pieces of the instruction are can be fetched from the cache and combined in the normal manner.

Type: Grant

Filed: May 18, 2005

Date of Patent: July 22, 2008

Assignee: QUALCOMM Incorporated

Inventors: Brian Michael Stempel, Jeffrey Todd Bridges, Rodney Wayne Smith, Thomas Andrew Sartorius
System and method for using a working global history register

Publication number: 20080109644

Abstract: A method of processing branch history information is disclosed. The method retrieves branch instructions from an instruction cache and executes the branch instructions in a plurality of pipeline stages. The method verifies that a branch instruction has been identified. The method further receives branch history information during a first pipeline stage and loads the branch history information into a first register, wherein the first register. The method further loads the branch history information into the second register during the second pipeline stage.

Type: Application

Filed: November 3, 2006

Publication date: May 8, 2008

Inventors: Brian Michael Stempel, James Norris Dieffenderfer, Thomas Andrew Sartorius, Rodney Wayne Smith
Effective Use of a BHT in Processor Having Variable Length Instruction Set Execution Modes

Publication number: 20080082807

Abstract: In a processor executing instructions in at least a first instruction set execution mode having a first minimum instruction length and a second instruction set execution mode having a smaller, second minimum instruction length, line and counter index addresses are formed that access every counter in a branch history table (BHT), and reduce the number of index address bits that are multiplexed based on the current instruction set execution mode. In one embodiment, counters within a BHT line are arranged and indexed in such a manner that half of the BHT can be powered down for each access in one instruction set execution mode.

Type: Application

Filed: September 29, 2006

Publication date: April 3, 2008

Inventors: Brian Michael Stempel, Rodney Wayne Smith
Methods and System for Resolving Simultaneous Predicted Branch Instructions

Publication number: 20080077781

Abstract: A method of resolving simultaneous branch predictions prior to validation of the predicted branch instruction is disclosed. The method includes processing two or more predicted branch instructions, with each predicted branch instruction having a predicted state and a corrected state. The method further includes selecting one of the corrected states. Should one of the predicted branch instructions be mispredicted, the selected corrected state is used to direct future instruction fetches.

Type: Application

Filed: September 27, 2006

Publication date: March 27, 2008

Inventors: Rodney Wayne Smith, Brian Michael Stempel, James Norris Dieffenderfer, Thomas Andrew Sartorius
Methods and Apparatus for Emulating the Branch Prediction Behavior of an Explicit Subroutine Call

Publication number: 20080059780

Abstract: An apparatus for emulating the branch prediction behavior of an explicit subroutine call is disclosed. The apparatus includes a first input which is configured to receive an instruction address and a second input. The second input is configured to receive predecode information which describes the instruction address as being related to an implicit subroutine call to a subroutine. In response to the predecode information, the apparatus also includes an adder configured to add a constant to the instruction address defining a return address, causing the return address to be stored to an explicit subroutine resource, thus, facilitating subsequent branch prediction of a return call instruction.

Type: Application

Filed: August 31, 2006

Publication date: March 6, 2008

Inventors: Brian Michael Stempel, James Norris Dieffenderfer, Thomas Andrew Sartorius, Rodney Wayne Smith
Debug Circuit Comparing Processor Instruction Set Operating Mode

Publication number: 20080040587

Abstract: A processor is operative to execute two or more instruction sets, each in a different instruction set operating mode. As each instruction is executed, debug circuit comparison the current instruction set operating mode to a target instruction set operating mode sent by a programmer, and outputs an alert or indication in they match. The alert or indication may additionally be dependent upon the instruction address following within a predetermined target address range. The alert or indication may comprise a breakpoint signal that halts execution and/or it is output as an external signal of the processor. The instruction address at which the processor detects a match in the instruction set operating modes may additionally be output. Additionally or alternatively, the alert or indication may comprise starting or stopping a trace operation, causing an exception, or any other known debugger function.

Type: Application

Filed: August 9, 2006

Publication date: February 14, 2008

Inventors: Kevin Charles Burke, Brian Michael Stempel, Daren Streett, Kevin Allen Sapp, Leslie Mark DeBruyne, Nabil Amir Rizk, Thomas Andrew Sartorius, Rodney Wayne Smith
Associate Cached Branch Information with the Last Granularity of Branch instruction in Variable Length instruction Set

Publication number: 20080040576

Abstract: In a variable-length instruction set wherein the length of each instruction is a multiple of a minimum instruction length granularity, an indication of the last granularity (i.e., the end) of a taken branch instruction is a stored in a branch target address cache (BTAC). If a branch instruction that later hits in the BTAC is predicted taken, previously fetched instructions are flushed from the pipeline beginning immediately past the indicated end of the branch instruction. This technique saves BTAC space by avoiding to the need to store the length of the branch instruction in the BTAC, and improves performance by eliminating the necessity of calculating where to begin flushing (based on the length of the branch instruction).

Type: Application

Filed: August 9, 2006

Publication date: February 14, 2008

Inventors: Brian Michael Stempel, Rodney Wayne Smith
Method and Apparatus for Prefetching Non-Sequential Instruction Addresses

Publication number: 20080034187

Abstract: A processor performs a prefetch operation on non-sequential instruction addresses. If a first instruction address misses in an instruction cache and accesses a higher-order memory as part of a fetch operation, and a branch instruction associated with the first instruction address or an address following the first instruction address is detected and predicted taken, a prefetch operation is performed using a predicted branch target address, during the higher-order memory access. If the predicted branch target address hits in the instruction cache during the prefetch operation, associated instructions are not retrieved, to conserve power. If the predicted branch target address misses in the instruction cache during the prefetch operation, a higher-order memory access may be launched, using the predicted branch instruction address. In either case, the first instruction address is re-loaded into the fetch stage pipeline to await the return of instructions from its higher-order memory access.

Type: Application

Filed: August 2, 2006

Publication date: February 7, 2008

Inventors: Brian Michael Stempel, Thomas Andrew Sartorius, Rodney Wayne Smith
Sliding-Window, Block-Based Branch Target Address Cache

Publication number: 20070283134

Abstract: A sliding-window, block-based Branch Target Address Cache (BTAC) comprises a plurality of entries, each entry associated with a block of instructions containing at least one branch instruction having been evaluated taken, and having a tag associated with the address of the first instruction in the block. The blocks each correspond to a group of instructions fetched from memory, such as an I-cache. Where a branch instruction is included in two or more fetch groups, it is also included in two or more instruction blocks associated with BTAC entries. The sliding-window, block-based BTAC allows for storing the Branch Target Address (BTA) of two or more taken branch instructions that fall in the same instruction block, without providing for multiple BTA storage space in each BTAC entry, by storing BTAC entries associated with different instruction blocks, each containing at least one of the taken branch instructions.

Type: Application

Filed: June 5, 2006

Publication date: December 6, 2007

Inventors: Rodney Wayne Smith, James Norris Dieffenderfer, Thomas Andrew Sartorius, Brian Michael Stempel

prev 1 2 3 4 5 next