Patents by Inventor Brian Michael Stempel

Brian Michael Stempel has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7650466
    Abstract: A method of managing cache partitions provides a first pointer for higher priority writes and a second pointer for lower priority writes, and uses the first pointer to delimit the lower priority writes. For example, locked writes have greater priority than unlocked writes, and a first pointer may be used for locked writes, and a second pointer may be used for unlocked writes. The first pointer is advanced responsive to making locked writes, and its advancement thus defines a locked region and an unlocked region. The second pointer is advanced responsive to making unlocked writes. The second pointer also is advanced (or retreated) as needed to prevent it from pointing to locations already traversed by the first pointer. Thus, the pointer delimits the unlocked region and allows the locked region to grow at the expense of the unlocked region.
    Type: Grant
    Filed: September 21, 2005
    Date of Patent: January 19, 2010
    Assignee: QUALCOMM Incorporated
    Inventors: Brian Michael Stempel, James Norris Dieffenderfer, Jeffrey Todd Bridges, Thomas Andrew Sartorius, Rodney Wayne Smith, Robert Douglas Clancy, Victor Roberts Augsburg
  • Patent number: 7617387
    Abstract: A method of resolving simultaneous branch predictions prior to validation of the predicted branch instruction is disclosed. The method includes processing two or more predicted branch instructions, with each predicted branch instruction having a predicted state and a corrected state. The method further includes selecting one of the corrected states. Should one of the predicted branch instructions be mispredicted, the selected corrected state is used to direct future instruction fetches.
    Type: Grant
    Filed: September 27, 2006
    Date of Patent: November 10, 2009
    Assignee: QUALCOMM Incorporated
    Inventors: Rodney Wayne Smith, Brian Michael Stempel, James Norris Dieffenderfer, Thomas Andrew Sartorius
  • Publication number: 20090119485
    Abstract: A predecode repair cache is described in a processor capable of fetching and executing variable length instructions having instructions of at least two lengths which may be mixed in a program. An instruction cache is operable to store in an instruction cache line instructions having at least a first length and a second length, the second length longer than the first length. A predecoder is operable to predecode instructions fetched from the instruction cache that have invalid predecode information to form repaired predecode information. A predecode repair cache is operable to store the repaired predecode information associated with instructions of the second length that span across two cache lines in the instruction cache. Methods for filling the predecode repair cache and for executing an instruction that spans across two cache lines are also described.
    Type: Application
    Filed: November 2, 2007
    Publication date: May 7, 2009
    Applicant: QUALCOMM INCORPORATED
    Inventors: Rodney Wayne Smith, Brian Michael Stempel, David John Mandzak, James Norris Dieffenderfer
  • Publication number: 20090094444
    Abstract: Whenever a link address is written to the link stack, the prior value of the link stack entry is saved, and is restored to the link stack after a link stack push operation is speculatively executed following a mispredicted branch. This condition is detected by maintaining a count of the total number of uncommitted link stack write instructions in the pipeline, and a count of the number of uncommitted link stack write instructions ahead of each branch instruction. When a branch is evaluated and determined to have been mispredicted, the count associated with it is compared to the total count. A discrepancy indicates a link stack write instruction was speculatively issued into the pipeline after the mispredicted branch instruction, and pushed a link address onto the link stack. The prior link address is restored to the link stack from the link stack restore buffer.
    Type: Application
    Filed: October 5, 2007
    Publication date: April 9, 2009
    Applicant: QUALCOMM INCORPORATED
    Inventors: James Norris Dieffenderfer, Brian Michael Stempel, Rodney Wayne Smith
  • Patent number: 7478228
    Abstract: An apparatus for emulating the branch prediction behavior of an explicit subroutine call is disclosed. The apparatus includes a first input which is configured to receive an instruction address and a second input. The second input is configured to receive predecode information which describes the instruction address as being related to an implicit subroutine call to a subroutine. In response to the predecode information, the apparatus also includes an adder configured to add a constant to the instruction address defining a return address, causing the return address to be stored to an explicit subroutine resource, thus, facilitating subsequent branch prediction of a return call instruction.
    Type: Grant
    Filed: August 31, 2006
    Date of Patent: January 13, 2009
    Assignee: QUALCOMM Incorporated
    Inventors: Brian Michael Stempel, James Norris Dieffenderfer, Thomas Andrew Sartorius, Rodney Wayne Smith
  • Publication number: 20080288753
    Abstract: An apparatus for emulating the branch prediction behavior of an explicit subroutine call is disclosed. The apparatus includes a first input which is configured to receive an instruction address and a second input. The second input is configured to receive predecode information which describes the instruction address as being related to an implicit subroutine call to a subroutine. In response to the predecode information, the apparatus also includes an adder configured to add a constant to the instruction address defining a return address, causing the return address to be stored to an explicit subroutine resource, thus, facilitating subsequent branch prediction of a return call instruction.
    Type: Application
    Filed: July 31, 2008
    Publication date: November 20, 2008
    Applicant: QUALCOMM INCORPORATED
    Inventors: Brian Michael Stempel, James Norris Dieffenderfer, Thomas Andrew Sartorius, Rodney Wayne Smith
  • Publication number: 20080250229
    Abstract: In a processor executing instructions from a variable-length instruction set, a preload instruction is operative to retrieve from memory a data block corresponding to an instruction cache line, pre-decode instructions from a variable-length instruction set in the data block, and load the instructions and pre-decode information into the instruction cache. An instruction execution unit indicates to a pre-decoder the position within the data block of a first valid instruction. The pre-decoder successively determines the length of each instruction and hence the instruction boundaries. An instruction cache line offset indicator that identifies the position of the first valid instruction may be generated and provided to the pre-decoder in a variety of ways.
    Type: Application
    Filed: April 4, 2007
    Publication date: October 9, 2008
    Applicant: QUALCOMM INCORPORATED
    Inventors: Brian Michael Stempel, Thomas Andrew Sartorius, Rodney Wayne Smith
  • Publication number: 20080229069
    Abstract: An instruction preload instruction executed in a first processor instruction set operating mode is operative to correctly preload instructions in a different, second instruction set. The instructions are pre-decoded according to the second instruction set encoding in response to an instruction set preload indicator (ISPI). In various embodiments, the ISPI may be set prior to executing the preload instruction, or may comprise part of the preload instruction or the preload target address.
    Type: Application
    Filed: March 14, 2007
    Publication date: September 18, 2008
    Applicant: QUALCOMM INCORPORATED
    Inventors: Thomas Andrew Sartorius, Brian Michael Stempel, Rodney Wayne Smith
  • Patent number: 7421568
    Abstract: A processor capable of fetching and executing variable length instructions is described having instructions of at least two lengths. The processor operates in multiple modes. One of the modes restricts instructions that can be fetched and executed to the longer length instructions. An instruction cache is used for storing variable length instructions and their associated predecode bit fields in an instruction cache line and storing the instruction address and processor operating mode state information at the time of the fetch in a tag line. The processor operating mode state information indicates the program specified mode of operation of the processor. The processor fetches instructions from the instruction cache for execution.
    Type: Grant
    Filed: March 4, 2005
    Date of Patent: September 2, 2008
    Assignee: QUALCOMM Incorporated
    Inventors: Brian Michael Stempel, James Norris Dieffenderfer, Jeffrey Todd Bridges, Rodney Wayne Smith, Thomas Andrew Sartorius
  • Patent number: 7415638
    Abstract: In a pipelined processor where instructions are pre-decoded prior to being stored in a cache, an incorrectly pre-decoded instruction is detected during execution in the pipeline. The corresponding instruction is invalidated in the cache, and the instruction is forced to evaluate as a branch instruction. In particular, the branch instruction is evaluated as “mispredicted not taken” with a branch target address of the incorrectly pre-decoded instruction's address. This, with the invalidated cache line, causes the incorrectly pre-decoded instruction to be re-fetched from memory with a precise address. The re-fetched instruction is then correctly pre-decoded, written to the cache, and executed.
    Type: Grant
    Filed: November 22, 2004
    Date of Patent: August 19, 2008
    Assignee: QUALCOMM Incorporated
    Inventors: Rodney Wayne Smith, Brian Michael Stempel, James Norris Dieffenderfer, Jeffrey Todd Bridges, Thomas Andrew Sartorius
  • Patent number: 7406613
    Abstract: In a pipelined processor, a pre-decoder in advance of an instruction cache calculates the branch target address (BTA) of PC-relative and absolute address branch instructions. The pre-decoder compares the BTA with the branch instruction address (BIA) to determine whether the target and instruction are in the same memory page. A branch target same page (BTSP) bit indicating this is written to the cache and associated with the instruction. When the branch is executed and evaluated as taken, a TLB access to check permission attributes for the BTA is suppressed if the BTA is in the same page as the BIA, as indicated by the BTSP bit. This reduces power consumption as the TLB access is suppressed and the BTA/BIA comparison is only performed once, when the branch instruction is first fetched. Additionally, the pre-decoder removes the BTA/BIA comparison from the BTA generation and selection critical path.
    Type: Grant
    Filed: December 2, 2004
    Date of Patent: July 29, 2008
    Assignee: QUALCOMM Incorporated
    Inventors: James Norris Dieffenderfer, Thomas Andrew Sartorius, Rodney Wayne Smith, Brian Michael Stempel
  • Patent number: 7404042
    Abstract: A fetch section of a processor comprises an instruction cache and a pipeline of several stages for obtaining instructions. Instructions may cross cache line boundaries. The pipeline stages process two addresses to recover a complete boundary crossing instruction. During such processing, if the second piece of the instruction is not in the cache, the fetch with regard to the first line is invalidated and recycled. On this first pass, processing of the address for the second part of the instruction is treated as a pre-fetch request to load instruction data to the cache from higher level memory, without passing any of that data to the later stages of the processor. When the first line address passes through the fetch stages again, the second line address follows in the normal order, and both pieces of the instruction are can be fetched from the cache and combined in the normal manner.
    Type: Grant
    Filed: May 18, 2005
    Date of Patent: July 22, 2008
    Assignee: QUALCOMM Incorporated
    Inventors: Brian Michael Stempel, Jeffrey Todd Bridges, Rodney Wayne Smith, Thomas Andrew Sartorius
  • Publication number: 20080109644
    Abstract: A method of processing branch history information is disclosed. The method retrieves branch instructions from an instruction cache and executes the branch instructions in a plurality of pipeline stages. The method verifies that a branch instruction has been identified. The method further receives branch history information during a first pipeline stage and loads the branch history information into a first register, wherein the first register. The method further loads the branch history information into the second register during the second pipeline stage.
    Type: Application
    Filed: November 3, 2006
    Publication date: May 8, 2008
    Inventors: Brian Michael Stempel, James Norris Dieffenderfer, Thomas Andrew Sartorius, Rodney Wayne Smith
  • Publication number: 20080082807
    Abstract: In a processor executing instructions in at least a first instruction set execution mode having a first minimum instruction length and a second instruction set execution mode having a smaller, second minimum instruction length, line and counter index addresses are formed that access every counter in a branch history table (BHT), and reduce the number of index address bits that are multiplexed based on the current instruction set execution mode. In one embodiment, counters within a BHT line are arranged and indexed in such a manner that half of the BHT can be powered down for each access in one instruction set execution mode.
    Type: Application
    Filed: September 29, 2006
    Publication date: April 3, 2008
    Inventors: Brian Michael Stempel, Rodney Wayne Smith
  • Publication number: 20080077781
    Abstract: A method of resolving simultaneous branch predictions prior to validation of the predicted branch instruction is disclosed. The method includes processing two or more predicted branch instructions, with each predicted branch instruction having a predicted state and a corrected state. The method further includes selecting one of the corrected states. Should one of the predicted branch instructions be mispredicted, the selected corrected state is used to direct future instruction fetches.
    Type: Application
    Filed: September 27, 2006
    Publication date: March 27, 2008
    Inventors: Rodney Wayne Smith, Brian Michael Stempel, James Norris Dieffenderfer, Thomas Andrew Sartorius
  • Publication number: 20080059780
    Abstract: An apparatus for emulating the branch prediction behavior of an explicit subroutine call is disclosed. The apparatus includes a first input which is configured to receive an instruction address and a second input. The second input is configured to receive predecode information which describes the instruction address as being related to an implicit subroutine call to a subroutine. In response to the predecode information, the apparatus also includes an adder configured to add a constant to the instruction address defining a return address, causing the return address to be stored to an explicit subroutine resource, thus, facilitating subsequent branch prediction of a return call instruction.
    Type: Application
    Filed: August 31, 2006
    Publication date: March 6, 2008
    Inventors: Brian Michael Stempel, James Norris Dieffenderfer, Thomas Andrew Sartorius, Rodney Wayne Smith
  • Publication number: 20080040587
    Abstract: A processor is operative to execute two or more instruction sets, each in a different instruction set operating mode. As each instruction is executed, debug circuit comparison the current instruction set operating mode to a target instruction set operating mode sent by a programmer, and outputs an alert or indication in they match. The alert or indication may additionally be dependent upon the instruction address following within a predetermined target address range. The alert or indication may comprise a breakpoint signal that halts execution and/or it is output as an external signal of the processor. The instruction address at which the processor detects a match in the instruction set operating modes may additionally be output. Additionally or alternatively, the alert or indication may comprise starting or stopping a trace operation, causing an exception, or any other known debugger function.
    Type: Application
    Filed: August 9, 2006
    Publication date: February 14, 2008
    Inventors: Kevin Charles Burke, Brian Michael Stempel, Daren Streett, Kevin Allen Sapp, Leslie Mark DeBruyne, Nabil Amir Rizk, Thomas Andrew Sartorius, Rodney Wayne Smith
  • Publication number: 20080040576
    Abstract: In a variable-length instruction set wherein the length of each instruction is a multiple of a minimum instruction length granularity, an indication of the last granularity (i.e., the end) of a taken branch instruction is a stored in a branch target address cache (BTAC). If a branch instruction that later hits in the BTAC is predicted taken, previously fetched instructions are flushed from the pipeline beginning immediately past the indicated end of the branch instruction. This technique saves BTAC space by avoiding to the need to store the length of the branch instruction in the BTAC, and improves performance by eliminating the necessity of calculating where to begin flushing (based on the length of the branch instruction).
    Type: Application
    Filed: August 9, 2006
    Publication date: February 14, 2008
    Inventors: Brian Michael Stempel, Rodney Wayne Smith
  • Publication number: 20080034187
    Abstract: A processor performs a prefetch operation on non-sequential instruction addresses. If a first instruction address misses in an instruction cache and accesses a higher-order memory as part of a fetch operation, and a branch instruction associated with the first instruction address or an address following the first instruction address is detected and predicted taken, a prefetch operation is performed using a predicted branch target address, during the higher-order memory access. If the predicted branch target address hits in the instruction cache during the prefetch operation, associated instructions are not retrieved, to conserve power. If the predicted branch target address misses in the instruction cache during the prefetch operation, a higher-order memory access may be launched, using the predicted branch instruction address. In either case, the first instruction address is re-loaded into the fetch stage pipeline to await the return of instructions from its higher-order memory access.
    Type: Application
    Filed: August 2, 2006
    Publication date: February 7, 2008
    Inventors: Brian Michael Stempel, Thomas Andrew Sartorius, Rodney Wayne Smith
  • Publication number: 20070283134
    Abstract: A sliding-window, block-based Branch Target Address Cache (BTAC) comprises a plurality of entries, each entry associated with a block of instructions containing at least one branch instruction having been evaluated taken, and having a tag associated with the address of the first instruction in the block. The blocks each correspond to a group of instructions fetched from memory, such as an I-cache. Where a branch instruction is included in two or more fetch groups, it is also included in two or more instruction blocks associated with BTAC entries. The sliding-window, block-based BTAC allows for storing the Branch Target Address (BTA) of two or more taken branch instructions that fall in the same instruction block, without providing for multiple BTA storage space in each BTAC entry, by storing BTAC entries associated with different instruction blocks, each containing at least one of the taken branch instructions.
    Type: Application
    Filed: June 5, 2006
    Publication date: December 6, 2007
    Inventors: Rodney Wayne Smith, James Norris Dieffenderfer, Thomas Andrew Sartorius, Brian Michael Stempel