Patents by Inventor Stephen R. Undy

Stephen R. Undy has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7356674
    Abstract: A method of, and apparatus for, interfacing the hardware of a processor capable of processing instructions from more than one type of instruction set. More particularly, an engine responsible for fetching native instructions from a memory subsystem (such as an EM fetch engine) is interfaced with an engine that processes emulated instructions (such as an x86 engine). This is achieved using a handshake protocol, whereby the x86 engine sends an explicit fetch request signal to the EM fetch engine along with a fetch address. The EM fetch engine then accesses the memory subsystem and retrieves a line of instructions for subsequent decode and execution. The EM fetch engine sends this line of instructions to the x86 engine along with an explicit fetch complete signal. The EM fetch engine also includes a fetch address queue capable of holding the fetch addresses before they are processed by the EM fetch engine. The fetch requests are processed such that more than one fetch request may be pending at the same time.
    Type: Grant
    Filed: November 21, 2003
    Date of Patent: April 8, 2008
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Anuj Dua, James E McCormick, Jr., Stephen R. Undy, Barry J Arnold, Russell C Brockmann, David Carl Kubicek, James Curtis Stout
  • Patent number: 7343479
    Abstract: The present invention is a method for implementing two architectures on a single chip. The method uses a fetch engine to retrieve instructions. If the instructions are macroinstructions, then it decodes the macroinstructions into microinstructions, and then bundles those microinstructions using a bundler, within an emulation engine. The bundles are issued in parallel and dispatched to the execution engine and contain pre-decode bits so that the execution engine treats them as microinstructions. Before being transferred to the execution engine, the instructions may be held in a buffer. The method also selects between bundled microinstructions from the emulation engine and native microinstructions coming directly from the fetch engine, by using a multiplexer or other means. Both native microinstructions and bundled microinstructions may be held in the buffer. The method also sends additional information to the execution engine.
    Type: Grant
    Filed: June 25, 2003
    Date of Patent: March 11, 2008
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Patrick Knebel, Kevin David Safford, Donald Charles Soltis, Jr., Joel D Lamb, Stephen R. Undy, Russell C Brockmann
  • Patent number: 6944751
    Abstract: The invention provides a processor architecture that bypasses data hazards. The architecture has an array of pipelines and a register file. Each of the pipelines includes an array of execution units. The register file has a first section of n registers (e.g., 128 registers) and a second section of m registers (e.g., 16 registers). A write mux couples speculative data from the execution units to the second set of m registers and non-speculative data from a write-back stage of the execution units to the first section of n registers. A read mux couples the speculative data from the second set of m registers to the execution units to bypass data hazards within the execution units. The register file preferably includes column decode logic for each of the registers in the second section of m registers to architect speculative data without moving data. The decode logic first decodes, and then selects, an age of the producer of the speculative state; the newest producer enables the decode.
    Type: Grant
    Filed: February 11, 2002
    Date of Patent: September 13, 2005
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Eric S. Fetzer, Donald C. Soltis, Jr., Stephen R. Undy
  • Patent number: 6883150
    Abstract: Automatic manufacturing test case generation mechanism receives a plurality of design verification test cases (DVTCs) from one or more sources and based thereon automatically generates manufacturing test cases (MTCs). Each of the manufacturing test cases (MTCs) generated by the automatic manufacturing test includes at least a first design verification test case and a second design verification test case.
    Type: Grant
    Filed: March 14, 2003
    Date of Patent: April 19, 2005
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Donald C. Soltis, Jr., Robert E. Weidner, Stephen R. Undy, Kevin D. Safford
  • Patent number: 6799263
    Abstract: A method for prefetching instructions into cache memory using a prefetch instruction. The prefetch instruction contains a target field, a count field, a cache level field, a flush field, and a trace field. The target field specifies the address at which prefetching begins. The count field specifies the number of instructions to prefetch. The flush field indicates whether earlier prefetches should be discarded and whether in-progress prefetches should be aborted. The level field specifies the level of the cache into which the instructions should be prefetched. The trace field establishes a trace vector that can be used to determine whether the prefetching operation specified by the operation should be aborted. The prefetch instruction may be used in conjunction with a branch predict instruction to prefetch a branch of instructions that is not predicted.
    Type: Grant
    Filed: October 28, 1999
    Date of Patent: September 28, 2004
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Dale C. Morris, James R. Callister, Stephen R. Undy
  • Publication number: 20040181763
    Abstract: Automatic manufacturing test case generation mechanism receives a plurality of design verification test cases (DVTCs) from one or more sources and based thereon automatically generates manufacturing test cases (MTCs). Each of the manufacturing test cases (MTCs) generated by the automatic manufacturing test includes at least a first design verification test case and a second design verification test case.
    Type: Application
    Filed: March 14, 2003
    Publication date: September 16, 2004
    Inventors: Donald C. Soltis, Robert E. Weidner, Stephen R. Undy, Kevin D. Safford
  • Publication number: 20040107335
    Abstract: A method of, and apparatus for, interfacing the hardware of a processor capable of processing instructions from more than one type of instruction set. More particularly, an engine responsible for fetching native instructions from a memory subsystem (such as an EM fetch engine) is interfaced with an engine that processes emulated instructions (such as an x86 engine). This is achieved using a handshake protocol, whereby the x86 engine sends an explicit fetch request signal to the EM fetch engine along with a fetch address. The EM fetch engine then accesses the memory subsystem and retrieves a line of instructions for subsequent decode and execution. The EM fetch engine sends this line of instructions to the x86 engine along with an explicit fetch complete signal. The EM fetch engine also includes a fetch address queue capable of holding the fetch addresses before they are processed by the EM fetch engine. The fetch requests are processed such that more than one fetch request may be pending at the same time.
    Type: Application
    Filed: November 21, 2003
    Publication date: June 3, 2004
    Inventors: Anuj Dua, James E. McCormick, Stephen R. Undy, Barry J. Arnold, Russell C. Brockmann, David Carl Kubicek, James Curtis Stout
  • Publication number: 20040095965
    Abstract: Wires that carry bits of an instruction syllable of an instruction bundle are routed to first and second branch execution units. The wires are routed over the first branch execution unit. When the first branch execution unit is configured to calculate a branch target of a long IP-relative branch instruction occupying multiple syllables of an instruction bundle, the wires are coupled to the first branch execution unit. Otherwise, the wires are not coupled to the first branch execution unit.
    Type: Application
    Filed: October 21, 2003
    Publication date: May 20, 2004
    Inventors: James E. McCormick, Stephen R. Undy, Donald Charles Soltis
  • Patent number: 6721875
    Abstract: Disclosed is a computer architecture with single-syllable IP-relative branch instructions and long IP-relative branch instructions (IP=instruction pointer). The architecture fetches instructions in multi-syllable, bundle form. Single-syllable IP-relative branch instructions occupy a single syllable in an instruction bundle, and long IP-relative branch instructions occupy two syllables in an instruction bundle. The additional syllable of the long branch carries with it additional IP-relative offset bits, which when merged with offset bits carried in a core branch syllable provide a much greater offset than is carried by a single-syllable branch alone. Thus, the long branch provides for greater reach within an address space. Use of the long branch to patch IA-64 architecture instruction bundles is also disclosed. Such a patch provides the reach of an indirect branch with the overhead of a single-syllable IP-relative branch.
    Type: Grant
    Filed: February 22, 2000
    Date of Patent: April 13, 2004
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: James E McCormick, Jr., Stephen R. Undy, Donald Charles Soltis, Jr.
  • Patent number: 6711671
    Abstract: An apparatus for and a method of ensuring that a non-speculative instruction is not fetched into an execution pipeline, where the non-speculative instruction, if fetched, may cause a cache miss that causes potentially catastrophic speculative processing, e.g., speculative transfer of data from an I/O device. When a non-speculative instruction is scheduled for a fetch into the pipeline, a translation lookaside buffer (TLB) miss is made to occur, e.g., by preventing the lowest level TLB from storing any page table entry (PTE) associated with any of the non-speculative instructions. The TLB miss prevents the occurrence of any cache miss, and causes a micro-fault to be injected into the pipeline. The micro-fault includes an address corresponding to the subject non-speculative instruction, and when it reaches the end of the pipeline, causes a redirect of instruction flow of the pipeline to the address, and thus the non-speculative instruction is fetched and executed in a non-speculative manner.
    Type: Grant
    Filed: February 18, 2000
    Date of Patent: March 23, 2004
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Stephen R. Undy, Donald Charles Soltis, Jr.
  • Publication number: 20040049667
    Abstract: Compiled and linked program code having instructions grouped into bundles, wherein the instructions of each bundle are sequentially ordered, is patched by forming a patch bundle and one or more patch code bundles. This is done by writing a long IP-relative branch instruction into multiple syllables of the patch bundle, with the long IP-relative branch instruction providing a means of branching to patch code. Instructions which are similarly located in a bundle to be patched, and which precede the long IP-relative branch instruction, are copied into syllables of the patch bundle. Other instructions of the bundle to be patched are copied into ones of the one or more patch code bundles. The bundle to be patched is overwritten with the patch bundle, and the one or more patch code bundles are written into the patch code.
    Type: Application
    Filed: September 4, 2003
    Publication date: March 11, 2004
    Inventors: James E. McCormick, Stephen R. Undy, Donald Charles Soltis
  • Publication number: 20040030865
    Abstract: The present invention is a method for implementing two architectures on a single chip. The method uses a fetch engine to retrieve instructions. If the instructions are macroinstructions, then it decodes the macroinstructions into microinstructions, and then bundles those microinstructions using a bundler, within an emulation engine. The bundles are issued in parallel and dispatched to the execution engine and contain pre-decode bits so that the execution engine treats them as microinstructions. Before being transferred to the execution engine, the instructions may be held in a buffer. The method also selects between bundled microinstructions from the emulation engine and native microinstructions coming directly from the fetch engine, by using a multiplexer or other means. Both native microinstructions and bundled microinstructions may be held in the buffer. The method also sends additional information to the execution engine.
    Type: Application
    Filed: June 25, 2003
    Publication date: February 12, 2004
    Inventors: Patrick Knebel, Kevin David Safford, Donald Charles Soltis, Joel D Lamb, Stephen R. Undy, Russell C. Brockmann
  • Patent number: 6678817
    Abstract: A method of, and apparatus for, interfacing the hardware of a processor capable of processing instructions from more than one type of instruction set. More particularly, an engine responsible for fetching native instructions from a memory subsystem (such as an EM fetch engine) is interfaced with an engine that processes emulated instructions (such as an x86 engine). This is achieved using a handshake protocol, whereby the x86 engine sends an explicit fetch request signal to the EM fetch engine along with a fetch address. The EM fetch engine then accesses the memory subsystem and retrieves a line of instructions for subsequent decode and execution. The EM fetch engine sends this line of instructions to the x86 engine along with an explicit fetch complete signal. The EM fetch engine also includes a fetch address queue capable of holding the fetch addresses before they are processed by the EM fetch engine. The fetch requests are processed such that more than one fetch request may be pending at the same time.
    Type: Grant
    Filed: February 22, 2000
    Date of Patent: January 13, 2004
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Anuj Dua, James E McCormick, Jr., Stephen R. Undy, Barry J Arnold, Russell C Brockmann, David Carl Kubicek, James Curtis Stout
  • Patent number: 6647487
    Abstract: An apparatus and methods for optimizing prefetch performance. Logical ones are shifted into the bits of a shift register from the left for each instruction address prefetched. As instruction addresses are fetched by the processor, logical zeros are shifted into the bit positions of the shift register from the right. Once initiated, prefetching continues until a logical one is stored in the nth-bit of the shift register. Detection of this logical one in the n-th bit causes prefetching to cease until a prefetched instruction address is removed from the prefetched instruction address register and a logical zero is shifted back into the n-th bit of the shift register. Thus, autonomous prefetch agents are prevented from prefetching too far ahead of the current instruction pointer resulting in wasted memory bandwidth and the replacement of useful instruction in the instruction cache.
    Type: Grant
    Filed: February 18, 2000
    Date of Patent: November 11, 2003
    Assignee: Hewlett-Packard Development Company, LP.
    Inventors: Stephen R. Undy, James E McCormick, Jr.
  • Patent number: 6618801
    Abstract: The present invention is a method for implementing two architectures on a single chip. The method uses a fetch engine to retrieve instructions. If the instructions are macroinstructions, then it decodes the macroinstructions into microinstructions, and then bundles those microinstructions using a bundler, within an emulation engine. The bundles are issued in parallel and dispatched to the execution engine and contain pre-decode bits so that the execution engine treats them as microinstructions. Before being transferred to the execution engine, the instructions may be held in a buffer. The method also selects between bundled microinstructions from the emulation engine and native microinstructions coming directly from the fetch engine, by using a multiplexor or other means. Both native microinstructions and bundled microinstructions may be held in the buffer. The method also sends additional information to the execution engine.
    Type: Grant
    Filed: February 2, 2000
    Date of Patent: September 9, 2003
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Patrick Knebel, Kevin David Safford, Donald Charles Soltis, Jr., Joel D Lamb, Stephen R. Undy, Russell C Brockmann
  • Publication number: 20030163672
    Abstract: The invention provides a processor architecture that bypasses data hazards. The architecture has an array of pipelines and a register file. Each of the pipelines includes an array of execution units. The register file has a first section of n registers (e.g., 128 registers) and a second section of m registers (e.g., 16 registers). A write mux couples speculative data from the execution units to the second set of m registers and non-speculative data from a write-back stage of the execution units to the first section of n registers. A read mux couples the speculative data from the second set of m registers to the execution units to bypass data hazards within the execution units. The register file preferably includes column decode logic for each of the registers in the second section of m registers to architect speculative data without moving data. The decode logic first decodes, and then selects, an age of the producer of the speculative state; the newest producer enables the decode.
    Type: Application
    Filed: February 11, 2002
    Publication date: August 28, 2003
    Inventors: Eric S. Fetzer, Donald C. Soltis, Stephen R. Undy
  • Patent number: 6516388
    Abstract: In a cache which writes new data over less recently used data, methods and apparatus which dispense with the convention of marking new cache data as most recently used. Instead, non-referenced data is marked as less recently used when it is written into a cache, and referenced data is marked as more recently used when it is written into a cache. Referenced data may correspond to fetch data, and non-referenced data may correspond to prefetch data. Upon fetch of a data value from the cache, its use status may be updated to more recently used. The methods and apparatus have the affect of preserving (n−1)/n of a cache's entries for the storage of fetch data, while limiting the storage of prefetch data to 1/n of a cache's entries. Pollution which results from unneeded prefetch data is therefore limited to 1/n of the cache.
    Type: Grant
    Filed: September 15, 2000
    Date of Patent: February 4, 2003
    Assignee: Hewlett-Packard Company
    Inventors: James E. McCormick, Jr., Stephen R. Undy
  • Patent number: 6493792
    Abstract: A CAM providing for the identification of a plurality of multiple bit tag values stored in the CAM, having logic circuitry for comparing each bit of an inputted test value to the corresponding bits of all stored tag values. A bit select is employed for generating a plurality of test bits for sequential input into the logic circuitry. The logic circuitry compares the plurality of test bits to the corresponding bit of each stored tag value and generates a “hit” signal if the selected bit is the same as the corresponding bit of the stored tag value. Storage means are employed for recording the results of the compare with the M hit signal.
    Type: Grant
    Filed: January 31, 2000
    Date of Patent: December 10, 2002
    Assignee: Hewlett-Packard Company
    Inventors: Stephen R. Undy, Terry L Lyon
  • Patent number: 5961655
    Abstract: Disclosed herein are methods and apparatus which provide a processor with raw, uncorrected data. The uncorrected data (or pre-corrected data) is retrieved from memory and then "bypassed" to a processing unit before its error status is known. Concurrently, error correction hardware determines the data's error status. Since the correct/incorrect indication is the first result available from error correction hardware, this result may be used to gate the actions of a processing unit prior to its taking an irrevocable action with possibly incorrect data. If bypassed data is incorrect, processing unit control logic may flag it as such and read corrected data from the output of error correction hardware. If bypassed data is correct (as will usually be the case), bypassed data may be consumed by a processing unit in due course.
    Type: Grant
    Filed: December 5, 1996
    Date of Patent: October 5, 1999
    Assignee: Hewlett-Packard Company
    Inventors: Leith Johnson, Stephen R. Undy
  • Patent number: 5860097
    Abstract: An associative cache memory for a computer with improved cache hit times. All possible data items are presented to bus driver circuits, thereby deferring data selection as long as possible. Driving and multiplexing are combined. The output of tag comparison directly selects at most one set of driver circuits. As a result, the only processing time in series with tag comparison is driver circuit selection. Since the data selection delay in series with tag comparison delay is reduced, the time delay is reduced for a clock edge for data driving after tag comparison, thereby enabling a faster clock.
    Type: Grant
    Filed: September 23, 1996
    Date of Patent: January 12, 1999
    Assignee: Hewlett-Packard Company
    Inventors: David J. Johnson, Stephen R. Undy