Patents Examined by Tonia L. Meonske
  • Patent number: 7100025
    Abstract: An apparatus and method for performing single-instruction multiple-data instructions using a single multiply-accumulate unit while minimizing operational latency. The multiply-accumulate unit generates a first half and a second half of a data result. A register stores the first half of the data result. A miscellaneous-logic unit determines when to release the first half of the data result from the register to synchronize the first half and the second half of the data result.
    Type: Grant
    Filed: January 28, 2000
    Date of Patent: August 29, 2006
    Assignees: Hewlett-Packard Development Company, L.P., Intel Corporation
    Inventor: Thomas Justin Sullivan
  • Patent number: 7100027
    Abstract: Methods and systems for replaying arbitrary system executions are disclosed. A system includes a storage element, a memory hierarchy and a processor. The memory hierarchy is coupled to the storage element. The processor is coupled to the memory hierarchy. The processor executes instructions from the memory hierarchy. A replay handler is loaded into the memory hierarchy. The replay handler is executed for replaying at least one execution. In another embodiment, a method for replaying executions is disclosed. Normal execution of a processor is interrupted. A replay/restart kernel is loaded. At least one execution is replayed. Normal execution of the processor is resumed.
    Type: Grant
    Filed: December 13, 1999
    Date of Patent: August 29, 2006
    Assignee: Intel Corporation
    Inventor: Kiran A. Padwekar
  • Patent number: 7100023
    Abstract: A system and method for handling complex instructions includes generating a jump instruction from an address which may be embedded in a computer instruction and selecting the original instruction if it was not complex or the Jump instruction if it was.
    Type: Grant
    Filed: August 23, 2001
    Date of Patent: August 29, 2006
    Assignee: Sony Computer Entertainment Inc.
    Inventor: Hidetaka Magoshi
  • Patent number: 7100026
    Abstract: A processor implements conditional vector operations in which, for example, an input vector containing multiple operands to be used in conditional operations is divided into two or more output vectors based on a condition vector. Each output vector can then be processed at full processor efficiency without cycles wasted due to branch latency. Data to be processed are divided into two groups based on whether or not they satisfy a given condition by, e.g., steering each to one of two index vectors. Once the data have been segregated in this way, subsequent processing can be performed without conditional operations, processor cycles wasted due to branch latency, incorrect speculation or execution of unnecessary instructions due to predication. Other examples of conditional operations include combining one or more input vectors into a single output vector based on a condition vector, conditional vector switching, conditional vector combining, and conditional vector load balancing.
    Type: Grant
    Filed: May 30, 2001
    Date of Patent: August 29, 2006
    Assignees: The Massachusetts Institute of Technology, The Board of Trustees of the Leland Stanford Junior University
    Inventors: William J. Dally, Scott Rixner, John D. Owens, Ujval J. Kapasi
  • Patent number: 7085916
    Abstract: For use in a processor having an external memory interface, an instruction prefetch mechanism, a method of prefetching instructions and a digital signal processor incorporating the mechanism or the method. In one embodiment, the mechanism includes: (1) a branch predictor that predicts whether a branch is to be taken, (2) prefetch circuitry, coupled to the branch predictor, that prefetches instructions associated with the branch via the external memory interface if the branch is taken and prefetches sequential instructions via the external memory interface if the branch is not taken and (3) a loop recognizer, coupled to the prefetch circuitry, that determines whether a loop is present in fetched instructions and reinstates a validity of instructions in the loop and prevents the prefetch circuitry from prefetching instructions outside of the loop until the loop completes execution.
    Type: Grant
    Filed: October 26, 2001
    Date of Patent: August 1, 2006
    Assignee: LSI Logic Corporation
    Inventor: Hung T. Nguyen
  • Patent number: 7085914
    Abstract: According to one aspect of the invention, there is provided a method for renaming memory references to stack locations in a computer processing system. The method includes the steps of detecting stack references that use architecturally defined stack access methods, and replacing the stack references with references to processor-internal registers. The architecturally defined stack access methods include memory accesses that use one of a stack pointer, a frame pointer, and an argument pointer. Moreover, the architecturally defined stack access methods include push, pop, and other stack manipulation operations.
    Type: Grant
    Filed: January 27, 2000
    Date of Patent: August 1, 2006
    Assignee: International Business Machines Corporation
    Inventor: Michael K. Gschwind
  • Patent number: 7082520
    Abstract: Improved Branch prediction utilizes both a Branch Target Buffer (BTB) and a Multiple Target Table (MTT) for providing the capability to predict multiple targets for a single branch. A MTT when used in conjunction with a BTB allows for branches which have changing targets to be able to selectively choose the target of choice based on the execution path that was taken that lead to the given branch. The method predicts traget addresses, and between the static and dynamic target address, and upon finding a hit, the target is sent to the instruction cache such that a fetch can begin for the current target address and the target address is sent back to the Branch Target Buffer (BTB) to begin the search for the next branch given the current target predicted address. Upon resolving a branch the dynamic target is placed in MTT for future use.
    Type: Grant
    Filed: May 9, 2002
    Date of Patent: July 25, 2006
    Assignee: International Business Machines Corporation
    Inventors: James J. Bonanno, Brian R. Prasky
  • Patent number: 7082518
    Abstract: The present invention relates to a digital signal processing apparatus comprising a plurality of available hardware resource means and a first instruction set means having access to said available hardware resource means, so that at least a part of said hardware resource means execute operations under control of said first instruction set means, and further comprising a second instruction set means having access to only a predetermined limited subset of said plurality of available hardware resource means, so that at least a part of said predetermined limited subset of said hardware resource means execute operations under control of said second instruction set means.
    Type: Grant
    Filed: October 16, 2001
    Date of Patent: July 25, 2006
    Assignee: Koninklijke Philips Electronics N.V.
    Inventors: Jeroen Anton Johan Leijten, Marco Jan Gerrit Bekooij, Adrianus Josephus Bink, Johan Sebastiaan Henri Van Gageldonk, Jan Hoogerbrugge, Bart Mesman
  • Patent number: 7080237
    Abstract: A technique for flattening architectural register windows into flattened space depending on a current window pointer to a register window is provided. The technique involves converting an n-bit value of a particular register in a register window to an x-bit value dependent on the current window pointer, where x is greater than n, and where the x-bit value is used for register dependency checking among a plurality of instructions.
    Type: Grant
    Filed: May 24, 2002
    Date of Patent: July 18, 2006
    Assignee: Sun Microsystems, Inc.
    Inventors: Chandra M. R. Thimmannagari, Sorin Iacobovici, Rabin A. Sugumar, Robert Nuckolls
  • Patent number: 7065636
    Abstract: In one embodiment, a programmable processor is adapted to support hardware loops. The processor may include hardware such as a first set of registers, a second set of registers, a first pipeline, and a second pipeline. Furthermore, the processor may include a control unit adapted to efficiently implement the hardware when performing a hardware loop.
    Type: Grant
    Filed: December 20, 2000
    Date of Patent: June 20, 2006
    Assignees: Intel Corporation, Analog Devices, Inc.
    Inventors: Ryo Inoue, Ravi P. Singh, Charles P. Roth, Gregory A. Overkamp
  • Patent number: 7047400
    Abstract: An Instruction Pointer (IP) signal is received comprising an IP tag field and an IP set field. A plurality of entries corresponding to the IP set field are read, each of the entries comprising an entry tag, an entry bank, and entry data. Each entry tag and entry bank is then compared with the IP tag and each of the plurality of banks. In one embodiment, the IP tag is concatenated with a number representing one of the plurality of banks and compared to the entry tag and entry bank. Separate comparisons may then be performed for each of the other banks.
    Type: Grant
    Filed: May 20, 2004
    Date of Patent: May 16, 2006
    Assignee: Intel Corporation
    Inventor: Nicolas I. Kacevas
  • Patent number: 7047395
    Abstract: A distributed system is provided for apportioning an instruction stream into multiple segments for processing in multiple parallel processing units, and for merging the processed segments into a single processed instruction stream having the same sequential relative order as the original instruction stream. Tags may be attached to each segment after apportioning to indicate the order in which the various segments are to be merged. In one embodiment, the end of each segment includes a tag indicating the unit to which the next instruction in the original instruction sequence is directed.
    Type: Grant
    Filed: November 13, 2001
    Date of Patent: May 16, 2006
    Assignee: Intel Corporation
    Inventors: Roni Rosner, Micha G. Moffie, Abraham Mendelson
  • Patent number: 7047396
    Abstract: A method and system for fixed-length memory-to-memory processing of fixed-length instructions. Further, the present invention is a method and system for implementing a memory operand width independent of the ALU width. The arithmetic and register data are 32 bits, but the memory operand is variable in size. The size of the memory operand is specified by the instruction. Instructions in accordance with the present invention allow for multiple memory operands in a single fixed-length instruction. The instruction set is small and simple, so the implementation is lower cost than traditional processors. More addressing modes are provided for, thus creating a more efficient code. Semaphores are implemented using a single bit. Shift-and-merge instructions are used to access data across word boundaries.
    Type: Grant
    Filed: June 22, 2001
    Date of Patent: May 16, 2006
    Assignee: Ubicom, Inc.
    Inventors: David A. Fotland, Roger D. Arnold, Tibet Mimaroglu
  • Patent number: 7043627
    Abstract: In view of a necessity of alleviating factors obstructing an effect of SIMD operation such as in-register data alignment in high speed formation of an SIMD processor, numerous data can be supplied to a data alignment operation pipe 211 by dividing a register file into four banks and enabling to designate a plurality of registers by a single piece of operand to thereby enable to make access to four registers simultaneously and data alignment operation can be carried out at high speed. Further, by defining new data pack instruction, data unpack instruction and data permutation instruction, data supplied in a large number can be aligned efficiently. Further, by the above-described characteristic, definition of multiply accumulate operation instruction maximizing parallelism of SIMD can be carried out.
    Type: Grant
    Filed: September 4, 2001
    Date of Patent: May 9, 2006
    Assignee: Hitachi, Ltd.
    Inventors: Takehiro Shimizu, Fumio Arakawa
  • Patent number: 7032101
    Abstract: An apparatus and method in a high performance processor for issuing instructions, comprising; a classification logic for sorting instructions in a number of priority categories, a plurality of instruction queues storing the instruction of differing priorities, and a issue logic selecting from which queue to dispatch instructions for execution. This apparatus and method can be implemented in both in-order, and out-of-order execution processor architectures. The invention also involves instruction cloning, and use of various predictive techniques.
    Type: Grant
    Filed: February 26, 2002
    Date of Patent: April 18, 2006
    Assignee: International Business Machines Corporation
    Inventors: Michael Karl Gschwind, Valentina Salapura
  • Patent number: 7024545
    Abstract: A processor is configured with a first level branch prediction cache configured to store branch prediction information corresponding to a group of instructions. In addition, a second level branch prediction cache is utilized to store branch prediction information which is evicted from the first level cache. The second level branch prediction cache is configured to store only a subset of the information which is evicted from the first level cache. Branch prediction information which is evicted from the first level cache and not stored in the second level cache is discarded. Upon a miss in the first level cache, a determination is made as to whether the second level cache contains branch prediction information corresponding to the miss. If corresponding branch prediction information is detected in the second level cache, the detected branch prediction information is used to rebuild complete branch prediction information.
    Type: Grant
    Filed: July 24, 2001
    Date of Patent: April 4, 2006
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Gerald D. Zuraski, Jr., James S. Roberts
  • Patent number: 7020766
    Abstract: A conjugate processor includes an instruction set architecture (ISA) visible portion having a main pipeline, and an h-flow portion having an h-flow pipeline. The binary executed on the conjugate processor includes an essential portion that is executed on the main pipeline and a non-essential portion that is executed on the h-flow pipeline. The non-essential portion includes hint calculus that is used to provide hints to the main pipeline. The conjugate processor also includes a conjugate mapping table that maps triggers to h-flow targets. Triggers can be instruction attributes, data attributes, state attributes or event attributes. When a trigger is satisfied, the h-flow code specified by the target is executed in the h-flow pipeline.
    Type: Grant
    Filed: May 30, 2000
    Date of Patent: March 28, 2006
    Assignee: Intel Corporation
    Inventors: Hong Wang, Ralph Kling, Yong-Fong Lee, David A. Berson, Michael A. Kozuch, Konrad Lai
  • Patent number: 6976151
    Abstract: In one embodiment, a processor receives coded instructions and converts the instructions to a second code prior to execution. The processor may be a digital signal processor. A decoder in the processor determines the destination of the instructions and performs decoding functions based on the destination.
    Type: Grant
    Filed: September 28, 2000
    Date of Patent: December 13, 2005
    Assignees: Intel Corporation, Analog Devices, Inc.
    Inventors: Gregory A. Overkamp, Charles P. Roth, Ravi P. Singh
  • Patent number: 6976152
    Abstract: An apparatus for a processor includes a first scoreboard, a second scoreboard, and a control circuit coupled to the first scoreboard and the second scoreboard. The control circuit is configured to update the first scoreboard to indicate that a write is pending for a first destination register of a first instruction in response to issuing the first instruction into a first pipeline. The control circuit is configured to update the second scoreboard to indicate that the write is pending for the first destination register in response to the first instruction passing a first stage of the pipeline. Replay may be signaled for a given instruction at the first stage. In response to a replay of a second instruction, the control circuit is configured to copy a contents of the second scoreboard to the first scoreboard. In various embodiments, additional scoreboards may be used for detecting different types of dependencies.
    Type: Grant
    Filed: February 4, 2002
    Date of Patent: December 13, 2005
    Assignee: Broadcom Corporation
    Inventors: Tse-Yu Yeh, David A. Kruckemyer, Randel P. Blake-Campos, Robert Rogenmoser, Robert Stepanian
  • Patent number: 6973559
    Abstract: A system and method for interconnecting a plurality of processing element nodes within a scalable multiprocessor system is provided. Each processing element node includes at least one processor and memory. A scalable interconnect network includes physical communication links interconnecting the processing element nodes in a cluster. A first set of routers in the scalable interconnect network route messages between the plurality of processing element nodes. One or more metarouters in the scalable interconnect network route messages between the first set of routers so that each one of the routers in a first cluster is connected to all other clusters through one or more metarouters.
    Type: Grant
    Filed: September 29, 1999
    Date of Patent: December 6, 2005
    Assignee: Silicon Graphics, Inc.
    Inventors: Martin M. Deneroff, Gregory M. Thorson, Randal S. Passint