Architecture Based Instruction Processing Patents (Class 712/200)
  • Patent number: 7761688
    Abstract: An in-order issue in-order completion micro-controller comprises a pipeline core comprising in succession a fetch address stage, a program access stage, a decode stage, a first execution stage, a second execution stage, a memory access stage, and a write back stage. The various stages are provided a thread ID such that alternating stages use a first thread ID, and the other stages use a second thread ID. Each stage which requires access to thread ID specific context information uses the thread ID to specify this context information.
    Type: Grant
    Filed: September 6, 2007
    Date of Patent: July 20, 2010
    Assignee: Redpine Signals, Inc.
    Inventor: Heonchul Park
  • Patent number: 7761689
    Abstract: A digital processor is coupled to a processor programmer through a single programming connection (e.g., terminal, pin, etc.) coupled to the single conductor programming bus. The processor programmer comprises an instruction encoder/decoder, a Manchester encoder, a Manchester decoder, a bus receiver and a bus transmitter. The digital processor comprises an instruction encoder/decoder, a Manchester encoder, a Manchester decoder, a bus receiver, a bus transmitter, a central processing unit (CPU), and a program memory. The instruction encoder/decoder is coupled to the CPU and the program memory. The bus receivers and bus transmitters are coupled to the single conductor programming bus which is coupled to a connection, e.g., terminal, pin, ball, etc., on an integrated circuit package containing the digital processor. The instruction encoder/decoder is coupled to a programming console, e.g., a personal computer, workstation, etc.
    Type: Grant
    Filed: November 13, 2008
    Date of Patent: July 20, 2010
    Assignee: Microchip Technology Incorporated
    Inventor: Joseph Alan Thomsen
  • Patent number: 7757065
    Abstract: In a front-end system for a processor, a recording scheme for instruction segments stores the instructions in reverse program order. Instruction segments may be traces, extended blocks or basic blocks. By storing the instructions in reverse program order, the instruction segment is easily extended to include additional instructions. The instruction segments may be extended without having to re-index tag arrays, pointers that associate instruction segments with other instruction segments.
    Type: Grant
    Filed: November 9, 2000
    Date of Patent: July 13, 2010
    Assignee: Intel Corporation
    Inventors: Stephan J. Jourdan, Ronny Ronen, Lihu Rappoport
  • Publication number: 20100169610
    Abstract: The processor according to the present invention is a processor having a forwarding function and includes an attribute information holding unit that holds attribute information regarding inhibition of writing to a register and a register write inhibition circuit that holds, when forwarding is performed, the writing of the data forwarded according to attribute information. The attribute information holding unit holds the attribute information by relating the attribute information to at least one register. Alternatively, the attribute information holding unit is a part of plural pipeline buffers and passes the attribute information along with the data to be forwarded, to a pipeline buffer in a subsequent stage.
    Type: Application
    Filed: October 16, 2006
    Publication date: July 1, 2010
    Applicant: MATSUSHITA ELECTRIC INDUSTRIAL CO., LTD.
    Inventors: Shin-ichiro Fukai, Makoto Kawamura
  • Patent number: 7746347
    Abstract: Methods and systems for processing a geometry shader program developed in a high-level shading language are disclosed. Specifically, in one embodiment, after having received the geometry shader program configured to be executed by a first processing unit in a programmable execution environment, the high-level shading language instructions of the geometry shader program is converted into low-level programming language instructions. The low-level programming language instructions are then linked with the low-level programming language instructions of a domain-specific shader program, which is configured to be executed by a second processing unit also residing in the programmable execution environment. The linked instructions of the geometry shader program are directed to the first processing unit, and the linked instructions of the domain-specific shader program are directed to the second processing unit.
    Type: Grant
    Filed: November 30, 2006
    Date of Patent: June 29, 2010
    Assignee: NVIDIA Corporation
    Inventors: Patrick R. Brown, Barthold B. Lichtenbelt, Christopher T. Dodd, Mark J. Kilgard
  • Patent number: 7724030
    Abstract: In one embodiment, an integrated device is disclosed. For example, in one embodiment of the present invention, a device comprises a core module for providing one or more output signals. The device comprises an output logic module for receiving the one or more output signals and an input logic module, wherein the one or more output signals are received by the input logic module via one or more feedback paths, where the one or more output signals are forwarded back to the core module.
    Type: Grant
    Filed: December 3, 2007
    Date of Patent: May 25, 2010
    Assignee: XILINX, Inc.
    Inventors: Steven E. McNeil, Andrew W. Lai
  • Patent number: 7720203
    Abstract: Systems and methods for processing speech are provided. A system may include an acoustic model to transform speech input into one or more word strings. The system may also include a semantic model to convert each of the one or more word strings into a detected object and a detected action. The system may also include a synonym table to determine a preferred object based on the detected object and to determine a preferred action based on the detected action.
    Type: Grant
    Filed: June 1, 2007
    Date of Patent: May 18, 2010
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Robert R. Bushey, Benjamin A. Knott, John M. Martin, Sarah Korth
  • Publication number: 20100122065
    Abstract: A large-scale data processing system and method for processing data in a distributed and parallel processing environment. The system includes an application-independent framework for processing data having a plurality of application-independent map modules and reduce modules. These application-independent modules use application-independent operators to automatically handle parallelization of computations across the distributed and parallel processing environment when performing user-specified data processing operations. The system also includes a plurality of user-specified, application-specific operators, for use with the application-independent framework to perform a user-specified data processing operation on a user-specified set of input files. The application-specific operators include: a map operator and a reduce operator. The map operator is applied by the application-independent map modules to input data in the user-specified set of input files to produce intermediate data values.
    Type: Application
    Filed: January 12, 2010
    Publication date: May 13, 2010
    Inventors: Jeffrey Dean, Sanjay Ghemawat
  • Patent number: 7694109
    Abstract: When fetching an instruction from a plurality of memory banks, a first pipeline cycle corresponding to selection of a memory bank and a second pipeline cycle corresponding to instruction readout are generated to carry out a pipeline process. Only the selected memory bank can be precharged to allow reduction of power consumption. Since the first and second pipeline cycles are effected in parallel, the throughput of the instruction memory can be improved.
    Type: Grant
    Filed: December 4, 2007
    Date of Patent: April 6, 2010
    Assignee: Renesas Technology Corp.
    Inventors: Toyohiko Yoshida, Akira Yamada, Hisakazu Sato
  • Publication number: 20100082945
    Abstract: A multi-thread processor in accordance with an exemplary aspect of the present invention includes a plurality of hardware threads each of which generates an independent instruction flow, a thread scheduler that outputs a thread selection signal TSEL designating a hardware thread to be executed in a next execution cycle, a first selector that outputs an instruction generated by a hardware thread selected according to the thread selection signal, and an execution pipeline that executes an instruction output from the first selector, wherein the thread scheduler specifies execution of at least one hardware thread selected in a fixed manner in a predetermined first execution period, and specifies execution of an arbitrary hardware thread in a second execution period.
    Type: Application
    Filed: September 28, 2009
    Publication date: April 1, 2010
    Applicant: NEC Electronics Corporation
    Inventors: Koji Adachi, Kazunori Miyamoto
  • Publication number: 20100082944
    Abstract: In an exemplary aspect, the present invention provides a multi-thread processor including a plurality of hardware threads each of which generates an independent instruction flow, a thread scheduler that outputs a thread selection signal in accordance with a first or second schedule, the thread selection signal designating a hardware thread to be executed in a next execution cycle among the plurality of hardware threads, a first selector that selects one of the plurality of hardware threads according to the thread selection signal and outputs an instruction generated by the selected hardware thread, and an execution pipeline that executes an instruction output from the first selector, wherein when the multi-thread processor is in a first state, the thread scheduler selects the first schedule, and when the multi-thread processor is in a second state, the thread scheduler selects the second schedule.
    Type: Application
    Filed: September 23, 2009
    Publication date: April 1, 2010
    Applicant: NEC ELECTRONICS CORPORATION
    Inventors: Koji Adachi, Toshiyuki Matsunaga
  • Patent number: 7685407
    Abstract: The present invention is to provide a semiconductor device that can correctly switch endians on the outside even if the endian of a parallel interface is not recognized on the outside. The semiconductor device includes a switching circuit and a first register. The switching circuit switches between whether a parallel interface with the outside is to be used as a big endian or a little endian. A first register holds control data of the switching circuit. The switching circuit regards the parallel interface as the little endian when first predetermined control information, that is unchanged in the values of specific bit positions even if its high-order and low-order bit positions are transposed, is supplied to the first register, and regards the parallel interface as the big endian when second predetermined control information, that is unchanged in the values of specific bit positions even if its high-order and low-order bit positions are transposed, is supplied to the first register.
    Type: Grant
    Filed: May 31, 2006
    Date of Patent: March 23, 2010
    Assignee: Renesas Technology Corp.
    Inventors: Goro Sakamaki, Yuri Azuma
  • Patent number: 7657766
    Abstract: In some embodiments, an apparatus for an energy efficient clustered micro-architecture are disclosed. In one embodiment, the micro-architecture computes an energy delay2 product for each active instruction scheduler and one or more associated function blocks of a current architecture configuration over a predetermined period. Once the energy delay2 product is computed, the computed product is compared against an energy delay2 product calculated for a prior architecture configuration to determine an effectiveness (energy efficiency) of the current architecture configuration. Based on the effectiveness of the current architecture configuration, a number of active instruction schedulers and one or more associated functional blocks within the current architecture configuration is adjusted. In one embodiment, the number of active instruction schedulers and one or more associated functional blocks may be increased or decreased to improve power efficiency of the cluster micro-architecture.
    Type: Grant
    Filed: January 26, 2007
    Date of Patent: February 2, 2010
    Assignee: Intel Corporation
    Inventors: Jose Gonzalez, Antonio Gonzalez
  • Patent number: 7650434
    Abstract: A system and method for enabling high-speed, low-latency global tree network communications among processing nodes interconnected according to a tree network structure. The global tree network enables collective reduction operations to be performed during parallel algorithm operations executing in a computer structure having a plurality of the interconnected processing nodes. Router devices are included that interconnect the nodes of the tree via links to facilitate performance of low-latency global processing operations at nodes of the virtual tree and sub-tree structures. The global operations performed include one or more of: broadcast operations downstream from a root node to leaf nodes of a virtual tree, reduction operations upstream from leaf nodes to the root node in the virtual tree, and point-to-point message passing from any node to the root node.
    Type: Grant
    Filed: February 25, 2002
    Date of Patent: January 19, 2010
    Assignee: International Business Machines Corporation
    Inventors: Matthias A. Blumrich, Dong Chen, Paul W. Coteus, Alan G. Gara, Mark E. Giampapa, Philip Heidelberger, Dirk Hoenicke, Burkhard D. Steinmacher-Burow, Todd E. Takken, Pavlos M. Vranas
  • Patent number: 7647476
    Abstract: In one embodiment, the present invention includes a processor having multiple processor cores to execute instructions, with each of the cores including dedicated digital interface circuitry. The processor further includes an analog interface coupled to the cores via the digital interface circuitry. The analog interface may be used to communicate traffic between a package including the cores and an interconnect such as a shared bus coupled thereto. Other embodiments are described and claimed.
    Type: Grant
    Filed: March 14, 2006
    Date of Patent: January 12, 2010
    Assignee: Intel Corporation
    Inventors: Christopher Mozak, Jeffrey D. Gilbert, Ganapati Srinivasa
  • Patent number: 7620798
    Abstract: A synchronization mechanism is used to synchronize events across multiple execution pipelines that process transaction streams. A common set of state configuration is included in each transaction stream to control processing of data that is distributed between the different transaction streams. Portions of the state configuration correspond to portions of the data. Execution of the transaction streams is synchronized to ensure that each portion of the data is processed using the state configuration that corresponds to that portion of the data. The synchronization mechanism may be used for multiple synchronizations and when the synchronization signals are pipelined to meet chip-level timing requirements.
    Type: Grant
    Filed: October 30, 2006
    Date of Patent: November 17, 2009
    Assignee: NVIDIA Corporation
    Inventors: Mark J. French, Steven E. Molnar
  • Patent number: 7620803
    Abstract: A data processing device is provided using pipeline architecture to reduce a time loss due to a branch without causing an increase in circuit scale. The data processing device uses pipeline control. The data processing device includes an instruction queue in which a plurality of instruction codes can be fetched, a fetch address operation circuit which calculates a fetch address, a fetch circuit which fetches an instruction code based on the fetch address, and a branch information setting circuit which decodes a branch setting instruction, stores a branch address in a branch address storage register, and stores a branch target address in a branch target address storage register. The fetch address operation circuit compares either a previous fetch address or an expected next fetch address with a value stored in the branch address storage register, and determines a next fetch address to be output, based on the comparison result.
    Type: Grant
    Filed: June 20, 2003
    Date of Patent: November 17, 2009
    Assignee: Seiko Epson Corporation
    Inventor: Makoto Kudo
  • Publication number: 20090271795
    Abstract: A method and apparatus for scheduling the processing of commands by a plurality of cryptographic algorithm cores in a network processor.
    Type: Application
    Filed: February 27, 2009
    Publication date: October 29, 2009
    Inventors: Jaroslaw J. Sydir, Chen-Chi Kuo, Kamal J. Koshy, Wajdi Feghali, Bradley A. Burres, Gilbert M. Wolrich
  • Patent number: 7610475
    Abstract: A processing system with reconfigurable instruction extensions includes a processor, programmable logic, a register file, and a load/store module. The processor executes a computer program comprising a set of computational instructions and at least one instruction extension. The programmable logic receives configuration information to configure the programmable logic for the instruction extension and executes the instruction extension. The register file is coupled to the programmable logic and stores data. The load/store module transfers the data directly between the register file and a system memory.
    Type: Grant
    Filed: August 15, 2005
    Date of Patent: October 27, 2009
    Assignee: Stretch, Inc.
    Inventors: Jeffrey Mark Arnold, Gareld Howard Banta, Scott Daniel Johnson, Albert R. Wang
  • Publication number: 20090265512
    Abstract: A shared memory network for communicating between processors using store and load instructions is described. A new processor architecture which may be used with the shared memory network is also described that uses arithmetic/logic instructions that do not specify any source operand addresses or target operand addresses. The source operands and target operands for arithmetic/logic execution units are provided by independent load instruction operations and independent store instruction operations.
    Type: Application
    Filed: June 3, 2009
    Publication date: October 22, 2009
    Inventor: Gerald George Pechanek
  • Patent number: 7590823
    Abstract: Method of informing a processor that a coprocessor instruction is not executable by a coprocessor is described. The coprocessor, instantiated in configurable logic, is configured to execute a subset of coprocessor instructions, excluding user-selected instructions not instantiated. The processor is coupled to the coprocessor via a controller. The coprocessor instruction is sent from the processor to the controller, which queries control logic to determine whether the coprocessor is configured to execute the coprocessor instruction. If a control bit is set to disable an instruction or group of instructions, the coprocessor instruction is not executable by the coprocessor.
    Type: Grant
    Filed: August 6, 2004
    Date of Patent: September 15, 2009
    Assignee: Xilinx, Inc.
    Inventors: Ahmad R. Ansari, Kathryn Story Purcell
  • Patent number: 7584343
    Abstract: An active memory device includes a command engine that receives high level tasks from a host and generates corresponding sets of either DCU commands to a DRAM control unit or ACU commands to a processing array control unit. The DCU commands include memory addresses, which are also generated by the command engine, and the ACU command include instruction memory addresses corresponding to an address in an array control unit where processing array instructions are stored. The active memory device includes a vector processing and re-ordering system coupled to the array control unit and the memory device. The vector processing and re-ordering system re-orders data received from the memory device into a vector of contiguous data, process the data in accordance with an instruction received from the array control unit to provide results data, and passes the results data to the memory device.
    Type: Grant
    Filed: October 17, 2006
    Date of Patent: September 1, 2009
    Assignee: Micron Technology, Inc.
    Inventor: Graham Kirsch
  • Patent number: 7584150
    Abstract: An information recording medium and an optical recording system allow target information (such as an ad) to be displayed without requiring changes in hardware or physical specifications. The recording medium comprises a recording-limited area in which recording is made possible by canceling the limit after an instruction is issued for displaying the target information.
    Type: Grant
    Filed: February 28, 2002
    Date of Patent: September 1, 2009
    Assignee: Hitachi, Ltd.
    Inventors: Akemi Hirotsune, Harukazu Miyamoto, Yoshiko Nishi
  • Publication number: 20090204790
    Abstract: Technologies are described herein for buffer management during real-time streaming. A video frame buffer stores video frames generated by a real-time streaming video capture device. New video frames received from the video capture device are stored in the video frame buffer prior to processing by a video processing pipeline that processes frames stored in the video frame buffer. A buffer manager determines whether a new video frame has been received from the video capture device and stored in the video frame buffer. When the buffer manager determines that a new video frame has arrived at the video frame buffer, it then determines whether the video processing pipeline has an unprocessed video frame. If the video processing pipeline has an unprocessed video frame, the buffer manager discards the new video frame stored in the video frame buffer or performs other processing on the new video frame.
    Type: Application
    Filed: February 7, 2008
    Publication date: August 13, 2009
    Applicant: MICROSOFT CORPORATION
    Inventor: Humayun Mukhtar Khan
  • Patent number: 7571300
    Abstract: A memory system includes a plurality of memory blocks, each having a dedicated local arithmetic logic unit (ALU). A data value having a plurality of bytes is stored such that each of the bytes is stored in a corresponding one of the memory blocks. In a read-modify-write operation, each byte of the data value is read from the corresponding memory block, and is provided to the corresponding ALU. Similarly, each byte of a modify data value is provided to a corresponding ALU on a memory data bus. Each ALU combines the read byte with the modify byte to create a write byte. Because the write bytes are all generated locally within the ALUs, long signal delay paths are avoided. Each ALU also generates two possible carry bits in parallel, and then uses the actual received carry bit to select from the two possible carry bits.
    Type: Grant
    Filed: January 8, 2007
    Date of Patent: August 4, 2009
    Assignee: Integrated Device Technologies, Inc.
    Inventor: Tak Kwong Wong
  • Publication number: 20090193222
    Abstract: In one embodiment of the present invention, a method includes switching between a first address space and a second address space, determining if the second address space exists in a list of address spaces; and maintaining entries of the first address space in a translation buffer after the switching. In such manner, overhead associated with such a context switch may be reduced.
    Type: Application
    Filed: March 30, 2009
    Publication date: July 30, 2009
    Inventors: Jason W. Brandt, Sanjoy K. Mondal, Richard Uhlig, Gilbert Neiger, Robert T. George
  • Publication number: 20090182979
    Abstract: In a logically partitioned host computer system comprising host processors (host CPUs), a facility and instruction for discovering topology of one or more guest processors (guest CPUs) of a guest configuration comprises a guest processor of the guest configuration fetching and executing a STORE SYSTEM INFORMATION instruction that obtains topology information of the computer configuration. The topology information comprising nesting information of processors of the configuration and the degree of dedication a host processor provides to a corresponding guest processor. The information is preferably stored in a single table in memory.
    Type: Application
    Filed: January 11, 2008
    Publication date: July 16, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Mark S. Farrell, Charles W. Gainey, JR., Jeffrey P. Kubala, Donald W. Schmidt
  • Patent number: 7562209
    Abstract: A platform may use heterogeneous instruction set architectures which may be called during run time. Using a system table, an operating system may be directed to the appropriate services for any of two or more instruction set architectures during run time.
    Type: Grant
    Filed: April 7, 2004
    Date of Patent: July 14, 2009
    Assignee: Marvell International, Ltd.
    Inventors: Vincent J. Zimmer, Michael A. Rothman
  • Publication number: 20090177866
    Abstract: A method of operating a computer system. A first processor sends a first unit of binary information to an input/output (I/O) unit. The I/O unit then conveys the first unit of binary information to a functional unit in the computer system. A system response from the functional unit is then received by the I/O unit, which forwards the system response to the first processor. The system response is also stored in a first buffer. After a predetermined delay time has elapsed, the system response is then forwarded to the second processor.
    Type: Application
    Filed: January 8, 2008
    Publication date: July 9, 2009
    Inventors: Michael L. Choate, Mark D. Nicol, Michael T. Clark, Scott A. White, Gregory A. Lewis, Todd Foster, Gerald D. Zuraski, JR.
  • Publication number: 20090144525
    Abstract: An multi-threading processor is provided. The multi-threading processor includes a first instruction fetch unit to receive a first thread and a second instruction fetch unit to receive a second thread. A multi-thread scheduler coupled to the instruction fetch units and a execution unit. The multi-thread scheduler determines the width of the execution unit and the execution unit executes the threads accordingly.
    Type: Application
    Filed: January 23, 2009
    Publication date: June 4, 2009
    Applicant: INTEL CORPORATION
    Inventors: Ken SHOEMAKER, Sailesh KOTTAPALLI, Kin-Kee SIT
  • Publication number: 20090144737
    Abstract: An apparatus and program product utilize a multithreaded processor having at least one hardware thread among a plurality of hardware threads that is capable of being selectively activated and deactivated responsive to a control circuit. The control circuit additionally provides the capability of controlling how an inactive thread can be activated after the thread has been deactivated, e.g., by enabling or disabling reactivation in response to an interrupt.
    Type: Application
    Filed: January 23, 2009
    Publication date: June 4, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: William Joseph Armstrong, Bruce G. Mealey, Naresh Nayar, Balaram Sinharoy
  • Patent number: 7543288
    Abstract: Techniques for implementing virtual machine instructions suitable for execution in virtual machines are disclosed. The inventive virtual machine instructions can effectively represent the complete set of operations performed by the conventional Java Bytecode instruction set. Moreover, the operations performed by conventional instructions can be performed by relatively fewer inventive virtual machine instructions. Thus, a more elegant, yet robust, virtual machine instruction set can be implemented. This, in turn, allows implementation of relatively simpler interpreters as well as allowing alternative uses of the limited 256 (28) Bytecode representation (e.g., a macro representing a set of commands). As a result, the performance of virtual machines, especially, those operating in systems with limited resources, can be improved by using the inventive virtual machine instructions.
    Type: Grant
    Filed: March 27, 2001
    Date of Patent: June 2, 2009
    Assignee: Sun Microsystems, Inc.
    Inventors: Stepan Sokolov, David Wallman
  • Publication number: 20090138677
    Abstract: A process, apparatus, and system to execute a program in an array of processor nodes that include an agent node and an executor node. A virtual program of tokens of different types represents the program and is provided in a memory. The types include a run type that includes native code instructions of the executer node. A token is loaded from the memory and executed in the agent node based on its type. In particular, if the token is an optional stop type execution ends and if the token is a run type the native code instructions in the token are sent to the executor node. The native code instructions are executed in the executor node as received from the agent node. And such loading and execution continues in this manner indefinitely or until a stop type token is executed.
    Type: Application
    Filed: October 20, 2008
    Publication date: May 28, 2009
    Applicant: VNS PORTFOLIO LLC
    Inventor: Charles W. Shattuck
  • Publication number: 20090125704
    Abstract: A design structure embodied in a machine readable medium used in a design process includes an apparatus for dynamically selecting compiled instructions for execution, the apparatus including an input for receiving static instructions for execution on a first execution unit and receiving dynamic instructions for execution on a second execution unit; and an instruction selection element adapted to evaluate throughput performance of the static instructions and dynamic instructions based on current states of the execution units and select the static instructions or the dynamic instructions for execution at runtime on the first execution unit or the second execution unit, respectively, based on the throughput performance of the instructions.
    Type: Application
    Filed: November 8, 2007
    Publication date: May 14, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Deanna J. Chou, Jesse E. Craig, John Sargis, JR., Daneyand J. Singley, Sebastian T. Ventrone
  • Patent number: 7533246
    Abstract: A method for automatically configuring a microprocessor architecture so that it is able to efficiently exploit instruction level parallelism in a particular application. Executable code for another microprocessor type is translated into the specialized instruction set of the configured microprocessor. The configured microprocessor may then be used as a coprocessor in a system containing another microprocessor running the original executable code.
    Type: Grant
    Filed: June 30, 2003
    Date of Patent: May 12, 2009
    Assignee: Critical Blue Ltd.
    Inventor: Richard Michael Taylor
  • Publication number: 20090113176
    Abstract: A programmable processor and method for improving the performance of processors by expanding at least two source operands, or a source and a result operand, to a width greater than the width of either the general purpose register or the data path width. The present invention provides operands which are substantially larger than the data path width of the processor by using the contents of a general purpose register to specify a memory address at which a plurality of data path widths of data can be read or written, as well as the size and shape of the operand. In addition, several instructions and apparatus for implementing these instructions are described which obtain performance advantages if the operands are not limited to the width and accessible number of general purpose registers.
    Type: Application
    Filed: October 31, 2007
    Publication date: April 30, 2009
    Applicant: MicroUnity Systems Engineering, Inc.
    Inventors: Craig Hansen, John Moussouris, Alexia Massalin
  • Publication number: 20090094439
    Abstract: A data processing apparatus and method employing multiple register sets is disclosed. The data processing apparatus has processing logic for performing data processing operations and a register bank for storing data associated with the processing logic. The register bank has at least one register group, each register group having a plurality of register sets. The processing logic has an operating state associated with each register group defining how that register group is used, a first operating state being a state in which each register set in the register group is used to support an independent execution thread of the processing logic, and a second operating state being a state in which the register sets of the register group are collectively used to support a single execution thread of the processing logic. Control logic is provided to control how the register sets of each register group are used dependent on the operating state associated with that register group.
    Type: Application
    Filed: May 11, 2005
    Publication date: April 9, 2009
    Inventors: David Hennah Mansell, Stuart David Biles, David Michael Gilday, Danel Kershaw
  • Patent number: 7500084
    Abstract: A new zSeries floating-point unit has a fused multiply-add dataflow capable of supporting two architectures and fused MULTIPLY and ADD and Multiply and SUBTRACT in both RRF and RXF formats for the fused functions. Both binary and hexadecimal floating-point instructions are supported for a total of 6 formats. The floating-point unit is capable of performing a multiply-add instruction for hexadecimal or binary every cycle with a latency of 5 cycles. This supports two architectures with two internal formats with their own biases. This has eliminated format conversion cycles and has optimized the width of the dataflow. The unit is optimized for both hexadecimal and binary floating-point architecture supporting a multiply-add/subtract per cycle.
    Type: Grant
    Filed: April 18, 2006
    Date of Patent: March 3, 2009
    Assignee: International Business Machines Corporation
    Inventors: Eric M. Schwarz, Ronald M. Smith, Sr.
  • Patent number: 7487331
    Abstract: A digital processor may be coupled to a processor programmer through a single conductor programming bus. The digital processor and the processor programmer, each may have a single programming connection (e.g., terminal, pin, etc.) coupled to the single conductor programming bus. The processor programmer may comprise an instruction encoder/decoder, a Manchester encoder, a Manchester decoder, a bus receiver and a bus transmitter. The bus receiver and bus transmitter may be coupled to the single connection that may be coupled to the single conductor programming bus. The instruction encoder/decoder may be coupled to a programming console, e.g., a personal computer, workstation, etc. The digital processor may comprise an instruction encoder/decoder, a Manchester encoder, a Manchester decoder, a bus receiver, a bus transmitter, a central processing unit (CPU), and a program memory. The bus receiver and bus transmitter may be coupled to the single connection, e.g., terminal, pin, ball, etc.
    Type: Grant
    Filed: September 15, 2005
    Date of Patent: February 3, 2009
    Assignee: Microchip Technology Incorprated
    Inventor: Joseph Alan Thomsen
  • Patent number: 7487332
    Abstract: A method and apparatus within a processing system is provided for associating shadow register sets with interrupt routines. The invention includes a vector generator that receives interrupts, and generates exception vectors to call interrupt routines that correspond to the interrupts. The exception vector considers the type of interrupt and the priority level of the interrupt when selecting the exception vector. Shadow set mapping logic is coupled to the vector generator. The mapping logic contains a number of fields that correspond to the different exception vectors that may be generated. The fields are programmable by kernel mode instructions, and contain data mapping each field to one of a number of shadow register sets. When an interrupt occurs, the vector generator generates a corresponding exception vector. In addition, the shadow set mapping logic looks at the field corresponding to the exception vector, and retrieves the data stored therein.
    Type: Grant
    Filed: January 30, 2007
    Date of Patent: February 3, 2009
    Assignee: Mips Technologies, Inc.
    Inventor: Michael G. Uhler
  • Publication number: 20090030668
    Abstract: Architecture for efficient translation and processing of PowerPC guest instructions on an x86 host machine. In an x86-based architecture, signed integer values are projected into the unsigned integer value space for processing by the host using the negation of the left-most (sign) bit. Compare operations are performed in the unsigned space and the compare results are written into the host flags register. Once the compare results are written into the host flags register, the flag values can be read out and used in a table lookup to retrieve the corresponding values for the guest register. The guest flag values are then passed into the guest flags register for processing by the guest application.
    Type: Application
    Filed: July 26, 2007
    Publication date: January 29, 2009
    Applicant: MICROSOFT CORPORATION
    Inventors: Darek Mihocka, Jens Troeger
  • Patent number: 7478223
    Abstract: A devices and method for parsing a data stream comprises a parser stack configured to store one or more parsing symbols, each parsing symbol representing a different state of data stream parsing, a table interface configured to retrieve one or more production rules from a production rule table according to the parsing symbols, and a state machine configured to control the parsing of a data stream according to the retrieved production rules.
    Type: Grant
    Filed: February 28, 2006
    Date of Patent: January 13, 2009
    Assignee: Gigafin Networks, Inc.
    Inventors: Somsubhra Sikdar, Kevin Jerome Rowett, Rajesh Nair, Komal Rathi
  • Patent number: 7464252
    Abstract: A programmable processor and system for improving the performance of processors by incorporating an execution unit operable to decode and execute single instructions specifying three registers each containing a plurality of data elements, the execution unit operable to multiply the first and second registers and add the third register to produce a catenated result containing a plurality of data elements. Additional instructions provide group floating-point subtract, add, multiply, set less, and set greater equal operations. The set less and set greater equal operations produce alternatively zero or an identity element for each element of a catenated result, the result facilitating alternative selection of individual data elements using bitwise Boolean operations and without requiring conditional branch operations.
    Type: Grant
    Filed: January 16, 2004
    Date of Patent: December 9, 2008
    Assignee: Microunity Systems Engineering, Inc.
    Inventors: Craig Hansen, John Moussouris
  • Patent number: 7454570
    Abstract: A multiprocessor data processing system (MDPS) with a weakly-ordered architecture providing processing logic for substantially eliminating issuing sync instructions after every store instruction of a well-behaved application. Instructions of a well-behaved application are translated and executed by a weakly-ordered processor. The processing logic includes a lock address tracking utility (LATU), which provides an algorithm and a table of lock addresses, within which each lock address is stored when the lock is acquired by the weakly-ordered processor. When a store instruction is encountered in the instruction stream, the LATU compares the target address of the store instruction against the table of lock addresses. If the target address matches one of the lock addresses, indicating that the store instruction is the corresponding unlock instruction (or lock release instruction), a sync instruction is issued ahead of the store operation.
    Type: Grant
    Filed: December 7, 2004
    Date of Patent: November 18, 2008
    Assignee: International Business Machines Corporation
    Inventors: Andrew Dunshea, Satya Prakash Sharma, Mysore Sathyanarayana Srinivas
  • Patent number: 7450131
    Abstract: Embodiments include storing graphics instructions at addresses in a memory in an original order, and storing in the memory pointers associated with each instruction pointing to the addresses of the instructions in the original order. A first pointer associated with a first graphics instruction may then be moved from pointing to a first address of the first graphics instruction to point to a second address of a second graphics instruction. Likewise, a second pointer associated with the second graphics instruction may be moved from pointing to the second address to point to the first address by accessing the first pointer before moving the first pointer to determine that the second pointer is to point to the first address (e.g., the address the first instruction points to before being moved). Afterwards, the instructions may be re-ordered into an optimized order for compiling, by switching them to different addresses according to the pointers.
    Type: Grant
    Filed: September 30, 2005
    Date of Patent: November 11, 2008
    Assignee: Intel Corporation
    Inventors: Shankar N. Swamy, Oliver Heim
  • Publication number: 20080270497
    Abstract: A decimal floating point finite number in a decimal floating point format is composed from the number in a different format. A decimal floating point format includes fields to hold information relating to the sign, exponent and significand of the decimal floating point finite number. Other decimal floating point data, including infinities and NaNs (not a number), are also composed. Decimal floating point data are also decomposed from the decimal floating point format to a different format. For composition and decomposition, one or more instructions may be employed, including one or more convert instructions.
    Type: Application
    Filed: April 26, 2007
    Publication date: October 30, 2008
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Shawn D. Lundvall, Eric M. Schwarz, Ronald M. Smith, Phil C. Yeh
  • Patent number: 7444488
    Abstract: A method and a programmable unit for bit field shifting in a memory device in a programmable unit as a result of the execution of an instruction, in which a bit segment is shifted within a first memory unit to a second memory unit, are presented. The bit segment is read with a first bit length from a first bit field in the first memory unit starting at a first start point. The bit segment that has been read is stored in the first bit field in the second memory unit starting at a second start point. The first or the second start points is updated by a predetermined value and the updated start point is stored for subsequent method steps.
    Type: Grant
    Filed: September 30, 2005
    Date of Patent: October 28, 2008
    Assignee: Infineon Technologies
    Inventors: Xiaoning Nie, Thomas Wahl
  • Patent number: 7437536
    Abstract: Methods and systems are provided whereby, in one aspect, pointers to address locations of instructions, static data and dynamically-created data are stored such that the instructions, static data and dynamically-created data can be moved to a different memory or processor without changing the values of the pointers.
    Type: Grant
    Filed: May 3, 2004
    Date of Patent: October 14, 2008
    Assignee: Sony Computer Entertainment Inc.
    Inventor: Tatsuya Iwamoto
  • Patent number: 7434028
    Abstract: According to some embodiments, determining a new value to be pushed onto a hardware stack having n entries is determined. Each entry in the stack may include a data portion and an associated counter. If the new value equals the data portion of the entry associated with a current top of stack pointer, the counter associated with that entry is incremented. If the new value does not equal the data portion associated with the current top of stack pointer, the new value is stored in the data portion of the next entry and the current top of stack pointer is advanced.
    Type: Grant
    Filed: December 15, 2004
    Date of Patent: October 7, 2008
    Assignee: Intel Corporation
    Inventors: Michael K. Dwyer, Hong Jiang, Thomas A. Piazza
  • Patent number: RE40509
    Abstract: An improved manifold array (ManArray) architecture addresses the problem of configurable application-spacific instruction set optimization and instruction memory reduction using an instruction abbreviation process thereby further optimizing the general ManArray architecture for application to high-volume and portablke battery-powered type of products. In the ManArray abbreviation process a standard 32-bit ManArray instruction is reduced to a smaller length instruction format, such as 14-bits. An application is first programmed using the full ManArray instruction set using the native 32-bit instructions. After the application program is completed and verified, an instruction-abbreviation tool analyzes the 32-bit application program and generates the abbreviated program using the abbreviated instructions. This instruction abbreviation process allows different program-reduction optimizations tailored for each application program. This process develops an optimized instruction set for the intended application.
    Type: Grant
    Filed: May 18, 2004
    Date of Patent: September 16, 2008
    Assignee: Altera Corporation
    Inventors: Gerald George Pechanek, Charles W. Kurak, Jr., Larry D. Larsen