Array Processor Patents (Class 712/10)
  • Publication number: 20080307194
    Abstract: The present invention provides a system and method for extracting elements from distributed arrays on a parallel processing system. The system includes a module that populates a local array with elements from input, a module that submits a largest element value in the local array and a processor ID for a local processor, and a module that determines a globally largest element value from the largest element values submitted by each one of the plurality of processors. The system further includes a module that broadcasts a winning globally largest element value and winning processor ID to the plurality of processors, and a module that increments an element pointer to the next value in the local array if the winning processor ID equals the processor ID for the local processor.
    Type: Application
    Filed: June 6, 2007
    Publication date: December 11, 2008
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Charles J. Archer, Michael A. Blocksome, Joseph D. Ratterman, Brian Smith
  • Patent number: 7461236
    Abstract: An integrated circuit includes a plurality of tiles. Each tile comprises a processor; and a switch including switching circuitry to forward data over data paths from other tiles to the processor and to switches of other tiles according to a switch instruction indicating an input port to which each of multiple output ports of the switch is to be coupled. The switch is able to operate in a first mode in which successive input data arriving at the switch are forwarded according to a different switch instruction, and a second mode in which successive input data arriving at the switch are forwarded according to the same switch instruction.
    Type: Grant
    Filed: December 21, 2005
    Date of Patent: December 2, 2008
    Assignee: Tilera Corporation
    Inventor: David Wentzlaff
  • Patent number: 7461234
    Abstract: A heterogeneous array includes clusters of processing elements. The clusters include a combination of ALUs and multiplexers linked by direct connections and various general-purpose routing networks. The multiplexers are controlled by the ALUs in the same cluster, or alternatively by ALUs in other clusters, via a special purpose routing network. Components of applications configured onto the array are selectively implemented in either multiplexers or ALUs, as determined by the relative efficiency of implementing the component in one or the other type of processing element, and by the relative availability of the processing element types. Multiplexer control signals are generated from combinations of ALU status signals, and optionally routed to control multiplexers in different clusters.
    Type: Grant
    Filed: May 16, 2005
    Date of Patent: December 2, 2008
    Assignee: Panasonic Corporation
    Inventors: Nicholas John Charles Ray, Andrea Olgiati, Anthony I. Stansfield, Alan D Marshall
  • Patent number: 7461235
    Abstract: Provided is a parallel data path architecture for high energy efficiency. In this architecture, a plurality of parallel process units and a plurality of function units of the process units are controlled by instructions and processed in parallel to improve performance. Also, since only necessary process units and function units are enabled, power dissipation is reduced to enhance energy efficiency. Further, by use of a simple instruction format, hardware can be programmed as the parallel data path architecture for high energy efficiency, which satisfies both excellent performance and low power dissipation, thus elevating hardware flexibility.
    Type: Grant
    Filed: June 6, 2005
    Date of Patent: December 2, 2008
    Assignee: Electronics and Telecommunications Research Institute
    Inventors: Yil Suk Yang, Tae Moon Roh, Dae Woo Lee, Sang Heung Lee, Jong Dae Kim
  • Patent number: 7447872
    Abstract: An inter-chip communication (ICC) mechanism enables any processor in a pipelined arrayed processing engine to communicate directly with any other processor of the engine over a low-latency communication path. The ICC mechanism includes a unidirectional control plane path that is separate from a data plane path of the engine and that accommodates control information flow among the processors. The mechanism thus enables inter-processor communication without sending messages over the data plane communication path extending through processors of each pipeline.
    Type: Grant
    Filed: May 30, 2002
    Date of Patent: November 4, 2008
    Assignee: Cisco Technology, Inc.
    Inventors: Russell Schroter, John William Marshall, Kenneth H. Potter
  • Patent number: 7441100
    Abstract: A method for synchronizing a plurality of processors of a multi-processor computer system on a synchronization point is disclosed. The method includes triggering a first set of processors, using a lead processor of the plurality of processors when the lead processor encounters the synchronization point, to enter an exit holding loop. The first set of processors representing the plurality of processors except the lead processor. The triggering the first set of processors is performed without accessing a shared memory area of the multi-processor system. There is also included triggering the plurality of processors, using a tail processor of the plurality of processors when the tail processor encounters the synchronization point, to leave the exit holding loop. The triggering the plurality of processors is performed without accessing the shared memory area of the multi-processor system.
    Type: Grant
    Filed: February 27, 2004
    Date of Patent: October 21, 2008
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Chenghung Justin Chen, John W. Curry, Robert Seymour
  • Patent number: 7418536
    Abstract: A processor for use in a router, the processor having a systolic array pipeline for processing data packets to determine to which output port of the router the data packet should be routed. In one embodiment, the systolic array pipeline includes a plurality of programmable functional units and register files arranged sequentially as stages, for processing packet contexts (which contain the packet's destination address) to perform operations, under programmatic control, to determine the destination port of the router for the packet. A single stage of the systolic array may contain a register file and one or more functional units such as adders, shifters, logical units, etc., for performing, in one example, very long instruction word (vliw) operations. The processor may also include a forwarding table memory, on-chip, for storing routing information, and a cross bar selectively connecting the stages of the systolic array with the forwarding table memory.
    Type: Grant
    Filed: January 4, 2006
    Date of Patent: August 26, 2008
    Assignee: Cisco Technology, Inc.
    Inventors: Arthur Tung-Tak Leung, Anthony Li, William Lynch, Sharad Mehrotra
  • Patent number: 7412586
    Abstract: The present invention provides a switch memory architecture (SMA) consisting of: (i) processing elements (PE) (ii) memory banks (MB), and (iii) interconnect switches (ISWITCH). The present invention allows for efficient, potentially unbounded data transfer between two adjacent processes by passing a memory handle and the status registers (memory control information) of the MB. This function may be performed by the ISWITCH.
    Type: Grant
    Filed: July 29, 2004
    Date of Patent: August 12, 2008
    Assignee: Colorado State University Research Foundation
    Inventors: Sanjay Rajopadhye, Lakshminarayanan Renganarayana, Gautam Gupta
  • Patent number: 7404066
    Abstract: A command engine for an active memory receives high level tasks from a host and generates corresponding sets of either DCU commands to a DRAM control unit or ACU commands to a processing array control unit. The DCU commands include memory addresses, which are also generated by the command engine, and the ACU command include instruction memory addresses corresponding to an address in an array control unit where processing array instructions are stored.
    Type: Grant
    Filed: January 24, 2007
    Date of Patent: July 22, 2008
    Assignee: Micron Technology, Inc.
    Inventor: Graham Kirsch
  • Patent number: 7389403
    Abstract: An Adaptive Computing Ensemble (ACE) includes a plurality of flexible computation units as well as an execution controller to allocate the units to Computing Ensembles (CEs) and to assign threads to the CEs. The units may be any combination of ACE-enabled units, including instruction fetch and decode units, integer execution and pipeline control units, floating-point execution units, segmentation units, special-purpose units, reconfigurable units, and memory units. Some of the units may be replicated, e.g. there may be a plurality of integer execution and pipeline control units. Some of the units may be present in a plurality of implementations, varying by performance, power usage, or both. The execution controller dynamically alters the allocation of units to threads in response to changing performance and power consumption observed behaviors and requirements.
    Type: Grant
    Filed: March 29, 2006
    Date of Patent: June 17, 2008
    Assignee: Sun Microsystems, Inc.
    Inventors: Donald B. Alpert, John Gregory Favor, Peter N. Glaskowsky, Seungyoon Peter Song
  • Patent number: 7376811
    Abstract: A data processing system architecture is based upon a hardware engine that includes a plurality of functional units and data routing units that interconnect the functional units. The hardware engine performs operations and computations on data as the data traverses paths through the functional units under control of software. The functional units include logic resources, examples of which are flip-flops, latches, arithmetic logic units, random access memory, and the like. The routing units are responsive to the software control signals that are turned on or off to steer the data through these resources. Operations and computations are accomplished according to the steering of the data through the functional units that control the functions performed.
    Type: Grant
    Filed: November 6, 2001
    Date of Patent: May 20, 2008
    Assignee: NetXen, Inc.
    Inventor: Govind Kizhepat
  • Patent number: 7376765
    Abstract: A system including a storage processing device with an input/output module. The input/output module has port processors to receive and transmit network traffic. The input/output module also has a switch connecting the port processors. Each port processor categorizes the network traffic as fast path network traffic or control path network traffic. The switch routes fast path network traffic from an ingress port processor to a specified egress port processor. The storage processing device also includes a control module to process the control path network traffic received from the ingress port processor. The control module routes processed control path network traffic to the switch for routing to a defined egress port processor. The control module is connected to the input/output module. The input/output module and the control module are configured to interactively support data virtualization, data migration, data journaling, and snapshotting.
    Type: Grant
    Filed: October 28, 2003
    Date of Patent: May 20, 2008
    Assignee: Brocade Communications Systems, Inc.
    Inventors: Venkat Rangan, Edward D. McClanahan, Guruaj Pangal, Curt E. Beckmann
  • Patent number: 7373432
    Abstract: A programmable circuit receives configuration data from an external source, stores the firmware in a memory, and then downloads the firmware from the memory. Such a programmable circuit allows a system, such as a computing machine, to modify the programmable circuit's configuration, thus eliminating the need for manually reprogramming the configuration memory. For example, if the programmable circuit is an FPGA that is part of a pipeline accelerator, a processor coupled to the accelerator can modify the configuration of the FPGA. More specifically, the processor retrieves from a configuration registry firmware that represents the modified configuration, and sends the firmware to the FPGA, which then stores the firmware in a memory such as an electrically erasable and programmable read-only memory (EEPROM). Next, the FPGA downloads the firmware from the memory into its configuration registers, and thus reconfigures itself to have the modified configuration.
    Type: Grant
    Filed: October 9, 2003
    Date of Patent: May 13, 2008
    Assignee: Lockheed Martin
    Inventors: John W. Rapp, Larry Jackson, Mark Jones, Troy Cherasaro
  • Publication number: 20080104366
    Abstract: Disclosed herein is a semiconductor chip including at least two processing apparatuses which comply with the same interface specifications and which differ in internal structure, wherein at least one of the processing apparatuses is constituted functionally to replace at least one processing apparatus.
    Type: Application
    Filed: September 11, 2007
    Publication date: May 1, 2008
    Applicant: Sony Corporation
    Inventor: Mutsuhiro Ohmori
  • Patent number: 7355601
    Abstract: A CPU module includes a host element configured to perform a high-level host-related task, and one or more data-generating processing elements configured to perform a data-generating task associated with the high-level host-related task. Each data-generating processing element includes logic configured to receive input data, and logic configured to process the input data to produce output data.
    Type: Grant
    Filed: December 16, 2005
    Date of Patent: April 8, 2008
    Assignees: International Business Machines Corporation, Microsoft Corporation
    Inventors: Jeffrey A. Andrews, Nicholas R. Baker, J. Andrew Goossen, Russell D. Hoover, Eric O. Mejdrich, Sandra S. Woodward
  • Publication number: 20080010435
    Abstract: One embodiment of the present invention sets forth a memory module that includes at least one memory chip, and an intelligent chip coupled to the at least one memory chip and a memory controller, where the intelligent chip is configured to implement at least a part of a RAS feature. The disclosed architecture allows one or more RAS features to be implemented locally to the memory module using one or more intelligent register chips, one or more intelligent buffer chips, or some combination thereof. Such an approach not only increases the effectiveness of certain RAS features that were available in prior art systems, but also enables the implementation of certain RAS features that were not available in prior art systems.
    Type: Application
    Filed: June 14, 2007
    Publication date: January 10, 2008
    Inventors: Michael John Sebastian Smith, Suresh Natarajan Rajan
  • Patent number: 7318143
    Abstract: An information processor for executing a program comprising a plurality of separate program instructions is provided. The processor comprises processing logic operable to individually execute said separate program instructions of said program, an operand store operable to store operand values and an accelerator having a plurality of functional units. The accelerator executes a combined operation corresponding to a computational sub-graph of the separate program instructions by configuring individual ones of said plurality of functional units to perform particular processing operations associated with the combined operation. The accelerator executes the combined operation in dependence upon operand mapping data providing a mapping between operands of the combined operation and storage locations within said operand store and in dependence upon separately specified configuration data providing a mapping between the plurality of functional units and the particular processing operations.
    Type: Grant
    Filed: January 28, 2005
    Date of Patent: January 8, 2008
    Assignees: ARM Limited, University of Michigan
    Inventors: Stuart D. Biles, Krisztian Flautner, Scott Mahlke, Nathan Clark
  • Patent number: 7315933
    Abstract: The present invention is a re-configurable circuit capable of reducing latency by selecting a route for skipping the FF of an operation unit and outputting data to a connection destination operation unit if an accumulated process time is below an operation cycle allocated to the operation unit. The operation unit comprises at least a selector, a flip-flop and an operator. In a program for generating configuration information for switching the configuration of the operation unit of the re-configurable circuit, the selector selects the use/non-use of the flip-flop, based on the configuration information and selector switching condition is reflected in the configuration information for determining whether to take a route for transferring data inputted to the selector to the operator or a route for transferring the data to the operator skipping the flip-flop.
    Type: Grant
    Filed: October 6, 2005
    Date of Patent: January 1, 2008
    Assignee: Fujitsu Limited
    Inventor: Seiichi Nishijima
  • Patent number: 7313788
    Abstract: A method for determining vectorization configurations in a computer processor architecture, the method including identifying a vectorizable loop in a computer program, identifying a memory access pattern of data required for implementing the loop in the architecture, computing a set of candidate configurations of resources required for vectorizing the data in the architecture, where the computing step includes configuring a vector pointer register of the architecture in support of either of reorder-on-read use and reorder-on-write use of a vector element file of the architecture, selecting one of the candidates in accordance with predefined selection criteria, and implementing the selected vectorization configuration in the architecture.
    Type: Grant
    Filed: October 29, 2003
    Date of Patent: December 25, 2007
    Assignee: International Business Machines Corporation
    Inventors: Shay Ben-David, Dorit Naishlos, Uzi Shvadron, Ayal Zaks
  • Publication number: 20070294507
    Abstract: A method and apparatus for improving the operation of a computer processor by utilizing an asymmetric clustered processor architecture are disclosed. The asymmetric clustered processor apparatus includes a narrow cluster, a wide cluster, a steering logic utilizing a cluster predictor for providing a decoded instruction to either the narrow cluster or the wide cluster; address registers which are not part of the ISA, and a translation look-aside buffer for translating the virtual address of a load/store instruction in parallel with an execute stage. The method includes the steps of: predictably steering the instruction to either a W-bit Wide integer cluster or an N-bit Narrow integer cluster, managing the Address register file, and processing any instruction in the Wide integer cluster but processing only N-bit instructions in the Narrow integer cluster.
    Type: Application
    Filed: June 16, 2006
    Publication date: December 20, 2007
    Applicants: THE REGENTS OF THE UNIVERSITY OF CALIFORNIA, UNIVERSITAT POLITECNICA DE CATALUNYA
    Inventors: ALEXANDER V. VEIDENBAUM, ADRIAN CRISTAL KESTELMAN, MATEO VALERO CORTES, RUBEN GONZALEZ GARCIA
  • Patent number: 7299339
    Abstract: A field programmable gate array includes a virtual bus interface that receives a control word from a host processor over a standard I/O bus. A configurable very long instruction word (VLIW) controller receives the control word via virtual bus interface signals mapped from the virtual bus interface. A reconfigurable communication and control fabric controls the data paths and programming modes of single instruction-multiple data (SIMD) processing element cells. The configurable VLIW controller has an interface with the reconfigurable communication and control fabric. SIMD processing element cells are controlled by the configurable VLIW controller through the reconfigurable communication and control fabric via the interface.
    Type: Grant
    Filed: August 30, 2004
    Date of Patent: November 20, 2007
    Assignee: The Boeing Company
    Inventor: Tirumale K. Ramesh
  • Patent number: 7290157
    Abstract: A processor comprises a main controller (CTR11) and a plurality of processing units (1–9). Each processing unit (1–9) has a local controller (CTR1–CTR9) and at least one functional unit (FU1–FU9) controllable by the local controller (CTR1–CTR9). The local controller (CTR1–CTR9) of a processing unit (1–9) is coupled (15) to the main controller (CTR11). The processor further comprises an instruction set, having at least one instruction for increasing the activity of at least one processing unit (1–9). The main controller (CTR11) is arranged to process the at least one instruction for increasing the activity of at least one processing unit (1–9). One or more processing units (1–9) of the processor can be completely switched off, including the corresponding local controller (CTR1–CTR9), since the instructions for switching on a processing unit (1–9) are not processed by the corresponding local controller (CTR1–CTR9), but by the main controller (CTR11) itself.
    Type: Grant
    Filed: April 28, 2003
    Date of Patent: October 30, 2007
    Assignee: Koninklijke Philips Electronics N.V.
    Inventors: Bernardo De Oliveira Kastrup Pereira, Vishal Suhas Choudhary
  • Patent number: 7287146
    Abstract: An array-type computer processor stops, with a plurality of computer programs held, a state control unit and a data-path unit, upon input of event data for task switching. The array-type computer processor obtains the operation state of the state control unit and the processed data of the data-path unit when stopped, and temporarily holds them for each of a plurality of the computer programs. Upon completion of this, the array-type computer processor reads the operation state and processed data of any other computer program and sets them in the state control unit and data-path unit. Upon completion of this, the array-type computer processor outputs to the state control unit the event data for starting the operation. The state control unit then starts to sequentially transfer the operation state, thereby making it possible to perform the process operations according to a plurality of computer programs in a time-sharing manner.
    Type: Grant
    Filed: February 2, 2005
    Date of Patent: October 23, 2007
    Assignees: NEC Corporation, NEC Electronics Corporation
    Inventors: Takeshi Inuo, Nobuki Kajihara, Takao Toi, Tooru Awashima, Hirokazu Kami, Taro Fujii, Kenichiro Anjo, Kouichiro Furuta, Masato Motomura
  • Patent number: 7275145
    Abstract: According to some embodiments, a processing element includes (i) a next neighbor register to receive information directly from a previous processing element in a series of processing elements, and (ii) a previous neighbor register to receive information directly from a next processing element in the series.
    Type: Grant
    Filed: December 24, 2003
    Date of Patent: September 25, 2007
    Assignee: Intel Corporation
    Inventors: Sridhar Lakshmanamurthy, Prashant Chandra, Wilson Y. Liao, Jeen-Yuan Miin, Pun Yim, Chen-Chi Kuo, Jaroslaw J. Sydir
  • Patent number: 7274390
    Abstract: The invention relates to a device for parallel processing data and to a camera system comprising such a device. The camera system (1) comprises a sensor matrix (2), a data converter (3), a DSP (4), a central controller (5), a data buffer (7) and a processor matrix (11) consisting of processors (12). The sensor matrix (2) converts incident electromagnetic radiation into pixel signals. The data converter (3) converts the pixel signals into data. The arrows (6) and (8) diagrammatically indicate the transport of pixel signals and data. The data buffer (7) is physically divided into a part (7A) and a part (7B) and functionally divided into an I/O register (9) and a memory bank (10). The central controller (5) co-ordinates the different tasks. The processors (12) and the data buffer (7) have data ports (13) and further data ports (14) with inputs and outputs which are mutually connected in an electrically conducting manner using the connections (15).
    Type: Grant
    Filed: May 10, 2002
    Date of Patent: September 25, 2007
    Assignee: Koninklijke Philips Electronics, N.V.
    Inventors: Leonardus Hendricus Maria Sevat, Cornelis Niessen
  • Patent number: 7237045
    Abstract: A system including a storage processing device with an input/output module. The input/output module has port processors to receive and transmit network traffic. The input/output module also has a switch connecting the port processors. Each port processor categorizes the network traffic as fast path network traffic or control path network traffic. The switch routes fast path network traffic from an ingress port processor to a specified egress port processor. The storage processing device also includes a control module to process the control path network traffic received from the ingress port processor. The control module routes processed control path network traffic to the switch for routing to a defined egress port processor. The control module is connected to the input/output module. The input/output module and the control module are configured to interactively support data virtualization, data migration, data journaling, and snapshotting.
    Type: Grant
    Filed: October 28, 2003
    Date of Patent: June 26, 2007
    Assignee: Brocade Communications Systems, Inc.
    Inventors: Curt E. Beckmann, Edward D. McClanahan, Guruaj Pangal
  • Patent number: 7237086
    Abstract: A customization program for use in customizing a baseboard management controller used for monitoring operation of various computer system components is disclosed. A user interacts with the customization program to customize the baseboard management controller based on a configuration of components specified for the baseboard of the computer system. The customization program provides a user interface having a repository of icons and a design page. The icons represent various components that may be connected, either directly or indirectly, to the baseboard. The design page is used for constructing a model representing the specified configuration of components. As a user drags icons onto the design page, the model is updated to reflect selection of the components corresponding to these icons. Further, the customization program creates a configuration file that identifies and describes each of the selected components.
    Type: Grant
    Filed: November 26, 2003
    Date of Patent: June 26, 2007
    Assignee: American Megatrends, Inc.
    Inventors: Govind A. Kothandapani, Bakka Ravinder Reddy
  • Patent number: 7231464
    Abstract: The present invention relates to a global management system for a multimodule, multiprocessor machine (PK). The system is characterized in that it comprises an independent module (SM) dedicated to the global management of a plurality of first modules (M1 through Mn), the independent module (SM) being connected to a management tool (BUMP) for each of the first modules (M1 through Mn) by a first specific link supporting a given communication protocol that makes it possible to manage each of the first modules at the startup of the machine, during the running of the machine, and after the machine stops running, the independent module (SM) being connected to each of the first modules via a second link, and the independent module also being globally connected to the multimodule machine (PK) via a physical link of a local area network (LAN) linked to at least two of the first modules (M2 and M3).
    Type: Grant
    Filed: September 15, 2000
    Date of Patent: June 12, 2007
    Assignee: Bull, SA
    Inventors: Christian Caudrelier, Lorenzo Olivares, Tony Reix
  • Patent number: 7202873
    Abstract: An image selecting apparatus includes a designation receiving portion which receives designation of a desired specific scene, an input receiving portion which receives input of image data representing an object image, a characteristic value deriving portion which derives from the image data input into the input receiving portion a characteristic value for use in distinguishment of the specific scene referring to reference data in which the kind of a characteristic value and distinguishing condition corresponding to the characteristic value are defined in advance by the scenes which can be designated as the specific scene, and a distinguishing portion which determines whether the image data represents an image which is of the specific scene input into the designation receiving portion on the basis of the characteristic value derived by the characteristic value deriving portion referring to the corresponding distinguishing condition defined in the reference data.
    Type: Grant
    Filed: September 27, 2004
    Date of Patent: April 10, 2007
    Assignee: Fujifilm Corporation
    Inventor: Sadato Akahori
  • Patent number: 7197623
    Abstract: Protocol processor intended to be associated with at least one main processor of a system with a view to the execution of tasks to which the main processor is not suited. The Protocol Processor comprises a program part (30) including an incrementation register (31), a program memory (33) connected to the incrementation register (31) in order to receive addresses thereof, a decoding part (35) intended to receive instructions from the program memory (33) of the program part (30) with a view to executing an instruction in two cycles, and a data part (36) for executing the instruction.
    Type: Grant
    Filed: June 28, 2000
    Date of Patent: March 27, 2007
    Assignee: Texas Instruments Incorporated
    Inventors: Gerard Chauvel, Francis Aussedat, Pierre Calippe
  • Patent number: 7194598
    Abstract: The present invention provides an adaptive computing engine (ACE) that includes processing nodes having different capabilities such as arithmetic nodes, bit-manipulation nodes, finite state machine nodes, input/output nodes and a programmable scalar node (PSN). In accordance with one embodiment of the present invention, a common architecture is adaptable to function in either a kernel node, or k-node, or as general purpose RISC node. The k-node acts as a system controller responsible for adapting other nodes to perform selected functions. As a RISC node, the PSN is configured to perform computationally intensive applications such as signal processing. The present invention further provides an interconnection scheme so that a plurality of ACE devices operates under the control of a single k-node.
    Type: Grant
    Filed: January 26, 2004
    Date of Patent: March 20, 2007
    Assignee: NVIDIA Corporation
    Inventor: Rojit Jacob
  • Patent number: 7191310
    Abstract: A parallel processor includes a global processor which interprets a program and controls the entirety of the parallel processor. A processor-element block includes a plurality of processor elements each comprising a register file and an operation array for processing a plurality of sets of data. The global processor outputs a control signal to the plurality of processor elements, and, thereby, sets processor-element numbers corresponding to the plurality of processor elements as input values of the operation arrays, respectively.
    Type: Grant
    Filed: January 16, 2001
    Date of Patent: March 13, 2007
    Assignee: Ricoh Company, Ltd.
    Inventors: Shinichi Yamaura, Kazuhiko Hara, Takao Katayama, Kazuhiko Iwanaga, Hiroshi Takafuji
  • Patent number: 7181593
    Abstract: A command engine for an active memory receives high level tasks from a host and generates corresponding sets of either DCU commands to a DRAM control unit or ACU commands to a processing array control unit. The DCU commands include memory addresses, which are also generated by the command engine, and the ACU command include instruction memory addresses corresponding to an address in an array control unit where processing array instructions are stored.
    Type: Grant
    Filed: July 28, 2003
    Date of Patent: February 20, 2007
    Assignee: Micron Technology, Inc.
    Inventor: Graham Kirsch
  • Patent number: 7174014
    Abstract: The present invention provides permutation instructions usable in a programmable processor for solving permutation problems in cryptography, multimedia and other applications. PPERM and PPERM3R instructions are defined to perform permutations by a sequence of instructions with each sequence specifying the position in the source for each bit in the destination. In the PPERM instruction bits in the destination register that change are updated and bits in the destination register that do not change are set to zero. In the PPERM3R instruction bits in the destination register that change are updated and bits in the destination register that do not change are copied from intermediate result of previous PPERM3R instructions. Both PPERM and PPERM3R instructions can individually do permutation with bit repetition. Both PPERM and PPERM3R instructions can individually do permutation of bits stored in more than one register. In an alternate embodiment, a GRP instruction is defined to perform permutations.
    Type: Grant
    Filed: May 7, 2001
    Date of Patent: February 6, 2007
    Assignee: Teleputers, LLC
    Inventors: Ruby B. Lee, Zhijie Shi
  • Patent number: 7167976
    Abstract: The present invention describes a method and system for an interface for integrating reconfigurable processors into a general purpose computing system. In particular, the system resides in a computer system containing standard instruction processors, as well as reconfigurable processors. The interface includes a command processor, a command list memory, various registers, a direct memory access engine, a translation look-aside buffer, a dedicated section of common memory, and a dedicated memory. The interface is controlled via commands from a command list that is created during compilation of a user application, or various direct commands.
    Type: Grant
    Filed: May 31, 2005
    Date of Patent: January 23, 2007
    Assignee: SRC Computers, Inc.
    Inventor: Daniel Poznanovic
  • Patent number: 7155602
    Abstract: The present invention describes a method and system for an interface for integrating reconfigurable processors into a general purpose computing system. In particular, the system resides in a computer system containing standard instruction processors, as well as reconfigurable processors. The interface includes a command processor, a command list memory, various registers, a direct memory access engine, a translation look-aside buffer, a dedicated section of common memory, and a dedicated memory. The interface is controlled via commands from a command list that is created during compilation of a user application, or various direct commands.
    Type: Grant
    Filed: December 5, 2001
    Date of Patent: December 26, 2006
    Assignee: SRC Computers, Inc.
    Inventor: Daniel Poznanovic
  • Patent number: 7149875
    Abstract: An active memory device includes a command engine that receives high level tasks from a host and generates corresponding sets of either DCU commands to a DRAM control unit or ACU commands to a processing array control unit. The DCU commands include memory addresses, which are also generated by the command engine, and the ACU command include instruction memory addresses corresponding to an address in an array control unit where processing array instructions are stored. The active memory device includes a vector processing and re-ordering system coupled to the array control unit and the memory device. The vector processing and re-ordering system re-orders data received from the memory device into a vector of contiguous data, process the data in accordance with an instruction received from the array control unit to provide results data, and passes the results data to the memory device.
    Type: Grant
    Filed: July 28, 2003
    Date of Patent: December 12, 2006
    Assignee: Micron Technology, Inc.
    Inventor: Graham Kirsch
  • Patent number: 7120903
    Abstract: An object code for sequentially switching contexts of processing circuits arrayed in a matrix in a parallel operation apparatus is generated from a general source code descriptive of operation of the parallel operation apparatus. A Data Flow Graph (DFG) is generated from the source code descriptive of operation of the parallel operation apparatus according to limiting conditions, registered in advance, representing a physical structure, etc. of the parallel operation apparatus, and scheduled in a Control Data Flow Graph (CDFG). An Register Transfer Level (RTL) description is generated from the CDFG, converting a finite-state machine into an object code and converting a data path into a net list. An object code of the processing circuits is generated in each context from the net list.
    Type: Grant
    Filed: September 24, 2002
    Date of Patent: October 10, 2006
    Assignee: NEC Corporation
    Inventors: Takao Toi, Toru Awashima, Yoshiyuki Miyazawa, Noritsugu Nakamura, Taro Fujii, Koichiro Furuta, Masato Motomura
  • Patent number: 7107363
    Abstract: The present invention discloses, in one aspect, a microprocessor. In one embodiment, the microprocessor includes a processing element configured to process an application using a bandwidth. The microprocessor also includes an access shaper coupled to the processing element and configured to shape storage requests for the processing of the application. In this embodiment, the microprocessor further includes bandwidth management circuitry coupled to the access shaper and configured to track the bandwidth usage based on the requests. A method of coordinating bandwidth allocation and a processor assembly are also disclosed.
    Type: Grant
    Filed: June 19, 2003
    Date of Patent: September 12, 2006
    Assignee: International Business Machines Corporation
    Inventors: Jeffrey Douglas Brown, Michael Norman Day, Charles Ray Johns, James Allan Kahle, Takeshi Yamazaki
  • Patent number: 7107361
    Abstract: The present invention provides coupled-type computers wherein a computer can be coupled with computers of the same structure easily, and can be coupled with other computers of the same structure in high density. Computer components such as CPUs or memories are built in a holder made of polyhedron cube. A radio propagation bus space formed by a cavity is provided in the inside of the holder, and a plurality of radio-electric signal interconversion elements provided with a signal identification means facing the radio propagation bus space are disposed in the holder. These radio-electric signal interconversion elements are connected to the computer components in the holder. Holes communicating with the radio propagation bus space are bored on the surfaces of the outsides of the holders by means of the radio lines.
    Type: Grant
    Filed: December 26, 2001
    Date of Patent: September 12, 2006
    Inventor: Tsunemi Tokuhara
  • Patent number: 7092526
    Abstract: The method and system provides a set of permutation primitives for current and future 2-D multimedia programs which are based on decomposing images and objects into atomic units, then finding the permutations desired for the atomic units. The subword permutation instructions for these 2-D building blocks are also defined for larger subword sizes at successively higher hierarchical levels. The atomic unit can be a 2×2 matrix and four triangles contained within the 2×2 matrix. Each of the elements in the matrix can represent a subword of one or more bits. The permutations provide vertical, horizontal, diagonal, rotational, and other rearrangements of the elements in the atomic unit.
    Type: Grant
    Filed: May 7, 2001
    Date of Patent: August 15, 2006
    Assignee: Teleputers, LLC
    Inventor: Ruby B. Lee
  • Patent number: 7085935
    Abstract: A chipset is initialized in a secure environment for an isolated execution mode by an initialization storage. The secure environment has a plurality of executive entities and is associated with an isolated memory area accessible by at least one processor. The at least one processor has a plurality of threads and operates in one of a normal execution mode and the isolated execution mode. The executive entities include a processor executive (PE) handler. PE handler data corresponding to the PE handler are stored in a PE handler storage. The PE handler data include a PE handler image to be loaded into the isolated memory area after the chipset is initialized. The loaded PE handler image corresponds to the PE handler.
    Type: Grant
    Filed: September 22, 2000
    Date of Patent: August 1, 2006
    Assignee: Intel Corporation
    Inventors: Carl M. Ellison, Roger A. Golliver, Howard C. Herbert, Derrick C. Lin, Francis X. McKeen, Gilbert Neiger, Ken Reneris, James A. Sutton, Shreekant S. Thakkar, Millind Mittal
  • Patent number: 7082610
    Abstract: A method and apparatus for exception handling in a multi-processor environment are described. In an embodiment, a method for handling a number of exceptions within a processor in a multi-processing system includes receiving an exception within the processor, wherein each processor in the multi-processor system shares a same memory. The method also includes executing a number of instructions at an address within a common interrupt handling vector address space of the same memory. The number of instructions cause the processor to determine an identification of the processor based on a query that is internal to the processor. Additionally, the method includes modifying execution flow of the exception to execute an interrupt handler located within one of a number of different interrupt handling vector address spaces.
    Type: Grant
    Filed: June 2, 2001
    Date of Patent: July 25, 2006
    Assignee: Redback Networks, Inc.
    Inventor: Sanjay Lal
  • Patent number: 7069372
    Abstract: A processor for use in a router, the processor having a systolic array pipeline for processing data packets to determine to which output port of the router the data packet should be routed. In one embodiment, the systolic array pipeline includes a plurality of programmable functional units and register files arranged sequentially as stages, for processing packet contexts (which contain the packet's destination address) to perform operations, under programmatic control, to determine the destination port of the router for the packet. A single stage of the systolic array may contain a register file and one or more functional units such as adders, shifters, logical units, etc., for performing, in one example, very long instruction word (vliw) operations. The processor may also include a forwarding table memory, on-chip, for storing routing information, and a cross bar selectively connecting the stages of the systolic array with the forwarding table memory.
    Type: Grant
    Filed: June 20, 2002
    Date of Patent: June 27, 2006
    Assignee: CISCO Technology, Inc.
    Inventors: Arthur Leung, Jr., Anthony J. Li, William L. Lynch, Sharad Mehrotra
  • Patent number: 7053895
    Abstract: An image processing apparatus which processes input image data of Y lines, each consisting of X pixels, using an SIMD processor, comprises a calculation unit including N (X>N>1, Y>N>1) elemental processors capable of parallel-operating; an input unit for dividing and inputting the image data of one line with respect to every N pixels; a storage for storing the input N-pixel image data of the N lines; and an image processor for supplying, from among the stored N-pixel image data of the N lines, the N image data respectively to the N elemental processors, and causing the respective elemental processors to perform the same-kind calculations in parallel. Thus, the image processing apparatus for performing an image process such as error diffusion by using the SIMD processor without using any auxiliary processor for a sequential process can be provided.
    Type: Grant
    Filed: December 31, 2002
    Date of Patent: May 30, 2006
    Assignee: Canon Kabushiki Kaisha
    Inventors: Shigeo Yamagata, Hiroshi Tanioka, Manabu Takebayashi
  • Patent number: 7035991
    Abstract: A surface computer includes an address generator for generating an address for adjusting surface region data concerning at least a storage region and a concurrent computer, provided at a subsequent stage of the address generator, having a plurality of unit computers.
    Type: Grant
    Filed: October 2, 2003
    Date of Patent: April 25, 2006
    Assignee: Sony Computer Entertainment Inc.
    Inventor: Akio Ohba
  • Patent number: 7031994
    Abstract: Improved transposition of a matrix in a computer system may be accomplished while utilizing at most a single permutation vector. This greatly improves the speed and parallelability of the transpose operation. For a standard rectangular matrix having M rows and N columns and a size M×N, first n and q are determined, wherein N=n*q, and wherein M×q represents a block size and wherein N is evenly divisible by p. Then, the matrix is partitioned into n columns of size M×q. Then for each column n, elements are sequentially read within the column row-wise and sequentially written into a cache, then sequentially read from the cache and sequentially written row-wise back into the matrix in a memory in a column of size q×M. A permutation vector may then be applied to the matrix to arrive at the transpose. This method may be modified for special cases, such as square matrices, to further improve efficiency.
    Type: Grant
    Filed: August 13, 2002
    Date of Patent: April 18, 2006
    Assignee: Sun Microsystems, Inc.
    Inventors: Shandong Lao, Bradley Romain Lewis, Michael Lee Boucher
  • Patent number: 7028145
    Abstract: Protocol processor intended to be associated with at least one main processor of a system with a view to the execution of tasks to which the main processor is not suited. The protocol processor comprises a program part (30) including an incrementation register (31), a program memory (33) connected to the incrementation register (31) in order to receive addresses thereof, a decoding part (35) intended to receive instructions from the program memory (33) of the program part (30) with a view to executing an instruction in two cycles, and a data part (36) for executing the instruction.
    Type: Grant
    Filed: July 10, 1997
    Date of Patent: April 11, 2006
    Inventors: Gerard Chauvel, Francis Aussedat, Pierre Calippe
  • Patent number: 7017158
    Abstract: The multi-processor system comprises a plurality of cell processors for performing data processing, a BCMC for broadcasting broadcast data including data used in data processing to the plurality of cell processors, each of the plurality of cell processors sorts out only data necessary for data processing that is performed by each cell processor from broadcast data broadcasted by BCMC to as to perform data processing. BCMC obtains results of data processing of all cell processors so that they can be supplied to all cell processors as broadcast data, thus making it possible to transmit and receive the results of data processing between the cell processors and perform high-speed data processing as an entire system.
    Type: Grant
    Filed: September 26, 2001
    Date of Patent: March 21, 2006
    Assignee: Sony Computer Entertainment Inc.
    Inventor: Nobuo Sasaki
  • Patent number: 7010667
    Abstract: An internal bus system for DFPs and units with two- or multi-dimensional programmable cell architectures, for managing large volumes of data with a high interconnection complexity. The bus system can transmit data between a plurality of function blocks, where multiple data packets can be on the bus at the same time. The bus system automatically recognizes the correct connection for various types of data or data transmitters and sets it up.
    Type: Grant
    Filed: April 5, 2002
    Date of Patent: March 7, 2006
    Assignee: PACT XPP Technologies AG
    Inventors: Martin Vorbach, Robert Münch