Array Processor Element Interconnection Patents (Class 712/11)
-
Patent number: 7263602Abstract: A method of associating virtual stripes to physical stripes in a pipelined or ring structure comprises associating a first set of virtual stripes with at least two physical stripes and associating a second set of virtual stripes, disjoint from the first set, with at least two additional physical stripes. The present invention is also directed to a method of configuring a plurality of processing elements based on a less than global, but not purely local, association. The configuration method of the present invention may be implemented in a device arranged in stripes of processing elements. The method comprises configuring either of at least two physical stripes with a virtual stripe from a first set of virtual stripes and configuring either of at least two additional physical stripes with a virtual stripe from a second set of virtual stripes, said first and second virtual sets being disjoint.Type: GrantFiled: August 16, 2002Date of Patent: August 28, 2007Assignee: Carnegie Mellon UniversityInventor: Herman Schmit
-
Patent number: 7243175Abstract: A general bus system is provided which combines a number of internal lines and leads them as a bundle to the terminals. The bus system control is predefined and does not require any influence by the programmer. Any number of memories, peripherals or other units can be connected to the bus system (for cascading).Type: GrantFiled: March 2, 2004Date of Patent: July 10, 2007Assignee: Pact XPP Technologies AGInventors: Martin Vorbach, Robert Münch
-
Patent number: 7203816Abstract: A multi-processor system apparatus allows a compiler to perform a static scheduling action easily and can conduct the transfer of data packets without collision in response to a common pattern of simultaneous access demands. Processor elements are interconnected by a multi-stage interconnection network having multiple stages. As each of switching elements in the multi-stage interconnection network is preliminarily subjected to the static scheduling action of a compiler. The multi-stage interconnection network is emulated without producing collision of data. When the transfer of packets is carried out in one clos network arrangement of the multi-stage interconnection network, the scheduling of switching elements SE0 to SE3 in the exchanger at Level 1 is determined so that a packet lost in the arbitration is transferred through the free port of any applicable one of the switching elements.Type: GrantFiled: March 1, 2002Date of Patent: April 10, 2007Assignee: Semiconductor Technology Academic Research CenterInventors: Tomohiro Morimura, Hideharu Amano
-
Patent number: 7197624Abstract: An array processor includes processing elements (00, 01, 02, 03, 10, 11, 12, 13, 20, 21, 22, 23, 30, 31, 32, 33) arranged in clusters (e.g., 44, 46, 48, 50) to form a rectangular array (40). Inter-cluster communication paths (88) are mutually exclusive. Due to the mutual exclusivity of the data paths, communications between the processing elements of each cluster may be combined in a single inter-cluster path, thus eliminating half the wiring required for the path. The length of the longest communication path is not directly determined by the overall dimension of the array, as in conventional torus arrays. Rather, the longest communications path is limited by the inter-cluster spacing. Transpose elements of an N×N torus may be combined in clusters and communicate with one another through intra-cluster communications paths. Transpose operation latency is eliminated in this approach. Each PE may have a single transmit port (35) and a single receive port (37).Type: GrantFiled: February 9, 2004Date of Patent: March 27, 2007Assignee: Altera CorporationInventors: Gerald George Pechanek, Charles W. Kurak, Jr.
-
Patent number: 7185174Abstract: A switching element for switchably coupling a two-dimensional array of circuit elements comprises an input, an output, means for switchably coupling the input to the output; a first input/output port, a second input/output port, a third input/output port, and a fourth input/output port, each input/output port being switchably coupled to the input, the output, and each other, wherein the first and third input/output ports are spaced apart along a first axis, and the second and fourth input/output ports are spaced apart along a second axis, wherein the second axis traverses the first axis between the first and third input/output ports.Type: GrantFiled: March 4, 2002Date of Patent: February 27, 2007Assignee: Mtekvision Co., Ltd.Inventors: Malcolm Stewart, Eric Giernalczyk, Richard Beriault
-
Patent number: 7185175Abstract: Processing units (PUs) are coupled with a gated bi-directional bus structure that allows the PUs to be cascaded. Each PUn has communication logic and function logic. Each PUn is physically coupled to two other PUs, a PUp and a PUf. The communication logic receives Link Out data from a PUp and sends Link In data to a PUf. The communication logic has register bits for enabling and disabling the data transmission. The communication logic couples the Link Out data from a PUp to the function logic and couples Link In data to the PUp from the function logic in response to the register bits. The function logic receives output data from the PUn and Link In data from the communication logic and forms Link Out data which is coupled to the PUf. The function logic couples Link In data from the PUf to the PUn and to the communication logic.Type: GrantFiled: January 14, 2004Date of Patent: February 27, 2007Assignee: International Business Machines CorporationInventors: Kerry A. Kravec, Ali G. Saidi, Jan M. Slyfield, Pascal R. Tannhof
-
Patent number: 7181594Abstract: A method of parallel hardware-based multithreaded processing is described. The method includes assigning tasks for packet processing to programming engines and establishing pipelines between programming stages, which correspond to the programming engines. The method also includes establishing contexts for the assigned tasks on the programming engines and using a software controlled cache such as a CAM to transfer data between next neighbor registers residing in the programming engines.Type: GrantFiled: January 25, 2002Date of Patent: February 20, 2007Assignee: Intel CorporationInventors: Hugh M. Wilkinson, III, Mark B. Rosenbluth, Matthew J. Adiletta, Debra Bernstein, Gilbert Wolrich
-
Patent number: 7176914Abstract: A system and method are provided for directing the flow of data and instructions into at least one functional unit. In one embodiment of a system of components defining a plurality of nodes, a queue network manager (QNM) forming a part of each node, is provided. In this embodiment, the QNM comprises an interface to a network that supports intercommunication among the plurality of nodes, an interface configured to pass messages with a functional unit within the node, a random access memory (RAM) configured to store at least one of a message and a programmable instruction, and logic configured to control an operational aspect of a functional unit based on contents of the programmable instruction.Type: GrantFiled: May 16, 2002Date of Patent: February 13, 2007Assignee: Hewlett-Packard Development Company, L.P.Inventor: Darel N. Emmot
-
Patent number: 7171499Abstract: A processor surrogate (320/520) is adapted for use in a processing node (S1) of a multiprocessor data processing system (300/500) having a plurality of processing nodes (P0, S1) coupled together and to a plurality of input/output devices (330, 340, 350/530, 540, 550, 560) using corresponding communication links. The processor surrogate (320/520) includes a first port (372, 374/620, 622) comprising a first set of integrated circuit terminals adapted to be coupled to a first external communication link (370/590) for coupling (P0) of the plurality of processing nodes (310, 320/510, 520), a second port (382, 384/630, 632) comprising a second set of integrated circuit terminals adapted to be coupled to a second external communication link (380/592) for coupling to one (350/550) of the plurality of input/output devices (330, 340, 350/530, 540, 550, 560), and an interconnection circuit (390, 392/608, 612, 614) coupled between the first port (372, 374/620, 622) and the second port (382, 384/630, 632).Type: GrantFiled: October 10, 2003Date of Patent: January 30, 2007Assignee: Advanced Micro Devices, Inc.Inventors: Brent Kelley, William C. Brantley
-
Patent number: 7098437Abstract: A semiconductor integrated circuit device, having a plurality of processing elements accommodated on a single semiconductor chip, has a latch circuit and a selecting circuit. The latch circuit is provided at an output of each of the processing elements. The selecting circuit selects an input source from a group consisting of upper, lower, left, and right processing elements and a zero signal.Type: GrantFiled: July 23, 2002Date of Patent: August 29, 2006Assignee: Semiconductor Technology Academic Research CenterInventors: Masatoshi Ishikawa, Idaku Ishii, Takashi Komuro, Shingo Kagami
-
Patent number: 7069372Abstract: A processor for use in a router, the processor having a systolic array pipeline for processing data packets to determine to which output port of the router the data packet should be routed. In one embodiment, the systolic array pipeline includes a plurality of programmable functional units and register files arranged sequentially as stages, for processing packet contexts (which contain the packet's destination address) to perform operations, under programmatic control, to determine the destination port of the router for the packet. A single stage of the systolic array may contain a register file and one or more functional units such as adders, shifters, logical units, etc., for performing, in one example, very long instruction word (vliw) operations. The processor may also include a forwarding table memory, on-chip, for storing routing information, and a cross bar selectively connecting the stages of the systolic array with the forwarding table memory.Type: GrantFiled: June 20, 2002Date of Patent: June 27, 2006Assignee: CISCO Technology, Inc.Inventors: Arthur Leung, Jr., Anthony J. Li, William L. Lynch, Sharad Mehrotra
-
Patent number: 7065672Abstract: Apparatus and methods for fault-tolerant computing using an asynchronous switching fabric where at least one of a plurality of redundant data processing elements executing substantially identical instructions communicates transactions to at least one target device, such as input/output device, or another data processing element. The transactions are communicated through the asynchronous switching fabric wherein each of the data processing elements and the target device are connected to the asynchronous switching fabric through a respective channel adapter.Type: GrantFiled: March 28, 2001Date of Patent: June 20, 2006Assignee: Stratus Technologies Bermuda Ltd.Inventors: Finbarr Denis Long, Joseph Ardini, Dana A. Kirkpatrick, Michael James O'Keeffe
-
Patent number: 7051185Abstract: A computer system comprising a first block which includes multiple processing subsystem, a second block which includes multiple processing subsystem, a third block which includes multiple processing subsystem, a fourth block which includes multiple processing subsystem, a first communication and processing subsystem that interconnects subsystem of the first and second blocks, a second communication and processing subsystem that interconnects subsystem of the third and fourth blocks, a third communication and processing subsystem that interconnects subsystem of the first and fourth blocks; and a fourth communication and processing subsystem that interconnects subsystem of the second and third blocks, wherein respective subsystem include a respective processing elements and a respective communication and processing unit interconnecting the respective processing elements.Type: GrantFiled: September 12, 2003Date of Patent: May 23, 2006Assignee: Star Bridge Systems, Inc.Inventor: Kent L. Gilson
-
Patent number: 7043562Abstract: Irregularities are provided in at least one dimension of a torus or mesh network for lower average path length and lower maximum channel load while increasing tolerance for omitted end-around connections. In preferred embodiments, all nodes supported on each backplane are connected in a single cycle which includes nodes on opposite sides of lower dimension tori. The cycles in adjacent backplanes hop different numbers of nodes.Type: GrantFiled: June 9, 2003Date of Patent: May 9, 2006Assignee: Avivi Systems, Inc.Inventors: William J. Dally, William F. Mann, Philip P. Carvey
-
Patent number: 7035991Abstract: A surface computer includes an address generator for generating an address for adjusting surface region data concerning at least a storage region and a concurrent computer, provided at a subsequent stage of the address generator, having a plurality of unit computers.Type: GrantFiled: October 2, 2003Date of Patent: April 25, 2006Assignee: Sony Computer Entertainment Inc.Inventor: Akio Ohba
-
Patent number: 7017074Abstract: A semiconductor device, such as a multiprocessor chip for a computer system, includes a total number of on-board components which is greater than the number of that component required by the system. The chip may be provided with multiple I/O controllers, e.g. more than one controller per I/O interface, and the I/O controllers can act as backups to one another, with failover logic controlling the backup process. In addition, the number of processors formed on the chip may be greater than the number required by the system, allowing multiple levels of redundancy and greater successful manufacturing yields.Type: GrantFiled: March 12, 2002Date of Patent: March 21, 2006Assignee: Sun Microsystems, Inc.Inventor: Kenneth Okin
-
Patent number: 7009614Abstract: A system is described that is broadly directed to a system of integrated circuit components. The system comprises a plurality of nodes that are interconnected by communication links. A random access memory (RAM) is connected to each node. At least one functional unit is integrated into each node, and each functional unit is configured to carry out a predetermined processing function. Finally, each RAM includes a coherency mechanism configured to permit only read access to the RAM by other nodes, the coherency mechanism further configured to permit write access to the RAM only by functional units that are local to the node.Type: GrantFiled: May 3, 2005Date of Patent: March 7, 2006Assignee: Hewlett-Packard Development Company, L.P.Inventors: Darel N Emmot, Byron A Alcorn
-
Patent number: 7000022Abstract: Frame-based streaming data flows through a graph of multiple interconnected processing modules. The modules have a set of performance parameters whose values specify the sensitivity of each module to the selection of certain resources of a system. A user specifies overall goals for an actual graph for processing a given type of data for a particular purpose. A flow manager constructs the graph as a sequence of module interconnections required for processing the data, in response to the parameter values of the individual modules in the graph in view of the goals for the overall graph as a whole, and divides it into pipes each having one or more modules and each assigned to a memory manager for handling data frames in the pipe.Type: GrantFiled: June 7, 2004Date of Patent: February 14, 2006Assignee: Microsoft CorporationInventors: Rafael S. Lisitsa, George H. J. Shaw, Dale A. Sather, Bryan A. Woodruff
-
Patent number: 6996504Abstract: A scalable computer architecture capable of performing fully scalable simulations includes a plurality of processing elements (PEs) and a plurality of interconnections between the PEs. In this regard, the interconnections can interconnect each processing element to each neighboring processing element located adjacent the respective processing element, and further interconnect at least one processing element to at least one other processing element located remote from the respective at least one processing element. For example, the interconnections can interconnect the plurality of processing elements according to a fractal-type method or a quenched random method. Further, the plurality of interconnections can include at least one interconnection at each length scale of the plurality of processing elements.Type: GrantFiled: November 14, 2001Date of Patent: February 7, 2006Assignee: Mississippi State UniversityInventors: Mark A. Novotny, Gyorgy Korniss
-
Patent number: 6993764Abstract: A computer implemented method schedules processor jobs on a network of parallel machine processors or distributed system processors. Control information communications generated by each process performed by each processor during a defined time interval is accumulated in buffers, where adjacent time intervals are separated by strobe intervals for a global exchange of control information. A global exchange of the control information communications at the end of each defined time interval is performed during an intervening strobe interval so that each processor is informed by all of the other processors of the number of incoming jobs to be received by each processor in a subsequent time interval.Type: GrantFiled: June 28, 2001Date of Patent: January 31, 2006Assignee: The Regents of the University of CaliforniaInventors: Fabrizio Petrini, Wu-chun Feng
-
Patent number: 6990566Abstract: A method and an apparatus for configuration of multiple context processing elements (MCPEs) are described. The method and an apparatus is capable of selectively transmitting data over a bidirectional shared bus network including a plurality of channels between pairs of MCPEs in the networked array. The method and an apparatus then selectively transmits a sideband bit indicating a direction in which the data is transmitted in the shared bus network.Type: GrantFiled: April 20, 2004Date of Patent: January 24, 2006Assignee: Broadcom CorporationInventors: Ethan Mirsky, Robert French, Ian Eslick
-
Patent number: 6968442Abstract: A parallel computer of this invention includes a plurality of memory elements and a plurality of processing elements and each of the processing elements is connected to logically adjacent memory elements. For example, the processing element which corresponds to a logical position (i, j) is connected to the memory elements which correspond to a plurality of logical positions (i, j), (i, j+1), (i+1, j) and (i+1, j+1). It is preferable if each of the memory elements can be accessed from the exterior. According to this invention, efficient memory access can be made and the parallel processing can be performed at high speed without increasing the hardware amount and making the control operation complicated. Further, the operation speed of the image processing can be enhanced by constructing an image memory by use of a plurality of memory elements and causing the processing element to effect the image processing in a distributed and cooperative manner.Type: GrantFiled: June 19, 2002Date of Patent: November 22, 2005Assignee: Kabushiki Kaisha ToshibaInventors: Kenichi Maeda, Nobuyuki Takeda, Yasukazu Okamoto
-
Patent number: 6961782Abstract: There is provided a method for routing packets on a linear array of N processors connected in a nearest neighbor configuration. The method includes the step of, for each end processor of the array, connecting unused outputs to corresponding unused inputs. For each axis required to directly route a packet from a source to a destination processor, the following steps are performed. It is determined whether a result of directly sending a packet from an initial processor to a target processor is less than or greater than N/2 moves, respectively. The initial processor is the source processor in the first axis, and the target processor is the destination processor in the last axis. The packet is directly sent from the initial processor to the target processor, when the result is less than N/2 moves. The packet is indirectly sent so as to wrap around each end processor, when the result is greater than N/2 moves.Type: GrantFiled: March 14, 2000Date of Patent: November 1, 2005Assignee: International Business Machines CorporationInventors: Monty M. Denneau, Peter H. Hochschild, Richard A. Swetz, Henry S. Warren, Jr.
-
Patent number: 6957318Abstract: A method for controlling a processor array by a host computer involves creating a graph of a plurality of nodes using a data connection component, configuring a broadcast tree from a spanning tree of the graph, propagating a first command from the host computer to a member of the processor array using the broadcast tree, configuring a reply tree from a spanning tree of the graph, transmitting a response from the member of the processor array to the host computer using the reply tree, and configuring the data connection component to send at least one message selected from the first command and the response on at least one run mode communication path.Type: GrantFiled: March 28, 2002Date of Patent: October 18, 2005Assignee: Sun Microsystems, Inc.Inventors: David R. Emberson, Jeffrey M. Broughton, James B. Burr, Derek E. Pappas
-
Patent number: 6944747Abstract: A matrix data processor is implemented wherein data elements are stored in physical registers and mapped to logical registers. After being stored in the logical registers, the data elements are then treated as matrix elements. By using a series of variable matrix parameters to define the size and location of the various matrix source and destination elements, as well as the operation(s) to be performed on the matrices, the performance of digital signal processing operations can be significantly enhanced.Type: GrantFiled: December 9, 2002Date of Patent: September 13, 2005Assignee: GemTech Systems, LLCInventors: Gopalan N Nair, Gouri G. Nair
-
Patent number: 6940496Abstract: A display module driving system wherein digital pixel data for an image to be displayed is provided to a plurality of column drivers on a row by row basis in serial format over a plurality of dedicated bus lines rather than a single parallel bus line. Digital pixel data for a complete image row is divided into segments, wherein the number of segments is each to the number of column drivers. Each segments is then serialized and transmitted to a corresponding column driver such that the digital pixel data for an entire row is transferred to each of the plurality of column drivers at the same time. The column drivers receive the segments and rearrange the data into parallel. The pixels are then transferred to a digital to analog converter, preferably two pixels at a time, where each pixel is converted into analog red, green and blue signals.Type: GrantFiled: June 4, 1999Date of Patent: September 6, 2005Assignee: Silicon, Image, Inc.Inventor: Eun-Gu Kim
-
Patent number: 6933942Abstract: In a display apparatus, a display instruction generating unit outputs a display instruction. A plurality of display processing units are arranged in parallel, and each of the plurality of display processing units generates display data in response to the display instruction from the display instruction generating unit. A display switching unit selects one of the plurality of display processing units and outputs the display data from the selected display processing unit to the display unit. Thus, a display unit displays the display data.Type: GrantFiled: July 16, 2002Date of Patent: August 23, 2005Assignee: NEC CorporationInventor: Junichi Tamai
-
Patent number: 6928535Abstract: An image input section and a signal processing section are provided. The image input section includes an array of pixel in which a plurality of pixels having a CMOS type photoelectric converting element for converting incident light to an electric signal are arranged in a matrix, and a data read-out circuit having the same number of A/D converters as the number of the pixels arranged in one row of the array of pixel and serving to convert the analog signal converted by the pixels into a digital signal and to output the digital signal. The signal processing section includes plurality of processors. Each of the processors includes a plurality of processing elements (PE) provided on the A/D converter provided in the data read-out circuit by one to one. Moreover, a plurality of PEs provided in each of the processors have the same data processing function in the same processor. Furthermore, the PEs in the processor carry out a signal processing in parallel in response to an instruction.Type: GrantFiled: July 16, 2002Date of Patent: August 9, 2005Assignee: Kabushiki Kaisha ToshibaInventors: Hirofumi Yamashita, Charles G. Sodini
-
Patent number: 6920545Abstract: A reconfigurable processor architecture. A reconfigurable processor is an array of a multiplicity of various functional elements, between which the interconnections may be programmably configured. The inventive processor is implemented on a single substrate as a network of clusters of elements. Each cluster includes a crossbar switching node to which a plurality of elements is connected via ports. Additional ports on the crossbar switching node connect to the switching nodes of nearest neighbor clusters. The crossbar switching nodes allow pathways to be programmably set between any of the ports, and any pathway may be set to be either registered or unregistered. The use of clusters of processing elements allows complete freedom of local connectivity for effective configuration of many different processing functions. Wide area interconnection is more restricted, but, since it is less used, does not significantly restrict configurability.Type: GrantFiled: January 17, 2002Date of Patent: July 19, 2005Assignee: Raytheon CompanyInventors: William D. Farwell, Kenneth E. Prager
-
Patent number: 6912608Abstract: Techniques for a pipelined bus which provides a very high performance interface to computing elements, such as processing elements, host interfaces, memory controllers, and other application-specific coprocessors and external interface units. The pipelined bus is a robust interconnected bus employing a scalable, pipelined, multi-client topology, with a fully synchronous, packet-switched, split-transaction data transfer model. Multiple non-interfering transfers may occur concurrently since there is no single point of contention on the bus. An aggressive packet transfer model with local conflict resolution in each client and packet-level retries allows recovery from collisions and buffer backups. Clients are assigned unique IDs, based upon a mapping from the system address space allowing identification needed for quick routing of packets among clients.Type: GrantFiled: April 25, 2002Date of Patent: June 28, 2005Assignee: PTS CorporationInventors: Edward A. Wolff, David Baker, Bryan Garnett Cope, Edwin Franklin Barry
-
Patent number: 6912626Abstract: A method and apparatus for connecting the processor array of an MPP array to a memory such that data conversion by software is not necessary, and the data can be directly stored in either a normal mode or vertical mode in the memory is disclosed. A connection circuit is provided in which multiple PEs share their connections to multiple data bits in the memory array. Each PE is associated with a plurality of memory buffer registers, which stores data read from (or to be written to) one or two memory data bits. In horizontal (normal) mode connection the memory bits are selected so that all the bits of a given byte are stored in the same PE, i.e., each set of buffer registers associated with a respective PE contains one byte as seen by an external device. In vertical (bit serial) mode, each set of buffer registers contains the successive bits at successive locations in the memory corresponding to that PEs position in the memory word.Type: GrantFiled: August 31, 2000Date of Patent: June 28, 2005Assignee: Micron Technology, Inc.Inventor: Graham Kirsch
-
Patent number: 6901359Abstract: A system and method for bulk transfer to and from the SRAMs in which a starting memory address is latched and is then incremented every clock cycle to generate a new memory address. The addresses are decoded and memory requests are pipelined to the SRAM memory, one every clock cycle. When the memory controller detects transfer of the boundary of a predetermined number of clock cycles or words (e.g. 64 words or four clock cycles) the burst mode of data transfer is stopped and the memory controller waits for a “done” signal before resuming another cycle of the burst transfer mode. The memory controller on detecting a request on this address boundary first does a memory refresh followed by a requested operation; e.g. a continuation of the transfer operation.Type: GrantFiled: September 6, 2000Date of Patent: May 31, 2005Assignee: Quickturn Design Systems, Inc.Inventors: William F. Beausoleil, R. Bryan Cook, Tak-kwong Ng, Helmut Roth, Peter Tannenbaum, Lawrence A. Thomas, Norton J. Tomassetti
-
Patent number: 6901491Abstract: In one embodiment, a server is provided. The server includes multiple application processor chips. Each of the multiple application processor chips includes multiple processing cores. Multiple memories corresponding to the multiple processor chips are included. The multiple memories are configured such that one processor chip is associated with one memory. A plurality of fabric chips enabling each of the multiple application processor chips to access any of the multiple memories are included. The data associated with one of the multiple application processor chips is stored across each of the multiple memories. In one embodiment, the application processor chips include a remote direct memory access (RDMA) and striping engine. The RDMA and striping engine is configured to store data in a striped manner across the multiple memories. A method for allowing multiple processors to exchange information through horizontal scaling is also provided.Type: GrantFiled: October 16, 2002Date of Patent: May 31, 2005Assignee: Sun Microsystems, Inc.Inventors: Leslie D. Kohn, Michael K. Wong
-
Patent number: 6898657Abstract: A multi-processor arrangement having an interprocessor communication path between each of every possible pair of processors, in addition to I/O paths to and from the arrangement, having signal processing functions configurably embedded in series with the communication paths and/or the I/O paths. Each processor is provided with a local memory which can be accessed by the local processor as well as by the other processors via the communications paths. This allows for efficient data movement from one processor's local memory to another processor's local memory, such as commonly done during signal processing corner turning operations. Configurable signal processing logic may be configured to host one or more signal processing functions which allow data to be autonomously accessed from the processor local memories, processed, and re-deposited in a local memory.Type: GrantFiled: December 16, 2002Date of Patent: May 24, 2005Assignee: Tera Force Technology Corp.Inventor: Winthrop W. Smith
-
Patent number: 6892291Abstract: An array processor includes processing elements arranged in clusters which are, in turn, combined in a rectangular array. Each cluster is formed of processing elements which preferably communicate with the processing elements of at least two other clusters. Additionally each inter-cluster communication path is mutually exclusive, that is, each path carries either north and west, south and east, north and east, or south and west communications. Due to the mutual exclusivity of the data paths, communications between the processing elements of each cluster may be combined in a single inter-cluster path. That is, communications from a cluster which communicates to the north and east with another cluster may be combined in one path, thus eliminating half the wiring required for the path. Additionally, the length of the longest communication path is not directly determined by the overall dimension of the array, as it is in conventional torus arrays.Type: GrantFiled: December 21, 2001Date of Patent: May 10, 2005Assignee: PTS CorporationInventors: Gerald G. Pechanek, Charles W. Kurak, Jr.
-
Patent number: 6883084Abstract: A reconfigurable data path processor comprises a plurality of independent processing elements. Each of the processing elements advantageously comprising an identical architecture. Each processing element comprises a plurality of data processing means for generating a potential output. Each processor is also capable of through-putting an input as a potential output with little or no processing. Each processing element comprises a conditional multiplexer having a first conditional multiplexer input, a second conditional multiplexer input and a conditional multiplexer output. A first potential output value is transmitted to the first conditional multiplexer input, and a second potential output value is transmitted to the second conditional multiplexer output. The conditional multiplexer couples either the first conditional multiplexer input or the second conditional multiplexer input to the conditional multiplexer output, according to an output control command.Type: GrantFiled: July 25, 2002Date of Patent: April 19, 2005Assignee: University of New MexicoInventor: Gregory Donohoe
-
Patent number: 6879341Abstract: A digital camera has a sensor for sensing an image, a processor for modifying the sensed image in accordance with instructions input into the camera and an output for outputting the modified image where the processor includes a series of processing elements arranged around a central crossbar switch. The processing elements include an Arithmetic Logic Unit (ALU) acting under the control of a writeable microcode store, an internal input and output FIFO for storing pixel data to be processed by the processing elements and the processor is interconnected to a read and write FIFO for reading and writing pixel data of images to the processor. Each of the processing elements can be arranged in a ring and each element is also separately connected to its nearest neighbors. The ALU receives a series of inputs interconnected via an internal crossbar switch to a series of core processing units within the ALU and includes a number of internal registers for the storage of temporary data.Type: GrantFiled: July 10, 1998Date of Patent: April 12, 2005Assignee: Silverbrook Research Pty LTDInventor: Kia Silverbrook
-
Patent number: 6874079Abstract: Aspects of a method and system for digital signal processing within an adaptive computing engine are described. These aspects include a mini-matrix, the mini-matrix comprising a set of composite blocks, each composite block capable of executing a predetermined set of instructions. A sequencer is included for controlling the set of composite blocks and directing instructions among the set of composite blocks based on a data-flow graph. Further, a data network is included and transmits data to and from the set of composite blocks and to the sequencer, while a status network routes status word data resulting from instruction execution in the set of composite blocks. With the present invention, an effective combination of hardware resources is provided in a manner that provides multi-bit digital signal processing capabilities for an embedded system environment, particularly in an implementation of an adaptive computing engine.Type: GrantFiled: July 25, 2001Date of Patent: March 29, 2005Assignee: Quicksilver TechnologyInventor: Eugene B. Hogenauer
-
Patent number: 6873287Abstract: The present invention relates to a method and an arrangement suitable for embedded signal processing, comprising a number of computational units (100), each computational unit comprising a number of processing elements (20) capable of working independently and transmitting data simultaneously. Said computational units are arranged in clusters, work independently, and transmit data simultaneously, and that said processing elements (20) are globally and regularly inter-connected optically in a hypercube topology and transformed into a planar waveguide.Type: GrantFiled: November 1, 2001Date of Patent: March 29, 2005Assignee: Telefonaktiebolaget LM EricssonInventor: Häkan Forsberg
-
Patent number: 6847346Abstract: A transfer circuit 25 includes two sets of an input circuit 52A and an output circuit 53B, which allows bidirectional transfer. The input circuit 52A decomposes external input data signals DI11A and DI12A to signals on lines L11 to L14 in synchronism with a clock signal CLK in order to reduce the frequency thereof. The output circuit 53B composes the decomposed signals in synchronism with the clock signal CLK to regenerate the original signals and output them as external output data signals DO11B and DO12B. Signals on either the lines L11 to L14 or L21 to L24 are selected by a multiplexer 57 to provide to a main body circuit.Type: GrantFiled: October 24, 2002Date of Patent: January 25, 2005Assignee: Fujitsu LimitedInventors: Masao Kumagai, Shinya Udo
-
Publication number: 20040255096Abstract: A massively parallel data processing system consisting of an array of closely spaced cells where each cell has direct output means as well as means for processing, memory and input. The data processing system according to the present invention overcomes the von Neumann bottleneck of uniprocessor architectures, the I/O and memory bottlenecks that plague parallel processors, and the input bandwidth bottleneck of high-resolution displays.Type: ApplicationFiled: June 11, 2003Publication date: December 16, 2004Inventor: Richard S. Norman
-
Publication number: 20040250047Abstract: A system and method for using wider data paths within Processing Elements (PEs) of a Massively Parallel Array (MPP) to speed the computational performance of the PEs and the MPP array while still allowing for use of the simple 1-bit interconnection network to transfer data between PEs in the MPP is disclosed. A register having a data width equal to the data width of the PE for holding data for movement from one PE to another is provided in each PE. The register can be loaded in parallel within the PE, and operated as a shift register to transfer a fill data width word from one PE to another PE using a 1-bit wide serial interconnection.Type: ApplicationFiled: June 9, 2004Publication date: December 9, 2004Inventor: Graham Kirsch
-
Publication number: 20040250046Abstract: A system for processing applications includes processor nodes and links interconnecting the processor nodes. Each node includes a processing element, a software extensible device, and a communication interface. The processing element executes at least one of the applications. The software extensible device provides additional instructions to a set of standard instructions for the processing element. The communication interface communicates with other processor nodes.Type: ApplicationFiled: December 31, 2003Publication date: December 9, 2004Inventors: Ricardo E. Gonzalez, Albert R. Wang, Gareld Howard Banta
-
Patent number: 6826674Abstract: In the present invention, an input and/or output interface of at least one of a plurality of processing units forming a data processing system is designated independently of timing of execution of the processing unit, so as to allow the plurality of processing units to define various data paths at the program level.Type: GrantFiled: August 6, 2001Date of Patent: November 30, 2004Assignee: IP Flex, Inc.Inventor: Tomoyoshi Sato
-
Publication number: 20040215926Abstract: A processor book designed to support both commercial workloads and technical workloads based on a dynamic or static mechanism of reconfiguring the external wiring interconnect. The processor book is configured as a building block for commercial workload processing systems with external connector buses (ECBs). The processor book is also provided with routing logic to enable to ECBs to be utilized for either book-to-book routing or routing within the same processor book. A table specific wiring scheme is provided for coupling the ECBs running off the chips of one MCM to the chips of the second MCM on the processor book so that the chips of the first MCM are connected directly to the chips of a second MCM that is logically furthest away and vice versa. Once the wiring of the ECBs are completed according to the wiring scheme, the operational and functional characteristics reflect those of a processor book configured for technical workloads.Type: ApplicationFiled: April 28, 2003Publication date: October 28, 2004Applicant: International Business Machines Corp.Inventors: Ravi Kumar Arimilli, Vicente Enrique Chung, Jody Bern Joyner, Jerry Don Lewis
-
Patent number: 6810434Abstract: An integrated circuit architecture for multimedia processing. A single integrated circuit (IC) operates as a system or subsystem, and is adaptable to processing a variety of multimedia algorithms, whether proprietary or open. Hard macros, either analog or digital, can be incorporated. The IC can also contain audio/video CODECs to suit different standards, as well as other peripheral devices which may be required for multimedia applications. An electronic component (e.g., integrated circuit) incorporating the technique is suitably included in a system or subsystem having electrical functionality, such as general purpose computers, telecommunications devices, and the like.Type: GrantFiled: August 14, 2001Date of Patent: October 26, 2004Assignee: Kawasaki Microelectronics, Inc.Inventors: Kumaraguru Muthujumaraswathy, Michael D. Rostoker
-
Patent number: 6807640Abstract: A programmable interface controller for transmitting data to an output device that is suitable in both fully synchronous systems and in systems that span clock domains. The illustrative embodiments comprise: receiving a plurality of field identifiers and an indication of an order by which each of the plurality of field identifiers is to be uniquely associated with each field in a sequence of fields; receiving a stream of data that comprises the sequence of fields and an indication of the boundary between successive fields in the sequence of fields; and processing each field in the stream of data in accordance with the field identifier uniquely associated with that field.Type: GrantFiled: May 8, 2001Date of Patent: October 19, 2004Assignee: Intersil Americas, Inc.Inventor: Michael Andrew Fischer
-
Patent number: 6791551Abstract: A system and method for synchronizing image display and buffer swapping in a multiple processor-multiple display environment. In a master-slave dichotomy, one processor or system is deemed the master and the others act as slaves. The master generates signals used to control vertical retrace and buffer swapping for itself and the slaves. In addition, a synchronization signal generator is provided to synchronize a timing signal between the master and slave systems.Type: GrantFiled: November 27, 2001Date of Patent: September 14, 2004Assignee: Silicon Graphics, Inc.Inventors: Shrijeet Mukherjee, Kanoj Sarcar, James Tornes
-
Publication number: 20040168040Abstract: An array processor includes processing elements (00, 01, 02, 03, 10, 11, 12, 13, 20, 21, 22, 23, 30, 31, 32, 33) arranged in clusters (e.g., 44, 46, 48, 50) to form a rectangular array (40). Inter-cluster communication paths (88) are mutually exclusive. Due to the mutual exclusivity of the data paths, communications between the processing elements of each cluster may be combined in a single inter-cluster path, thus eliminating half the wiring required for the path. The length of the longest communication path is not directly determined by the overall dimension of the array, as in conventional torus arrays. Rather, the longest communications path is limited by the inter-cluster spacing. Transpose elements of an N×N torus may be combined in clusters and communicate with one another through intra-cluster communications paths. Transpose operation latency is eliminated in this approach. Each PE may have a single transmit port (35) and a single receive port (37).Type: ApplicationFiled: February 9, 2004Publication date: August 26, 2004Applicant: PTS CorporationInventors: Gerald G. Pechanek, Charles W. Kurak
-
Publication number: 20040143724Abstract: The present invention provides an adaptive computing engine (ACE) that includes processing nodes having different capabilities such as arithmetic nodes, bit-manipulation nodes, finite state machine nodes, input/output nodes and a programmable scalar node (PSN). In accordance with one embodiment of the present invention, a common architecture is adaptable to function in either a kernel node, or k-node, or as general purpose RISC node. The k-node acts as a system controller responsible for adapting other nodes to perform selected functions. As a RISC node, the PSN is configured to perform computationally intensive applications such as signal processing.Type: ApplicationFiled: September 29, 2003Publication date: July 22, 2004Applicant: QuickSilver Technology, Inc.Inventors: Rojit Jacob, Dan MingLun Chuang