Distributed Processing System Patents (Class 712/28)
-
Publication number: 20100077472Abstract: A secure communication interface for a secure multi-processor system is disclosed. The secure communication interface can include a secure controller that is operable to transfer data between a first memory that is directly accessible by a first (master) processor and a second memory that is directly accessible by a secure second (slave) processor in the multi-processor system. One or more control and status registers accessible by the processors facilitate secure data transfer between the first memory and a memory window defined in the second memory. One or more status and violation registers shared by the processors can be included in the secure communication interface for facilitating secure data transfer and for reporting security violations based on a rule set.Type: ApplicationFiled: September 23, 2008Publication date: March 25, 2010Inventors: Majid Kaabouch, Eric Le Cocquen
-
Patent number: 7685370Abstract: A data processing system can establish or maintain data coherency by issuing a data flush operation. An agent can initialize a first flush operation by writing to a flush register. The agent can determine that the flush operation is complete by reading a status indicator from a status register. Additional agents can independently issue flush operations during the pendency of the first flush operation. A second flush instruction and any additional flush instructions that issue during the pendency of the first flush operation set a flush pending indicator in a status register. Once the first flush operation completes, the host performs all pending flush operations in a single second flush operation. The status indicator does not indicate a completed flush operation for the first flush operation until all flush operations are complete. Multiple co-pending flush operations are collapsed into at most two flush operations.Type: GrantFiled: December 16, 2005Date of Patent: March 23, 2010Assignee: NVIDIA CorporationInventors: Samuel Hammond Duncan, Lincoln G. Garlick
-
Patent number: 7671863Abstract: Architectures for graphic engine chips with minimum impact on other resources are disclosed. According to one embodiment, a graphic engine architecture includes a scheduler that is configured to schedule an execution time for each of the drawing instructions sent in groups from a processor. Each drawing instruction includes a piece of time information. The scheduler is provided to fetch the drawing instructions from a FIFO buffer that buffers the drawing instructions. Subsequently, the drawing instructions are successively executed according to their scheduling.Type: GrantFiled: March 6, 2006Date of Patent: March 2, 2010Assignee: Vimicro CorporationInventors: Chuanen Jin, Chunquan Dai
-
Publication number: 20100049941Abstract: A method using for performing a scatter-type data distribution among a cluster of computational devices. A number of nodes (equal to a value Cg, the number of tree generator channels) are initially generated, each connected to an initial generator, to create respective initial root nodes of an initial tree structure. Data is transmitted from the initial generator to each of the initial root nodes. Cg root nodes, each connected to a respective new generator, are generated to create respective roots of Cg newly generated tree structures. Each of the tree structures is expanded by generating Ct (the number of communication channels per node in each tree structure) new nodes connected to each node generated in each previous step. Data is then transmitted to each of the new nodes from an immediately preceding one of the nodes, and from each new generator to an associated root node.Type: ApplicationFiled: October 19, 2009Publication date: February 25, 2010Inventor: Kevin D. HOWARD
-
Patent number: 7664970Abstract: Embodiments of the invention relate to a method and apparatus for a zero voltage processor sleep state. A processor may include a dedicated cache memory. A voltage regulator may be coupled to the processor to provide an operating voltage to the processor. During a transition to a zero voltage power management state for the processor, the operational voltage applied to the processor by the voltage regulator may be reduced to approximately zero and the state variables associated with the processor may be saved to the dedicated cache memory.Type: GrantFiled: December 30, 2005Date of Patent: February 16, 2010Assignee: Intel CorporationInventors: Sanjeev Jahagirdar, George Varghese, John B. Conrad, Robert Milstrey, Stephen A. Fischer, Alon Navch, Shai Rotem
-
Publication number: 20100037035Abstract: Methods, apparatus, and products are disclosed for generating an executable version of an application using a distributed compiler operating on a plurality of compute nodes that include: receiving, by each compute node, a portion of source code for an application; compiling, in parallel by each compute node, the portion of the source code received by that compute node into a portion of object code for the application; performing, in parallel by each compute node, inter-procedural analysis on the portion of the object code of the application for that compute node, including sharing results of the inter-procedural analysis among the compute nodes; optimizing, in parallel by each compute node, the portion of the object code of the application for that compute node using the shared results of the inter-procedural analysis; and generating the executable version of the application in dependence upon the optimized portions of the object code of the application.Type: ApplicationFiled: August 11, 2008Publication date: February 11, 2010Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Charles J. Archer, Michael A. Blocksome, Joseph D. Ratterman, Albert Sidelnik, Brian E. Smith
-
Patent number: 7659901Abstract: Systems and methods that optimize GPU processing by front loading activities from a set time/binding time to creation time via enhancements to an API that configures the GPU. Such enhancements to the API include: implementing layering arrangements, employing state objects and view components for data objects; incorporating a pipeline stage linkage/signature, employing a detection mechanism to mitigate error conditions. Such an arrangement enables front loading of the work and reduction of associated API calls.Type: GrantFiled: November 30, 2006Date of Patent: February 9, 2010Assignee: Microsoft CorporationInventors: Michael A. Toelle, Craig C. Peeper, Brian T. Klamik, Sam Glassenberg
-
Publication number: 20100011189Abstract: An information processing device includes a plurality of processor cores each including a plurality of transistors, and at least one substrate bias circuit that supplies each of the plurality of transistors with a substrate bias voltage that is determined based on the number of the processor cores.Type: ApplicationFiled: September 18, 2009Publication date: January 14, 2010Applicant: FUJITSU LIMITEDInventor: Keisuke MURAYA
-
Patent number: 7647593Abstract: A CPU 111m segments the jobs from each of the volume rendering processing on hand, prioritize processing sequence for each job, transmits one job which has reached the processing order to the computers (21 to 2k) on the accepting side and other computers equivalent to the self computer simultaneously, and executes the job for self processing. Then, after if receiving the processing result from the computer which has completed the processing of the transmitted job the earliest, the CPU 111m issues a halt command of the job to other computers on the accepting side. At this time, if any job to be requested on hand remains uncompleted, a series of processing procedures starting with the simultaneous communication is repeated.Type: GrantFiled: September 30, 2004Date of Patent: January 12, 2010Assignee: Ziosoft, Inc.Inventor: Kazuhiko Matsumoto
-
Patent number: 7640443Abstract: To provide a storage system capable of minimizing a performance deterioration, saving power consumption, and realizing a high reliability. A storage system according to the present invention includes a computer, a storage apparatus 1 connected with the computer, and a storage management apparatus connected to the storage apparatus, the storage apparatus including a hard disk unit to control a data write operation and a data read operation between the computer and the hard disk unit, and control on/off states of power supply of the hard disk unit on a group basis, and the system management apparatus collecting running information about the computer and computer execution job information for each computer, and determining an on/off time of the power supply of the hard disk unit on the group basis to record the collected information and the on/off time of the power supply on the group basis.Type: GrantFiled: September 15, 2008Date of Patent: December 29, 2009Assignee: Hitachi, Ltd.Inventor: Kazuhisa Fujimoto
-
Patent number: 7630376Abstract: Sequences of items may be maintained using ordered locks. These items may correspond to anything, but using ordered locks to maintain sequences of packets, especially for maintaining requisite packet orderings when distributing packets to be processed to different packet processing engines, may be particularly useful. For example, in response to a particular packet processing engine completing processing of a particular packet, a gather instruction is attached to the particular identifier of a particular ordered lock associated with the particular packet. If no longer needed for further processing, the packet processing engine is immediately released to be able to process another packet or perform another function. The gather instruction is typically performed in response to the particular ordered lock being acquired by the particular identifier, with the gather instruction causing the processed particular packet to be sent.Type: GrantFiled: April 3, 2008Date of Patent: December 8, 2009Assignee: Cisco Technology, Inc.Inventors: John J. Williams, Jr., John Andrew Fingerhut, Doron Shoham, Shimon Listman
-
Publication number: 20090300326Abstract: A method (system and computer program product) performs facet classification synthesis to relate concepts represented by concept definitions defined in accordance with a faceted data set comprising facets, facet attributes, and facet attributes hierarchies. Dimensional concept relationships are expressed between the concept definitions. Two concept definitions are determined to be related in a particular dimensional concept relationship by examining whether at least one of explicit relationships and implicit relationships exist in the faceted data set between the respective facet attributes of the two concept definitions.Type: ApplicationFiled: June 4, 2009Publication date: December 3, 2009Inventor: Peter Sweeney
-
Patent number: 7627738Abstract: A data processing system includes a first processing node and a second processing node. The first processing node includes a plurality of first processing units coupled to each other for communication, and the second processing node includes a plurality of second processing units coupled to each other for communication. Each of the plurality of first processing units is coupled to a respective one of the plurality of second processing units in the second processing node by a respective one of a plurality of point-to-point links.Type: GrantFiled: December 19, 2007Date of Patent: December 1, 2009Assignee: International Business Machines CorporationInventors: Vicente E. Chung, Benjiman L. Goodman, Praveen S. Reddy, William J. Starke
-
Publication number: 20090292900Abstract: Control messages are sent from a control processor to a plurality of attached processors via a control tree structure comprising the plurality of attached processors and branching from the control processor, such that two or more of the plurality of attached processor nodes are operable to send messages to other attached processor nodes in parallel.Type: ApplicationFiled: May 21, 2008Publication date: November 26, 2009Applicant: Cray Inc.Inventor: Michael Karo
-
Patent number: 7620776Abstract: A method, apparatus, and computer program product are disclosed for reducing the number of unnecessarily broadcast remote requests to reduce the latency to access data from local nodes and to reduce global traffic in an SMP computer system. A modified invalid cache coherency protocol state is defined that predicts whether a memory access request to read or write data in a cache line can be satisfied within a local node. When a cache line is in the modified invalid state, the only valid copies of the data are predicted to be located in the local node. When a cache line is in the invalid state and not in the modified invalid state, a valid copy of the data is predicted to be located in one of the remote nodes. Memory access requests to read exclusive or write data in a cache line that is not currently in the modified invalid state are broadcast first to all nodes.Type: GrantFiled: December 12, 2007Date of Patent: November 17, 2009Assignee: International Business Machines CorporationInventors: Jason Frederick Cantin, Steven R. Kunkel
-
Patent number: 7614059Abstract: A method is presented for a mobile agent object to discover services available in a host-computing environment. According to an embodiment of this method, the mobile agent object requests a service listing from the host environment. The host environment returns a service listing to the mobile agent object in response to the request for the service listing. The mobile agent object then determines if a particular service is within the returned service listing and requests the particular service if the particular service is determined by the mobile agent object to be within the returned service listing.Type: GrantFiled: July 11, 2003Date of Patent: November 3, 2009Assignee: Topia TechnologyInventor: Michael R. Manzano
-
Publication number: 20090259713Abstract: A novel massively parallel supercomputer of hundreds of teraOPS-scale includes node architectures based upon System-On-a-Chip technology, i.e., each processing node comprises a single Application Specific Integrated Circuit (ASIC). Within each ASIC node is a plurality of processing elements each of which consists of a central processing unit (CPU) and plurality of floating point processors to enable optimal balance of computational performance, packaging density, low cost, and power and cooling requirements. The plurality of processors within a single node may be used individually or simultaneously to work on any combination of computation or communication as required by the particular algorithm being solved or executed at any point in time. The system-on-a-chip ASIC nodes are interconnected by multiple independent networks that optimally maximizes packet communications throughput and minimizes latency.Type: ApplicationFiled: June 26, 2009Publication date: October 15, 2009Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Matthias A. Blumrich, Dong Chen, George L. Chiu, Thomas M. Cipolla, Paul W. Coteus, Alan G. Gara, Mark E. Giampapa, Philip Heidelberger, Gerard V. Kopcsay, Lawrence S. Mok, Todd E. Takken
-
Publication number: 20090259825Abstract: A system has a first plurality of cores in a first coherency group. Each core transfers data in packets. The cores are directly coupled serially to form a serial path. The data packets are transferred along the serial path. The serial path is coupled at one end to a packet switch. The packet switch is coupled to a memory. The first plurality of cores and the packet switch are on an integrated circuit. The memory may or may not be on the integrated circuit. In another aspect a second plurality of cores in a second coherency group is coupled to the packet switch. The cores of the first and second pluralities may be reconfigured to form or become part of coherency groups different from the first and second coherency groups.Type: ApplicationFiled: April 15, 2008Publication date: October 15, 2009Inventors: Perry H. Pelley, III, George P. Hoekstra, Lucio F.C. Pessoa
-
Publication number: 20090254718Abstract: A digital signal processor (DSP) co-processor according to a clustered architecture with local memories. Each cluster in the architecture includes multiple sub-clusters, each sub-cluster capable of executing one or two instructions that may be specifically directed to a particular DSP operation. The sub-clusters in each cluster communicate with global memory resources by way of a crossbar switch in the cluster. One or more of the sub-clusters has a dedicated local memory that can be accessed in a random access manner, in a vector access manner, or in a streaming or stack manner. The local memory is arranged as a plurality of banks. In response to certain vector access instructions, the input data may be permuted among the banks prior to a write, or permuted after being read from the banks, according to a permutation pattern stored in a register.Type: ApplicationFiled: March 6, 2009Publication date: October 8, 2009Applicant: TEXAS INSTRUMENTS INCORPORATEDInventors: Eric Biscondi, David J. Hoyle, Tod D. Wolf
-
Patent number: 7600095Abstract: Executing a scatter operation on a parallel computer includes: configuring a send buffer on a logical root, the send buffer having positions, each position corresponding to a ranked node in an operational group of compute nodes and for storing contents scattered to that ranked node; and repeatedly for each position in the send buffer: broadcasting, by the logical root to each of the other compute nodes on a global combining network, the contents of the current position of the send buffer using a bitwise OR operation, determining, by each compute node, whether the current position in the send buffer corresponds with the rank of that compute node, if the current position corresponds with the rank, receiving the contents and storing the contents in a reception buffer of that compute node, and if the current position does not correspond with the rank, discarding the contents.Type: GrantFiled: April 19, 2007Date of Patent: October 6, 2009Assignee: International Business Machines CorporationInventors: Charles J. Archer, Joseph D. Ratterman
-
Publication number: 20090249029Abstract: An overall processing time to rasterize, at the first device, the electronic document to be rendered is computed. Also, a rendering time to render, at the first device, the electronic document to be rendered is computed. When the overall processing time to rasterize at the first device is greater than the rendering time to render at the first device, the electronic document to be rendered is parsed into a first document and sub-documents. A productivity capacity of each node is determined, the productivity capacity being a measured of the processing power of the node and the communication cost of exchanging information between the first device and the node. A sub-document is rasterized at a node when a productivity capacity of the node reduces the processing time to rasterize the electronic document to be rendered to be less than the computed overall processing time.Type: ApplicationFiled: March 25, 2008Publication date: October 1, 2009Applicant: Xerox CorporationInventors: Hua Liu, Steven J. Harrington
-
Patent number: 7596705Abstract: An apparatus for controlling a power management mode of a multi-core processor in a computer system includes a monitoring unit configured for monitoring conditions relating to the power management mode of the multi-core processor. The apparatus also includes an automatic mode change unit operatively connected to the monitoring unit for receiving the monitored conditions. The automatic mode change unit is configured to set the power management mode of the multi-core processor to a single-core mode or a multi-core mode based on the monitored conditions.Type: GrantFiled: June 14, 2006Date of Patent: September 29, 2009Assignee: LG Electronics Inc.Inventor: Seo Kwang Kim
-
Publication number: 20090240915Abstract: Methods, systems, and products are disclosed for broadcasting collective operation contributions throughout a parallel computer. The parallel computer includes a plurality of compute nodes connected together through a data communications network. Each compute node has a plurality of processors for use in collective parallel operations on the parallel computer. Broadcasting collective operation contributions throughout a parallel computer according to embodiments of the present invention includes: transmitting, by each processor on each compute node, that processor's collective operation contribution to the other processors on that compute node using intra-node communications; and transmitting on a designated network link, by each processor on each compute node according to a serial processor transmission sequence, that processor's collective operation contribution to the other processors on the other compute nodes using inter-node communications.Type: ApplicationFiled: March 24, 2008Publication date: September 24, 2009Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventor: Ahmad Faraj
-
Patent number: 7594233Abstract: A method of and computer system for selecting a processor of a computer system on which to launch a processing thread is described. Each processor load is compared with a volunteer load of a volunteer information. If the processor load is lower than the volunteer load, the volunteer information is updated with the compared processor information. If the compared processor is the current volunteer and the compared processor load is higher than the volunteer load, the volunteer information is updated with the compared processor information.Type: GrantFiled: June 28, 2002Date of Patent: September 22, 2009Assignee: Hewlett-Packard Development Company, L.P.Inventors: Paul Gootherts, Douglas V. Larson
-
Patent number: 7594045Abstract: A memory control apparatus and method for operating a plurality of digital signal processors (DSPs) using a single memory slot and buffer are provided. Exemplary embodiments provide at least one DSP for processing different signals, a flash memory that can record and reproduce a digital signal, a plurality of selection switches located on signal lines between the DSP and the flash memory for switching the signals, a three-state buffer that selectively outputs insert information of the memory to the DSPs according to a control signal, a control unit for providing the control signal for controlling switching of the signals, and a key input unit for determining input/output operation modes. The control unit records and reproduces the data in the flash memory according to the operation mode determined through the key input unit.Type: GrantFiled: June 12, 2006Date of Patent: September 22, 2009Assignee: Samsung Electronics Co., Ltd.Inventor: Yong-Hyun Lee
-
Publication number: 20090228685Abstract: Methods and systems are provided for partitioning data of a database or data store into several independent parts as part of a data mining process. The methods and systems use a mining application having content-based partitioning logic to partition the data. Once the data is partitioned, the partitioned data may be grouped and distributed to an associated processor for further processing. The mining application and content-based partitioning logic may be used in a computing system, including shared memory and distributed memory multi-processor computing systems. Other embodiments are described and claimed.Type: ApplicationFiled: April 27, 2006Publication date: September 10, 2009Inventors: Hu Wei, Lai Chunrong
-
Patent number: 7584369Abstract: The disclosed methodology and apparatus may control heat generation in a multi-core processor. In one embodiment, each processor core includes a temperature sensor that reports temperature information to a processor controller. If a particular processor core exceeds a predetermined temperature, the processor core disables that processor core to allow it to cool. The processor core enables the previously disabled processor when the previously disabled processor core cools sufficiently to a normal operating temperature. The disclosed multi-core processor may avoid undesirable hot spots that impact processor life.Type: GrantFiled: July 26, 2006Date of Patent: September 1, 2009Assignee: International Business Machines CorporationInventors: Louis Bennie Capps, Jr., Warren D. Dyckman, Michael Jay Shapiro
-
Publication number: 20090216996Abstract: A system and methods comprising a plurality of leaf nodes in communication with one or more branch nodes, each node comprising a processor. Each leaf node is arranged to obtain data indicative of a restriction A|IS of a linear map from Rn to Rm represented by a first matrix, A, to a subspace IS of Rn and to carry out a calculation of data indicative of at least a leading part of the SVD of a matrix representation of the restriction A|IS. One or more of the plurality of leaf nodes or branch nodes is arranged to use results of the calculations to compute data indicative of a subspace OS of each node input subspace IS and to pass that data and a corresponding restriction A|OS of A to one of a plurality of the one or more branch nodes. Each of the one or more branch nodes is arranged to receive data indicative of node output spaces OS1, . . . , OSk and the corresponding restrictions A|OS1, . . . , A|OSk for k?2, to use this data to form a further node input space IS=OS1+ . . .Type: ApplicationFiled: February 20, 2009Publication date: August 27, 2009Inventors: Daniel James Goodman, Raphael Andreas Hauser
-
Patent number: 7581081Abstract: A system for processing applications includes processor nodes and links interconnecting the processor nodes. Each node includes a processing element, a software extensible device, and a communication interface. The processing element executes at least one of the applications. The software extensible device provides additional instructions to a set of standard instructions for the processing element. The communication interface communicates with other processor nodes.Type: GrantFiled: December 31, 2003Date of Patent: August 25, 2009Assignee: Stretch, Inc.Inventors: Ricardo E. Gonzalez, Albert R. Wang, Gareld Howard Banta
-
Patent number: 7581079Abstract: A shared memory network for communicating between processors using store and load instructions is described. A new processor architecture which may be used with the shared memory network is also described that uses arithmetic/logic instructions that do not specify any source operand addresses or target operand addresses. The source operands and target operands for arithmetic/logic execution units are provided by independent load instruction operations and independent store instruction operations.Type: GrantFiled: March 26, 2006Date of Patent: August 25, 2009Inventor: Gerald George Pechanek
-
Publication number: 20090210732Abstract: This invention provides an information processing apparatus which includes a first storage unit and a second storage unit and implements a function of causing the first storage unit and the second storage unit to store data redundantly while maintaining a power saving mode even upon receiving an access request from an external apparatus in the power saving mode, and a method of controlling the same. To accomplish this, upon receiving an HDD access request in the power saving mode, the information processing apparatus operates after transiting to an HDD access mode in which only minimum necessary functions are activated without activating the main CPU. The contents of the HDD changed during the HDD access mode are stored as history information. Upon transiting from the power saving mode to the normal operating mode, the data in another HDD is updated in accordance with the history information, thereby implementing a mirroring function.Type: ApplicationFiled: February 19, 2009Publication date: August 20, 2009Applicant: CANON KABUSHIKI KAISHAInventor: Takeshi Aoyagi
-
Publication number: 20090204789Abstract: Methods, apparatus, and products for distributing parallel algorithms of a parallel application among compute nodes of an operational group in a parallel computer are disclosed that include establishing a hardware profile, the hardware profile describing thermal characteristics of each compute node in the operational group; establishing a hardware independent application profile, the application profile describing thermal characteristics of each parallel algorithm of the parallel application; and mapping, in dependence upon the hardware profile and application profile, each parallel algorithm of the parallel application to a compute node in the operational group.Type: ApplicationFiled: February 11, 2008Publication date: August 13, 2009Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: THOMAS M. GOODING, BRANT L. KNUDSON, CORY LAPPI, RUTH J. POOLE, ANDREW T. TAUFERNER
-
Patent number: 7552439Abstract: A method includes receiving at least one process control value from a deterministic process control environment according to an execution cycle of the deterministic process control environment. The method also includes providing the at least one process control value to a non-deterministic process according to an execution cycle of the non-deterministic process. The execution cycle of the non-deterministic process does not correspond to the execution cycle of the deterministic process control environment.Type: GrantFiled: March 28, 2006Date of Patent: June 23, 2009Assignee: Honeywell International Inc.Inventors: Gary L. Fox, Lawrence L. Martin, Robert J. McNulty
-
Publication number: 20090150650Abstract: Techniques for grouping individual processors into assignment entities are discussed. Statically grouping processors may permit threads to be assigned on a group basis. In this manner, the burden of scheduling threads for processing may be minimized, while the processor within the assignment entity may be selected based on the physical locality of the individual processors within the group. The groupings may permit a system to scale to meet the processing demands of various applications.Type: ApplicationFiled: December 7, 2007Publication date: June 11, 2009Applicant: Microsoft CorporationInventors: Arie Van der Hoeven, Ellsworth D. Walker, Forrest C. Foltz, Zhong Deng
-
Publication number: 20090150651Abstract: Disclosed herein is a semiconductor chip including: a plurality of processing devices that can communicate with each other; wherein each of the processing devices includes an arithmetic unit, an individual memory connected to the arithmetic unit on a one-to-one basis, and a control unit configured to independently control turning on and off of operation of the arithmetic unit and the individual memory.Type: ApplicationFiled: November 17, 2008Publication date: June 11, 2009Applicant: Sony CorporationInventor: Mutsuhiro Ohmori
-
Patent number: 7533382Abstract: A hyperprocessor includes a control processor controlling tasks executed by a plurality of processor cores, each of which may include multiple execution units, or special hardware units. The control processor schedules tasks according to control threads for the tasks created during compilation and comprising a hardware context including register files, a program counter and status bits for the respective task. The tasks are dispatched to the processor cores or special hardware units for parallel, sequential, out-of-order or speculative execution. A universal register file contains data to be operated on by the task, and an interconnect couples at least the processor cores or special hardware units to each other and to the universal register file, allowing each node to communicate with any other node.Type: GrantFiled: October 30, 2002Date of Patent: May 12, 2009Assignee: STMicroelectronics, Inc.Inventor: Faraydon O. Karim
-
Publication number: 20090113405Abstract: The architectures derived from the proposed template are integrated in a generic System on Chip (SoC) and consist of reconfigurable coprocessors for executing nested program loops whose bodies are expressions of operations performed in a functional unit array in parallel. The data arrays are accessed from one or more system inputs and from an embedded memory array in parallel. The processed data arrays are sent back to the memory array or to system outputs. The architectures enable the acceleration of nested loops compared to execution on a standard processor, where only one operation or datum access can be performed at a time. The invention can be used in a number of applications especially those which involve digital signal processing, such as multimedia and communications. The architectures are used preferably in conjunction with von Neumann processors which are better at implementing control flow.Type: ApplicationFiled: October 8, 2008Publication date: April 30, 2009Inventors: Jose Teixeira De Sousa, Victor Manuel Goncalves Martins, Nuno Calado Correia Lourenco, Alexandre Miguel Dias Santos, Nelson Goncalo Do Rosario Ribeiro
-
Publication number: 20090113171Abstract: In one embodiment, a computer system comprises at least a first computing cell and a second computing cell, each computing cell comprising at least one processor, at least one programmable trusted platform management device coupled to the processor via a hardware path which goes through at least one trusted platform management device controller which manages operations of the at least one programmable trusted platform device, and a routing device to couple the first and second computing cells.Type: ApplicationFiled: October 26, 2007Publication date: April 30, 2009Inventor: Russ W. Herrell
-
Publication number: 20090113211Abstract: A processing unit includes a processing core and a wireless module directly connected to the processing core, wherein the wireless module is for providing wireless communications to the processing core. A multi-processor system includes a first processing unit having a first processing core and a first wireless module directly connected to the first processing core, the first wireless module for providing wireless communications to the first processing core; a second processing unit having a second processing core and a second wireless module directly connected to the second processing core, the second wireless module for providing wireless communications to the second processing core; and a wireless link between the first and second wireless modules; wherein the first processing unit is for communicating with the second processing unit via the wireless link.Type: ApplicationFiled: October 31, 2007Publication date: April 30, 2009Inventors: Chun-Hung Liu, Jyh-Ming Lin, Min-Chih Hsuan
-
Publication number: 20090113172Abstract: A system and method for interconnecting a plurality of processing element nodes within a scalable multiprocessor system is provided. Each processing element node includes at least one processor and memory. A scalable interconnect network includes physical communication links interconnecting the processing element nodes in a cluster. A first set of routers in the scalable interconnect network route messages between the plurality of processing element nodes. One or more metarouters in the scalable interconnect network route messages between the first set of routers so that each one of the routers in a first cluster is connected to all other clusters through one or more metarouters.Type: ApplicationFiled: May 16, 2008Publication date: April 30, 2009Inventors: Martin M. Deneroff, Gregory M. Thorson, Randal S. Passint
-
Patent number: 7526456Abstract: A method of operating a Linear Complementarity Problem (LCP) solver is disclosed, where the LCP solver is characterized by multiple execution units operating in parallel to implement a competent computational method adapted to resolve physics-based LCPs in real-time.Type: GrantFiled: March 8, 2004Date of Patent: April 28, 2009Assignee: NVIDIA CorporationInventors: Lihua Zhang, Richard Tonge, Dilip Sequeira, Monier Maher
-
Publication number: 20090106529Abstract: A multiprocessor computer system comprises a folded butterfly processor interconnect network, the folded butterfly interconnect network comprising a traditional butterfly interconnect network derived from a butterfly network by flattening routers in each row into a single router for each row, and eliminating channels entirely local to the single row.Type: ApplicationFiled: August 20, 2008Publication date: April 23, 2009Inventors: Dennis C. Abts, John Kim, William J. Dally
-
Publication number: 20090094437Abstract: The present invention provides a method and a device for controlling a multicore processor by selecting and operating the appropriate number of cores corresponding to an operation state of the processor. In a multicore processor having a plurality of cores each independently performing a calculation process on one processor, an operating rate of a thread or task of each core within a predetermined time is calculated by summing the operating times or the number of operating times within a predetermined time, and an overall operating rate of all the cores is found by summing the calculated operating rates. The number of operating cores corresponding to the overall operating rate is determined by a previously set table. The number of cores operating has a hysteresis characteristic in which the number of operating cores is different between increasing and decreasing times of the overall operating rate. Operating cores corresponding to the number of the determined cores are selected by the previously set table.Type: ApplicationFiled: August 1, 2008Publication date: April 9, 2009Inventor: Masahiro Fukuda
-
Publication number: 20090094594Abstract: Blade-based systems and methods are provided that support a plurality of application-specific functions associated with data processing, communication and/or storage. Exemplary embodiments include a chassis for receipt of a plurality of blades. The blades are programmed/loaded with application-specific software, e.g., wireless communication software, that facilitates data-related operations. The chassis may also contain cooling vents, power supply modules and/or circuitry, and a backplane for requisite communications. Additional structural features and components may include mounting brackets, cooling/exhaust fans and detachable front/rear faces to facilitate mounting and/or service of associated components.Type: ApplicationFiled: October 3, 2007Publication date: April 9, 2009Applicant: ORTRONICS, INC.Inventor: Anthony B. Walker
-
Patent number: 7516301Abstract: Heterogeneous processors can cooperate for distributed processing tasks in a multiprocessor computing system. Each processor is operable in a “compatible” mode, in which all processors within a family accept the same baseline command set and produce identical results upon executing any command in the baseline command set. The processors also have a “native” mode of operation in which the command set and/or results may differ in at least some respects from the baseline command set and results. Heterogeneous processors with a compatible mode defined by reference to the same baseline can be used cooperatively for distributed processing by configuring each processor to operate in the compatible mode.Type: GrantFiled: December 16, 2005Date of Patent: April 7, 2009Assignee: Nvidia CorporationInventors: Henry Packard Moreton, Abraham B. de Waal
-
Publication number: 20090083515Abstract: In an embodiment, the present invention discloses a flexible and reconfigurable architecture with efficient memory data management, together with efficient data transfer and relieving data transfer congestion in an integrated circuit. In an embodiment, the output of a first functional component is stored to an input memory of a next functional component. Thus when the first functional component completes its processing, its output is ready to be accessed as input to the next functional component. In an embodiment, the memory device further comprises a partition mechanism for simultaneously accepting output writing from the first functional component and accepting input reading from the second functional component. In another embodiment, the present integrated circuit comprises at least two functional components and at least two memory devices, together with a controller for switching the connections between the functional components and the memory devices.Type: ApplicationFiled: June 29, 2008Publication date: March 26, 2009Inventors: Hirak Mitra, Raj Kulkarni, Richard Wicks, Michael Moon
-
Publication number: 20090083516Abstract: In a media server for processing data packets, media server functions are implemented by a plurality of modules categorized by real-time response requirements.Type: ApplicationFiled: September 25, 2008Publication date: March 26, 2009Applicant: RADISYS CANADA, INC.Inventors: Adnan Saleem, Alvin Chubbs, Neil Gunn, James Davidson
-
Patent number: 7503046Abstract: A method of determining an interleave pattern for n lots of A and y lots of B, when n plus y equals a power of two such that the expression 2z?n may be used to represent the value of y, includes generating a key including the reverse bit order of a serially indexed count from 0 to 2z. An interleave pattern can be generated from the key in which all values less than n are replace by A and all other values are replaced by B. The key can be used to generate a table that contains all possible combinations of values of A and B. The table can then be stored such that an interleave pattern can be automatically selected based on either the number of lots of A or the number of lots of B.Type: GrantFiled: October 20, 2003Date of Patent: March 10, 2009Assignee: Micron Technology, Inc.Inventor: Mark Beaumont
-
Publication number: 20090063812Abstract: A processor includes a CPU capable of performing predetermined arithmetic processing, a memory accessible by the CPU, and a data transfer unit capable of controlling data transfer with the memory by substituting for the CPU. The data transfer unit is provided with a command chain unit for continuously performing data transfer by execution of a preset command chain, and a retry controller for executing a retry processing in case a transfer error occurs during data transfer by the command chain unit. Then, the data transfer unit reports a command relating to the transfer error to the CPU after completion of the execution of the command chain, thereby lessening the number of interruptions for error processing, and attaining enhancement in performance of a system.Type: ApplicationFiled: July 14, 2008Publication date: March 5, 2009Inventor: Takashi TODAKA
-
Publication number: 20090055625Abstract: Methods and systems for parallel computation of an algorithm using a plurality of nodes configured as a Howard Cascade. A home node of a Howard Cascade receives a request from a host system to compute an algorithm identified in the request. The request is distributed to processing nodes of the Howard Cascade in a time sequence order in a manner to minimize the time to so expand the Howard Cascade. The participating nodes then perform the designated portion of the algorithm in parallel. Partial results from each node are agglomerated upstream to higher nodes of the structure and then returned to the host system. The nodes each include a library of stored algorithms accompanied by data template information defining partitioning of the data used in the algorithm among the number of participating nodes.Type: ApplicationFiled: August 25, 2008Publication date: February 26, 2009Inventors: Kevin David Howard, Glen Curtis Rea, Nick Wade Robertson, Silva Chang