Interface Patents (Class 712/29)
-
Patent number: 8072623Abstract: An image processing apparatus is disclosed that includes an image processing unit section and an information processing unit section. The image processing unit section includes an image scanner that performs an image processing function and a SDK application that expands and controls the function of the image processing apparatus. The information processing unit section includes an operations panel that selectively performs operations between a basic application and the SDK application and a MFP service that transmits an instruction signal to the SDK application so as to control the image scanner in accordance with the operation on the operations panel. The information processing unit section confirms the corresponding relationship between the MFP service and the SDK application when the image processing apparatus performs a starting process and makes the SDK application correspond to the MFP service in accordance with the confirmation results.Type: GrantFiled: January 29, 2008Date of Patent: December 6, 2011Assignee: Ricoh Company, Ltd.Inventor: Hideyoshi Ooshio
-
Publication number: 20110296138Abstract: A method, system, and computer usable program product for fast remote communication and computation between processors are provided in the illustrative embodiments. A direct core to core communication unit (DCC) is configured to operate with a first processor, the first processor being a remote processor. A memory associated with the DCC receives a set of bytes, the set of bytes being sent from a second processor. An operation specified in the set of bytes is executed at the remote processor such that the operation is invoked without causing a software thread to execute.Type: ApplicationFiled: May 27, 2010Publication date: December 1, 2011Applicant: International Business Machines CorporationInventors: JOHN BRUCE CARTER, ELMOOTAZBELLAH NABIL ELNOZAHY, AHMED GHEITH, ERIC VAN HANSBERGEN, KARTHICK RAJAMANI, WILLIAM EVAN SPEIGHT, LIXIN ZHANG
-
Publication number: 20110271078Abstract: A processor structure of integrated circuit is provided. The processor structure comprises at least one processor capable of configuring an operation component and at least one processor capable of configuring a storage component. The processor capable of configuring an operation component or the processor capable of configuring a storage component cascades the processor capable of configuring an operation component and the processor capable of configuring a storage component. The processor capable of configuring an operation component includes a first arithmetic data control component and at least one operation component, and the first arithmetic data control component executes a configuration instruction to configure the operation function of the operation component.Type: ApplicationFiled: December 15, 2008Publication date: November 3, 2011Applicant: PEKING UNIVERSITY SHENZHEN GRADUATE SCHOOLInventors: Peng Dai, Ziyi Hu, Xinan Wang, Xing Zhang
-
Patent number: 8035337Abstract: A power hub apparatus and an associated method of formation, and a system. The power hub apparatus includes a central power hub that encompasses M+1 tiers sequenced in a vertical direction (M>1). The central power hub includes a central area to which N radial arms are connected. Each pair of adjacent radial arms defines a docking bay in each tier such that N docking bays are defined in each tier (N>3). Each docking bay in each tier is vertically aligned directly above a corresponding docking bay in a directly lower tier. Irregular shaped modules are latched in each docking bay in each tier. Each module provides a functionality for responding to an alert pertaining to an event. The central area includes rechargeable batteries that provide electrical power for the latched modules in each tier. The system comprises a micro grid apparatus covered by a solar power skin.Type: GrantFiled: March 11, 2011Date of Patent: October 11, 2011Assignee: International Business Machines CorporationInventor: Ian Edward Oakenfull
-
Publication number: 20110246747Abstract: A reconfigurable circuit includes a data execution unit including a plurality of execution elements, each of which performs execution with respect to plural data upon the plural data being all in a valid state, and holds valid-state output data indicative of a result of the execution at an output node while all the plural data are in the valid state, a data selecting unit configured to connect between the execution elements in a reconfigurable manner, and a data input unit configured to supply input data to a series of execution elements to perform a series of executions, wherein a valid or invalid state of given data is specified by a valid signal accompanying and forming a pair with the given data, and the input data supplied from the data input unit to the data execution unit are fixed to valid-state constant data while the series of executions are performed.Type: ApplicationFiled: March 28, 2011Publication date: October 6, 2011Applicant: FUJITSU SEMICONDUCTOR LIMITEDInventors: Takashi HANAI, Kiyomitsu Katou, Takahiro Kubota, Junji Sahoda, Ichiro Kasama, Kyoji Sato, Shinichi Sutou
-
Publication number: 20110246746Abstract: Various embodiments include apparatuses, stacked devices and methods of forming dice stacks on an interface die. In one such apparatus, a dice stack includes at least a first die and a second die, and conductive paths coupling the first die and the second die to the common control die. In some embodiments, the conductive paths may be arranged to connect with circuitry on alternating dice of the stack. In other embodiments, a plurality of dice stacks may be arranged on a single interface die, and some or all of the dice may have interleaving conductive paths.Type: ApplicationFiled: March 30, 2010Publication date: October 6, 2011Inventors: Brent Keeth, Christopher K. Morzano
-
Publication number: 20110219208Abstract: A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaOPS-scale computing, at decreased cost, power and footprint, and that allows for a maximum packaging density of processing nodes from an interconnect point of view. The Supercomputer exploits technological advances in VLSI that enables a computing model where many processors can be integrated into a single Application Specific Integrated Circuit (ASIC).Type: ApplicationFiled: January 10, 2011Publication date: September 8, 2011Applicant: International Business Machines CorporationInventors: Sameh Asaad, Ralph E. Bellofatto, Michael A. Blocksome, Matthias A. Blumrich, Peter Boyle, Jose R. Brunheroto, Dong Chen, Chen-Yong Cher, George L. Chiu, Norman Christ, Paul W. Coteus, Kristan D. Davis, Gabor J. Dozsa, Alexandre E. Eichenberger, Noel A. Eisley, Matthew R. Ellavsky, Kahn C. Evans, Bruce M. Fleischer, Thomas W. Fox, Alan Gara, Mark E. Giampapa, Thomas M. Gooding, Michael K. Gschwind, John A. Gunnels, Shawn A. Hall, Rudolf A. Haring, Philip Heidelberger, Todd A. Inglett, Brant L. Knudson, Gerard V. Kopcsay, Sameer Kumar, Amith R. Mamidala, James A. Marcella, Mark G. Megerian, Douglas R. Miller, Samuel J. Miller, Adam J. Muff, Michael B. Mundy, John K. O'Brien, Kathryn M. O'Brien, Martin Ohmacht, Jeffrey J. Parker, Ruth J. Poole, Joseph D. Ratterman, Valentina Salapura, David L. Satterfield, Robert M. Senger, Brian Smith, Burkhard Steinmacher-Burow, William M. Stockdell, Craig B. Stunkel, Krishnan Sugavanam, Yutaka Sugawara, Todd E. Takken, Barry M. Trager, James L. Van Oosten, Charles D. Wait, Robert E. Walkup, Alfred T. Watson, Robert W. Wisniewski, Peng Wu
-
Publication number: 20110213949Abstract: Various methods and apparatus are described for communicating transactions between one or more initiator IP cores and one or more target IP cores coupled to an interconnect. Tag logic may be located within the interconnect, such as located in an agent, and configured to assign different interconnect tag identification numbers to two or more transactions from a same thread. The tag logic assigns different interconnect tag identification numbers to allow the two or more transactions from the same thread to be outstanding over the interconnect to two or more different target IP cores at the same time, allow the two or more transactions from the same thread to be processed in parallel over the interconnect, and potentially serviced out of issue order while being returned back to the multiple threaded initiator IP core realigned in expected execution order.Type: ApplicationFiled: March 1, 2010Publication date: September 1, 2011Applicant: SONICS, INC.Inventors: Doddaballapur N. Jayasimha, Luc Hoa Ton, Drew E. Wingard
-
Patent number: 8010593Abstract: The present invention provides an adaptive integrated circuit. The various embodiments include a plurality of heterogeneous computational elements coupled to an interconnection network. The plurality of heterogeneous computational elements include corresponding computational elements having fixed and differing architectures, such as fixed architectures for different functions such as memory, addition, multiplication, complex multiplication, subtraction, configuration, reconfiguration, control, input, output, and field programmability. In response to configuration information, the interconnection network is operative in real time to configure and reconfigure the plurality of heterogeneous computational elements for a plurality of different functional modes, including linear algorithmic operations, non-linear algorithmic operations, finite state machine operations, memory operations, and bit-level manipulations.Type: GrantFiled: December 17, 2007Date of Patent: August 30, 2011Assignee: QST Holdings LLCInventors: Paul L. Master, Eugene Hogenauer, Walter James Scheuermann
-
Publication number: 20110197047Abstract: A definition file included in the present invention includes a plurality of parallel descriptions that respectively define a plurality of parallel processes performed independently. The plurality of parallel descriptions include a first parallel description showing a first parallel process with a plurality of data inputs including at least one data input into which output data of another parallel process is inputted, with data with the same latency from input in a parallel processing system are inputted into the plurality of data inputs.Type: ApplicationFiled: March 22, 2011Publication date: August 11, 2011Applicant: FUJI XEROX CO., LTD.Inventor: Shimura Hiroshi
-
Publication number: 20110185152Abstract: A reconfigurable circuit includes a plurality of processing elements and an input/output data interface unit, and the reconfigurable circuit is configured to control connections of the plurality of processing elements for each context. The input/output data interface unit is configured to hold operation input data which is input to the plurality of processing elements and operation output data which is output from the plurality of processing elements. The input/output data interface unit includes a plurality of ports, and a plurality of registers. The registers are configured to be connected to the plurality of ports, and to include m (m being an integer of 2 or more) number of banks in a depth direction.Type: ApplicationFiled: December 20, 2010Publication date: July 28, 2011Applicant: FUJITSU SEMICONDUCTOR LIMITEDInventors: Shinichi Sutou, Ichiro Kasama, Kyoji Sato, Takashi Hanai, Kiyomitsu Katou, Takahiro Kubota, Junji Sahoda
-
Patent number: 7984267Abstract: Executing a service program for an accelerator application program in a hybrid computing environment that includes a host computer and an accelerator, the host computer and the accelerator adapted to one another for data communications by a system level message passing module; where the service program includes a host portion and an accelerator portion and executing a service program for an accelerator includes receiving, from the host portion, operating information for the accelerator portion; starting the accelerator portion on the accelerator; providing, to the accelerator portion, operating information for the accelerator application program; establishing direct data communications between the host portion and the accelerator portion; and, responsive to an instruction communicated directly from the host portion, executing the accelerator application program.Type: GrantFiled: September 4, 2008Date of Patent: July 19, 2011Assignee: International Business Machines CorporationInventors: Michael E. Aho, Ricardo M. Matinata, Amir F. Sanjar, Gordon G. Stewart, Cornell G. Wright, Jr.
-
Publication number: 20110173366Abstract: A plurality of processing cores, are central storage unit having at least memory connected in a daisy chain manner, forming a daisy chain ring layout on an integrated chip. At least one of the plurality of processing cores places trace data on the daisy chain connection for transmitting the trace data to the central storage unit, and the central storage unit detects the trace data and stores the trace data in the memory co-located in with the central storage unit.Type: ApplicationFiled: January 8, 2010Publication date: July 14, 2011Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: David L. Satterfield, James C. Sexton
-
Publication number: 20110173416Abstract: A data processing device in which parallel processing elements can efficiently perform processing is provided. A parallel processing module includes plural processing elements, banks A and B provided to correspond to the processing elements and used to store data to be used when the processing elements perform processing, and an I/O bank provided to correspond to the processing elements and used to transfer data to and from an external memory. A first selector circuit selectively couples bank B or the I/O bank to the processing elements. A second selector circuit selectively couples the external memory or the processing elements to the I/O bank. Thus, data can be transferred from the external memory to the I/O bank concurrently with the processing performed by the processing elements. The processing elements can therefore perform processing efficiently.Type: ApplicationFiled: January 5, 2011Publication date: July 14, 2011Inventors: Hideyuki NODA, Takeaki Sugimura
-
Publication number: 20110173415Abstract: According to one embodiment, each of routers includes: a cache mechanism that stores data transferred to the other routers or processor elements; and a unit that reads out, when an access generated from each of the processor elements is transferred thereto, if target data of the access is stored in the cache mechanism, the data from the cache mechanism and transmits the data to the processor element as a request source.Type: ApplicationFiled: September 2, 2010Publication date: July 14, 2011Applicant: KABUSHIKI KAISHA TOSHIBAInventors: Jun Tanabe, Hiroyuki Usui
-
Patent number: 7979645Abstract: A memory mapping unit requests allocation of a remote memory to memory mapping units of other processor nodes via a second communication unit, and requests creation of a mapping connection to a memory-mapping managing unit of a first processor node via the second communication unit. The memory-mapping managing unit creates the mapping connection between a processor node and other processor nodes according to a connection creation request from the memory mapping unit, and then transmits a memory mapping instruction for instructing execution of a memory mapping to the memory mapping unit via a first communication unit of the first processor node.Type: GrantFiled: September 10, 2008Date of Patent: July 12, 2011Assignee: Ricoh Company, LimitedInventor: Hiroomi Motohashi
-
Publication number: 20110161626Abstract: Techniques for packet routing in an on-chip network are provided. In one embodiment, a method for routing packets in a multi-core processor including multiple cores connected by an on-chip network includes identifying ports that are incorrect while routing the packet. After receiving the packet at an input port, some of the ports are excluded from consideration while selecting the output port for the packet. The output port is selected from the remaining ports and the packet is routed to the selected output port.Type: ApplicationFiled: December 28, 2009Publication date: June 30, 2011Applicant: EMPIRE TECHNOLOGY DEVELOPMENT LLCInventor: William H. Mangione-Smith
-
Patent number: 7961226Abstract: The present invention provides a digital imaging apparatus having an optical sensor, an analog-to-digital converter, a plurality of computational elements, and an interconnection network. The optical sensor converts an object image into a detected image, which is then converted to digital image information by the analog-to-digital converter. The plurality of computational elements includes a first computational element having a first fixed architecture and a second computational element having a second, different fixed architecture. The interconnection network is capable of providing a processed digital image from the digital image information by configuring and reconfiguring the plurality of computational elements for performance of a plurality of different imaging functions. The invention may be embodied, for example, as a digital camera, a scanner, a printer, or a dry copier.Type: GrantFiled: September 15, 2009Date of Patent: June 14, 2011Assignee: QST Holdings, Inc.Inventors: Paul L. Master, John Watson
-
Publication number: 20110126133Abstract: An interface for a multi-processor gateway apparatus and method for using the same. A user device communicates with a multi-processor gateway apparatus over a wired or wireless path. A first processor within the multi-processor gateway apparatus provides the user device a user interface. The user interface allows the user to select a function that is managed by one of the multiple processors. If the selected function is assigned to the first processor, the function is performed by the first. However, if the selected function is performed by one of the other processors, the first processor executes calls to an API layer associated with the processor assigned to perform the requested function. The requested function is performed by the processor to which it is assigned and the results reported to the first processor. The first processor then provides the results of the request to the user device via the path.Type: ApplicationFiled: November 20, 2009Publication date: May 26, 2011Inventor: Jeffrey Paul Markley
-
Publication number: 20110107337Abstract: A reconfigurable hierarchical computer architecture having N levels, where N is an integer value greater than one, wherein said N levels include a first level including a first computation block including a first data input, a first data output and a plurality of computing nodes interconnected by a first connecting mechanism, each computing node including an input port, a functional unit and an output port, the first connecting mechanism capable of connecting each output port to the input port of each other computing node; and a second level including a second computation block including a second data input, a second data output and a plurality of the first computation blocks interconnected by a second connecting means for selectively connecting the first data output of each of the first computation blocks and the second data input to each of the first data inputs and for selectively connecting each of the first data outputs to the second data output.Type: ApplicationFiled: December 22, 2006Publication date: May 5, 2011Applicant: STMicroelectronics S. A.Inventor: Joel Cambonie
-
Patent number: 7930519Abstract: A processor unit and a coprocessor unit are disclosed. In one embodiment, the processor unit includes a functional unit that receives a set of instructions in an instruction stream and provides the set of instructions to the coprocessor unit. The coprocessor executes the instructions and initiates transmission of a set of execution results corresponding to the set of instructions to the processor unit's functional unit. The processor functional unit may be coupled to the coprocessor unit through a shared bus circuit implementing a packet-based protocol. The processor unit and the coprocessor unit may share a coherent view of system memory. In various embodiments, the functional unit may alter entries in a translation lookaside buffer (TLB) located in the coprocessor unit, resume and suspend a thread executing on the coprocessor unit, etc.Type: GrantFiled: December 17, 2008Date of Patent: April 19, 2011Assignee: Advanced Micro Devices, Inc.Inventor: Michael Frank
-
Publication number: 20110072239Abstract: Methods, procedures, apparatuses, computer programs, computer-accessible mediums, processing arrangements and systems generally related to data multi-casting in a distributed processor architecture are described. Various implementations may include identifying a plurality of target instructions that are configured to receive a first message from a source; providing target routing instructions to the first message for each of the target instructions including selected information commonly shared by the target instructions; and, when two of the identified target instructions are located in different directions from one another relative to a router, replicating the first message and routing the replicated messages to each of the identified target instructions in the different directions.Type: ApplicationFiled: September 18, 2009Publication date: March 24, 2011Applicant: Board of Regents, University of Texas SystemInventors: Doug Burger, Stephen W. Keckler, Dong Li
-
Patent number: 7904696Abstract: In one embodiment, the present invention includes a method for communicating an assertion signal from a first instruction sequencer to a plurality of accelerators coupled to the first instruction sequencer via a dedicated interconnect, detecting the assertion signal in the accelerators and communicating a request for a lock on a second interconnect coupled to the first instruction sequencer and the accelerators, and registering an accelerator that achieves the lock by communication of a registration message for the accelerator to the first instruction sequencer via the second interconnect. Other embodiments are described and claimed.Type: GrantFiled: September 14, 2007Date of Patent: March 8, 2011Assignee: Intel CorporationInventors: Perry Wang, Jamison Collins, Hong Wang
-
Publication number: 20110055518Abstract: The different advantageous embodiments provide a system for partitioning a data processing system comprising a number of cores and a partitioning process. The partitioning process is configured to assign a number of partitions to the number of cores. Each partition in the number of partitions is assigned to a separate number of cores from the number of cores.Type: ApplicationFiled: August 27, 2009Publication date: March 3, 2011Applicant: The Boeing CompanyInventors: Jonathan N. Hotra, Kenn R. Luecke
-
Publication number: 20110055520Abstract: Systems, methods and apparatus for a scalable quantum processor architecture. A quantum processor is locally programmable by providing a memory register with a signal embodying device control parameter(s), converting the signal to an analog signal; and administering the analog signal to one or more programmable devices.Type: ApplicationFiled: November 11, 2010Publication date: March 3, 2011Inventors: Andrew J. Berkley, Paul I. Bunyk, Geordie Rose
-
Publication number: 20110055519Abstract: A stream processing computer architecture includes creating a stream computer processing (SCP) system by forming a super node cluster of processors representing physical computation nodes (“nodes”), communicatively coupling the processors via a local interconnection means (“interconnect”), and communicatively coupling the cluster to an optical circuit switch (OCS), via optical external links (“links”). The OCS is communicatively coupled to another cluster of processors via the links. The method also includes generating a stream computation graph including kernels and data streams, and mapping the graph to the SCP system, which includes assigning the kernels to the clusters and respective nodes, assigning data stream traffic between the kernels to the interconnection when the data stream is between nodes in the same cluster, and assigning traffic between the kernels to the links when the data stream is between nodes in different clusters.Type: ApplicationFiled: November 9, 2010Publication date: March 3, 2011Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Eugen Schenfeld, Thomas B. Smith, III
-
Publication number: 20110047353Abstract: A device (1) including a reconfigurable section comprises a plurality of PEs (17) laid out having been divided into a plurality of segments and a command transmitting system (50) for transmitting commands to each PE (17). The command transmitting system (50) includes: a transmission command register (53) that is separately provided in each segment; a first level command transmitting matrix (51) for connecting the transmission command register (53) and PEs (17) in each segment with a delay of one clock; and a second level command transmitting matrix (52) for connecting the transmission command registers (53) of the plurality of segments and a command outputting unit (59) that outputs commands.Type: ApplicationFiled: January 29, 2009Publication date: February 24, 2011Applicant: FUJI XEROX CO., LTD.Inventor: Hiroyuki Matsuno
-
Publication number: 20110047352Abstract: A data processing system includes at least a first through third processing nodes coupled by an interconnect fabric. The first processing node includes a master, a plurality of snoopers capable of participating in interconnect operations, and a node interface that receives a request of the master and transmits the request of the master to the second processing unit with a nodal scope of transmission limited to the second processing node. The second processing node includes a node interface having a directory. The node interface of the second processing node permits the request to proceed with the nodal scope of transmission if the directory does not indicate that a target memory block of the request is cached other than in the second processing node and prevents the request from succeeding if the directory indicates that the target memory block of the request is cached other than in the second processing node.Type: ApplicationFiled: August 21, 2009Publication date: February 24, 2011Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Paul A. Ganfield, Guy L. Guthrie, David J. Krolak, Michael S. Siegel, William J. Starke, Jeffrey A. Stuecheli, Derek E. Williams
-
Publication number: 20110041006Abstract: A system and method for processing a distributed transaction for an application are disclosed. Conventionally transactions on critical data (e.g. financial information) are processed using a database architecture whereby a persistent database (typically a redundant disk array) comprises the master record. In cases where large amounts of data need to be accessed but absolute data integrity is less critical, for example search engines, processing is conducted on live in-memory data without all the data being backed up, which can be much faster but data can be lost when processors fail. There have been attempts to use data grid architectures with some backup to persistent stores for more important data but these have either introduced disk access bottlenecks or required manual intervention in the event of failure.Type: ApplicationFiled: March 5, 2010Publication date: February 17, 2011Applicant: New Technology/enterprise LimitedInventor: Matthew Fowler
-
Publication number: 20110035626Abstract: On a typical motherboard the processor and memory are separated by a printed circuit data bus that traverses the motherboard. Throughput, or data transfer rate, on the data bus is much lower than the rate at which a modern processor can operate. The difference between the data bus throughput and the processor speed significantly limits the effective processing speed of the computer when the processor is required to process large amounts of data stored in the memory. The processor is forced to wait for data to be transferred to or from the memory, leaving the processor under-utilized. The delays are compounded in a distributed computing system including a number of computers operating in parallel. The present disclosure describes systems, method and apparatus that tend to alleviate delays so that memory access bottlenecks are not compounded within distributed computing systems.Type: ApplicationFiled: August 6, 2010Publication date: February 10, 2011Applicant: ADVANCED PROCESSOR ARCHITECTURES, LLCInventors: Louis Edmund Chall, John Bradley Serson, Philip Arnold Roberts, Cecil Eugene Hutchins
-
Patent number: 7877574Abstract: A first storing unit stores therein a chain indivisibility instruction. A detecting unit detects a change of first data that is distributed in a node computer. A first designating unit designates, when the detecting unit detects the change in the first data, an indivisibility instruction corresponding to the first data from which the change is detected, by referring to the first storing unit. A first executing unit executes the indivisibility instruction designated by the first designating unit.Type: GrantFiled: April 26, 2007Date of Patent: January 25, 2011Assignee: Fujitsu LimitedInventor: Nobutaka Imamura
-
Publication number: 20110010525Abstract: A system and method is shown for on-chip and chip-to-chip routing. The system and method includes a processor element residing on a processor die to process a data packet received at the processor die. The system and method also include a router residing on the process die to route the data packet received at the processor die. Further, the system and method includes a switch core residing on the processor die to switch a communication channel along which the data packet is to be transmitted. Additionally, the system and method includes a switch core to identify a destination processing element and router (PE/R) module for a data packet, the switch core and the destination PE/R module residing on a common processor die. Moreover, the system and method includes a communication channel to operatively connect the switch core and the destination PE/R module on the common processor die.Type: ApplicationFiled: July 10, 2009Publication date: January 13, 2011Inventors: Nathan Binkert, Moray McLaren
-
Publication number: 20110010526Abstract: Nowadays, many architectures have processing units with different bandwidth requirements which are connected over a pipelined ring bus. The proposed invention can optimize the data transfer for the case where processing units with lower bandwidth requirements can be grouped and controlled together for a data transfer, so that the available bus bandwidth can be optimally utilized.Type: ApplicationFiled: March 3, 2008Publication date: January 13, 2011Inventors: Hanno Lieske, Shorin Kyo
-
Publication number: 20110004880Abstract: A system and method for managing data, such as in a data warehousing, analysis, or similar applications, where dataflow graphs are expressed as reusable map components, at least some of which are selected from a library of components, and map components are assembled to create an integrated dataflow application. Composite map components encapsulate a dataflow pattern using other maps as subcomponents. Ports are used as link points to assemble map components and are hierarchical and composite allowing ports to contain other ports. The dataflow application may be executed in a parallel processing environment by recognizing the linked data processes within the map components and assigning threads to the linked data processes.Type: ApplicationFiled: September 14, 2010Publication date: January 6, 2011Inventors: Larry Lee Schumacher, Agustin Gonzales-Tuchmann, Laurence Tobin Yogman, Paul C. Dingman
-
Publication number: 20100332795Abstract: A computer system includes a central processing unit, a random-access-memory interface, a random-access memory in which addresses are allocated in an address space of the random-access-memory interface and a reconfigurable arithmetic device whose arithmetic function is capable of being dynamically changed in accordance with configuration data. The reconfigurable arithmetic device includes input terminals, output terminals, a plurality of processor elements that perform individual arithmetic processes in synchronization with a clock, an inter-processor-element network which connects the input terminals and the output terminals to input ports and output ports of the plurality of processor elements, a random-access memory built into the reconfigurable arithmetic device and a control unit that sets the plurality of processor elements and the inter-processor-element network.Type: ApplicationFiled: June 7, 2010Publication date: December 30, 2010Applicant: FUJITSU SEMICONDUCTOR LIMITEDInventors: Hiroshi FURUKAWA, Ichiro Kasama
-
Publication number: 20100325387Abstract: An arithmetic processing apparatus includes: a plurality of processing units connected in series to each other, wherein each of the processing units includes a limitation information setting section in which limitation information, which indicates the amount of arithmetic processing that each of the processing units is to process for data of each arithmetic processing unit, is set; an arithmetic section which executes arithmetic processing on the data of each arithmetic processing unit, according to the limitation information set in the limitation information setting section, by the same program between the plurality of processing units; and a memory in which processing data subjected to the arithmetic processing by the arithmetic section is stored.Type: ApplicationFiled: May 25, 2010Publication date: December 23, 2010Applicant: Sony CorporationInventors: Kenji YAMANE, Tsuyoshi Kano, Masahiro Takahashi
-
Publication number: 20100325388Abstract: A multiprocessor system on a chip (MPSoC) implements parallel processing and include a plurality of cores with inter-core communication. This communication is implemented by an on-chip switch fabric in communication with each core, or by shared memory in communication with each core. In another embodiment, a parallel processing system is implemented as a Howard Cascade and uses shared memory for implementing inter-chip communication.Type: ApplicationFiled: June 17, 2010Publication date: December 23, 2010Inventor: Kevin D. Howard
-
Patent number: 7856544Abstract: A method for implementing a stream processing computer architecture includes creating a stream computer processing (SCP) system by forming a super node cluster of processors representing physical computation nodes (“nodes”), communicatively coupling the processors via a local interconnection means (“interconnect”), and communicatively coupling the cluster to an optical circuit switch (OCS), via optical external links (“links”). The OCS is communicatively coupled to another cluster of processors via the links. The method also includes generating a stream computation graph including kernels and data streams, and mapping the graph to the SCP system, which includes assigning the kernels to the clusters and respective nodes, assigning data stream traffic between the kernels to the interconnection when the data stream is between nodes in the same cluster, and assigning traffic between the kernels to the links when the data stream is between nodes in different clusters.Type: GrantFiled: August 18, 2008Date of Patent: December 21, 2010Assignee: International Business Machines CorporationInventors: Eugen Schenfeld, Thomas B. Smith, III
-
Publication number: 20100318767Abstract: A multiplexing auxiliary processing element (PE) performs a process that includes the operations of receiving signals of a plurality of upstream processing elements (PEs) including a plurality of pairs of PEs arranged on the input side; supplying the signals from the upstream PEs to a multiplex PE that is multiplexed and used so that the signals are subjected to a predetermined process by the multiplex PE; receiving the processed signals subjected to the predetermined process by the multiplex PE and sequentially supplying the signals to a plurality of downstream PEs arranged on the output side; and performing operations of the upstream PEs synchronously with the supply of the processed signals to the corresponding downstream PEs on the basis of setting of the multiplexing auxiliary PE.Type: ApplicationFiled: June 1, 2010Publication date: December 16, 2010Applicant: FUJITSU SEMICONDUCTOR LIMITEDInventor: Tsuguchika TABARU
-
Publication number: 20100312990Abstract: Systems, methods of operating a memory device, and methods of arbitrating access to a memory array in a memory device having an internal processor are provided. In one or more embodiments, conflicts in accessing the memory array are reduced by interfacing an external processor, such as a memory controller, with the internal processor, which could be an embedded ALU, through a control interface. The external processor can control access to the memory array, and the internal processor can send signals to the external processor to request access to the memory array. The signals may also request a particular bank in the memory array. In different embodiments, the external processor and the internal processor communicate via the control interface or a standard memory interface to grant access to the memory array, or to a particular bank in the memory array, for example.Type: ApplicationFiled: June 4, 2009Publication date: December 9, 2010Applicant: MICRON TECHNOLOGY, INC.Inventor: Robert Walker
-
Publication number: 20100293312Abstract: Described embodiments provide a system having a plurality of processor cores and common memory in direct communication with the cores. A source processing core communicates with a task destination core by generating a task message for the task destination core. The task source core transmits the task message directly to a receiving processing core adjacent to the task source core. If the receiving processing core is not the task destination core, the receiving processing core passes the task message unchanged to a processing core adjacent the receiving processing core. If the receiving processing core is the task destination core, the task destination core processes the message.Type: ApplicationFiled: May 18, 2010Publication date: November 18, 2010Inventors: David P. Sonnier, William G. Burroughs, Narender R. Vangati, Deepak Mital, Robert J. Munoz
-
Publication number: 20100268914Abstract: A processing system comprising processors and the dynamically configurable communication elements coupled together in an interspersed arrangement. The processors each comprise at least one arithmetic logic unit, an instruction processing unit, and a plurality of processor ports. The dynamically configurable communication elements each comprise a plurality of communication ports, a first memory, and a routing engine. For each of the processors, the plurality of processor ports is configured for coupling to a first subset of the plurality of dynamically configurable communication elements. For each of the dynamically configurable communication elements, the plurality of communication ports comprises a first subset of communication ports configured for coupling to a subset of the plurality of processors and a second subset of communication ports configured for coupling to a second subset of the plurality of dynamically configurable communication elements.Type: ApplicationFiled: June 30, 2010Publication date: October 21, 2010Inventors: Michael B. Doerr, William H. Hallidy, David A. Gibson, Craig M. Chase
-
Publication number: 20100269027Abstract: A data processing system is programmed to provide a method for enabling user-level one-to-all message/messaging (OTAM) broadcast within a distributed parallel computing environment in which multiple threads of a single job execute on different processing nodes across a network. The method comprises: generating one or more messages for transmission to at least one other processing node accessible via a network, where the messages are generated by/for a first thread executing at the data processing system (first processing node) and the other processing node executes one or more second threads of a same parallel job as the first thread. An OTAM broadcast is transmitting via a host fabric interface (HFI) of the data processing system as a one-to-all broadcast on the network, whereby the messages are transmitted to a cluster of processing nodes across the network that execute threads of the same parallel job as the first thread.Type: ApplicationFiled: April 16, 2009Publication date: October 21, 2010Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Lakshminarayana B. Arimilli, Ravi K. Arimilli, Robert S. Blackmore
-
Publication number: 20100268913Abstract: A multi-core processor includes two or more cores; an external communication facility that is shared by the cores and is capable of communicating with one of the cores at a time; and an internal communication facility capable of communicating simultaneously with each one of the cores; wherein the multi-core processor is configured to: receive a first signal via the external communication facility; relay the first signal to one of the cores; handle the first signal by the one of the cores, thereby generating a second signal; transmit substantially at the same time the second signal to each one of the cores by the internal communication facility; start a task on each one of the cores in response to the receiving of the second signal.Type: ApplicationFiled: March 18, 2010Publication date: October 21, 2010Applicant: ASML NETHERLANDS B.V.Inventors: Robert Lambertus Cornelius Johannes VAN HALDER, Wilhelmus Theodorus Maria Alberts, Richard Nauber
-
Patent number: 7809926Abstract: A reconfigurable multiprocessor system including a number of processing units and components enabling executing sequential code collectively at processing units and enabling changing the architectural configuration of the processing units.Type: GrantFiled: November 3, 2006Date of Patent: October 5, 2010Assignee: Cornell Research Foundation, Inc.Inventors: Jose F. Martinez, Engin Ipek, Meyrem Kirman, Nevin Kirman
-
Publication number: 20100241826Abstract: A data processing apparatus can reduce an occupancy rate of a ring bus by suppressing occurrence of a stall packet, and can change a processing sequence. In the data processing apparatus, a buffer is provided in each communication unit connecting the ring bus and the associated processing unit. Transfer of data from the communication unit to the processing unit is controlled by an enable signal. Consequently, occurrence of a stall packet is suppressed. Accordingly, frequency of occurrence of a deadlock state is reduced by decreasing the occupancy rate of the ring bus.Type: ApplicationFiled: March 15, 2010Publication date: September 23, 2010Applicant: CANON KABUSHIKI KAISHAInventors: Yuji Hara, Hisashi Ishikawa, Akinobu Mori, Takeo Kimura, Hirowo Inoue
-
Publication number: 20100235609Abstract: In an information apparatus including a plurality of processing circuits connected to a ring bus, when processing speeds (throughput) of processing circuits are different or an amount of data in the processing circuit is increased or decreased, deadlock can occur or the throughput can be decreased in the ring bus. In order to solve this problem, a stall state of other processing unit is detected from a packet acquired from the ring bus and a packet is restricted from being newly generated by the processing circuit nor transmitted therefrom when other processing unit is in the stall state.Type: ApplicationFiled: March 9, 2010Publication date: September 16, 2010Applicant: CANON KABUSHIKI KAISHAInventors: Hirowo Inoue, Hisashi Ishikawa
-
Publication number: 20100223219Abstract: A calculation processing apparatus for executing network calculations defined by hierarchically connecting a plurality of logical processing nodes that apply calculation processing to input data, sequentially designates a processing node which is to execute calculation processing based on sequence information that specifies an execution order of calculations of predetermined processing units to be executed by the plurality of processing nodes, so as to implement the network calculations, and executes the calculation processing of the designated processing node in the processing unit to obtain a calculation result. The calculation apparatus allocates partial areas of a memory to the plurality of processing nodes as ring buffers, and writes the calculation result in the memory while circulating a write destination of data to have a memory area corresponding to the amount of the calculation result of the processing unit as a unit.Type: ApplicationFiled: June 11, 2008Publication date: September 2, 2010Applicant: CANON KABUSHIKI KAISHAInventors: Masami Kato, Takahisa Yamamoto, Yoshinori Ito
-
Publication number: 20100217955Abstract: The present disclosure relates to a system for routing data across a multicore processing network. The system includes a multicore processing array having a plurality of processing cores, a memory for storing data relating to an object being modeled, the data being associated with coordinate information relating to the object within a coordinate system, and a controller for routing the data from the memory to one or more of the plurality of processing cores of the multicore processing array based on the coordinate information associated with the data. The present disclosure also relates to a method for routing data across a multicore processing network and a computer accessible medium having stored thereon computer executable instructions for performing a procedure for routing data across a multicore processing network.Type: ApplicationFiled: February 25, 2009Publication date: August 26, 2010Inventors: Thomas Martin Conte, Andrew Wolfe
-
Patent number: 7783861Abstract: When an instruction code “MVLR” is sent from a control processor in a PE having a mask register MR in operation setting, when the direction register F is ON, if a counter and transfer result storing buffer T is ?M, a value of T?M is stored in buffer T, and if T is less than M, content of a first transport register L of a PE whose PE number counted from the left inside a PE block is T, is selected by a first selector and stored in buffer T and the mask register is set to non-operation. When the direction register is OFF, if T is ??M, a value of T+M is stored in buffer T, and if T is greater than ?M, content of R of a PE whose PE number is ?T, counted from the right inside the PE block, is selected by a second selector and stored in buffer T, and MR is set to non-operation. Entire PEs transfer content of L and R to M-adjacent left and right PEs, and data transferred from M-adjacent right and M-adjacent left PEs are stored in L and R respectively.Type: GrantFiled: February 27, 2007Date of Patent: August 24, 2010Assignee: NEC CorporationInventor: Shorin Kyo