Operation Patents (Class 712/30)
  • Patent number: 8468534
    Abstract: Techniques are provided for dynamically re-ordering operation requests that have previously been submitted to a queue management unit. After the queue management unit has placed multiple requests in a queue to be executed in an order that is based on priorities that were assigned to the operations, the entity that requested the operations (the “requester”) sends one or more priority-change messages. The one or more priority-change messages include requests to perform operations that have already been queued. For at least one of the operations, the priority assigned to the operation in the subsequent request is different from the priority that was assigned to the same operation when that operation was initially queued for execution. Based on the change in priority, the operation whose priority has change is placed at a different location in the queue, relative to the other operations in the queue that were requested by the same requester.
    Type: Grant
    Filed: April 5, 2010
    Date of Patent: June 18, 2013
    Assignee: Apple Inc.
    Inventor: Brian R. Tunning
  • Publication number: 20130151814
    Abstract: A multi-core processor includes a monitored processor core whose process result is to be monitored; a monitoring processor core group including two or more monitoring processors which can perform a process for monitoring the monitored processor core; an evaluating part configured to evaluate a processing load of the monitoring processor core group; and a controlling part configured to make the monitoring processor core group perform the process for monitoring the monitored processor core in a distributed manner if the processing load of the monitoring processor core group evaluated by the evaluating part is low, and make the monitoring processor of the monitoring processor core group perform the process for monitoring the monitored processor core if the processing load of the monitoring processor core group evaluated by the evaluating part is high, the monitoring processor performing a process whose priority is relatively low.
    Type: Application
    Filed: December 13, 2011
    Publication date: June 13, 2013
    Applicant: Toyota Jidosha Kabushiki Kaisha
    Inventor: Koji Ueda
  • Patent number: 8464026
    Abstract: A CPU may select a variable from a variable set as a dependent variable. The variable set may be part of the data structure that includes a plurality of vector values, a vector value associated with a variable set of n number of variables, and each variable of the variable set having a variable value. The number of dependent variable steps for the dependent variable may be determined. The number of the vector values in a dependent variable step is determined as being number of independent variables. A function is mapped to a plurality of thread processors, and each thread processor is assigned for the function to be performed on each one of the independent variables for each of the dependent variable steps.
    Type: Grant
    Filed: February 17, 2010
    Date of Patent: June 11, 2013
    Assignee: International Business Machines Corporation
    Inventors: Rajesh Ramkrishna Bordawekar, Ravishankar Rao
  • Patent number: 8458244
    Abstract: A parallel computer including compute nodes, each including two reduction processing cores, a network write processing core, and a network read processing core, each processing core assigned an input buffer. Copying, in interleaved chunks by the reduction processing cores, contents of the reduction processing cores' input buffers to an interleaved buffer in shared memory; copying, by one of the reduction processing cores, contents of the network write processing core's input buffer to shared memory; copying, by another of the reduction processing cores, contents of the network read processing core's input buffer to shared memory; and locally reducing in parallel by the reduction processing cores: the contents of the reduction processing core's input buffer; every other interleaved chunk of the interleaved buffer; the copied contents of the network write processing core's input buffer; and the copied contents of the network read processing core's input buffer.
    Type: Grant
    Filed: August 15, 2012
    Date of Patent: June 4, 2013
    Assignee: International Business Machines Corporation
    Inventors: Michael A. Blocksome, Daniel A. Faraj
  • Publication number: 20130138920
    Abstract: An apparatus for packet processing is provided. The apparatus is to be implemented in a server and includes: a preprocessor and at least two processors which are respectively connected with the preprocessor. The preprocessor is to classify packets received externally from the server, and to distribute the classified packets to the respective processors, wherein packets in a same flow are distributed to a same processor. Each of the processors is to receive and process a packet distributed by the preprocessor.
    Type: Application
    Filed: August 11, 2011
    Publication date: May 30, 2013
    Applicant: Hangzhou H3C Technologies, Co., Ltd.
    Inventor: Changzhong Ge
  • Patent number: 8453152
    Abstract: A scheduler receives at least one flexible reservation request for scheduling in a computing environment comprising consumable resources. The flexible reservation request specifies a duration and at least one required resource. The consumable resources comprise at least one machine resource and at least one floating resource. The scheduler creates a flexible job for the at least one flexible reservation request and places the flexible job in a prioritized job queue for scheduling, wherein the flexible job is prioritizes relative to at least one regular job in the prioritized job queue. The scheduler adds a reservation set to a waiting state for the at least one flexible reservation request.
    Type: Grant
    Filed: February 1, 2011
    Date of Patent: May 28, 2013
    Assignee: International Business Machines Corporation
    Inventors: Alexander Druyan, Wei Li, Kailash N. Marthi, Yun T. Xiang, Linda C. Cham
  • Patent number: 8448174
    Abstract: An information processing device which has a plurality of process units for performing various kinds of processes includes a detecting unit that detects a processing loads of the process units; a determining unit that determines whether a total amount of the processing loads detected by the detecting unit is equal to or larger than a specific value; a designating unit that designates a process unit having a process state to be controlled, based on the processing loads of the process units detected by the detecting unit, when the determining unit determines that the total amount is equal to or larger than the specific value; a process identifying unit that identifies a process having an execution state to be controlled among processes being performed by the process unit designated by the designating unit; and a control unit that controls the execution state of the process identified by the process identifying unit.
    Type: Grant
    Filed: January 22, 2010
    Date of Patent: May 21, 2013
    Assignee: Fujitsu Limited
    Inventors: Ryo Miyamoto, Ryuichi Matsukura, Takashi Ohno
  • Patent number: 8443175
    Abstract: A microprocessor integrated circuit includes first and second processors, an internal memory accessible by the first and second processors, and a bus interface unit configured to interface to a bus external to the microprocessor for providing access to a memory external to the microprocessor. The bus interface unit, external bus, and external memory are accessible by the second processor but are inaccessible by the first processor. The first processor writes debug information to the internal memory. The first processor detects an event and provides a notification of the event to the second processor. The second processor, coupled to the bus interface unit, executes microcode in response to the event notification received from the first processor. The microcode reads the debug information from the internal memory and writes the debug information to the external memory via the bus interface unit and external bus for use in debugging the second processor.
    Type: Grant
    Filed: March 29, 2010
    Date of Patent: May 14, 2013
    Assignee: VIA Technologies, Inc.
    Inventors: G. Glenn Henry, Jui-Shuan Chen
  • Publication number: 20130117533
    Abstract: A coprocessor has: a processing unit for processing tasks in a data-processing system subject to at least one master processor; at least one storage module having memory areas, assignable in each case to the tasks, for storing data assigned to the tasks; and a buffer area for buffering instructions assigned to the tasks, the instructions including processing instructions, and upon retrieval of the processing instructions from the buffer area, the data stored in the storage module being processed on the basis of the processing instructions.
    Type: Application
    Filed: April 6, 2011
    Publication date: May 9, 2013
    Inventor: Jan Hayek
  • Patent number: 8438404
    Abstract: The disclosure is applied to a generic microprocessor architecture with a set (e.g., one or more) of controlling elements (e.g., MPEs) and a set of groups of sub-processing elements (e.g., SPEs). Under this arrangement, MPEs and SPEs are organized in a way that a smaller number MPEs control the behavior of a group of SPEs using program code embodied as a set of virtualized control threads. The arrangement also enables MPEs delegate functionality to one or more groups of SPEs such that those group(s) of SPEs will act as pseudo MPEs. The pseudo MPEs will utilize pseudo virtualized control threads to control the behavior of other groups of SPEs. In a typical embodiment, the apparatus includes a MCP coupled to a power supply coupled with cores to provide a supply voltage to each core (or core group) and controlling-digital elements and multiple instances of sub-processing elements.
    Type: Grant
    Filed: September 30, 2008
    Date of Patent: May 7, 2013
    Assignee: International Business Machines Corporation
    Inventors: Karl J. Duvalsaint, Harm P. Hofstee, Daeik Kim, Moon J. Kim
  • Publication number: 20130103928
    Abstract: With the progress toward multi-core processors, each core is can not readily ascertain the status of the other dies with respect to an idle or active status. A proposal for utilizing an interface to transmit core status among multiple cores in a multi-die microprocessor is discussed. Consequently, this facilitates thermal management by allowing an optimal setting for setting performance and frequency based on utilizing each core status.
    Type: Application
    Filed: December 11, 2012
    Publication date: April 25, 2013
    Inventors: Jose P. Allarey, Varghese George, Sanjeev Jahagirdar, Oren Lamdan, Nathan Ofer, Tomer Ziv
  • Publication number: 20130103927
    Abstract: A processor link that couples a first processor and a second processor is selected for validation and a plurality of communication parameter settings associated with the first and the second processors is identified. The first and the second processors are successively configured with each of the communication parameter settings. One or more test data pattern(s) are provided from the first processor to the second processor in accordance with the communication parameter setting. Performance measurements associated with the selected processor link and with the communication parameter setting are determined based, at least in part, on the test data pattern as received at the second processor. One of the communication parameter settings that is associated with the highest performance measurements is selected. The selected communication parameter setting is applied to the first and the second processors for subsequent communication between the first and the second processors via the processor link.
    Type: Application
    Filed: October 25, 2011
    Publication date: April 25, 2013
    Applicant: International Business Machines Corporation
    Inventors: Robert W. Berry, JR., Anand Haridass, Prasanna Jayaraman
  • Publication number: 20130097407
    Abstract: A method, system, and computer program product for maintaining reliability in a computer system. In an example embodiment, the method includes managing workloads on a first processor with a first processor architecture by an agent process executing on a second processor with a second processor architecture. The method proceeds by activating redundant computation on the second processor by the agent process. The method continues by performing a same computation from a workload of the workloads at least twice. Finally, the method includes comparing results of the same computation. In this embodiment the first processor is coupled the second processor by a network, and the first processor architecture and second processor architecture are different architectures.
    Type: Application
    Filed: December 8, 2012
    Publication date: April 18, 2013
    Applicant: International Business Machines Corporation
    Inventor: International Business Machines Corporation
  • Publication number: 20130097406
    Abstract: In some embodiments, a computer cluster system comprises a plurality of nodes and a software package comprising a user interface and a kernel for interpreting program code instructions. In certain embodiments, a cluster node module is configured to communicate with the kernel and other cluster node modules. The cluster node module can accept instructions from the user interface and can interpret at least some of the instructions such that several cluster node modules in communication with one another and with a kernel can act as a computer cluster.
    Type: Application
    Filed: March 16, 2012
    Publication date: April 18, 2013
    Inventors: Zvi Tannenbaum, Dean E. Dauger
  • Patent number: 8423749
    Abstract: A computer-implemented method, system and computer program product for controlling an algorithm that is performed on a unit of work in a subsequent software pipeline stage in a Network On a Chip (NOC) is presented. In one embodiment, the method executes a first operation in a first node of the NOC. The first node generates payload, and then loads that payload into a message. The message with the payload is transmitted to a nanokernel that controls a second node in the NOC. The nanokernel calls an algorithm that is needed by a second operation in a second node in the NOC, which uses the algorithm to execute the second operation.
    Type: Grant
    Filed: October 22, 2008
    Date of Patent: April 16, 2013
    Assignee: International Business Machines Corporation
    Inventors: Eric O. Mejdrich, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
  • Publication number: 20130091341
    Abstract: A computation system for computing interactions in a multiple-body simulation includes an array of processing modules arranged into one or more serially interconnected processing groups of the processing modules. Each of the processing modules includes storage for data elements and includes circuitry for performing pairwise computations between data elements each associated with a spatial location. Each of the pairwise computations makes use of a data element from the storage of the processing module and a data element passing through the serially interconnected processing modules. Each of the processing modules includes circuitry for selecting the pairs of data elements according to separations between spatial locations associated with the data elements.
    Type: Application
    Filed: November 19, 2012
    Publication date: April 11, 2013
    Applicant: D.E. Shaw Research LLC
    Inventors: David E. Shaw, Martin M. Deneroff, Ron O. Dror, Richard H. Larson, John K. Salmon
  • Patent number: 8417919
    Abstract: A method of dynamic parallelization in a multi-processor identifies potentially independent computational operations, such as functions and methods, with a serializer that assigns a computational operation to a serialization set and a processor based on assessment of the data that the computational operation will be accessing upon execution.
    Type: Grant
    Filed: August 18, 2009
    Date of Patent: April 9, 2013
    Assignee: Wisconsin Alumni Research Foundation
    Inventors: Matthew Allen, Gurindar S. Sohi
  • Publication number: 20130086355
    Abstract: A method, an apparatus and an article of manufacture for generating a distributed data scalable adaptive map-reduce framework for at least one multi-core cluster. The method includes partitioning a cluster into at least one computational group, determining at least one key-group leader within each computational group, performing a local combine operation at each computational group, performing a global combine operation at each of the at least one key-group leader within each computational group based on a result from the local combine operation, and performing a global map-reduce operation across the at least one key-group leader within each computational group.
    Type: Application
    Filed: September 30, 2011
    Publication date: April 4, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Ankur Narang, Jyothish Soman
  • Publication number: 20130086356
    Abstract: A method for generating a distributed data scalable adaptive map-reduce framework for at least one multi-core cluster. The method includes partitioning a cluster into at least one computational group, determining at least one key-group leader within each computational group, performing a local combine operation at each computational group, performing a global combine operation at each of the at least one key-group leader within each computational group based on a result from the local combine operation, and performing a global map-reduce operation across the at least one key-group leader within each computational group.
    Type: Application
    Filed: August 1, 2012
    Publication date: April 4, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Ankur Narang, Jyothish Soman
  • Publication number: 20130073832
    Abstract: A parallel computer that includes compute nodes having computer processors and a CAU (Collectives Acceleration Unit) that couples processors to one another for data communications. In embodiments of the present invention, deterministic reduction operation include: organizing processors of the parallel computer and a CAU into a branched tree topology, where the CAU is a root of the branched tree topology and the processors are children of the root CAU; establishing a receive buffer that includes receive elements associated with processors and configured to store the associated processor's contribution data; receiving, in any order from the processors, each processor's contribution data; tracking receipt of each processor's contribution data; and reducing, the contribution data in a predefined order, only after receipt of contribution data from all processors in the branched tree topology.
    Type: Application
    Filed: November 1, 2012
    Publication date: March 21, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: International Business Machines Corporation
  • Publication number: 20130067198
    Abstract: A parallel computer is provided that includes a collection of compute nodes organized as a tree, including: initiating a collective gather operation by a logical root of the collection of compute nodes, including adding result data of the logical root to a gather buffer; for each compute node in the collection of compute nodes, determining whether result data of the compute node is already written in the gather buffer; and if the result data of the compute node is already written in the gather buffer, incrementing a counter assigned to that result data already written in the gather buffer; and if the result data of the compute node is not already written in the gather buffer, writing the result data of the compute node as new result data in the gather buffer, incrementing a counter assigned to that new result data, and writing in the gather buffer a node ID.
    Type: Application
    Filed: November 1, 2012
    Publication date: March 14, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: International Business Machines Corporation
  • Publication number: 20130061078
    Abstract: A computing apparatus and corresponding method for operating are disclosed. The computing apparatus may comprise a set of interconnected central processing units (CPUs). Each CPU may embed an operating system including a kernel comprising a protocol stack. At least one of the CPUs may further embed executable instructions for allocating multiple strands among the rest of the CPUs. The protocol stack may comprise a Transmission Control Protocol/Internet Protocol (TCP/IP), a User Datagram Protocol/Internet Protocol (UDP/IP) stack, an Internet Control Message Protocol (ICMP) stack or any other suitable Internet protocol. The method for operating the computing apparatus may comprise receiving input/output (I/O) requests, generating multiple strands according to the I/O requests, and allocating the multiple strands to one or more CPUs.
    Type: Application
    Filed: December 21, 2011
    Publication date: March 7, 2013
    Inventor: Ian Henry Stuart Cullimore
  • Publication number: 20130060555
    Abstract: Methods and apparatus for controlling at least two processing cores in a multi-processor device or system include accessing an operating system run queue to generate virtual pulse trains for each core and correlating the virtual pulse trains to identify patterns of interdependence. The correlated information may be used to determine dynamic frequency/voltage control settings for the first and second processing cores to provide a performance level that accommodates interdependent processes, threads and processing cores.
    Type: Application
    Filed: February 27, 2012
    Publication date: March 7, 2013
    Applicant: QUALCOMM INCORPORATED
    Inventors: Steven S. Thomson, Edoardo Regini, Mriganka Mondal, Nishant Hariharan
  • Patent number: 8381216
    Abstract: Dynamically managing a thread pool associated with a plurality of sub-applications. A request for at least one of the sub-applications is received. A quantity of threads currently assigned to the at least one of the sub-applications is determined. The determined quantity of threads is compared to a predefined maximum thread threshold. A thread in the thread pool is assigned to handle the received request if the determined quantity of threads is not greater than the predefined maximum thread threshold. Embodiments enable control of the quantity of threads within the thread pool assigned to each of the sub-applications. Further embodiments manage the threads for the sub-applications based on latency of the sub-applications.
    Type: Grant
    Filed: March 5, 2010
    Date of Patent: February 19, 2013
    Assignee: Microsoft Corporation
    Inventor: Rohith Thammana Gowda
  • Publication number: 20130042088
    Abstract: Collective operation protocol selection in a parallel computer that includes compute nodes may be carried out by calling a collective operation with operating parameters; selecting a protocol for executing the operation and executing the operation with the selected protocol. Selecting a protocol includes: iteratively, until a prospective protocol meets predetermined performance criteria: providing, to a protocol performance function for the prospective protocol, the operating parameters; determining whether the prospective protocol meets predefined performance criteria by evaluating a predefined performance fit equation, calculating a measure of performance of the protocol for the operating parameters; determining that the prospective protocol meets predetermined performance criteria and selecting the protocol for executing the operation only if the calculated measure of performance is greater than a predefined minimum performance threshold.
    Type: Application
    Filed: August 9, 2011
    Publication date: February 14, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Charles J. Archer, Michael A. Blocksome, Joseph D. Ratterman, Brian E. Smith
  • Patent number: 8375197
    Abstract: Methods, apparatus, and products are disclosed for performing an allreduce operation on a plurality of compute nodes of a parallel computer, each node including at least two processing cores, that include: performing, for each node, a local reduction operation using allreduce contribution data for the cores of that node, yielding, for each node, a local reduction result for one or more representative cores for that node; establishing one or more logical rings among the nodes, each logical ring including only one of the representative cores from each node; performing, for each logical ring, a global allreduce operation using the local reduction result for the representative cores included in that logical ring, yielding a global allreduce result for each representative core included in that logical ring; and performing, for each node, a local broadcast operation using the global allreduce results for each representative core on that node.
    Type: Grant
    Filed: May 21, 2008
    Date of Patent: February 12, 2013
    Assignee: International Business Machines Corporation
    Inventor: Ahmad Faraj
  • Patent number: 8370605
    Abstract: A system includes first and second processors, first and second graphics processing units (GPUs), one or more peripheral devices, a switch matrix, and processor-readable memory. The switch matrix comprises programmable data paths between the processors, the GPUs, and the peripheral devices. Software encoded in the process-readable memory includes a first operating system (OS) executed by the first processor, a second OS executed by the second processor, a matrix scheduling engine, and a media interface switch (MIS) engine. The first OS boots faster than the second OS. The matrix scheduling engine runs on both OSs and configures the data paths in the switch matrix to couple the processors and the GPUs, and to couple the processors and the peripheral devices. The MIS engine runs on the operating systems, detects presence of the peripheral devices, and configures the data paths in the switch matrix to couple the processors and the peripheral devices.
    Type: Grant
    Filed: November 11, 2009
    Date of Patent: February 5, 2013
    Assignee: Sunman Engineering, Inc.
    Inventors: Allen Nejah, Gholam Reza Golshan, George W. Harvey
  • Publication number: 20130031335
    Abstract: Techniques are described for transmitting predicted output data on a processing element in a stream computing application instead of processing currently received input data. The stream computing application monitors the output of a processing element and determines whether its output is predictable, for example, if the previously transmitted output values are within a predefined range or if one or more input values correlate with the same one or more output values. The application may then generate a predicted output value to transmit from the processing element instead of transmitting a processed output value based on current input values. The predicted output value may be, for example, an average of the previously transmitted output values or a previously transmitted output value that was transmitted in response to a previously received input value that is similar to a currently received input value.
    Type: Application
    Filed: July 26, 2011
    Publication date: January 31, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: John M. Santosuosso, Brandon W. Schulz
  • Publication number: 20130031334
    Abstract: A mechanism is provided for automatically routing network interconnects in a data processing system. A processor in a node of a plurality of nodes receives network topology from neighboring nodes in the plurality of nodes within the data processing system. The processor constructs a system node map that identifies a physical connectivity between the node and the neighboring nodes. The processor programs a switch in the node with a connectivity map that indicates a set of point-to-point connections with the neighboring nodes. The set of point-to-point connections comprise locally-connected connections and pass-through connections.
    Type: Application
    Filed: July 25, 2011
    Publication date: January 31, 2013
    Applicant: International Business Machines Corporation
    Inventors: Wael R. El-Essawy, David A. Papa, Jarrod A. Roy
  • Publication number: 20130031336
    Abstract: An external intrinsic interface. A processor may include a core including a plurality of functional units, an intrinsic module located outside the core, and an interface module to perform relaying between the intrinsic module and a functional unit, among the plurality of functional units.
    Type: Application
    Filed: February 16, 2012
    Publication date: January 31, 2013
    Applicant: Samsung Electronics Co., Ltd.
    Inventors: Kwon Taek KWON, Seok Yoon Jung
  • Publication number: 20130024658
    Abstract: Technology to suppress the drop in SIMD processor efficiency that occurs when exchanging two-dimensional data in a plurality of rectangular regions, between an external section and a plurality of processor elements in an SIMD processor, so that one rectangular region corresponds to one processor element. In the SIMD processor, an address storage unit in a memory controller is capable of setting N number of addresses Ai (i=1 through N) in an external memory by utilizing a control processor. A parameter storage unit is capable of setting a first parameter OSV, a second parameter W, and a third parameter L by utilizing a control processor. A data transfer unit executes the transfer of data between an external memory, and the buffers in N number of processor elements contained in the applicable SIMD processor, based on the contents of the address storage unit and the parameter storage unit.
    Type: Application
    Filed: July 3, 2012
    Publication date: January 24, 2013
    Inventor: Shorin KYO
  • Publication number: 20130024659
    Abstract: In a logically partitioned host computer system comprising host processors (host CPUs) partitioned into a plurality of guest processors (guest CPUs) of a guest configuration, a perform topology function instruction is executed by a guest processor specifying a topology change of the guest configuration. The topology change preferably changes the polarization of guest CPUs, the polarization related to the amount of a host CPU resource is provided to a guest CPU.
    Type: Application
    Filed: September 27, 2012
    Publication date: January 24, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: International Business Machines Corporation
  • Publication number: 20130013891
    Abstract: A hierarchical barrier synchronization of cores and nodes on a multiprocessor system, in one aspect, may include providing by each of a plurality of threads on a chip, input bit signal to a respective bit in a register, in response to reaching a barrier; determining whether all of the plurality of threads reached the barrier by electrically tying bits of the register together and “AND”ing the input bit signals; determining whether only on-chip synchronization is needed or whether inter-node synchronization is needed; in response to determining that all of the plurality of threads on the chip reached the barrier, notifying the plurality of threads on the chip, if it is determined that only on-chip synchronization is needed; and after all of the plurality of threads on the chip reached the barrier, communicating the synchronization signal to outside of the chip, if it is determined that inter-node synchronization is needed.
    Type: Application
    Filed: September 13, 2012
    Publication date: January 10, 2013
    Applicant: International Business Machines Corporation
    Inventors: Valentina Salapura, Robert W. Wisniewski
  • Publication number: 20130013839
    Abstract: A portable handheld device including a CPU for processing a script; a multi-core processor for processing an image; an input buffer for receiving data for processing by the multi-core processor, the input buffer being provided under the control of the multi-core processor to send data thereto; and an output buffer for receiving data processed by the multi-core processor, the output buffer being provided under the control of the multi-core processor to receive data therefrom. The multi-core processor comprises a plurality of micro-coded processing units. The CPU is configured with authority to clear and query the input and output buffers.
    Type: Application
    Filed: September 15, 2012
    Publication date: January 10, 2013
    Inventor: Kia Silverbrook
  • Publication number: 20130007413
    Abstract: Methods and apparatus for accomplishing dynamic frequency/voltage control between at least two processor cores in a multi-processor device or system include receiving busy, idle and wait, time and/or frequency information from a first processor core and receiving busy, idle, wait, time and/or frequency information from a second processor core. The received busy, idle, wait, time and/or frequency information may be correlated to identify patterns of interdependence. The correlated information may be used to determine dynamic frequency/voltage control settings for the first and second processor cores to provide a performance level that accommodates interdependent processes, threads and processor cores. The correlation of received busy, idle, wait, time and/or frequency information may involve generating a consolidated busy/idle pulse train that can then be used to set the frequency or voltage of each processor core independently.
    Type: Application
    Filed: January 5, 2012
    Publication date: January 3, 2013
    Applicant: QUALCOMM INCORPORATED
    Inventors: Steven S. Thomson, Mriganka Mondal, Nishant Hariharan
  • Publication number: 20130007412
    Abstract: A method, system, and computer program product for maintaining reliability in a computer system. In an example embodiment, the method includes managing workloads on a first processor with a first processor architecture by an agent process executing on a second processor with a second processor architecture. The method proceeds by activating redundant computation on the second processor by the agent process. The method continues by performing a same computation from a workload of the workloads at least twice. Finally, the method includes comparing results of the same computation. In this embodiment the first processor is coupled the second processor by a network, and the first processor architecture and second processor architecture are different architectures.
    Type: Application
    Filed: June 28, 2011
    Publication date: January 3, 2013
    Applicant: International Business Machines Corporation
    Inventors: Rajaram B. Krishnamurthy, Carl J. Parris, Donald W. Schmidt, Benjamin P. Segal
  • Publication number: 20120331270
    Abstract: Compressing result data for a compute node in a parallel computer, the parallel computer including a collection of compute nodes organized as a tree, including: initiating a collective gather operation by a logical root of the collection of compute nodes, including adding result data of the logical root to a gather buffer; for each compute node in the collection of compute nodes, determining whether result data of the compute node is already written in the gather buffer; and if the result data of the compute node is already written in the gather buffer, incrementing a counter assigned to that result data already written in the gather buffer; and if the result data of the compute node is not already written in the gather buffer, writing the result data of the compute node as new result data in the gather buffer, incrementing a counter assigned to that new result data, and writing in the gather buffer a node ID.
    Type: Application
    Filed: June 22, 2011
    Publication date: December 27, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Charles J. Archer, James E. Carey, Matthew W. Markland, Philip J. Sanders
  • Publication number: 20120324166
    Abstract: A computer-implemented method for managing processing resources of a computerized system having at least a first processor and a second processor, each of the processors operatively interconnected to a memory storing a set of data to be processed by a processor, the method comprising: monitoring data accessed by the first processor while executing; and if the second processor is at a shorter distance than the first processor from the monitored data, instructing to interrupt execution at the first processor and resume the execution at the second processor.
    Type: Application
    Filed: August 30, 2012
    Publication date: December 20, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Hillery C Hunter, Ronald P. Luijten, Phillip Stanley-Marbell
  • Publication number: 20120317398
    Abstract: A method to reduce buffer capacity in a processor includes giving the data packets admittance to the processor through at least one interface, storing the data packets in at least one input buffer, and using a packet rate shaper outside of a processing pipeline to control flow of the data packets to the pipeline before the data packets enter the pipeline. First and second data packets are given admittance to the pipeline in dependence on cost information per packet that is dependent upon an expected time period of residence of the first data packet in the pipeline. Cost information dependent upon an expected time period of residence of the second data packet in the pipeline differs from said cost information dependent upon the expected time period of residence of the first data packet in the pipeline.
    Type: Application
    Filed: August 15, 2012
    Publication date: December 13, 2012
    Inventors: Thomas Bodén, Jakob Carlström
  • Publication number: 20120317399
    Abstract: A parallel computer including compute nodes, each including two reduction processing cores, a network write processing core, and a network read processing core, each processing core assigned an input buffer. Copying, in interleaved chunks by the reduction processing cores, contents of the reduction processing cores' input buffers to an interleaved buffer in shared memory; copying, by one of the reduction processing cores, contents of the network write processing core's input buffer to shared memory; copying, by another of the reduction processing cores, contents of the network read processing core's input buffer to shared memory; and locally reducing in parallel by the reduction processing cores: the contents of the reduction processing core's input buffer; every other interleaved chunk of the interleaved buffer; the copied contents of the network write processing core's input buffer; and the copied contents of the network read processing core's input buffer.
    Type: Application
    Filed: August 15, 2012
    Publication date: December 13, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michael A. Blocksome, Daniel A. Faraj
  • Patent number: 8332460
    Abstract: A parallel computer including compute nodes, each including two reduction processing cores, a network write processing core, and a network read processing core, each processing core assigned an input buffer. Copying, in interleaved chunks by the reduction processing cores, contents of the reduction processing cores' input buffers to an interleaved buffer in shared memory; copying, by one of the reduction processing cores, contents of the network write processing core's input buffer to shared memory; copying, by another of the reduction processing cores, contents of the network read processing core's input buffer to shared memory; and locally reducing in parallel by the reduction processing cores: the contents of the reduction processing core's input buffer; every other interleaved chunk of the interleaved buffer; the copied contents of the network write processing core's input buffer; and the copied contents of the network read processing core's input buffer.
    Type: Grant
    Filed: April 14, 2010
    Date of Patent: December 11, 2012
    Assignee: International Business Machines Corporation
    Inventors: Michael A. Blocksome, Daniel A. Faraj
  • Publication number: 20120311300
    Abstract: Disclosed is a method of synchronizing a plurality of processors accesses to at least one shared resource. One of a plurality of processors requests an exclusive region lock for a shared resource using a logical block address (LBA) of a dummy target. The LBA is defined in a region map that associates LBAs to shared resources. The exclusive region lock request is inserted as a node in a region lock tree of the dummy target. Access to the shared resource is granted based on a determination whether there is an existing region lock in the region lock tree that is overlapps with the new exclusive region lock request.
    Type: Application
    Filed: June 1, 2011
    Publication date: December 6, 2012
    Inventors: Kapil Sundrani, Lakshmi Kanth Reddy Kakanuru
  • Publication number: 20120311301
    Abstract: In a method of synchronizing data processing of processor arrangement, responsive to reaching, during execution of a program, a barrier included in a program sequence, the processor arrangement halts the program execution until it is determined that all instructions preceding the barrier in the program sequence have been successfully scheduled for execution.
    Type: Application
    Filed: June 8, 2012
    Publication date: December 6, 2012
    Inventors: Martin VORBACH, Volker Baumgarte, Gerd Ehlers, Frank May, Armin Nückel
  • Publication number: 20120303933
    Abstract: The present invention relates to a processor which comprises processing elements that execute instructions in parallel and are connected together with point-to-point communication links called data communication links (DCL). The instructions use DCLs to communicate data between them. In order to realize those communications, they specify the DCLs from which they take their operands, and the DCLs to which they write their results. The DCLs allow the instructions to synchronize their executions and to explicitly manage the data they manipulate. Communications are explicit and are used to realize the storage of temporary variables, which is decoupled from the storage of long-living variables.
    Type: Application
    Filed: January 31, 2011
    Publication date: November 29, 2012
    Inventors: Philippe Manet, Bertrand Rousseau
  • Publication number: 20120303932
    Abstract: A processor includes a plurality of processing tiles, wherein each tile is configured at runtime to perform a configurable operation. A first subset of tiles are configured to perform in a pipeline a first plurality of configurable operations in parallel. A second subset of tiles are configured to perform a second plurality of configurable operations in parallel with the first plurality of configurable operations. The process also includes a multi-port memory access module operably connected to the plurality of tiles via a data bus configured to control access to a memory and to provide data to two or more processing tiles simultaneously. The processor also includes a controller operably connected to the plurality of tiles and the multi-port memory access module via a runtime bus. The processor configures the tiles and the multi-port memory access module to execute a computation.
    Type: Application
    Filed: May 24, 2012
    Publication date: November 29, 2012
    Inventors: Clément Farabet, Yann LeCun
  • Publication number: 20120297164
    Abstract: This invention describes an apparatus, computer architecture, method, operating system, compiler, and application program products for MPEs as well as virtualization in a symmetric MCP. The disclosure is applied to a generic microprocessor architecture with a set (e.g., one or more) of controlling elements (e.g., MPEs) and a set of groups of sub-processing elements (e.g., SPEs). Under this arrangement, MPEs and SPEs are organized in a way that a smaller number MPEs control the behavior of a group of SPEs. The apparatus enables virtualized control threads within MPEs to be assigned to different groups of SPEs for controlling the same. The apparatus further includes a MCP coupled to a power supply coupled with cores to provide a supply voltage to each core (or core group) and controlling-digital elements and multiple instances of sub-processing elements.
    Type: Application
    Filed: July 31, 2012
    Publication date: November 22, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Karl J. Duvalsaint, Harm P. Hofstee, Daeik Kim, Moon J. Kim
  • Publication number: 20120290815
    Abstract: A data processing apparatus causes multiple processors to carry out a first data process in parallel, and when storing the data processed in parallel in a storage unit, converts the addresses of the data into addresses in the storage unit based on the data cache size of the multiple processors and stores the data. The data stored in the storage unit is then read out, and a second data process is carried out on the read-out data.
    Type: Application
    Filed: April 10, 2012
    Publication date: November 15, 2012
    Applicant: CANON KABUSHIKI KAISHA
    Inventor: Hirokazu Takahashi
  • Patent number: 8312455
    Abstract: A method for optimizing execution of a single threaded program on a multi-core processor. The method includes dividing the single threaded program into a plurality of discretely executable components while compiling the single threaded program; identifying at least some of the plurality of discretely executable components for execution by an idle core within the multi-core processor; and enabling execution of the at least one of the plurality of discretely executable components on the idle core.
    Type: Grant
    Filed: December 19, 2007
    Date of Patent: November 13, 2012
    Assignee: International Business Machines Corporation
    Inventors: Robert H. Bell, Jr., Louis Bennie Capps, Jr., Michael A. Paolini, Michael Jay Shapiro
  • Publication number: 20120278589
    Abstract: The present invention provides a storage system in which each microprocessor is able to execute synchronous processing and asynchronous processing in accordance with the operating status of the storage system. Any one attribute, from among multiple attributes (operating modes) prepared beforehand, is set in each microprocessor in accordance with the operating status of the storage system. The attribute that is set in each microprocessor is regularly reviewed and changed.
    Type: Application
    Filed: June 17, 2010
    Publication date: November 1, 2012
    Applicant: HITACHI, LTD.
    Inventors: Tomohiro Yoshihara, Shintaro Kudo, Norio Shimozono
  • Publication number: 20120278590
    Abstract: A reconfigurable processor is provided. The reconfigurable processor includes a plurality of functional blocks configured to perform corresponding operations. The reconfigurable processor also includes one or more data inputs coupled to the plurality of functional blocks to provide one or more operands to the plurality of functional blocks, and one or more data outputs to provide at least one result outputted from the plurality of functional blocks.
    Type: Application
    Filed: January 7, 2011
    Publication date: November 1, 2012
    Applicant: SHANGHAI XIN HAO MICRO ELECTRONICS CO. LTD.
    Inventors: Kenneth Chenghao Lin, Zhongmin Zhang, Haoqi Ren