Operation Patents (Class 712/30)

Master/slave (Class 712/31)

Dynamic priority queuing

Patent number: 8468534

Abstract: Techniques are provided for dynamically re-ordering operation requests that have previously been submitted to a queue management unit. After the queue management unit has placed multiple requests in a queue to be executed in an order that is based on priorities that were assigned to the operations, the entity that requested the operations (the “requester”) sends one or more priority-change messages. The one or more priority-change messages include requests to perform operations that have already been queued. For at least one of the operations, the priority assigned to the operation in the subsequent request is different from the priority that was assigned to the same operation when that operation was initially queued for execution. Based on the change in priority, the operation whose priority has change is placed at a different location in the queue, relative to the other operations in the queue that were requested by the same requester.

Type: Grant

Filed: April 5, 2010

Date of Patent: June 18, 2013

Assignee: Apple Inc.

Inventor: Brian R. Tunning
MULTI-CORE PROCESSOR

Publication number: 20130151814

Abstract: A multi-core processor includes a monitored processor core whose process result is to be monitored; a monitoring processor core group including two or more monitoring processors which can perform a process for monitoring the monitored processor core; an evaluating part configured to evaluate a processing load of the monitoring processor core group; and a controlling part configured to make the monitoring processor core group perform the process for monitoring the monitored processor core in a distributed manner if the processing load of the monitoring processor core group evaluated by the evaluating part is low, and make the monitoring processor of the monitoring processor core group perform the process for monitoring the monitored processor core if the processing load of the monitoring processor core group evaluated by the evaluating part is high, the monitoring processor performing a process whose priority is relatively low.

Type: Application

Filed: December 13, 2011

Publication date: June 13, 2013

Applicant: Toyota Jidosha Kabushiki Kaisha

Inventor: Koji Ueda
Method and apparatus for computing massive spatio-temporal correlations using a hybrid CPU-GPU approach

Patent number: 8464026

Abstract: A CPU may select a variable from a variable set as a dependent variable. The variable set may be part of the data structure that includes a plurality of vector values, a vector value associated with a variable set of n number of variables, and each variable of the variable set having a variable value. The number of dependent variable steps for the dependent variable may be determined. The number of the vector values in a dependent variable step is determined as being number of independent variables. A function is mapped to a plurality of thread processors, and each thread processor is assigned for the function to be performed on each one of the independent variables for each of the dependent variable steps.

Type: Grant

Filed: February 17, 2010

Date of Patent: June 11, 2013

Assignee: International Business Machines Corporation

Inventors: Rajesh Ramkrishna Bordawekar, Ravishankar Rao
Performing a local reduction operation on a parallel computer

Patent number: 8458244

Abstract: A parallel computer including compute nodes, each including two reduction processing cores, a network write processing core, and a network read processing core, each processing core assigned an input buffer. Copying, in interleaved chunks by the reduction processing cores, contents of the reduction processing cores' input buffers to an interleaved buffer in shared memory; copying, by one of the reduction processing cores, contents of the network write processing core's input buffer to shared memory; copying, by another of the reduction processing cores, contents of the network read processing core's input buffer to shared memory; and locally reducing in parallel by the reduction processing cores: the contents of the reduction processing core's input buffer; every other interleaved chunk of the interleaved buffer; the copied contents of the network write processing core's input buffer; and the copied contents of the network read processing core's input buffer.

Type: Grant

Filed: August 15, 2012

Date of Patent: June 4, 2013

Assignee: International Business Machines Corporation

Inventors: Michael A. Blocksome, Daniel A. Faraj
METHOD AND APPARATUS FOR PACKET PROCESSING AND A PREPROCESSOR

Publication number: 20130138920

Abstract: An apparatus for packet processing is provided. The apparatus is to be implemented in a server and includes: a preprocessor and at least two processors which are respectively connected with the preprocessor. The preprocessor is to classify packets received externally from the server, and to distribute the classified packets to the respective processors, wherein packets in a same flow are distributed to a same processor. Each of the processors is to receive and process a packet distributed by the preprocessor.

Type: Application

Filed: August 11, 2011

Publication date: May 30, 2013

Applicant: Hangzhou H3C Technologies, Co., Ltd.

Inventor: Changzhong Ge
Workflow control of reservations and regular jobs using a flexible job scheduler

Patent number: 8453152

Abstract: A scheduler receives at least one flexible reservation request for scheduling in a computing environment comprising consumable resources. The flexible reservation request specifies a duration and at least one required resource. The consumable resources comprise at least one machine resource and at least one floating resource. The scheduler creates a flexible job for the at least one flexible reservation request and places the flexible job in a prioritized job queue for scheduling, wherein the flexible job is prioritizes relative to at least one regular job in the prioritized job queue. The scheduler adds a reservation set to a waiting state for the at least one flexible reservation request.

Type: Grant

Filed: February 1, 2011

Date of Patent: May 28, 2013

Assignee: International Business Machines Corporation

Inventors: Alexander Druyan, Wei Li, Kailash N. Marthi, Yun T. Xiang, Linda C. Cham
Information processing device, information processing method, and recording medium

Patent number: 8448174

Abstract: An information processing device which has a plurality of process units for performing various kinds of processes includes a detecting unit that detects a processing loads of the process units; a determining unit that determines whether a total amount of the processing loads detected by the detecting unit is equal to or larger than a specific value; a designating unit that designates a process unit having a process state to be controlled, based on the processing loads of the process units detected by the detecting unit, when the determining unit determines that the total amount is equal to or larger than the specific value; a process identifying unit that identifies a process having an execution state to be controlled among processes being performed by the process unit designated by the designating unit; and a control unit that controls the execution state of the process identified by the process identifying unit.

Type: Grant

Filed: January 22, 2010

Date of Patent: May 21, 2013

Assignee: Fujitsu Limited

Inventors: Ryo Miyamoto, Ryuichi Matsukura, Takashi Ohno
Microprocessor with first processor for debugging second processor

Patent number: 8443175

Abstract: A microprocessor integrated circuit includes first and second processors, an internal memory accessible by the first and second processors, and a bus interface unit configured to interface to a bus external to the microprocessor for providing access to a memory external to the microprocessor. The bus interface unit, external bus, and external memory are accessible by the second processor but are inaccessible by the first processor. The first processor writes debug information to the internal memory. The first processor detects an event and provides a notification of the event to the second processor. The second processor, coupled to the bus interface unit, executes microcode in response to the event notification received from the first processor. The microcode reads the debug information from the internal memory and writes the debug information to the external memory via the bus interface unit and external bus for use in debugging the second processor.

Type: Grant

Filed: March 29, 2010

Date of Patent: May 14, 2013

Assignee: VIA Technologies, Inc.

Inventors: G. Glenn Henry, Jui-Shuan Chen
COPROCESSOR HAVING TASK SEQUENCE CONTROL

Publication number: 20130117533

Abstract: A coprocessor has: a processing unit for processing tasks in a data-processing system subject to at least one master processor; at least one storage module having memory areas, assignable in each case to the tasks, for storing data assigned to the tasks; and a buffer area for buffering instructions assigned to the tasks, the instructions including processing instructions, and upon retrieval of the processing instructions from the buffer area, the data stored in the storage module being processed on the basis of the processing instructions.

Type: Application

Filed: April 6, 2011

Publication date: May 9, 2013

Inventor: Jan Hayek
Main processing element for delegating virtualized control threads controlling clock speed and power consumption to groups of sub-processing elements in a system such that a group of sub-processing elements can be designated as pseudo main processing element

Patent number: 8438404

Abstract: The disclosure is applied to a generic microprocessor architecture with a set (e.g., one or more) of controlling elements (e.g., MPEs) and a set of groups of sub-processing elements (e.g., SPEs). Under this arrangement, MPEs and SPEs are organized in a way that a smaller number MPEs control the behavior of a group of SPEs using program code embodied as a set of virtualized control threads. The arrangement also enables MPEs delegate functionality to one or more groups of SPEs such that those group(s) of SPEs will act as pseudo MPEs. The pseudo MPEs will utilize pseudo virtualized control threads to control the behavior of other groups of SPEs. In a typical embodiment, the apparatus includes a MCP coupled to a power supply coupled with cores to provide a supply voltage to each core (or core group) and controlling-digital elements and multiple instances of sub-processing elements.

Type: Grant

Filed: September 30, 2008

Date of Patent: May 7, 2013

Assignee: International Business Machines Corporation

Inventors: Karl J. Duvalsaint, Harm P. Hofstee, Daeik Kim, Moon J. Kim
Method, Apparatus, And System For Optimizing Frequency And Performance In A Multidie Microprocessor

Publication number: 20130103928

Abstract: With the progress toward multi-core processors, each core is can not readily ascertain the status of the other dies with respect to an idle or active status. A proposal for utilizing an interface to transmit core status among multiple cores in a multi-die microprocessor is discussed. Consequently, this facilitates thermal management by allowing an optimal setting for setting performance and frequency based on utilizing each core status.

Type: Application

Filed: December 11, 2012

Publication date: April 25, 2013

Inventors: Jose P. Allarey, Varghese George, Sanjeev Jahagirdar, Oren Lamdan, Nathan Ofer, Tomer Ziv
CHARACTERIZATION AND VALIDATION OF PROCESSOR LINKS

Publication number: 20130103927

Abstract: A processor link that couples a first processor and a second processor is selected for validation and a plurality of communication parameter settings associated with the first and the second processors is identified. The first and the second processors are successively configured with each of the communication parameter settings. One or more test data pattern(s) are provided from the first processor to the second processor in accordance with the communication parameter setting. Performance measurements associated with the selected processor link and with the communication parameter setting are determined based, at least in part, on the test data pattern as received at the second processor. One of the communication parameter settings that is associated with the highest performance measurements is selected. The selected communication parameter setting is applied to the first and the second processors for subsequent communication between the first and the second processors via the processor link.

Type: Application

Filed: October 25, 2011

Publication date: April 25, 2013

Applicant: International Business Machines Corporation

Inventors: Robert W. Berry, JR., Anand Haridass, Prasanna Jayaraman
UNIFIED, WORKLOAD-OPTIMIZED, ADAPTIVE RAS FOR HYBRID SYSTEMS

Publication number: 20130097407

Abstract: A method, system, and computer program product for maintaining reliability in a computer system. In an example embodiment, the method includes managing workloads on a first processor with a first processor architecture by an agent process executing on a second processor with a second processor architecture. The method proceeds by activating redundant computation on the second processor by the agent process. The method continues by performing a same computation from a workload of the workloads at least twice. Finally, the method includes comparing results of the same computation. In this embodiment the first processor is coupled the second processor by a network, and the first processor architecture and second processor architecture are different architectures.

Type: Application

Filed: December 8, 2012

Publication date: April 18, 2013

Applicant: International Business Machines Corporation

Inventor: International Business Machines Corporation
CLUSTER COMPUTING USING SPECIAL PURPOSE MICROPROCESSORS

Publication number: 20130097406

Abstract: In some embodiments, a computer cluster system comprises a plurality of nodes and a software package comprising a user interface and a kernel for interpreting program code instructions. In certain embodiments, a cluster node module is configured to communicate with the kernel and other cluster node modules. The cluster node module can accept instructions from the user interface and can interpret at least some of the instructions such that several cluster node modules in communication with one another and with a kernel can act as a computer cluster.

Type: Application

Filed: March 16, 2012

Publication date: April 18, 2013

Inventors: Zvi Tannenbaum, Dean E. Dauger
Sequential processing in network on chip nodes by threads generating message containing payload and pointer for nanokernel to access algorithm to be executed on payload in another node

Patent number: 8423749

Abstract: A computer-implemented method, system and computer program product for controlling an algorithm that is performed on a unit of work in a subsequent software pipeline stage in a Network On a Chip (NOC) is presented. In one embodiment, the method executes a first operation in a first node of the NOC. The first node generates payload, and then loads that payload into a message. The message with the payload is transmitted to a nanokernel that controls a second node in the NOC. The nanokernel calls an algorithm that is needed by a second operation in a second node in the NOC, which uses the algorithm to execute the second operation.

Type: Grant

Filed: October 22, 2008

Date of Patent: April 16, 2013

Assignee: International Business Machines Corporation

Inventors: Eric O. Mejdrich, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
PARALLEL COMPUTER ARCHITECTURE FOR COMPUTATION OF PARTICLE INTERACTIONS

Publication number: 20130091341

Abstract: A computation system for computing interactions in a multiple-body simulation includes an array of processing modules arranged into one or more serially interconnected processing groups of the processing modules. Each of the processing modules includes storage for data elements and includes circuitry for performing pairwise computations between data elements each associated with a spatial location. Each of the pairwise computations makes use of a data element from the storage of the processing module and a data element passing through the serially interconnected processing modules. Each of the processing modules includes circuitry for selecting the pairs of data elements according to separations between spatial locations associated with the data elements.

Type: Application

Filed: November 19, 2012

Publication date: April 11, 2013

Applicant: D.E. Shaw Research LLC

Inventors: David E. Shaw, Martin M. Deneroff, Ron O. Dror, Richard H. Larson, John K. Salmon
Assigning different serialization identifier to operations on different data set for execution in respective processor in multi-processor system

Patent number: 8417919

Abstract: A method of dynamic parallelization in a multi-processor identifies potentially independent computational operations, such as functions and methods, with a serializer that assigns a computational operation to a serialization set and a processor based on assessment of the data that the computational operation will be accessing upon execution.

Type: Grant

Filed: August 18, 2009

Date of Patent: April 9, 2013

Assignee: Wisconsin Alumni Research Foundation

Inventors: Matthew Allen, Gurindar S. Sohi
Distributed Data Scalable Adaptive Map-Reduce Framework

Publication number: 20130086355

Abstract: A method, an apparatus and an article of manufacture for generating a distributed data scalable adaptive map-reduce framework for at least one multi-core cluster. The method includes partitioning a cluster into at least one computational group, determining at least one key-group leader within each computational group, performing a local combine operation at each computational group, performing a global combine operation at each of the at least one key-group leader within each computational group based on a result from the local combine operation, and performing a global map-reduce operation across the at least one key-group leader within each computational group.

Type: Application

Filed: September 30, 2011

Publication date: April 4, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Ankur Narang, Jyothish Soman
Distributed Data Scalable Adaptive Map-Reduce Framework

Publication number: 20130086356

Abstract: A method for generating a distributed data scalable adaptive map-reduce framework for at least one multi-core cluster. The method includes partitioning a cluster into at least one computational group, determining at least one key-group leader within each computational group, performing a local combine operation at each computational group, performing a global combine operation at each of the at least one key-group leader within each computational group based on a result from the local combine operation, and performing a global map-reduce operation across the at least one key-group leader within each computational group.

Type: Application

Filed: August 1, 2012

Publication date: April 4, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Ankur Narang, Jyothish Soman
PERFORMING A DETERMINISTIC REDUCTION OPERATION IN A PARALLEL COMPUTER

Publication number: 20130073832

Abstract: A parallel computer that includes compute nodes having computer processors and a CAU (Collectives Acceleration Unit) that couples processors to one another for data communications. In embodiments of the present invention, deterministic reduction operation include: organizing processors of the parallel computer and a CAU into a branched tree topology, where the CAU is a root of the branched tree topology and the processors are children of the root CAU; establishing a receive buffer that includes receive elements associated with processors and configured to store the associated processor's contribution data; receiving, in any order from the processors, each processor's contribution data; tracking receipt of each processor's contribution data; and reducing, the contribution data in a predefined order, only after receipt of contribution data from all processors in the branched tree topology.

Type: Application

Filed: November 1, 2012

Publication date: March 21, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: International Business Machines Corporation
COMPRESSING RESULT DATA FOR A COMPUTE NODE IN A PARALLEL COMPUTER

Publication number: 20130067198

Abstract: A parallel computer is provided that includes a collection of compute nodes organized as a tree, including: initiating a collective gather operation by a logical root of the collection of compute nodes, including adding result data of the logical root to a gather buffer; for each compute node in the collection of compute nodes, determining whether result data of the compute node is already written in the gather buffer; and if the result data of the compute node is already written in the gather buffer, incrementing a counter assigned to that result data already written in the gather buffer; and if the result data of the compute node is not already written in the gather buffer, writing the result data of the compute node as new result data in the gather buffer, incrementing a counter assigned to that new result data, and writing in the gather buffer a node ID.

Type: Application

Filed: November 1, 2012

Publication date: March 14, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: International Business Machines Corporation
Massively Multicore Processor and Operating System to Manage Strands in Hardware

Publication number: 20130061078

Abstract: A computing apparatus and corresponding method for operating are disclosed. The computing apparatus may comprise a set of interconnected central processing units (CPUs). Each CPU may embed an operating system including a kernel comprising a protocol stack. At least one of the CPUs may further embed executable instructions for allocating multiple strands among the rest of the CPUs. The protocol stack may comprise a Transmission Control Protocol/Internet Protocol (TCP/IP), a User Datagram Protocol/Internet Protocol (UDP/IP) stack, an Internet Control Message Protocol (ICMP) stack or any other suitable Internet protocol. The method for operating the computing apparatus may comprise receiving input/output (I/O) requests, generating multiple strands according to the I/O requests, and allocating the multiple strands to one or more CPUs.

Type: Application

Filed: December 21, 2011

Publication date: March 7, 2013

Inventor: Ian Henry Stuart Cullimore
System and Apparatus Modeling Processor Workloads Using Virtual Pulse Chains

Publication number: 20130060555

Abstract: Methods and apparatus for controlling at least two processing cores in a multi-processor device or system include accessing an operating system run queue to generate virtual pulse trains for each core and correlating the virtual pulse trains to identify patterns of interdependence. The correlated information may be used to determine dynamic frequency/voltage control settings for the first and second processing cores to provide a performance level that accommodates interdependent processes, threads and processing cores.

Type: Application

Filed: February 27, 2012

Publication date: March 7, 2013

Applicant: QUALCOMM INCORPORATED

Inventors: Steven S. Thomson, Edoardo Regini, Mriganka Mondal, Nishant Hariharan
Dynamic thread pool management

Patent number: 8381216

Abstract: Dynamically managing a thread pool associated with a plurality of sub-applications. A request for at least one of the sub-applications is received. A quantity of threads currently assigned to the at least one of the sub-applications is determined. The determined quantity of threads is compared to a predefined maximum thread threshold. A thread in the thread pool is assigned to handle the received request if the determined quantity of threads is not greater than the predefined maximum thread threshold. Embodiments enable control of the quantity of threads within the thread pool assigned to each of the sub-applications. Further embodiments manage the threads for the sub-applications based on latency of the sub-applications.

Type: Grant

Filed: March 5, 2010

Date of Patent: February 19, 2013

Assignee: Microsoft Corporation

Inventor: Rohith Thammana Gowda
Collective Operation Protocol Selection In A Parallel Computer

Publication number: 20130042088

Abstract: Collective operation protocol selection in a parallel computer that includes compute nodes may be carried out by calling a collective operation with operating parameters; selecting a protocol for executing the operation and executing the operation with the selected protocol. Selecting a protocol includes: iteratively, until a prospective protocol meets predetermined performance criteria: providing, to a protocol performance function for the prospective protocol, the operating parameters; determining whether the prospective protocol meets predefined performance criteria by evaluating a predefined performance fit equation, calculating a measure of performance of the protocol for the operating parameters; determining that the prospective protocol meets predetermined performance criteria and selecting the protocol for executing the operation only if the calculated measure of performance is greater than a predefined minimum performance threshold.

Type: Application

Filed: August 9, 2011

Publication date: February 14, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Charles J. Archer, Michael A. Blocksome, Joseph D. Ratterman, Brian E. Smith
Performing an allreduce operation on a plurality of compute nodes of a parallel computer

Patent number: 8375197

Abstract: Methods, apparatus, and products are disclosed for performing an allreduce operation on a plurality of compute nodes of a parallel computer, each node including at least two processing cores, that include: performing, for each node, a local reduction operation using allreduce contribution data for the cores of that node, yielding, for each node, a local reduction result for one or more representative cores for that node; establishing one or more logical rings among the nodes, each logical ring including only one of the representative cores from each node; performing, for each logical ring, a global allreduce operation using the local reduction result for the representative cores included in that logical ring, yielding a global allreduce result for each representative core included in that logical ring; and performing, for each node, a local broadcast operation using the global allreduce results for each representative core on that node.

Type: Grant

Filed: May 21, 2008

Date of Patent: February 12, 2013

Assignee: International Business Machines Corporation

Inventor: Ahmad Faraj
Computer architecture for a mobile communication platform

Patent number: 8370605

Abstract: A system includes first and second processors, first and second graphics processing units (GPUs), one or more peripheral devices, a switch matrix, and processor-readable memory. The switch matrix comprises programmable data paths between the processors, the GPUs, and the peripheral devices. Software encoded in the process-readable memory includes a first operating system (OS) executed by the first processor, a second OS executed by the second processor, a matrix scheduling engine, and a media interface switch (MIS) engine. The first OS boots faster than the second OS. The matrix scheduling engine runs on both OSs and configures the data paths in the switch matrix to couple the processors and the GPUs, and to couple the processors and the peripheral devices. The MIS engine runs on the operating systems, detects presence of the peripheral devices, and configures the data paths in the switch matrix to couple the processors and the peripheral devices.

Type: Grant

Filed: November 11, 2009

Date of Patent: February 5, 2013

Assignee: Sunman Engineering, Inc.

Inventors: Allen Nejah, Gholam Reza Golshan, George W. Harvey
USING PREDICTIVE DETERMINISM WITHIN A STREAMING ENVIRONMENT

Publication number: 20130031335

Abstract: Techniques are described for transmitting predicted output data on a processing element in a stream computing application instead of processing currently received input data. The stream computing application monitors the output of a processing element and determines whether its output is predictable, for example, if the previously transmitted output values are within a predefined range or if one or more input values correlate with the same one or more output values. The application may then generate a predicted output value to transmit from the processing element instead of transmitting a processed output value based on current input values. The predicted output value may be, for example, an average of the previously transmitted output values or a previously transmitted output value that was transmitted in response to a previously received input value that is similar to a currently received input value.

Type: Application

Filed: July 26, 2011

Publication date: January 31, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: John M. Santosuosso, Brandon W. Schulz
Automatically Routing Super-Compute Interconnects

Publication number: 20130031334

Abstract: A mechanism is provided for automatically routing network interconnects in a data processing system. A processor in a node of a plurality of nodes receives network topology from neighboring nodes in the plurality of nodes within the data processing system. The processor constructs a system node map that identifies a physical connectivity between the node and the neighboring nodes. The processor programs a switch in the node with a connectivity map that indicates a set of point-to-point connections with the neighboring nodes. The set of point-to-point connections comprise locally-connected connections and pass-through connections.

Type: Application

Filed: July 25, 2011

Publication date: January 31, 2013

Applicant: International Business Machines Corporation

Inventors: Wael R. El-Essawy, David A. Papa, Jarrod A. Roy
EXTERNAL INTRINSIC INTERFACE

Publication number: 20130031336

Abstract: An external intrinsic interface. A processor may include a core including a plurality of functional units, an intrinsic module located outside the core, and an interface module to perform relaying between the intrinsic module and a functional unit, among the plurality of functional units.

Type: Application

Filed: February 16, 2012

Publication date: January 31, 2013

Applicant: Samsung Electronics Co., Ltd.

Inventors: Kwon Taek KWON, Seok Yoon Jung
MEMORY CONTROLLER AND SIMD PROCESSOR

Publication number: 20130024658

Abstract: Technology to suppress the drop in SIMD processor efficiency that occurs when exchanging two-dimensional data in a plurality of rectangular regions, between an external section and a plurality of processor elements in an SIMD processor, so that one rectangular region corresponds to one processor element. In the SIMD processor, an address storage unit in a memory controller is capable of setting N number of addresses Ai (i=1 through N) in an external memory by utilizing a control processor. A parameter storage unit is capable of setting a first parameter OSV, a second parameter W, and a third parameter L by utilizing a control processor. A data transfer unit executes the transfer of data between an external memory, and the buffers in N number of processor elements contained in the applicable SIMD processor, based on the contents of the address storage unit and the parameter storage unit.

Type: Application

Filed: July 3, 2012

Publication date: January 24, 2013

Inventor: Shorin KYO
Executing An Instruction for Performing a Configuration Virtual Topology Change

Publication number: 20130024659

Abstract: In a logically partitioned host computer system comprising host processors (host CPUs) partitioned into a plurality of guest processors (guest CPUs) of a guest configuration, a perform topology function instruction is executed by a guest processor specifying a topology change of the guest configuration. The topology change preferably changes the polarization of guest CPUs, the polarization related to the amount of a host CPU resource is provided to a guest CPU.

Type: Application

Filed: September 27, 2012

Publication date: January 24, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: International Business Machines Corporation
METHOD AND APPARATUS FOR A HIERARCHICAL SYNCHRONIZATION BARRIER IN A MULTI-NODE SYSTEM

Publication number: 20130013891

Abstract: A hierarchical barrier synchronization of cores and nodes on a multiprocessor system, in one aspect, may include providing by each of a plurality of threads on a chip, input bit signal to a respective bit in a register, in response to reaching a barrier; determining whether all of the plurality of threads reached the barrier by electrically tying bits of the register together and “AND”ing the input bit signals; determining whether only on-chip synchronization is needed or whether inter-node synchronization is needed; in response to determining that all of the plurality of threads on the chip reached the barrier, notifying the plurality of threads on the chip, if it is determined that only on-chip synchronization is needed; and after all of the plurality of threads on the chip reached the barrier, communicating the synchronization signal to outside of the chip, if it is determined that inter-node synchronization is needed.

Type: Application

Filed: September 13, 2012

Publication date: January 10, 2013

Applicant: International Business Machines Corporation

Inventors: Valentina Salapura, Robert W. Wisniewski
MULTI-CORE IMAGE PROCESSOR FOR PORTABLE DEVICE

Publication number: 20130013839

Abstract: A portable handheld device including a CPU for processing a script; a multi-core processor for processing an image; an input buffer for receiving data for processing by the multi-core processor, the input buffer being provided under the control of the multi-core processor to send data thereto; and an output buffer for receiving data processed by the multi-core processor, the output buffer being provided under the control of the multi-core processor to receive data therefrom. The multi-core processor comprises a plurality of micro-coded processing units. The CPU is configured with authority to clear and query the input and output buffers.

Type: Application

Filed: September 15, 2012

Publication date: January 10, 2013

Inventor: Kia Silverbrook
System and Apparatus For Consolidated Dynamic Frequency/Voltage Control

Publication number: 20130007413

Abstract: Methods and apparatus for accomplishing dynamic frequency/voltage control between at least two processor cores in a multi-processor device or system include receiving busy, idle and wait, time and/or frequency information from a first processor core and receiving busy, idle, wait, time and/or frequency information from a second processor core. The received busy, idle, wait, time and/or frequency information may be correlated to identify patterns of interdependence. The correlated information may be used to determine dynamic frequency/voltage control settings for the first and second processor cores to provide a performance level that accommodates interdependent processes, threads and processor cores. The correlation of received busy, idle, wait, time and/or frequency information may involve generating a consolidated busy/idle pulse train that can then be used to set the frequency or voltage of each processor core independently.

Type: Application

Filed: January 5, 2012

Publication date: January 3, 2013

Applicant: QUALCOMM INCORPORATED

Inventors: Steven S. Thomson, Mriganka Mondal, Nishant Hariharan
UNIFIED, WORKLOAD-OPTIMIZED, ADAPTIVE RAS FOR HYBRID SYSTEMS

Publication number: 20130007412

Abstract: A method, system, and computer program product for maintaining reliability in a computer system. In an example embodiment, the method includes managing workloads on a first processor with a first processor architecture by an agent process executing on a second processor with a second processor architecture. The method proceeds by activating redundant computation on the second processor by the agent process. The method continues by performing a same computation from a workload of the workloads at least twice. Finally, the method includes comparing results of the same computation. In this embodiment the first processor is coupled the second processor by a network, and the first processor architecture and second processor architecture are different architectures.

Type: Application

Filed: June 28, 2011

Publication date: January 3, 2013

Applicant: International Business Machines Corporation

Inventors: Rajaram B. Krishnamurthy, Carl J. Parris, Donald W. Schmidt, Benjamin P. Segal
Compressing Result Data For A Compute Node In A Parallel Computer

Publication number: 20120331270

Abstract: Compressing result data for a compute node in a parallel computer, the parallel computer including a collection of compute nodes organized as a tree, including: initiating a collective gather operation by a logical root of the collection of compute nodes, including adding result data of the logical root to a gather buffer; for each compute node in the collection of compute nodes, determining whether result data of the compute node is already written in the gather buffer; and if the result data of the compute node is already written in the gather buffer, incrementing a counter assigned to that result data already written in the gather buffer; and if the result data of the compute node is not already written in the gather buffer, writing the result data of the compute node as new result data in the gather buffer, incrementing a counter assigned to that new result data, and writing in the gather buffer a node ID.

Type: Application

Filed: June 22, 2011

Publication date: December 27, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Charles J. Archer, James E. Carey, Matthew W. Markland, Philip J. Sanders
COMPUTER-IMPLEMENTED METHOD OF PROCESSING RESOURCE MANAGEMENT

Publication number: 20120324166

Abstract: A computer-implemented method for managing processing resources of a computerized system having at least a first processor and a second processor, each of the processors operatively interconnected to a memory storing a set of data to be processed by a processor, the method comprising: monitoring data accessed by the first processor while executing; and if the second processor is at a shorter distance than the first processor from the monitored data, instructing to interrupt execution at the first processor and resume the execution at the second processor.

Type: Application

Filed: August 30, 2012

Publication date: December 20, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Hillery C Hunter, Ronald P. Luijten, Phillip Stanley-Marbell
METHOD FOR REDUCING BUFFER CAPACITY IN A PIPELINE PROCESSOR

Publication number: 20120317398

Abstract: A method to reduce buffer capacity in a processor includes giving the data packets admittance to the processor through at least one interface, storing the data packets in at least one input buffer, and using a packet rate shaper outside of a processing pipeline to control flow of the data packets to the pipeline before the data packets enter the pipeline. First and second data packets are given admittance to the pipeline in dependence on cost information per packet that is dependent upon an expected time period of residence of the first data packet in the pipeline. Cost information dependent upon an expected time period of residence of the second data packet in the pipeline differs from said cost information dependent upon the expected time period of residence of the first data packet in the pipeline.

Type: Application

Filed: August 15, 2012

Publication date: December 13, 2012

Inventors: Thomas Bodén, Jakob Carlström
Performing A Local Reduction Operation On A Parallel Computer

Publication number: 20120317399

Abstract: A parallel computer including compute nodes, each including two reduction processing cores, a network write processing core, and a network read processing core, each processing core assigned an input buffer. Copying, in interleaved chunks by the reduction processing cores, contents of the reduction processing cores' input buffers to an interleaved buffer in shared memory; copying, by one of the reduction processing cores, contents of the network write processing core's input buffer to shared memory; copying, by another of the reduction processing cores, contents of the network read processing core's input buffer to shared memory; and locally reducing in parallel by the reduction processing cores: the contents of the reduction processing core's input buffer; every other interleaved chunk of the interleaved buffer; the copied contents of the network write processing core's input buffer; and the copied contents of the network read processing core's input buffer.

Type: Application

Filed: August 15, 2012

Publication date: December 13, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Michael A. Blocksome, Daniel A. Faraj
Performing a local reduction operation on a parallel computer

Patent number: 8332460

Abstract: A parallel computer including compute nodes, each including two reduction processing cores, a network write processing core, and a network read processing core, each processing core assigned an input buffer. Copying, in interleaved chunks by the reduction processing cores, contents of the reduction processing cores' input buffers to an interleaved buffer in shared memory; copying, by one of the reduction processing cores, contents of the network write processing core's input buffer to shared memory; copying, by another of the reduction processing cores, contents of the network read processing core's input buffer to shared memory; and locally reducing in parallel by the reduction processing cores: the contents of the reduction processing core's input buffer; every other interleaved chunk of the interleaved buffer; the copied contents of the network write processing core's input buffer; and the copied contents of the network read processing core's input buffer.

Type: Grant

Filed: April 14, 2010

Date of Patent: December 11, 2012

Assignee: International Business Machines Corporation

Inventors: Michael A. Blocksome, Daniel A. Faraj
MULTIPROCESSOR SYNCHRONIZATION USING REGION LOCKS

Publication number: 20120311300

Abstract: Disclosed is a method of synchronizing a plurality of processors accesses to at least one shared resource. One of a plurality of processors requests an exclusive region lock for a shared resource using a logical block address (LBA) of a dummy target. The LBA is defined in a region map that associates LBAs to shared resources. The exclusive region lock request is inserted as a node in a region lock tree of the dummy target. Access to the shared resource is granted based on a determination whether there is an existing region lock in the region lock tree that is overlapps with the new exclusive region lock request.

Type: Application

Filed: June 1, 2011

Publication date: December 6, 2012

Inventors: Kapil Sundrani, Lakshmi Kanth Reddy Kakanuru
PIPELINE CONFIGURATION PROTOCOL AND CONFIGURATION UNIT COMMUNICATION

Publication number: 20120311301

Abstract: In a method of synchronizing data processing of processor arrangement, responsive to reaching, during execution of a program, a barrier included in a program sequence, the processor arrangement halts the program execution until it is determined that all instructions preceding the barrier in the program sequence have been successfully scheduled for execution.

Type: Application

Filed: June 8, 2012

Publication date: December 6, 2012

Inventors: Martin VORBACH, Volker Baumgarte, Gerd Ehlers, Frank May, Armin Nückel
TILE-BASED PROCESSOR ARCHITECTURE MODEL FOR HIGH-EFFICIENCY EMBEDDED HOMOGENEOUS MULTICORE PLATFORMS

Publication number: 20120303933

Abstract: The present invention relates to a processor which comprises processing elements that execute instructions in parallel and are connected together with point-to-point communication links called data communication links (DCL). The instructions use DCLs to communicate data between them. In order to realize those communications, they specify the DCLs from which they take their operands, and the DCLs to which they write their results. The DCLs allow the instructions to synchronize their executions and to explicitly manage the data they manipulate. Communications are explicit and are used to realize the storage of temporary variables, which is decoupled from the storage of long-living variables.

Type: Application

Filed: January 31, 2011

Publication date: November 29, 2012

Inventors: Philippe Manet, Bertrand Rousseau
RUNTIME RECONFIGURABLE DATAFLOW PROCESSOR

Publication number: 20120303932

Abstract: A processor includes a plurality of processing tiles, wherein each tile is configured at runtime to perform a configurable operation. A first subset of tiles are configured to perform in a pipeline a first plurality of configurable operations in parallel. A second subset of tiles are configured to perform a second plurality of configurable operations in parallel with the first plurality of configurable operations. The process also includes a multi-port memory access module operably connected to the plurality of tiles via a data bus configured to control access to a memory and to provide data to two or more processing tiles simultaneously. The processor also includes a controller operably connected to the plurality of tiles and the multi-port memory access module via a runtime bus. The processor configures the tiles and the multi-port memory access module to execute a computation.

Type: Application

Filed: May 24, 2012

Publication date: November 29, 2012

Inventors: Clément Farabet, Yann LeCun
VIRTUALIZATION IN A MULTI-CORE PROCESSOR (MCP)

Publication number: 20120297164

Abstract: This invention describes an apparatus, computer architecture, method, operating system, compiler, and application program products for MPEs as well as virtualization in a symmetric MCP. The disclosure is applied to a generic microprocessor architecture with a set (e.g., one or more) of controlling elements (e.g., MPEs) and a set of groups of sub-processing elements (e.g., SPEs). Under this arrangement, MPEs and SPEs are organized in a way that a smaller number MPEs control the behavior of a group of SPEs. The apparatus enables virtualized control threads within MPEs to be assigned to different groups of SPEs for controlling the same. The apparatus further includes a MCP coupled to a power supply coupled with cores to provide a supply voltage to each core (or core group) and controlling-digital elements and multiple instances of sub-processing elements.

Type: Application

Filed: July 31, 2012

Publication date: November 22, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Karl J. Duvalsaint, Harm P. Hofstee, Daeik Kim, Moon J. Kim
DATA PROCESSING APPARATUS AND DATA PROCESSING METHOD

Publication number: 20120290815

Abstract: A data processing apparatus causes multiple processors to carry out a first data process in parallel, and when storing the data processed in parallel in a storage unit, converts the addresses of the data into addresses in the storage unit based on the data cache size of the multiple processors and stores the data. The data stored in the storage unit is then read out, and a second data process is carried out on the read-out data.

Type: Application

Filed: April 10, 2012

Publication date: November 15, 2012

Applicant: CANON KABUSHIKI KAISHA

Inventor: Hirokazu Takahashi
Optimizing execution of single-threaded programs on a multiprocessor managed by compilation

Patent number: 8312455

Abstract: A method for optimizing execution of a single threaded program on a multi-core processor. The method includes dividing the single threaded program into a plurality of discretely executable components while compiling the single threaded program; identifying at least some of the plurality of discretely executable components for execution by an idle core within the multi-core processor; and enabling execution of the at least one of the plurality of discretely executable components on the idle core.

Type: Grant

Filed: December 19, 2007

Date of Patent: November 13, 2012

Assignee: International Business Machines Corporation

Inventors: Robert H. Bell, Jr., Louis Bennie Capps, Jr., Michael A. Paolini, Michael Jay Shapiro
STORAGE SYSTEM COMPRISING MULTIPLE MICROPROCESSORS AND METHOD FOR SHARING PROCESSING IN THIS STORAGE SYSTEM

Publication number: 20120278589

Abstract: The present invention provides a storage system in which each microprocessor is able to execute synchronous processing and asynchronous processing in accordance with the operating status of the storage system. Any one attribute, from among multiple attributes (operating modes) prepared beforehand, is set in each microprocessor in accordance with the operating status of the storage system. The attribute that is set in each microprocessor is regularly reviewed and changed.

Type: Application

Filed: June 17, 2010

Publication date: November 1, 2012

Applicant: HITACHI, LTD.

Inventors: Tomohiro Yoshihara, Shintaro Kudo, Norio Shimozono
RECONFIGURABLE PROCESSING SYSTEM AND METHOD

Publication number: 20120278590

Abstract: A reconfigurable processor is provided. The reconfigurable processor includes a plurality of functional blocks configured to perform corresponding operations. The reconfigurable processor also includes one or more data inputs coupled to the plurality of functional blocks to provide one or more operands to the plurality of functional blocks, and one or more data outputs to provide at least one result outputted from the plurality of functional blocks.

Type: Application

Filed: January 7, 2011

Publication date: November 1, 2012

Applicant: SHANGHAI XIN HAO MICRO ELECTRONICS CO. LTD.

Inventors: Kenneth Chenghao Lin, Zhongmin Zhang, Haoqi Ren

prev 1 2 3 4 5 6 7 8 9 … next