Array Processor Patents (Class 712/10)

Array processor element interconnection (Class 712/11)

Array processor operation (Class 712/16)

Adjustment of threads for execution based on over-utilization of a domain in a multi-processor system by destroying parallizable group of threads in sub-domains

Patent number: 9043802

Abstract: Embodiments provide various techniques for dynamic adjustment of a number of threads for execution in any domain based on domain utilizations. In a multiprocessor system, the utilization for each domain is monitored. If a utilization of any of these domains changes, then the number of threads for each of the domains determined for execution may also be adjusted to adapt to the change.

Type: Grant

Filed: January 8, 2014

Date of Patent: May 26, 2015

Assignee: NetApp, Inc.

Inventors: Gokul Nadathur, Manpreet Singh, Grace Ho
High performance computing (HPC) node having a plurality of switch coupled processors

Patent number: 9037833

Abstract: A High Performance Computing (HPC) node comprises a motherboard, a switch comprising eight or more ports integrated on the motherboard, and at least two processors operable to execute an HPC job, with each processor communicably coupled to the integrated switch and integrated on the motherboard.

Type: Grant

Filed: December 12, 2012

Date of Patent: May 19, 2015

Assignee: RAYTHEON COMPANY

Inventors: James D. Ballew, Gary R. Early
Multiprocessor system, multiprocessor control method, and multiprocessor integrated circuit

Patent number: 9032407

Abstract: In a multiprocessor system, in general, a processor assigned with a larger amount of tasks is apt to perform a larger amount of communication with other processors assigned with tasks, than a processor assigned with a smaller amount of tasks. Thus in order for each processor to be able to perform the routing process efficiently, tasks are assigned such that, when there are a first processor and a second processor, the number of processors each assigned with one or more tasks and directly connected with the second processor being smaller than the number of processors each assigned with one or more tasks and directly connected with the first processor, the amount of tasks assigned to the first processor is equal to or larger than the amount of tasks assigned to the second processor.

Type: Grant

Filed: May 20, 2010

Date of Patent: May 12, 2015

Assignee: Panasonic Intellectual Property Corporation of America

Inventor: Masahiko Saito
Client-allocatable bandwidth pools

Patent number: 9032077

Abstract: Methods and apparatus for client-allocatable bandwidth pools are disclosed. A system includes a plurality of resources of a provider network and a resource manager. In response to a determination to accept a bandwidth pool creation request from a client for a resource group, where the resource group comprises a plurality of resources allocated to the client, the resource manager stores an indication of a total network traffic rate limit of the resource group. In response to a bandwidth allocation request from the client to allocate a specified portion of the total network traffic rate limit to a particular resource of the resource group, the resource manager initiates one or more configuration changes to allow network transmissions within one or more network links of the provider network accessible from the particular resource at a rate up to the specified portion.

Type: Grant

Filed: June 28, 2012

Date of Patent: May 12, 2015

Assignee: Amazon Technologies, Inc.

Inventors: Matthew D. Klein, Michael David Marr
Active memory command engine and method

Patent number: 9032185

Abstract: A command engine for an active memory receives high level tasks from a host and generates corresponding sets of either DCU commands to a DRAM control unit or ACU commands to a processing array control unit. The DCU commands include memory addresses, which are also generated by the command engine, and the ACU command include instruction memory addresses corresponding to an address in an array control unit where processing array instructions are stored.

Type: Grant

Filed: May 23, 2012

Date of Patent: May 12, 2015

Assignee: Micron Technology, Inc.

Inventor: Graham Kirsch
Sharing virtual functions in a shared virtual memory between heterogeneous processors of a computing platform

Patent number: 8997113

Abstract: A computing platform may include heterogeneous processors (e.g., CPU and a GPU) to support sharing of virtual functions between such processors. In one embodiment, a CPU side vtable pointer used to access a shared object from the CPU 110 may be used to determine a GPU vtable if a GPU-side table exists. In other embodiment, a shared non-coherent region, which may not maintain data consistency, may be created within the shared virtual memory. The CPU and the GPU side data stored within the shared non-coherent region may have a same address as seen from the CPU and the GPU side. However, the contents of the CPU-side data may be different from that of GPU-side data as shared virtual memory may not maintain coherency during the run-time. In one embodiment, the vptr may be modified to point to the CPU vtable and GPU vtable stored in the shared virtual memory.

Type: Grant

Filed: September 24, 2010

Date of Patent: March 31, 2015

Assignee: Intel Corporation

Inventors: Shoumeng Yan, Sai Luo, Xiaocheng Zhou, Ying Gao, Hu Chen, Bratin Saha
Vector width-aware synchronization-elision for vector processors

Patent number: 8966461

Abstract: A medium, method, and apparatus are disclosed for eliding superfluous function invocations in a vector-processing environment. A compiler receives program code comprising a width-contingent invocation of a function. The compiler creates a width-specific executable version of the program code by determining a vector width of a target computer system and omitting the function from the width-specific executable if the vector width meets one or more criteria. For example, the compiler may omit the function call if the vector width is greater than a minimum size.

Type: Grant

Filed: September 29, 2011

Date of Patent: February 24, 2015

Assignee: Advanced Micro Devices, Inc.

Inventors: Benedict R. Gaster, Lee W. Howes, Mark D. Hummel
Message passing in a cluster-on-chip computing environment

Patent number: 8966222

Abstract: Technologies pertaining to cluster-on-chip computing environments are described herein. More particularly, mechanisms for supporting message passing in such environments are described herein, where cluster-on-chip computing environments do not support hardware cache coherency.

Type: Grant

Filed: December 15, 2010

Date of Patent: February 24, 2015

Assignee: Microsoft Corporation

Inventors: Alexey Pakhunov, Ajith Jayamohan, Suyash Sinha
Multi-chip initialization using a parallel firmware boot process

Patent number: 8954721

Abstract: Mechanisms, in a multi-chip data processing system, for performing a boot process for booting each of a plurality of processor chips of the multi-chip data processing system are provided. With these mechanisms, a multi-chip agnostic isolated boot phase operation is performed, in parallel, to perform an initial boot of each of the plurality of processor chips as if each of the processor chips were an only processor chip in the multi-chip data processing system. A multi-chip aware isolated boot phase operation of each of the processor chips is performed in parallel, where each of the processor chips has its own separately configured address space. In addition, a unified configuration phase operation is performed to select a master processor chip from the plurality of processor chips and configure other processor chips in the plurality of processor chips to operate as slave processor chips that are controlled by the master processor chip.

Type: Grant

Filed: December 8, 2011

Date of Patent: February 10, 2015

Assignee: International Business Machines Corporation

Inventors: Eberhard Amann, Frank Haverkamp, Thomas Huth, Jan Kunigk
Hum generation circuitry

Patent number: 8952727

Abstract: Systems and methods for clock generation and distribution are disclosed. Embodiments include arrangements of synchronization signals implemented using a mesh circuit. The mesh circuit is comprised of a plurality of null convention logic (NCL) gates organized into rings. Each ring shares at least one NCL gate with an adjacent ring. The rings are configured in such a way that each ring in the mesh operates synchronously with the other rings in the mesh.

Type: Grant

Filed: August 19, 2013

Date of Patent: February 10, 2015

Assignee: Wave Semiconductor, Inc.

Inventors: Scott E Johnston, Karl Michael Fant
Arithmetic node including general digital signal processing functions for an adaptive computing machine

Patent number: 8949576

Abstract: An apparatus for processing operations in an adaptive computing environment is provided. The adaptive computing environment including at least one processing node. A node includes a memory configured to receive and store data. The data is received from a programmable interconnection network and stored. The node also includes an execution unit configured to perform a signal processing operation. The operation is performed using data retrieved from the memory and an output result is generated. The output result may be used for further computations or sent directly to the programmable interconnection network for transfer to another processing node in the adaptive computing environment.

Type: Grant

Filed: February 13, 2003

Date of Patent: February 3, 2015

Assignee: NVIDIA Corporation

Inventor: Eugene B. Hogenauer
System structuring method in multiprocessor system and switching execution environment by separating from or rejoining the primary execution environment

Patent number: 8935510

Abstract: For flexibly setting up an execution environment according to contents of processing to be executed while taking stability or a security level into consideration, the multiple processor system includes the execution environment main control unit 10 which determines CPU assignment at the time of deciding CPU assignment, the execution environment sub control unit 20 which controls starting, stopping and switching of an execution environment according to an instruction from the execution environment main control unit 10 to synchronize with the execution environment main control unit 10, and the execution environment management unit 30 which receives input of management information or reference refusal information of shared resources for each CPU 4 or each execution environment 100 to separate the execution environment main control unit 10 from the execution environment sub control units 20a through 20n, or the execution environment sub control units 20a through 20n from each other.

Type: Grant

Filed: November 1, 2007

Date of Patent: January 13, 2015

Assignee: NEC Corporation

Inventors: Hiroaki Inoue, Junji Sakai, Tsuyoshi Abe, Masato Edahiro
Efficient complex multiplication and fast fourier transform (FFT) implementation on the ManArray architecture

Patent number: 8904152

Abstract: Efficient computation of complex multiplication results and very efficient fast Fourier transforms (FFTs) are provided. A parallel array VLIW digital signal processor is employed along with specialized complex multiplication instructions and communication operations between the processing elements which are overlapped with computation to provide very high performance operation. Successive iterations of a loop of tightly packed VLIWs are used allowing the complex multiplication pipeline hardware to be efficiently used. In addition, efficient techniques for supporting combined multiply accumulate operations are described.

Type: Grant

Filed: May 26, 2011

Date of Patent: December 2, 2014

Assignee: Altera Corporation

Inventors: Nikos P. Pitsianis, Gerald George Pechanek, Ricardo Rodriguez
MAC processor architecture

Patent number: 8897293

Abstract: In a media access control (MAC) processor, a programmable controller is configured to execute machine readable instructions for implementing MAC functions corresponding to data received by a communication device. A tightly coupled memory is associated with the programmable controller. A system memory is coupled to the programmable controller via a system bus, and a hardware processor is coupled to the system bus and the tightly coupled memory. The hardware processor is configured to implement MAC functions on data received in a communication frame, store, in the tightly coupled memory, processed data corresponding to data in the communication frame that indicates a structure of downlink data in the communication frame, and store, in the system memory, processed data corresponding to other data in the communication frame.

Type: Grant

Filed: May 7, 2012

Date of Patent: November 25, 2014

Assignee: Marvell International Ltd.

Inventors: Bhaskar Chowdhuri, Srikanth Shubhakoti, Vinod Ananth, Hongyu Xie, Shui Cheong Lee
Method for the interoperation of virtual organizations

Patent number: 8892624

Abstract: A cooperative data stream processing system is provided that utilizes a plurality of independent, autonomous and possibly heterogeneous sites in a cooperative arrangement to process user-defined job requests over dynamic, continuous streams of data. A method is provided to organize the distributed sites into a plurality of virtual organizations that can be further combined and virtualized into virtualized virtual organizations. These virtualized virtual organizations can also include additional distributed sites and existing virtualized virtual organizations and all members of a given virtualized virtual organization can share data and processing resources in order to process jobs on either a task-based or goal-based allocation mechanism. The virtualized virtual organization is created dynamically using ad-hoc collaborations among the members and is arranged in either a federated or cooperative architecture. Collaborations between members is either tightly-coupled or loosely coupled.

Type: Grant

Filed: May 11, 2007

Date of Patent: November 18, 2014

Assignee: International Business Machines Corporation

Inventors: Michael J. Branson, Frederick Douglis, Bradley W. Fawcett, Zhen Liu, William Waller, Fan Ye
Memory controller with inter-core interference detection

Patent number: 8880809

Abstract: Embodiments are described for a method for controlling access to memory in a processor-based system comprising monitoring a number of interference events, such as bank contentions, bus contentions, row-buffer conflicts, and increased write-to-read turnaround time caused by a first core in the processor-based system that causes a delay in access to the memory by a second core in the processor-based system; deriving a control signal based on the number of interference events; and transmitting the control signal to one or more resources of the processor-based system to reduce the number of interference events from an original number of interference events.

Type: Grant

Filed: October 29, 2012

Date of Patent: November 4, 2014

Assignee: Advanced Micro Devices Inc.

Inventors: Gabriel Loh, James O'Connor
Embedded memory and dedicated processor structure within an integrated circuit

Patent number: 8874837

Abstract: An integrated circuit can include a programmable circuitry operable according to a first clock frequency and a block random access memory. The block random access memory can include a random access memory (RAM) element having at least one data port and a memory processor coupled to the data port of the RAM element and to the programmable circuitry. The memory processor can be operable according to a second clock frequency that is higher than the first clock frequency. Further, the memory processor can be hardwired and dedicated to perform operations in the RAM element of the block random access memory.

Type: Grant

Filed: November 8, 2011

Date of Patent: October 28, 2014

Assignee: Xilinx, Inc.

Inventors: Christopher E. Neely, Gordon J. Brebner
Method and system for improved multi-cell support on a single modem board

Patent number: 8861434

Abstract: A system for providing multi-cell support within a single SMP partition in a telecommunications network is disclosed. The typically includes a modem board and a multi-core processor having a plurality of processor cores, wherein the multi-core processor is configured to disable non-essential interrupts arriving on a plurality of data plane cores and route the non-essential interrupts to a plurality of control plane cores. Optionally, the multi-core processor may be configured so that all non-real-time threads and processes are bound to processor cores that are dedicated for all control plane activities and processor cores that are dedicated for all data plane activities will not host or run any threads that are not directly needed for data path implementation or Layer 2 processing.

Type: Grant

Filed: November 29, 2010

Date of Patent: October 14, 2014

Assignee: Alcatel Lucent

Inventors: Mohammad R. Khawer, Mugur Abulius
Switches and a network of switches

Patent number: 8825986

Abstract: A switch includes at least one input configured to receive data and at least two outputs configured to send data to at least two further switches in a network via at least two output links. Each output link has a known hop value. The switch further includes a direction determinator that determines a routing direction for the data from information identifying a relative location of the switch in the network and information identifying a destination of said data. A distributor within the switch processes the routing direction and direction information about each output link in order to select one of said at least two outputs for outputting said data. The selection that is made prioritizes output links for selection which have relatively higher known hop values.

Type: Grant

Filed: October 13, 2011

Date of Patent: September 2, 2014

Assignee: STMicroelectronics (Grenoble 2) SAS

Inventors: Antonio-Marcello Coppola, Riccardo Locatelli, Jose Flich Cardo, Jose Cano Reyes, Jose Francisco Duato Marin
Asynchronous computer communication

Patent number: 8825924

Abstract: A computer array (10) has a plurality of computers (12). The computers (12) communicate with each other asynchronously, and the computers (12) themselves operate in a generally asynchronous manner internally. When one computer (12) attempts to communicate with another it goes to sleep until the other computer (12) is ready to complete the transaction, thereby saving power and reducing heat production. A plurality of read lines (18), write lines (20) and data lines (22) interconnect the computers (12). When one computer (12) sets a read line (18) high and the other computer sets a corresponding write line (20) then data is transferred on the data lines (22). When both the read line (18) and corresponding write line (20) go low this allows both communicating computers (12) to know that the communication is completed. An acknowledge line (72) goes high to restart the computers (12).

Type: Grant

Filed: March 4, 2011

Date of Patent: September 2, 2014

Assignee: Array Portfolio LLC

Inventor: Charles H. Moore
Managing shared resource in an operating system by distributing reference to object and setting protection levels

Patent number: 8799914

Abstract: Managing processes in a computing system comprising one or more cores includes generating an object in an operating system running on at least one core. A reference to the object is distributed to each of at least one and fewer than all of a plurality of processes to be executed on the at least one core. The operating system controls access to a resource such that processes to which the reference to the object was distributed have access to the resource and processes to which the reference to the object was not distributed do not have access to the resource.

Type: Grant

Filed: September 20, 2010

Date of Patent: August 5, 2014

Assignee: Tilera Corporation

Inventor: Christopher D. Metcalf
Intelligent control with hierarchical stacked neural networks

Patent number: 8788441

Abstract: An intelligent control system based on an explicit model of cognitive development (Table 1) performs high-level functions. It comprises up to O hierarchically stacked neural networks, Nm, . . . , Nm+(O?1), where m denotes the stage/order tasks performed in the first neural network, Nm, and O denotes the highest stage/order tasks performed in the highest-level neural network. The type of processing actions performed in a network, Nm, corresponds to the complexity for stage/order m. Thus N1 performs tasks at the level corresponding to stage/order 1. N5 processes information at the level corresponding to stage/order 5. Stacked neural networks begin and end at any stage/order, but information must be processed by each stage in ascending order sequence. Stages/orders cannot be skipped. Each neural network in a stack may use different architectures, interconnections, algorithms, and training methods, depending on the stage/order of the neural network and the type of intelligent control system implemented.

Type: Grant

Filed: November 3, 2009

Date of Patent: July 22, 2014

Inventors: Michael Lamport Commons, Mitzi Sturgeon White
Method and system for improved multi-cell support on a single modem board

Patent number: 8787255

Abstract: A system for providing multi-cell support within a single SMP partition in a telecommunications network is disclosed. The typically includes a modem board and a multi-core processor having a plurality of processor cores, wherein the multi-core processor is configured to disable non-essential interrupts arriving on a plurality of data plane cores and route the non-essential interrupts to a plurality of control plane cores. Optionally, the multi-core processor may be configured so that all non-real-time threads and processes are bound to processor cores that are dedicated for all control plane activities and processor cores that are dedicated for all data plane activities will not host or run any threads that are not directly needed for data path implementation or Layer 2 processing.

Type: Grant

Filed: November 29, 2010

Date of Patent: July 22, 2014

Assignee: Alcatel Lucent

Inventors: Mohammad R. Khawer, Mugur Abulius
System and method for remotely configuring semiconductor functional circuits

Patent number: 8768642

Abstract: The present invention systems and methods facilitate configuration of functional components included in a remotely located integrated circuit die. In one exemplary implementation, a die functional component reconfiguration request process is engaged in wherein a system requests a reconfiguration code from a remote centralized resource. A reconfiguration code production process is executed in which a request for a reconfiguration code and a permission indicator are received, validity of permission indicator is analyzed, and a reconfiguration code is provided if the permission indicator is valid. A die functional component configuration process is performed on the die when an appropriate reconfiguration code is received by the die. The functional component configuration process includes directing alteration of a functional component configuration. Workflow is diverted from disabled functional components to enabled functional components.

Type: Grant

Filed: December 18, 2003

Date of Patent: July 1, 2014

Assignee: Nvidia Corporation

Inventors: Michael B. Diamond, John S. Montrym, James M. Van Dyke, Michael B. Nagy, Sean J. Treichler
Transferring data in a parallel processing environment

Patent number: 8745604

Abstract: An integrated circuit includes a plurality of tiles. Each tile includes a processor, a switch including switching circuitry to forward data over data paths from other tiles to the processor and to switches of other tiles, and a switch memory that stores instruction streams that are able to operate independently for respective output ports of the switch.

Type: Grant

Filed: February 25, 2008

Date of Patent: June 3, 2014

Assignee: Massachusetts Institute of Technology

Inventor: Anant Agarwal
Systems, methods, and devices for configuring a device

Patent number: 8725961

Abstract: Disclosed are methods and devices, among which is a method for configuring an electronic device. In one embodiment, an electronic device may include one or more memory locations having stored values representative of the capabilities of the device. According to an example configuration method, a configuring system may access the device capabilities from the one or more memory locations and configure the device based on the accessed device capabilities.

Type: Grant

Filed: March 20, 2012

Date of Patent: May 13, 2014

Assignee: Micron Technology Inc.

Inventor: Harold B Noyes
Two way communication support for heterogenous processors of a computer platform

Patent number: 8719839

Abstract: A computer system may comprise a computer platform and input-output devices. The computer platform may include a plurality of heterogeneous processors comprising a central processing unit (CPU) and a graphics processing unit) GPU, for example. The GPU may be coupled to a GPU compiler and a GPU linker/loader and the CPU may be coupled to a CPU compiler and a CPU linker/loader. The user may create a shared object in an object oriented language and the shared object may include virtual functions. The shared object may be fine grain partitioned between the heterogeneous processors. The GPU compiler may allocate the shared object to the CPU and may create a first and a second enabling path to allow the GPU to invoke virtual functions of the shared object. Thus, the shared object that may include virtual functions may be shared seamlessly between the CPU and the GPU.

Type: Grant

Filed: October 30, 2009

Date of Patent: May 6, 2014

Assignee: Intel Corporation

Inventors: Shoumeng Yan, Xiaocheng Zhou, Ying Gao, Mohan Rajagopalan, Rajiv Deodhar, David Putzolu, Clark Nelson, Milind Girkar, Robert Geva, Tiger Chen, Sai Luo, Stephen Junkins, Bratin Saha, Ravi Narayanaswamy, Patrick Xi
Systems and methods for identifying missing signage

Patent number: 8676506

Abstract: Methods and systems for identifying missing signage are described herein. The method includes generating a route from an origin to a destination, the route having a plurality of maneuvers. The method further includes receiving missing signage information from a first device, the missing signage information relating to one or more maneuvers of the plurality of maneuvers, and providing the missing signage information and at least one of the one or more related maneuvers to a second device.

Type: Grant

Filed: November 15, 2011

Date of Patent: March 18, 2014

Assignee: Google Inc.

Inventor: Daniel M. LaLiberte
System, method, and computer program product for performing a scan operation on a sequence of single-bit values using a parallel processor architecture

Patent number: 8661226

Abstract: A system, method, and computer program product are provided for performing a scan operation on a sequence of single-bit values using a parallel processing architecture. In operation, a scan operation instruction is received. Additionally, in response to the scan operation instruction, a scan operation is performed on a sequence of single-bit values using a parallel processor architecture with a plurality of processing elements.

Type: Grant

Filed: November 15, 2007

Date of Patent: February 25, 2014

Assignee: NVIDIA Corporation

Inventors: Michael J. Garland, Samuli M. Laine, Timo O. Aila, David Patrick Luebke
Variable clocked heterogeneous serial array processor

Patent number: 8656143

Abstract: A serial array processor may have an execution unit, which is comprised of a multiplicity of single bit arithmetic logic units (ALUs), and which may perform parallel operations on a subset of all the words in memory by serially accessing and processing them, one bit at a time, while an instruction unit of the processor is pre-fetching the next instruction, a word at a time, in a manner orthogonal to the execution unit.

Type: Grant

Filed: February 3, 2010

Date of Patent: February 18, 2014

Inventor: Laurence H. Cooke
Architecture and programming in a parallel processing environment with switch-interconnected processors

Patent number: 8656141

Abstract: An integrated circuit includes a plurality of tiles. Each tile includes a pipelined processor configured to process multiple streams of instructions for the processor; and a switch including switching circuitry to forward data over data paths from other tiles to one or more pipeline stages of the processor and to switches of other tiles. At least some of the data is forwarded based on one or more streams of instructions for the switch.

Type: Grant

Filed: December 13, 2005

Date of Patent: February 18, 2014

Assignee: Massachusetts Institute of Technology

Inventor: Anant Agarwal
Fencing direct memory access data transfers in a parallel active messaging interface of a parallel computer

Patent number: 8650338

Abstract: Fencing direct memory access (‘DMA’) data transfers in a parallel active messaging interface (‘PAMI’) of a parallel computer, the PAMI including data communications endpoints, each endpoint including specifications of a client, a context, and a task, the endpoints coupled for data communications through the PAMI and through DMA controllers operatively coupled to segments of shared random access memory through which the DMA controllers deliver data communications deterministically, including initiating execution through the PAMI of an ordered sequence of active DMA instructions for DMA data transfers between two endpoints, effecting deterministic DMA data transfers through a DMA controller and a segment of shared memory; and executing through the PAMI, with no FENCE accounting for DMA data transfers, an active FENCE instruction, the FENCE instruction completing execution only after completion of all DMA instructions initiated prior to execution of the FENCE instruction for DMA data transfers between the two en

Type: Grant

Filed: March 4, 2013

Date of Patent: February 11, 2014

Assignee: International Business Machines Corporation

Inventors: Michael A. Blocksome, Amith R. Mamidala
Method and system for packet processing

Patent number: 8639912

Abstract: A data processor and a method for processing data is disclosed. The processor has an input port for receiving packets of data to be processed. A master controller acts to analyze the packets and to provide a header including a list of processes to perform on the packet of data and an ordering thereof. The master controller is programmed with process related data relating to the overall processing function of the processor. The header is appended to the packet of data. The packet with the appended header information is stored within a buffer. A buffer controller acts to determine for each packet stored within the buffer based on the header within the packet a next processor to process the packet. The controller then provides the packet to the determined processor for processing. The processed packet is returned with some indication that the processing is done. For example, the process may be deleted from the list of processes.

Type: Grant

Filed: November 16, 2009

Date of Patent: January 28, 2014

Assignee: Mosaid Technologies Incorporated

Inventors: Arthur John Low, Stephen J. Davis
Packet draining from a scheduling hierarchy in a traffic manager of a network processor

Patent number: 8638805

Abstract: Described embodiments provide for restructuring a scheduling hierarchy of a network processor having a plurality of processing modules and a shared memory. The scheduling hierarchy schedules packets for transmission. The network processor generates tasks corresponding to each received packet associated with a data flow. A traffic manager receives tasks provided by one of the processing modules and determines a queue of the scheduling hierarchy corresponding to the task. The queue has a parent scheduler at each of one or more next levels of the scheduling hierarchy up to a root scheduler, forming a branch of the hierarchy. The traffic manager determines if the queue and one or more of the parent schedulers of the branch should be restructured. If so, the traffic manager drops subsequently received tasks for the branch, drains all tasks of the branch, and removes the corresponding nodes of the branch from the scheduling hierarchy.

Type: Grant

Filed: September 30, 2011

Date of Patent: January 28, 2014

Assignee: LSI Corporation

Inventors: Balakrishnan Sundararaman, Shashank Nemawarkar, David Sonnier, Shailendra Aulakh, Allen Vestal
Adjustment of threads for execution based on over-utilization of a domain in a multi-processor system by sub-dividing parallizable group of threads to sub-domains

Patent number: 8631415

Abstract: Embodiments provide various techniques for dynamic adjustment of a number of threads for execution in any domain based on domain utilizations. In a multiprocessor system, the utilization for each domain is monitored. If a utilization of any of these domains changes, then the number of threads for each of the domains determined for execution may also be adjusted to adapt to the change.

Type: Grant

Filed: August 25, 2009

Date of Patent: January 14, 2014

Assignee: NetApp, Inc.

Inventors: Gokul Nadathur, Manpreet Singh, Grace Ho
Methods and apparatus for providing bit-reversal and multicast functions utilizing DMA controller

Patent number: 8601176

Abstract: Techniques for providing improved data distribution to and collection from multiple memories are described. Such memories are often associated with and local to processing elements (PEs) within an array processor. Improved data transfer control within a data processing system provides support for radix 2, 4 and 8 fast Fourier transform (FFT) algorithms through data reordering or bit-reversed addressing across multiple PEs, carried out concurrently with FFT computation on a digital signal processor (DSP) array by a DMA unit. Parallel data distribution and collection through forms of multicast and packet-gather operations are also supported.

Type: Grant

Filed: July 10, 2012

Date of Patent: December 3, 2013

Assignee: Altera Corporation

Inventors: Edwin Franklin Barry, Nikos P. Pitsianis, Kevin Coopman
Condensed router headers with low latency output port calculation

Patent number: 8572353

Abstract: Communicating among cores in a computing system comprising a plurality of cores, each core comprising a processor and a switch, includes: routing a packet from an origin core to a destination core over a route including multiple cores; and at each core in the route before the destination core, routing the packet to the next core in the route according to a respective symbol in a sequence of multiple symbols. The respective symbol has a first symbol value indicating a single likely direction and the respective symbol has a second symbol value indicating multiple less likely directions.

Type: Grant

Filed: September 20, 2010

Date of Patent: October 29, 2013

Assignee: Tilera Corporation

Inventors: Ian Rudolf Bratt, Carl G. Ramey, Matthew Mattina
Selectively isolating processor elements into subsets of processor elements

Patent number: 8532288

Abstract: A cryptographic engine for modulo N multiplication, which is structured as a plurality of almost identical, serially connected Processing Elements, is controlled so as to accept input in blocks that are smaller than the maximum capability of the engine in terms of bits multiplied at one time. The serially connected hardware is thus partitioned on the fly to process a variety of cryptographic key sizes while still maintaining all of the hardware in an active processing state.

Type: Grant

Filed: December 1, 2006

Date of Patent: September 10, 2013

Assignee: International Business Machines Corporation

Inventors: Camil Fayad, John K. Li, Siegfried K. H. Sutter, Phil C. Yeh
Fencing direct memory access data transfers in a parallel active messaging interface of a parallel computer

Patent number: 8527672

Abstract: Fencing direct memory access (‘DMA’) data transfers in a parallel active messaging interface (‘PAMI’) of a parallel computer, the PAMI including data communications endpoints, each endpoint including specifications of a client, a context, and a task, the endpoints coupled for data communications through the PAMI and through DMA controllers operatively coupled to segments of shared random access memory through which the DMA controllers deliver data communications deterministically, including initiating execution through the PAMI of an ordered sequence of active DMA instructions for DMA data transfers between two endpoints, effecting deterministic DMA data transfers through a DMA controller and a segment of shared memory; and executing through the PAMI, with no FENCE accounting for DMA data transfers, an active FENCE instruction, the FENCE instruction completing execution only after completion of all DMA instructions initiated prior to execution of the FENCE instruction for DMA data transfers between the two en

Type: Grant

Filed: November 5, 2010

Date of Patent: September 3, 2013

Assignee: International Business Machines Corporation

Inventors: Michael A. Blocksome, Amith R. Mamidala
Programmable and scalable microcontroller architecture

Patent number: 8521989

Abstract: A microcontroller includes a program memory, data memory, central processing unit, at least one register module, a memory management unit, and a transport network. Instructions are executed in one clock cycle via an instruction word. The instruction word indicates the source module from which data is to be retrieved and the destination module to which data is to be stored. The address/data capability of an instruction word may be extended via a prefix module. If an operation is performed on the data, the source module or the destination module may perform the operation during the same clock cycle in which the data is transferred.

Type: Grant

Filed: May 18, 2006

Date of Patent: August 27, 2013

Assignee: Maxim Integrated Products, Inc.

Inventors: Jeffrey Dean Owens, Edward Tang K. Ma, Don Loomis, Tom Chenot
Virtual architectures in a parallel processing environment

Patent number: 8516222

Abstract: An integrated circuit includes a plurality of processor core. Processing instructions in the integrated circuit includes: managing a plurality of sets of processor cores, each set including one or more processor cores assigned to a function associated with executing instructions; and reconfiguring the number of processor cores assigned to at least one of the sets during execution based on characteristics associated with executing the instructions.

Type: Grant

Filed: December 9, 2011

Date of Patent: August 20, 2013

Assignee: Massachusetts Institute of Technology

Inventors: Anant Agarwal, David Wentzlaff
Integrated circuit with coupled processing cores

Patent number: 8516179

Abstract: A processing system on an integrated circuit includes a group of processing cores. A group of dedicated random access memories are severally coupled to one of the group of processing cores or shared among the group. A star bus couples the group of processing cores and random access memories. Additional layer(s) of star bus may couple many such clusters to each other and to an off-chip environment.

Type: Grant

Filed: November 30, 2004

Date of Patent: August 20, 2013

Assignee: Digital RNA, LLC

Inventor: Joel Henry Hinrichs
Method to dynamically distribute a multi-dimensional work set across a multi-core system

Patent number: 8516461

Abstract: A method provides efficient dispatch/completion of an N Dimensional (ND) Range command in a data processing system (DPS). The method comprises: a compiler generating one or more commands from received program instructions; ND Range work processing (WP) logic determining when a command generated by the compiler will be implemented over an ND configuration of operands, where N is greater than one (1); automatically decomposing the ND configuration of operands into a one (1) dimension (1D) work element comprising P sequentially ordered work items that each represent one of the operands; placing the 1D work element within a command queue of the DPS; enabling sequential dispatching of 1D work items in ordered sequence from to one or more processing units; and generating an ND Range output by mapping the 1D work output result to an ND position corresponding to an original location of the operand represented by the 1D work item.

Type: Grant

Filed: September 15, 2012

Date of Patent: August 20, 2013

Assignee: International Business Machines Corporation

Inventors: Gregory Howard Bellows, Brian H. Horton, Joaquin Madruga, Barry L. Minor
Computing apparatus and method of handling interrupt

Patent number: 8495345

Abstract: A computing apparatus and method of handling an interrupt are provided. The computing apparatus includes a coarse-grained array, a host processor, and an interrupt supervisor. When an interrupt occurs in the coarse-grained array while performing a loop operation, the host processor processes the interrupt, and the interrupt supervisor may perform mode switching between the coarse-grained array and the host processor.

Type: Grant

Filed: December 16, 2009

Date of Patent: July 23, 2013

Assignee: Samsung Electronics Co., Ltd.

Inventors: Dong-hoon Yoo, Soo-jung Ryu, Yeon-gon Cho, Bernhard Egger, Il-hyun Park
Dynamically distribute a multi-dimensional work set across a multi-core system

Patent number: 8495604

Abstract: A system provides efficient dispatch/completion of an N Dimensional (ND) Range command in a data processing system (DPS). The system comprises: a compiler generating one or more commands from received program instructions; ND Range work processing (WP) logic determining when a command generated by the compiler will be implemented over an ND configuration of operands, where N is greater than one (1); automatically decomposing the ND configuration of operands into a one (1) dimension (1D) work element comprising P sequentially ordered work items that each represent one of the operands; placing the 1D work element within a command queue of the DPS; enabling sequential dispatching of 1D work items in ordered sequence from to one or more processing units; and generating an ND Range output by mapping the 1D work output result to an ND position corresponding to an original location of the operand represented by the 1D work item.

Type: Grant

Filed: December 30, 2009

Date of Patent: July 23, 2013

Assignee: International Business Machines Corporation

Inventors: Gregory H. Bellows, Brian H. Horton, Joaquin Madruga, Barry L. Minor
Processor cluster architecture and associated parallel processing methods

Patent number: 8489857

Abstract: A parallel processing architecture comprising a cluster of embedded processors that share a common code distribution bus. Pages or blocks of code are concurrently loaded into respective program memories of some or all of these processors (typically all processors assigned to a particular task) over the code distribution bus, and are executed in parallel by these processors. A task control processor determines when all of the processors assigned to a particular task have finished executing the current code page, and then loads a new code page (e.g., the next sequential code page within a task) into the program memories of these processors for execution. The processors within the cluster preferably share a common memory (1 per cluster) that is used to receive data inputs from, and to provide data outputs to, a higher level processor. Multiple interconnected clusters may be integrated within a common integrated circuit device.

Type: Grant

Filed: November 5, 2010

Date of Patent: July 16, 2013

Assignee: Schism Electronics, L.L.C.

Inventors: Richard F. Hobson, Bill Ressl, Allan R. Dyck
Processing array data on SIMD multi-core processor architectures

Patent number: 8484276

Abstract: Techniques are disclosed for converting data into a format tailored for efficient multidimensional fast Fourier transforms (FFTS) on single instruction, multiple data (SIMD) multi-core processor architectures. The technique includes converting data from a multidimensional array stored in a conventional row-major order into SIMD format. Converted data in SIMD format consists of a sequence of blocks, where each block interleaves s rows such that SIMD vector processors may operate on s rows simultaneously. As a result, the converted data in SIMD format enables smaller-sized 1D FFTs to be optimized in SIMD multi-core processor architectures.

Type: Grant

Filed: March 18, 2009

Date of Patent: July 9, 2013

Assignee: International Business Machines Corporation

Inventors: David G. Carlson, Travis M. Drucker, Timothy J. Mullins, Jeffrey S. McAllister, Nelson Ramirez
Signal processing apparatus with signal control units and processor units operating based on different threads

Patent number: 8464025

Abstract: A signal processing apparatus able to raise a processing capability in processing accompanying access to a storing means is provided. Stream control units (SCU) 203—0 to 203—3 access data at an external memory system or local memories 204—0 to 204—3 according to a thread under control from a host processor. Processor units (PU) arrays 202—0 to 202—3 perform image processing by a different thread from the thread of the SCUs 203—0 to 203—3.

Type: Grant

Filed: May 22, 2006

Date of Patent: June 11, 2013

Assignee: Sony Corporation

Inventors: Yuji Yamaguchi, Masatoshi Imai, Toshiharu Noda, Naosuke Asari, Tomoo Mitsunaga, Mitsuharu Ohki, Kazumasa Ito, Hidetoshi Nagano, Sumito Arakawa, Kei Ito
Communication method

Patent number: 8453003

Abstract: A communication method is provided to reduce an overhead of inter-processor synchronization for a communication phase in collective communication and to speed up the collective communication. Each of processors in a parallel computer start a previous process before a collective communication phase in which communications are performed at a same time among the processors through a inter-processor network. Each processor executes a synchronization command in advance at a time when a portion of the previous process for a predetermined time t is left. The inter-processor synchronization control section transmits a synchronization completion notice to each processor, if a synchronization condition is met. For the period, each processor executes the previous process in parallel. Then, the plurality of processors enter the collective communication phase.

Type: Grant

Filed: April 9, 2008

Date of Patent: May 28, 2013

Assignee: NEC Corporation

Inventor: Yasushi Kanoh
Method and system for implementing efficient locking to facilitate parallel processing of IC designs

Patent number: 8438512

Abstract: Disclosed is an improved method and system for implementing parallelism for execution of electronic design automation (EDA) tools, such as layout processing tools. Examples of EDA layout processing tools are placement and routing tools. Efficient locking mechanism are described for facilitating parallel processing and to minimize blocking.

Type: Grant

Filed: August 30, 2011

Date of Patent: May 7, 2013

Assignee: Cadence Design Systems, Inc.

Inventors: David Cross, Eric Nequist

prev 1 2 3 4 5 6 7 … next