Array Processor Element Interconnection Patents (Class 712/11)
  • Patent number: 10830743
    Abstract: A method for determining a source of pollutant emissions for a selected area includes: mapping a grid onto a representation of the selected area; collecting monitoring data for the selected data; from the monitoring data, assigning air pollution values and weather values to each cell in the grid using an interpolation method to estimate values for gap cells; re-sizing the grid to mitigate the influence of atmospheric turbulence on the air pollution values; and using weather factor separation to minimize the influence of weather from the air pollution values, resulting in air pollution values that reflect the net pollutant emissions for the selected area.
    Type: Grant
    Filed: May 4, 2017
    Date of Patent: November 10, 2020
    Assignee: International Business Machines Corporation
    Inventors: Long Sheng Bai, Liang Liu, Zhuo Liu, Jun Mei Qu, Wei Zhuang
  • Patent number: 10782974
    Abstract: A VLIW (Very Long Instruction Word) interface device includes a memory configured to store instructions and data, and a processor configured to process the instructions and the data, wherein the processor includes an instruction fetcher configured to output an instruction fetch request to load the instruction from the memory, a decoder configured to decode the instruction loaded on the instruction fetcher, an arithmetic logic unit (ALU) configured to perform an operation function if the decoded instruction is an operation instruction, a memory interface scheduler configured to schedule the instruction fetch request or a data fetch request that is input from the arithmetic logic unit, and a memory operator configured to perform a memory access operation in accordance with the scheduled instruction fetch request or data fetch request.
    Type: Grant
    Filed: November 23, 2016
    Date of Patent: September 22, 2020
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Young-chul Cho, Suk-jin Kim, Chul-soo Park, Dong-kwan Suh
  • Patent number: 10691542
    Abstract: According to an embodiment, a storage device includes a plurality of memory nodes and a control unit. Each of the memory nodes includes a storage unit including a plurality of storage areas having a predetermined size. The memory nodes are connected to each other in two or more different directions. The memory nodes constitute two or more groups each including two or more memory nodes. The control unit is configured to sequentially allocate data writing destinations in the storage units to the storage areas respectively included in the different groups.
    Type: Grant
    Filed: September 11, 2013
    Date of Patent: June 23, 2020
    Assignee: Toshiba Memory Corporation
    Inventors: Yuki Sasaki, Takahiro Kurita, Atsuhiro Kinoshita
  • Patent number: 10685699
    Abstract: Examples of the present disclosure provide apparatuses and methods related to performing a sort operation in a memory. An example apparatus might include a a first group of memory cells coupled to a first sense line, a second group of memory cells coupled to a second sense line, and a controller configured to control sensing circuitry to sort a first element stored in the first group of memory cells and a second element stored in the second group of memory cells by performing an operation without transferring data via an input/output (I/O) line.
    Type: Grant
    Filed: December 3, 2018
    Date of Patent: June 16, 2020
    Assignee: Micron Technology, Inc.
    Inventor: Kyle B. Wheeler
  • Patent number: 10679319
    Abstract: A SIMD processing unit processes a plurality of tasks which each include up to a predetermined maximum number of work items. The work items of a task are arranged for executing a common sequence of instructions on respective data items. The data items are arranged into blocks, with some of the blocks including at least one invalid data item. Work items which relate to invalid data items are invalid work items. The SIMD processing unit comprises a group of processing lanes configured to execute instructions of work items of a particular task over a plurality of processing cycles. A control module assembles work items into the tasks based on the validity of the work items, so that invalid work items of the particular task are temporally aligned across the processing lanes. In this way the number of wasted processing slots due to invalid work items may be reduced.
    Type: Grant
    Filed: April 17, 2019
    Date of Patent: June 9, 2020
    Assignee: Imagination Technologies Limited
    Inventors: John Howson, Jonathan Redshaw, Yoong Chert Foo
  • Patent number: 10656912
    Abstract: A processor or other device, such as a programmable and/or massively parallel processor or other device, includes processing elements designed to perform arithmetic operations (possibly but not necessarily including, for example, one or more of addition, multiplication, subtraction, and division) on numerical values of low precision but high dynamic range (“LPHDR arithmetic”). Such a processor or other device may, for example, be implemented on a single chip. Whether or not implemented on a single chip, the number of LPHDR arithmetic elements in the processor or other device in certain embodiments of the present invention significantly exceeds (e.g., by at least 20 more than three times) the number of arithmetic elements, if any, in the processor or other device which are designed to perform high dynamic range arithmetic of traditional precision (such as 32 bit or 64 bit floating point arithmetic).
    Type: Grant
    Filed: November 6, 2019
    Date of Patent: May 19, 2020
    Assignee: Singular Computing LLC
    Inventor: Joseph Bates
  • Patent number: 10650303
    Abstract: Methods, systems, and computer storage media for implementing neural networks in fixed point arithmetic computing systems. In one aspect, a method includes the actions of receiving a request to process a neural network using a processing system that performs neural network computations using fixed point arithmetic; for each node of each layer of the neural network, determining a respective scaling value for the node from the respective set of floating point weight values for the node; and converting each floating point weight value of the node into a corresponding fixed point weight value using the respective scaling value for the node to generate a set of fixed point weight values for the node; and providing the sets of fixed point floating point weight values for the nodes to the processing system for use in processing inputs using the neural network.
    Type: Grant
    Filed: February 14, 2017
    Date of Patent: May 12, 2020
    Assignee: Google LLC
    Inventor: William John Gulland
  • Patent number: 10609188
    Abstract: An information processing apparatus includes a receiver to receive data-packets, the data-packets generated by dividing a message into division-data and storing, for each of the division-data, one of the plurality of division data into one of the plurality of data packets, wherein each of the data-packets also includes a data value indicating a quantity of the division-data and data indicating whether or not the data-packet includes final division data corresponding to an end of the message, a memory, and a processor to store the division-data that is contained in a packet of the data-packets that are received, in the memory, and suppress the final division-data from being stored in the memory until the quantity of the data-packets received by the receiver equates to the data value indicating the quantity of the division-data, in a case where the final division-data is received earlier than any one of the other division-data.
    Type: Grant
    Filed: April 3, 2018
    Date of Patent: March 31, 2020
    Assignee: FUJITSU LIMITED
    Inventors: Shinya Hiramoto, Yuji Kondo, Yuichiro Ajima
  • Patent number: 10558575
    Abstract: Systems, methods, and apparatuses relating to a configurable spatial accelerator are described. In one embodiment, a processor includes a core with a decoder to decode an instruction into a decoded instruction and an execution unit to execute the decoded instruction to perform a first operation; a plurality of processing elements; and an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the plurality of processing elements is to perform a second operation when an incoming operand set arrives at the plurality of processing elements.
    Type: Grant
    Filed: December 30, 2016
    Date of Patent: February 11, 2020
    Assignee: Intel Corporation
    Inventors: Kermin E. Fleming, Jr., Kent D. Glossop, Simon C. Steely, Jr., Jinjie Tang, Alan G. Gara
  • Patent number: 10534652
    Abstract: Given a current configuration of virtual node groups in a computing cluster and a new configuration indicating one or more changes to the virtual node groups, a cluster manager generates a reconfiguration plan to arrange virtual nodes into the desired virtual node groups of the new configuration while minimizing a number of virtual nodes to be moved between physical nodes in the computing cluster.
    Type: Grant
    Filed: June 29, 2017
    Date of Patent: January 14, 2020
    Assignee: Amazon Technologies, Inc.
    Inventors: Rajib Dugar, Ashish Manral, Ganesh Narayanan
  • Patent number: 10481965
    Abstract: Counting status circuits are electrically coupled to corresponding status elements. The status elements selectably store a bit status of a bit line coupled to a memory array. The bit status can indicate one of at least pass and fail. The counting status circuits are electrically coupled to each other in a sequential order. Control logic causes processing of the counting status circuits in the sequential order to determine a total of the memory elements that store the bit status. The total number of memory elements that store the bit status indicate the number of error bits or non-error bits, which can help determine whether there are too many errors to be fixed by error correction codes.
    Type: Grant
    Filed: December 22, 2016
    Date of Patent: November 19, 2019
    Assignee: MACRONIX INTERNATIONAL CO., LTD.
    Inventors: Yih-Shan Yang, Shou-Nan Hung, Chun-Hsiung Hung, Yao-Jen Kuo, Meng-Fan Chang
  • Patent number: 10474626
    Abstract: Configuring compute nodes in a parallel computer using remote direct memory access (‘RDMA’), the parallel computer comprising a plurality of compute nodes coupled for data communications via one or more data communications networks, including: initiating, by a source compute node of the parallel computer, an RDMA broadcast operation to broadcast binary configuration information to one or more target compute nodes in the parallel computer; preparing, by each target compute node, the target compute node for receipt of the binary configuration information from the source compute node; transmitting, by each target compute node, a ready message to the target compute node, the ready message indicating that the target compute node is ready to receive the binary configuration information from the source compute node; and performing, by the source compute node, an RDMA broadcast operation to write the binary configuration information into memory of each target compute node.
    Type: Grant
    Filed: December 10, 2012
    Date of Patent: November 12, 2019
    Assignee: International Business Machines Corporation
    Inventors: Michael E. Aho, John E. Attinella, Thomas M. Gooding, Michael B. Mundy
  • Patent number: 10474625
    Abstract: Configuring compute nodes in a parallel computer using remote direct memory access (‘RDMA’), the parallel computer comprising a plurality of compute nodes coupled for data communications via one or more data communications networks, including: initiating, by a source compute node of the parallel computer, an RDMA broadcast operation to broadcast binary configuration information to one or more target compute nodes in the parallel computer; preparing, by each target compute node, the target compute node for receipt of the binary configuration information from the source compute node; transmitting, by each target compute node, a ready message to the target compute node, the ready message indicating that the target compute node is ready to receive the binary configuration information from the source compute node; and performing, by the source compute node, an RDMA broadcast operation to write the binary configuration information into memory of each target compute node.
    Type: Grant
    Filed: January 17, 2012
    Date of Patent: November 12, 2019
    Assignee: International Business Machines Corporation
    Inventors: Michael E. Aho, John E. Attinella, Thomas M. Gooding, Michael B. Mundy
  • Patent number: 10445098
    Abstract: Methods and apparatuses relating to privileged configuration in spatial arrays are described. In one embodiment, a processor includes processing elements; an interconnect network between the processing elements; and a configuration controller coupled to a first subset and a second, different subset of the plurality of processing elements, the first subset having an output coupled to an input of the second, different subset, wherein the configuration controller is to configure the interconnect network between the first subset and the second, different subset of the plurality of processing elements to not allow communication on the interconnect network between the first subset and the second, different subset when a privilege bit is set to a first value and to allow communication on the interconnect network between the first subset and the second, different subset of the plurality of processing elements when the privilege bit is set to a second value.
    Type: Grant
    Filed: September 30, 2017
    Date of Patent: October 15, 2019
    Assignee: Intel Corporation
    Inventors: Kermin E. Fleming, Simon C. Steely, Kent D. Glossop
  • Patent number: 10417159
    Abstract: An interconnection system, apparatus and method is described for arranging elements in a network, which may be a data memory system, computing system or communications system where the data paths are arranged and operated so as to control the power consumption and data skew properties of the system. A configurable switching element may be used to form the interconnections at nodes, where a control signal and other information is used to manage the power status of other aspects of the configurable switching element. Time delay skew of data being transmitted between nodes of the network may be altered by exchanging the logical and physical line assignments of the data at one or more nodes of the network. A method of laying out an interconnecting motherboard is disclosed which reduces the complexity of the trace routing.
    Type: Grant
    Filed: April 17, 2006
    Date of Patent: September 17, 2019
    Assignee: VIOLIN SYSTEMS LLC
    Inventor: Jon C. R. Bennett
  • Patent number: 10409765
    Abstract: An array of ALUs and a controlling and controlling unit providing the array sequentially ordered subapplications, wherein an ALU signals completion of execution of a subapplication to the controlling unit, which then provides a next sequential subapplication to the requesting ALU.
    Type: Grant
    Filed: June 21, 2017
    Date of Patent: September 10, 2019
    Assignee: PACT XPP SCHWEIZ AG
    Inventors: Martin Vorbach, Armin Nuckel
  • Patent number: 10341561
    Abstract: In a distributed video encoding system, a video is encoded by splitting into video segments and encoding the segments using multiple encoders. Prior to segmenting the video for distributed video encoding, image stabilization is performed on the video. For each frame in the video, a corresponding transform operation is first computed based on an estimated camera movement. Next, the video is segmented into multiple video segments and the corresponding per-frame transform information for the multiple video segments. The video segments are then distributed to multiple processing nodes that perform the image stabilization of the corresponding video segment by applying the corresponding transform. The results from all the stabilized video segments are then stitched back together for further video encoding operation.
    Type: Grant
    Filed: September 11, 2015
    Date of Patent: July 2, 2019
    Assignee: Facebook, Inc.
    Inventors: Amit Puntambekar, Michael Hamilton Coward
  • Patent number: 10282348
    Abstract: An output buffer holds N words arranged as N/J mutually exclusive output buffer word groups (OBWG) of J words each. N processing units (PU) are arranged as N/J mutually exclusive PU groups each having an associated OBWG. Each PU has an accumulator, an arithmetic unit, and first and second multiplexed registers each having at least J+1 inputs and an output. A first input receives a memory operand and the other J inputs receive the J words of the associated OBWG. Each accumulator provides its output to a respective output buffer word. Each arithmetic unit performs an operation on the first and second multiplexed register outputs and the accumulator output to generate a result for accumulation into the accumulator. A mask input to the output buffer controls which words, if any, of the N words retain their current value or are updated with their respective accumulator output.
    Type: Grant
    Filed: April 5, 2016
    Date of Patent: May 7, 2019
    Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD.
    Inventors: G. Glenn Henry, Terry Parks, Kyle T. O'Brien
  • Patent number: 10275393
    Abstract: A neural network unit configurable to first/second/third configurations has N narrow and N wide accumulators, multipliers and adders. Each multiplier performs a narrow/wide multiply on first and second narrow/wide inputs to generate a narrow/wide product. A first adder input receives a corresponding narrow/wide accumulator's output and third input receives a widened corresponding narrow multiplier's narrow product in the third configuration. In the first configuration, each narrow/wide adder performs a narrow/wide addition on the first and second inputs to generate a narrow/wide sum for storage into the corresponding narrow/wide accumulator. In the second configuration, each wide adder performs a wide addition on the first and a second input to generate a wide sum for storage into the corresponding wide accumulator. In the third configuration, each wide adder performs a wide addition on the first, second and third inputs to generate a wide sum for storage into the corresponding wide accumulator.
    Type: Grant
    Filed: April 5, 2016
    Date of Patent: April 30, 2019
    Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD.
    Inventors: G. Glenn Henry, Terry Parks
  • Patent number: 10268727
    Abstract: A technique of batching tuples can include determining a plurality of key-attributes for a plurality of tuples, creating a batch tuple, and calculating a hash value for the batch tuple.
    Type: Grant
    Filed: March 29, 2013
    Date of Patent: April 23, 2019
    Assignee: Hewlett Packard Enterprise Development LP
    Inventors: Matthias J. Sax, Maria Guadalupe Castellanos, Meichun Hsu
  • Patent number: 10229073
    Abstract: A system including at least one computation node including a memory, a processor reading/writing data in a work area of the memory and a DMA controller including a receiver receiving data from outside and writing it in a sharing area of the memory or a transmitter reading data in said sharing area and transmitting it outside. A write and read request mechanism is provided in order to cause, upon request of the processor, a data transfer, by the DMA controller, between the sharing area and the work area. The DMA controller includes an additional transmitting/receiving device designed for exchanging data between outside and the work area, without passing through the sharing area.
    Type: Grant
    Filed: March 2, 2017
    Date of Patent: March 12, 2019
    Assignee: Commissariat à l'énergie atomique et aux énergies alternatives
    Inventors: Thiago Raupp Da Rosa, Romain Lemaire, Fabien Clermidy
  • Patent number: 10216655
    Abstract: A memory interface apparatus is provided. The apparatus includes a central processing unit (CPU)-side protocol processor connected to a CPU through a parallel interface and a memory-side protocol processor connected to a memory through a parallel interface, and the CPU-side protocol processor and the memory-side protocol processor are connected through a serial link.
    Type: Grant
    Filed: June 27, 2016
    Date of Patent: February 26, 2019
    Assignee: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE
    Inventors: Yong Seok Choi, Hyuk Je Kwon
  • Patent number: 10199364
    Abstract: A single multichip package is provided, comprising: a substrate having opposing upper and lower surfaces. A first die is mounted on the upper surface of the substrate and includes one or more non-volatile memory devices. A second die is mounted on the upper surface of the substrate, and includes at least one of: (a) a non-volatile memory controller that facilitates transfer of data to/from the one or more non-volatile memory devices, (b) a register clock driver for volatile memory devices, and/or (c) one or more multiplexer switches configured to switch between two or more of the volatile memory devices. A plurality of wire bonds connect the first and second dies. A plurality of solder balls are located on the lower surface of the substrate for mounting the single multichip package to a printed circuit board, the plurality of solder balls electrically coupled to the first die and the second die.
    Type: Grant
    Filed: May 19, 2017
    Date of Patent: February 5, 2019
    Assignee: SANMINA CORPORATION
    Inventors: Arvindhkumar Lalam, Alec C. Shen
  • Patent number: 10009135
    Abstract: According to some embodiments, a network architecture is disclosed. The network architecture includes a plurality of processing network nodes. The network architecture further includes at least one broadcasting medium to interconnect the plurality of processing network nodes where the broadcasting medium includes an integrated waveguide. The network architecture also includes a broadcast and weight protocol configured to perform wavelength division multiplexing such that multiple wavelengths coexist in the integrated waveguide available to all nodes of the plurality of processing network nodes.
    Type: Grant
    Filed: February 5, 2016
    Date of Patent: June 26, 2018
    Assignee: THE TRUSTEES OF PRINCETON UNIVERSITY
    Inventors: Alexander N. Tait, Mitchell A. Nahmias, Bhavin J. Shastri, Paul R. Prucnal
  • Patent number: 9992133
    Abstract: A switching device in a network system for transferring data includes one or more source line cards, one or more destination line cards and a switching fabric coupled to the source line cards and the destination line cards to enable data communication between any source line card and destination line card. Each source line card includes a request generator to generate a request signal to be transmitted in order to obtain an authorization to transmit data. Each destination line card includes a grant generator to generate and send back a grant signal to the source line card in response to the request signal received at the destination line card to authorize the source line card to transmit a data cell to the destination line card.
    Type: Grant
    Filed: October 21, 2016
    Date of Patent: June 5, 2018
    Assignee: Juniper Networks, Inc.
    Inventors: Pradeep S. Sindhu, Philippe Lacroute, Matthew A. Tucker, John D. Weisbloom, David B. Winters
  • Patent number: 9984026
    Abstract: Provided is a parallel computing system that has scalability and is capable of performing data transfer between desired PEs. Also provided is a computer system that utilizes the parallel computing system described above, and enables radiosity processing on small-scale mobile terminal devices. An HXNet is implemented in a VLSI, and data transfer between VLSIs is possible using additional BMs. Scalability is realized that enables selection of any number of VLSIs, and radiosity processing is enabled on small-scale mobile terminal devices.
    Type: Grant
    Filed: May 11, 2015
    Date of Patent: May 29, 2018
    Assignee: Nakaikegami Koubou Co., Ltd.
    Inventor: Ryuji Murakami
  • Patent number: 9971720
    Abstract: An island-based integrated circuit includes a configurable mesh data bus. The data bus includes four meshes. Each mesh includes, for each island, a crossbar switch and radiating half links. The half links of adjacent islands align to form links between crossbar switches. A link is implemented as two distributed credit FIFOs. In one direction, a link portion involves a FIFO associated with an output port of a first island, a first chain of registers, and a second FIFO associated with an input port of a second island. When a transaction value passes through the FIFO and through the crossbar switch of the second island, an arbiter in the crossbar switch returns a taken signal. The taken signal passes back through a second chain of registers to a credit count circuit in the first island. The credit count circuit maintains a credit count value for the distributed credit FIFO.
    Type: Grant
    Filed: May 29, 2015
    Date of Patent: May 15, 2018
    Assignee: Netronome Systems, Inc.
    Inventors: Gavin J. Stark, Steven W. Zagorianakos, Ronald N. Fortino
  • Patent number: 9875171
    Abstract: A technique for estimating a format of a log message (LM) according to the present invention includes creating a first directed graph structure by dividing a first LM by predetermined characters to define divided portions as nodes and arranging the nodes in order from the beginning of the first LM; creating a second directed graph structure by performing on a second LM the same processing as that performed on the first LM; comparing nodes in the first directed graph structure with nodes in the second directed graph structure to detect nodes other than nodes including a corresponding character string; adding to the first directed graph structure the node detected in the second directed graph structure among the detected nodes as a first branch node; and estimating the format, based on the first directed graph structure including the first branch node added thereto.
    Type: Grant
    Filed: August 27, 2015
    Date of Patent: January 23, 2018
    Assignee: International Business Machines Corporation
    Inventor: Masayoshi Mizutani
  • Patent number: 9875045
    Abstract: A device for matching, in input data, a regular expression with back-references, represented by a finite-state machine (FSM). The device comprises a plurality of parallel processing elements (PPEs), an interconnection network for interconnecting the PPEs with each other, and a memory for receiving and storing input data. The PPEs process the input data stored in the memory, based on backtracking to process the back-references, and implement FA next state logic to generate new active FA configurations or mark themselves as available to receive active FA configurations. The interconnection network retrieves active FA configurations from the PPEs and allocates the active FA configurations to available PPEs. The PPEs are configured to match a regular expression in the input data.
    Type: Grant
    Filed: July 27, 2015
    Date of Patent: January 23, 2018
    Assignee: International Business Machines Corporation
    Inventors: Kubilay Atasu, Silvio Dragone
  • Patent number: 9778856
    Abstract: The subject disclosure is directed towards one or more parallel storage components for parallelizing block-level input/output associated with remote file data. Based upon a mapping scheme, the file data is partitioned into a plurality of blocks in which each may be equal in size. A translator component of the parallel storage may determine a mapping between the plurality of blocks and a plurality of storage nodes such that at least a portion of the plurality of blocks is accessible in parallel. Such a mapping, for example, may place each block in a different storage node allowing the plurality of blocks to be retrieved simultaneously and in its entirety.
    Type: Grant
    Filed: August 30, 2012
    Date of Patent: October 3, 2017
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Bin Fan, Asim Kadav, Edmund Bernard Nightingale, Jeremy E. Elson, Richard F. Rashid, James W. Mickens
  • Patent number: 9780978
    Abstract: A system for an orthogonal frequency division multiplexed (OFDM) equalizer, said system comprising a program memory, a program sequencer and a processing unit connected to each other, wherein the processing unit comprises an input selection unit, an arithmetic logic unit (ALU), a coprocessor and an output selection unit; further wherein the program sequencer schedules the processing of one or more symbol-carrier pairs input to said OFDM equalizer using multiple threads; retrieves, for each of the one or more symbol-carrier pairs, multiple program instructions from said program memory; generates multiple expanded instructions corresponding to said retrieved multiple program instructions; and further wherein said ALU performs said processing of the one or more symbol-carrier pairs using the multiple threads across multiple pipeline stages, wherein said processing comprises said ALU executing arithmetic operations to process said expanded instructions using said multiple threads across the multiple pipeline stag
    Type: Grant
    Filed: December 15, 2016
    Date of Patent: October 3, 2017
    Assignee: Redline Communications Inc.
    Inventor: Octavian Valeriu Sarca
  • Patent number: 9710469
    Abstract: Methods and systems for providing content are disclosed. An example method can comprise identifying a first plurality of data fragments of a media file. An example method can also comprise identifying a second plurality of data fragments of the media file. An example method can comprise generating a manifest file. The manifest file can comprise information for playback of the second plurality of data fragments on a device without access to the first plurality of data fragments.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: July 18, 2017
    Assignee: Comcast Cable Communications, LLC
    Inventor: Michael Chen
  • Patent number: 9690734
    Abstract: A plurality of data links interconnects a number (N) of nodes of a large-scale, parallel system with minimum data transfer latency. A maximum number (K) of the data links connect each node to the other nodes. The number (N) of the nodes is related to the maximum number (K) of the data links by the expression: N=2K. An average distance (A) of the shortest distances between all pairs of the nodes, and a diameter (D), which is a largest of the shortest distances, are minimized.
    Type: Grant
    Filed: September 8, 2015
    Date of Patent: June 27, 2017
    Inventor: Arjun Kapoor
  • Patent number: 9575756
    Abstract: Embodiments relate to vector processor predication in an active memory device. An aspect includes a system for vector processor predication in an active memory device. The system includes memory in the active memory device and a processing element in the active memory device. The processing element is configured to perform a method including decoding an instruction with a plurality of sub-instructions to execute in parallel. One or more mask bits are accessed from a vector mask register in the processing element. The one or more mask bits are applied by the processing element to predicate operation of a unit in the processing element associated with at least one of the sub-instructions.
    Type: Grant
    Filed: August 8, 2012
    Date of Patent: February 21, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Bruce M. Fleischer, Thomas W. Fox, Hans M. Jacobson, Ravi Nair
  • Patent number: 9569211
    Abstract: Embodiments relate to vector processor predication in an active memory device. An aspect includes a method for vector processor predication in an active memory device that includes memory and a processing element. The method includes decoding, in the processing element, an instruction including a plurality of sub-instructions to execute in parallel. One or more mask bits are accessed from a vector mask register in the processing element. The one or more mask bits are applied by the processing element to predicate operation of a unit in the processing element associated with at least one of the sub-instructions.
    Type: Grant
    Filed: August 3, 2012
    Date of Patent: February 14, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Bruce M. Fleischer, Thomas W. Fox, Hans M. Jacobson, Ravi Nair
  • Patent number: 9531781
    Abstract: In a stream computing application, data may be transmitted between operators using tuples. However, the receiving operator may not evaluate these tuples as they arrive but instead wait to evaluate a group of tuples—i.e., a window. A window is typically triggered when a buffer associated with the receiving operator reaches a maximum window size or when a predetermined time period has expired. Additionally, a window may be triggered by a monitoring a tuple rate—i.e., the rate at which the operator receives the tuples. If the tuple rate exceeds or falls below a threshold, a window may be triggered. Further, the number of exceptions, or the rate at which an operator throws exceptions, may be monitored. If either of these parameters satisfies a threshold, a window may be triggered, thereby instructing an operator to evaluate the tuples contained within the window.
    Type: Grant
    Filed: December 10, 2012
    Date of Patent: December 27, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michael J. Branson, John M. Santosuosso, Brandon W. Schulz
  • Patent number: 9524178
    Abstract: Systems and methods for executing non-native instructions in a computing system having a processor configured to execute native instructions are provided. A dynamic translator uses instruction code translation in parallel with just-in-time (JIT) compilation to execute the non-native instructions. Non-native instructions may be interpreted to generate instruction codes, which may be stored in a shadow memory. During a subsequent scheduling of a non-native instruction for execution, the corresponding instruction code may be retrieved from the shadow memory and executed, thereby avoiding reinterpreting the non-native instruction. In addition, the JIT compiler may compile instruction codes to generate native instructions, which may be made available for execution, further speeding up the execution process.
    Type: Grant
    Filed: December 30, 2013
    Date of Patent: December 20, 2016
    Assignee: Unisys Corporation
    Inventors: Andrew T Jennings, Charles R Caldarale, Maurice Marks, Kevin Harris
  • Patent number: 9513963
    Abstract: A SIMD processing unit processes a plurality of tasks which each include up to a predetermined maximum number of work items. The work items of a task are arranged for executing a common sequence of instructions on respective data items. The data items are arranged into blocks, with some of the blocks including at least one invalid data item. Work items which relate to invalid data items are invalid work items. The SIMD processing unit comprises a group of processing lanes configured to execute instructions of work items of a particular task over a plurality of processing cycles. A control module assembles work items into the tasks based on the validity of the work items, so that invalid work items of the particular task are temporally aligned across the processing lanes. In this way the number of wasted processing slots due to invalid work items may be reduced.
    Type: Grant
    Filed: December 17, 2014
    Date of Patent: December 6, 2016
    Assignee: Imagination Technologies Limited
    Inventors: John Howson, Jonathan Redshaw, Yoong Chert Foo
  • Patent number: 9507394
    Abstract: A monolithically integrated circuit with one or more supply overrides without need of an override control pin to the IC is presented. The internal circuitry to control such an override is presented and various override conditions are also presented.
    Type: Grant
    Filed: March 29, 2013
    Date of Patent: November 29, 2016
    Assignee: Peregrine Semiconductor Corporation
    Inventor: Robert Mark Englekirk
  • Patent number: 9509780
    Abstract: A node includes a sending unit that sends a signal to another node; a receiving unit that receives a signal from another node; a determining unit that determines, when the sending unit sends a signal to the other node, that synchronization has been established with the other node, that determines, when the receiving unit receives a signal from another node, that synchronization has been established with the other node, and that determines, when a node in which synchronization has already been established with the other two nodes in each of which synchronization has been established, that synchronization has been established with the nodes; and a selecting unit that selects an information processing apparatus that is not determined, by the determining unit, that synchronization has been established as the other node at the sending destination for the signal.
    Type: Grant
    Filed: July 28, 2014
    Date of Patent: November 29, 2016
    Assignee: FUJITSU LIMITED
    Inventors: Naoto Fukumoto, Akira Naruse, Kohta Nakashima
  • Patent number: 9465675
    Abstract: An arithmetic processing device executes a program, and gives first sequence information to a first start time when a first process included in the program starts a first interprocess communication. Then, the first start time and the first sequence information are written in a main storage device. When second sequence information given to a second start time when a second process starts a second interprocess communication is newer than the first sequence information, an operational circuit in a communication control device does not carry out an operation using the first start time. On the other hand, when the second sequence information corresponds to the first sequence information, the operational circuit carries out an operation using the first start time and the second start time and outputs the operation result.
    Type: Grant
    Filed: November 12, 2014
    Date of Patent: October 11, 2016
    Assignee: FUJITSU LIMITED
    Inventors: Hideki Miwa, Ikuo Miyoshi
  • Patent number: 9405724
    Abstract: A reconfigurable tree apparatus with a bypass mode and a method of using the reconfigurable tree apparatus are disclosed. The reconfigurable tree apparatus uses a short-circuit register to selectively designate participating agents for such operations as barriers, multicast, and reductions. The reconfigurable tree apparatus enables an agent to initiate a barrier, multicast, or reduction operation, leaving software to determine the participating agents for each operation. Although the reconfigurable tree apparatus is implemented using a small number of wires, multiple in-flight barrier, multicast, and reduction operations can take place. The method and apparatus have low complexity, easy reconfigurability, and provide the energy savings necessary for future exa-scale machines.
    Type: Grant
    Filed: June 28, 2013
    Date of Patent: August 2, 2016
    Assignee: INTEL CORPORATION
    Inventors: Jianping Xu, Asit K. Mishra, Joshua B. Fryman, David S. Dunning
  • Patent number: 9405647
    Abstract: Some implementations provide techniques and arrangements for detecting a register value having a life longer than a threshold period based, at least in part, on at least one code segment of a code being translated by a binary translator. For a register value detected as having a life longer than a threshold period, at least one instruction to cause an access of the detected register value during the life of the register value may be included in at least one translated code segment to be output by the binary translator.
    Type: Grant
    Filed: December 30, 2011
    Date of Patent: August 2, 2016
    Assignee: Intel Corporation
    Inventors: Xavier Vera, Javier Carretero Casado, Matteo Monchiero, Tanausu Ramirez, Enric Herrero
  • Patent number: 9390057
    Abstract: An array processor includes processing elements arranged in clusters to form a rectangular array. Inter-cluster communication paths are mutually exclusive. Due to the mutual exclusivity of the data paths, communications between the processing elements of each cluster may be combined in a single inter-cluster path, thus eliminating half the wiring required for the path. The length of the longest communication path is not directly determined by the overall dimension of the array, as in conventional torus arrays. Rather, the longest communications path is limited by the inter-cluster spacing. Transpose elements of an N×N torus may be combined in clusters and communicate with one another through intra-cluster communications paths. Transpose operation latency is eliminated in this approach. Each PE may have a single transmit port and a single receive port. Thus, the individual PEs are decoupled from the array topology.
    Type: Grant
    Filed: September 14, 2012
    Date of Patent: July 12, 2016
    Assignee: Altera Corporation
    Inventors: Gerald George Pechanek, Charles W. Kurak, Jr.
  • Patent number: 9368489
    Abstract: Embodiments of the invention relate to processor arrays, and in particular, a processor array with interconnect circuits for bonding semiconductor dies. One embodiment comprises multiple semiconductor dies and at least one interconnect circuit for exchanging signals between the dies. Each die comprises at least one processor core circuit. Each interconnect circuit corresponds to a die of the processor array. Each interconnect circuit comprises one or more attachment pads for interconnecting a corresponding die with another die, and at least one multiplexor structure configured for exchanging bus signals in a reversed order.
    Type: Grant
    Filed: February 28, 2013
    Date of Patent: June 14, 2016
    Assignee: International Business Machines Corporation
    Inventors: Rodrigo Alvarez-Icaza Rivera, John V. Arthur, John E. Barth, Andrew S. Cassidy, Subramanian S. Iyer, Bryan L. Jackson, Paul A. Merolla, Dharmendra S. Modha, Jun Sawada
  • Patent number: 9363137
    Abstract: Embodiments of the invention relate to faulty recovery mechanisms for a three-dimensional (3-D) network on a processor array. One embodiment comprises a multidimensional switch network for a processor array. The switch network comprises multiple switches for routing packets between multiple core circuits of the processor array. The switches are organized into multiple planes. The switch network further comprises a redundant plane including multiple redundant switches. Multiple data paths interconnect the switches. The redundant plane is used to facilitate full operation of the processor array in the event of one or more component failures.
    Type: Grant
    Filed: August 6, 2015
    Date of Patent: June 7, 2016
    Assignee: International Business Machines Corporation
    Inventors: Rodrigo Alvarez-Icaza Rivera, John V. Arthur, John E. Barth, Jr., Andrew S. Cassidy, Subramanian Iyer, Paul A. Merolla, Dharmendra S. Modha
  • Patent number: 9330060
    Abstract: A method and device for encoding and decoding video image data. An MPEG decoding and encoding process using data flow pipeline architecture implemented using complete dedicated logic is provided. A plurality of fixed-function data processors are interconnected with at least one pipelined data transmission line. At least one of the fixed-function processors performs a predefined encoding/decoding function upon receiving a set of predefined data from said transmission line. Stages of pipeline are synchronized on data without requiring a central traffic controller. This architecture provides better performance in smaller size, lower power consumption and better usage of memory bandwidth.
    Type: Grant
    Filed: April 15, 2004
    Date of Patent: May 3, 2016
    Assignee: NVIDIA CORPORATION
    Inventor: Eric Kwong-Hang Tsang
  • Patent number: 9330230
    Abstract: Validating a cabling topology in a distributed computing system comprised of cabled nodes connected using data communications cables, each cabled node characterized by cabling dimensions, each cable corresponding to one of the cabling dimensions, includes: receiving a selection from a user of at least one cabled node for topology validation; identifying, for each cabling dimension for each selected cabled node, a shortest cabling path; determining, for each cabling dimension, whether the number of cabled nodes in the shortest cabling path for each selected cabled node match; and if, for each cabling dimension, the number of cabled nodes in the shortest cabling path for each selected cabled node match: selecting, for each cabling dimension, the number of cabled nodes in the shortest cabling path as a representative value for the cabling dimension, calculating a product of the representative values, and determining whether the product equals the number of selected cabled nodes.
    Type: Grant
    Filed: April 19, 2007
    Date of Patent: May 3, 2016
    Assignee: International Business Machines Corporation
    Inventors: Charles J. Archer, Mark G. Megerian
  • Patent number: 9323716
    Abstract: A reconfigurable hierarchical computer architecture having N levels, where N is an integer value greater than one, wherein said N levels include a first level including a first computation block including a first data input, a first data output and a plurality of computing nodes interconnected by a first connecting mechanism, each computing node including an input port, a functional unit and an output port, the first connecting mechanism capable of connecting each output port to the input port of each other computing node; and a second level including a second computation block including a second data input, a second data output and a plurality of the first computation blocks interconnected by a second connecting means for selectively connecting the first data output of each of the first computation blocks and the second data input to each of the first data inputs and for selectively connecting each of the first data outputs to the second data output.
    Type: Grant
    Filed: July 11, 2014
    Date of Patent: April 26, 2016
    Assignee: STMICROELECTRONICS SA
    Inventor: Joël Cambonie
  • Patent number: 9282037
    Abstract: A multiprocessor computer system comprises a dragonfly processor interconnect network that comprises a plurality of processor nodes and a plurality of routers. The routers are operable to route data by selecting from among a plurality of network paths from a target node to a destination node in the dragonfly network based on one or more routing tables.
    Type: Grant
    Filed: November 7, 2011
    Date of Patent: March 8, 2016
    Assignee: INTEL CORPORATION
    Inventors: Mike Parker, Steve Scott, Albert Cheng, Robert Alverson