Cube Or Hypercube Patents (Class 712/12)
  • Patent number: 11886934
    Abstract: A data processing system comprising a plurality of processing nodes, each comprising at least one memory configured to store an array of data items, wherein each of the plurality of processing nodes is configured to execute compute instructions during a compute phase and following a precompiled synchronisation barrier, enter at least one exchange phase. During the at least one exchange phase, a series of collective operations are carried out. Each processing node is configured to perform a reduce scatter collective in at least one first dimension. Using the results of the reduce scatter collective, each processing node performs an allreduce in a second dimension. The processing nodes then perform an all-gather collective in the at least one first dimension using the results of the allreduce.
    Type: Grant
    Filed: July 14, 2020
    Date of Patent: January 30, 2024
    Assignee: GRAPHCORE LIMITED
    Inventors: Lorenzo Cevolani, Fabian Tschopp, Ola Torudbakken
  • Patent number: 11853721
    Abstract: An interrupt-driven system verification method based on interrupt sequence diagrams includes the steps of: establishing an interrupt-driven system model based on an interrupt sequence diagram, dividing interaction fragments in the obtained interrupt sequence diagram into basic interaction fragments and composite interaction fragments and sequentially converting the basic interaction fragments and the composite interaction fragments into the corresponding automaton models, combining the automaton models into one automaton model, adding the constraints in the interrupt sequence diagram to the converted automaton model, adding the verification attribute information as a constraint to the converted automaton model, describing an automaton as an input format acceptable to the automaton verification tool, and verifying the model with the automaton verification tool.
    Type: Grant
    Filed: July 17, 2019
    Date of Patent: December 26, 2023
    Assignee: NANJING UNIVERSITY
    Inventors: Minxue Pan, Shouyu Chen, Tian Zhang, Linzhang Wang, Xuandong Li
  • Patent number: 11681907
    Abstract: A computation unit for performing a computation of a neural network layer is disclosed. A number of processing element (PE) units are arranged in an array. First input values are provided in parallel in an input dimension of the array during a first processing period, and a second input values are provided in parallel in the input dimension during a second processing period. Computations are performed by the PE units based on stored weight values. An adder coupled to the first set of PE units generates a first sum of results of the computations by the first set of PE units during the first processing cycle, and generates a second sum of results of the computations during the second processing cycle. A first accumulator coupled to the first adder stores the first sum, and further shifts the first sum to a second accumulator prior to storing the second sum.
    Type: Grant
    Filed: October 14, 2022
    Date of Patent: June 20, 2023
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Hamzah Abdelaziz, Joseph Hassoun, Ali Shafiee Ardestani
  • Patent number: 11675899
    Abstract: Aspects include circuitry that includes a first global generation counter (GGC) that is increased upon decoding of a branch instruction and a second GGC that is increased upon a completion of the branch instruction. Upon a triggered rollback, the first GGC is reset. The circuitry also includes a generation tag memory associated with a register that receives loads during a side-channel attacks which is set to the first GGC upon a first load, and a determination unit to determine, for a second load from an address depending on the register of the first load, a generation tag value associated with the register of the second load as a function of the first GGC, the second GGC, and the generation tag value associated with the register of the first load. A wait queue is configured to block the second load, if the generation tag is larger than the second GGC.
    Type: Grant
    Filed: December 15, 2020
    Date of Patent: June 13, 2023
    Assignee: International Business Machines Corporation
    Inventors: Christian Borntraeger, Jonathan D. Bradbury, Martin Recktenwald, Anthony Saporito
  • Patent number: 11579978
    Abstract: In one approach, filesets to be backed up are divided into partitions and snapshots are pulled for each partition. In one architecture, a data management and storage (DMS) cluster includes a plurality of peer DMS nodes and a distributed data store implemented across the peer DMS nodes. One of the peer DMS nodes receives fileset metadata for the fileset and defines a plurality of partitions for the fileset based on the fileset metadata. The peer DMS nodes operate autonomously to execute jobs to pull snapshots for each of the partitions and to store the snapshots of the partitions in the distributed data store.
    Type: Grant
    Filed: February 14, 2018
    Date of Patent: February 14, 2023
    Assignee: Rubrik, Inc.
    Inventors: Looi Chow Lee, Guilherme Vale Ferreira Menezes
  • Patent number: 11573902
    Abstract: A coherent data processing system includes a system fabric communicatively coupling a plurality of coherence participants and fabric control logic. The fabric control logic quantifies congestion on the system fabric based on coherence messages associated with commands issued on the system fabric. Based on the congestion on the system fabric, the fabric control logic determines a rate of request issuance applicable to a set of coherence participants among the plurality of coherence participants. The fabric control logic issues at least one rate command to set a rate of request issuance to the system fabric of the set of coherence participants.
    Type: Grant
    Filed: August 18, 2021
    Date of Patent: February 7, 2023
    Assignee: International Business Machines Corporation
    Inventors: Hugh Shen, Guy L. Guthrie, Jeffrey A. Stuecheli, Luke Murray, Alexander Michael Taft, Bernard C. Drerup, Derek E. Williams
  • Patent number: 11516087
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer-storage media, for connecting processors using twisted torus configurations. In some implementations, a cluster of processing nodes is coupled using a reconfigurable interconnect fabric. The system determines a number of processing nodes to allocate as a network within the cluster and a topology for the network. The system selects an interconnection scheme for the network, where the interconnection scheme is selected from a group that includes at least a torus interconnection scheme and a twisted torus interconnection scheme. The system allocates the determined number of processing nodes of the cluster in the determined topology, sets the reconfigurable interconnect fabric to provide the selected interconnection scheme for the processing nodes in the network, and provides access to the network for performing a computing task.
    Type: Grant
    Filed: December 11, 2020
    Date of Patent: November 29, 2022
    Assignee: Google LLC
    Inventor: Brian Patrick Towles
  • Patent number: 11507817
    Abstract: A computation unit for performing a computation of a neural network layer is disclosed. A number of processing element (PE) units are arranged in an array. First input values are provided in parallel in an input dimension of the array during a first processing period, and a second input values are provided in parallel in the input dimension during a second processing period. Computations are performed by the PE units based on stored weight values. An adder coupled to the first set of PE units generates a first sum of results of the computations by the first set of PE units during the first processing cycle, and generates a second sum of results of the computations during the second processing cycle. A first accumulator coupled to the first adder stores the first sum, and further shifts the first sum to a second accumulator prior to storing the second sum.
    Type: Grant
    Filed: June 12, 2020
    Date of Patent: November 22, 2022
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Hamzah Abdelaziz, Joseph Hassoun, Ali Shafiee Ardestani
  • Patent number: 11178072
    Abstract: There may be provided a non-uniform Benes network, that may include a first Benes network portion that has a first number (k) of first inputs and k first outputs; a second Benes network portion that has a second number (j) of second inputs and j second outputs; wherein j is smaller than k; and a set of multiplexers that are coupled between a set of switches of an intermediate layer of the first Benes network portion and a first layer of the second Benes network layer.
    Type: Grant
    Filed: December 14, 2017
    Date of Patent: November 16, 2021
    Assignee: Mobileye Vision Technologies Ltd.
    Inventors: Daniel Srebnik, Emmanuel Sixou, Gil Israel Dogon, Dror Livne
  • Patent number: 11165659
    Abstract: The invention relates in particular to an administration server (S) of a supercomputer suitable for first loading information on the environment of said supercomputer; receiving an administration task transmitted by an administration client (C1, C2); executing said administration task in collaboration with said information, previously loaded; and transmitting the results of the execution of said administration task to said administration client.
    Type: Grant
    Filed: September 13, 2016
    Date of Patent: November 2, 2021
    Assignee: BULL SAS
    Inventors: Pierre Vigneras, Sebastien Miquee
  • Patent number: 11032905
    Abstract: A control system for an unmanned vehicle (UV) comprises a housing defining an interior, a first circuit board disposed within the interior, and a second circuit board disposed within the interior. The first circuit board includes one or more processing circuits including a first processing system and a second processing system having heterogeneous field programmable architectures. The second circuit board includes a plurality of interface circuits associated with a plurality of vehicle devices of the UV. The second circuit board is in operative communication with the first circuit board and includes an input/output (I/O) interface between the plurality of interface circuits and the first and second processing systems.
    Type: Grant
    Filed: January 19, 2018
    Date of Patent: June 8, 2021
    Assignee: GE AVIATION SYSTEMS LLC
    Inventor: Stefano Angelo Mario Lassini
  • Patent number: 10915450
    Abstract: A data analysis system to analyze data. The data analysis system includes a data buffer configured to receive data to be analyzed. The data analysis system also includes a state machine lattice. The state machine lattice includes multiple data analysis elements and each data analysis element includes multiple memory cells configured to analyze at least a portion of the data and to output a result of the analysis. The data analysis system includes a buffer interface configured to receive the data from the data buffer and to provide the data to the state machine lattice.
    Type: Grant
    Filed: July 16, 2019
    Date of Patent: February 9, 2021
    Assignee: Micron Technology, Inc.
    Inventors: David R. Brown, Harold B Noyes, Inderjit Singh Bains
  • Patent number: 10853072
    Abstract: An arithmetic processing apparatus includes: an instruction controller; a first level cache and a second level cache. The instruction controller, for a memory access instruction to be speculatively executed that is executed while a branch destination of a branch instruction is undetermined, adds a valid speculation flag and an instruction identifier of the branch instruction to the memory access instruction and issues to the first level cache. The first level cache controller interrupts execution of the memory access instruction when a virtual address of the memory access instruction hits in a TLB of the first level cache, the speculation flag of the memory access instruction is valid and an entry having a virtual address matching the virtual address of the memory access instruction stores a speculative access prohibition flag prohibiting speculative access.
    Type: Grant
    Filed: May 28, 2019
    Date of Patent: December 1, 2020
    Assignee: FUJITSU LIMITED
    Inventor: Masaharu Maruyama
  • Patent number: 10831482
    Abstract: An arithmetic processing apparatus includes a decoder, a first cache memory, a second cache memory and a processor. The processor performs a cache hit determination on the first cache memory in response to a memory access instruction, issues a data request to a second cache memory when the cache hit determination is a cache miss. When the memory access instruction is for a speculative execution speculatively executed in a state where a branch destination of a branch instruction is unestablished, the decoder issues the memory access instruction with a valid prohibition flag and an instruction identifier. In a case where the cache hit determination is the cache miss and the prohibition flag is valid, the processor does not issue the data request to the second cache memory. In a case where the cache hit determination is a cache hit, the processor acquires data from the first cache memory.
    Type: Grant
    Filed: May 30, 2019
    Date of Patent: November 10, 2020
    Assignee: FUJITSU LIMITED
    Inventors: Yasunobu Akizuki, Toshio Yoshida
  • Patent number: 10789202
    Abstract: A method is described. The method includes configuring a first instance of object code to execute on a processor. The processor has multiple cores and an internal network. The internal network is configured in a first configuration that enables a first number of the cores to be communicatively coupled. The method also includes configuring a second instance of the object code to execute on a second instance of the processor. A respective internal network of the second instance of the processor is configured in a second configuration that enables a different number of cores to be communicatively coupled, wherein, same positioned cores on the processor and the second instance of the processor have same network addresses for the first and second configurations. A processor is also described having an internal network designed to enable the above method.
    Type: Grant
    Filed: May 12, 2017
    Date of Patent: September 29, 2020
    Assignee: Google LLC
    Inventors: Jason Redgrave, Albert Meixner, Ji Kim, Ofer Shacham
  • Patent number: 10728179
    Abstract: Techniques are disclosed for pushing configuration changes of a distributed virtual switch from a management server to a plurality of host servers underlying the distributed virtual switch. The approach includes sending, in parallel, by the management server, a message to each of the plurality of host servers. The message specifies a final configuration state for one or more virtual ports emulated via virtualization layers of the host servers. The approach further includes determining, by each of the plurality of host servers, port state configuration changes to make to the virtual ports to achieve the final configuration state, and reconfiguring, by each of the plurality of host servers, their respective virtual ports, to match the final configuration state.
    Type: Grant
    Filed: December 16, 2015
    Date of Patent: July 28, 2020
    Assignee: VMware, Inc.
    Inventors: Mukesh Baphna, Chi-Hsiang Su, Piyush Kothari, Geetha Kakarlapudi
  • Patent number: 10678713
    Abstract: A low-latency, high-bandwidth, and highly scalable method delivers data from a source device to multiple communication devices on a communication network. Under this method, the communication devices (also called player nodes) provide download and upload bandwidths for each other. In this manner, the bandwidth requirement on the data source is significantly reduced. Such a data delivery network is scalable without limits with the number of player nodes. In one embodiment, a computer network includes (a) a source server that provides a data stream for delivery in the computer network, (b) player nodes that exchange data with each other to obtain a complete copy of the data stream, the network nodes being capable of dynamically joining or exiting the computer network, and (c) a control server which maintains a topology graph representing connections between the source server and the player nodes, and the connections among the player nodes themselves.
    Type: Grant
    Filed: May 7, 2018
    Date of Patent: June 9, 2020
    Inventor: Wensheng Hua
  • Patent number: 10601944
    Abstract: An approach for cached content identification for adaptive data streaming. A first request is received, requesting a current segment from a sequence of segments from a data file of a streaming data session. A NewVideoFlag is determined as indicating that the sequence of segments associated with the first request is not currently being cached. The first request is forwarded to a content server, and a first response message is received. A SegmentID of the received content segment is determined as not matching that of cached content segments. The NewVideoFlag is set to indicate that the segments from the streaming data session file are currently being cached. A global cVideoFileID is generated identifying the streaming session data file being cached. The content segment is cached, and cache bookkeeping is updated to associate the segment with the SegmentID and the cVideoFileID. The first response message is provided to the client device.
    Type: Grant
    Filed: August 20, 2018
    Date of Patent: March 24, 2020
    Assignee: Hughes Network Systems, LLC
    Inventors: Chi-Jiun Su, Udaya Bhaskar
  • Patent number: 10311023
    Abstract: An apparatus of includes a processor component to: transmit node device identifiers to multiple node devices to define an ordering thereamong and among subsets of multiple blocks of data distributed thereamong; receive sizes of the subsets from the multiple node devices; derive block exchanges among the multiple node device based on the sizes and a minimum size imposed on data transmissions to storage device(s); and transmit a block exchange vector that describes the block exchanges to the multiple node devices, wherein: the subsets remain distributed among a reduced number of the multiple node devices following the block exchanges; at least all node devices of the reduced number but one stores an amount of the blocks of data exceeding the minimum size; and the block exchanges are all lower-order to higher-order node device transfers, or all higher-order to lower-order node device transfers.
    Type: Grant
    Filed: December 27, 2018
    Date of Patent: June 4, 2019
    Assignee: SAS INSTITUTE INC.
    Inventors: Brian Payton Bowman, Jeff Ira Cleveland, III
  • Patent number: 10241706
    Abstract: There is a problem that memory protection against access to a shared memory by a sub-arithmetic unit used by a program executed in a main-arithmetic unit cannot be performed in a related-art semiconductor device. According to one embodiment, a semiconductor device includes a sub-arithmetic unit configured to execute a process of a part of a program executed by a main-arithmetic unit, and a shared memory shared by the main-arithmetic unit and the sub-arithmetic unit, in which the sub-arithmetic unit includes a memory protection unit configured to permit or prohibit access to the shared memory based on an access permission range address value provided from the main-arithmetic unit, the access to the shared memory being access that arises from a process executed by the sub-arithmetic unit.
    Type: Grant
    Filed: April 28, 2017
    Date of Patent: March 26, 2019
    Assignee: RENESAS ELECTRONICS CORPORATION
    Inventors: Seiji Mochizuki, Katsushige Matsubara, Ren Imaoka, Hiroshi Ueda, Ryoji Hashimoto, Toshiyuki Kaya
  • Patent number: 10146613
    Abstract: A set of processors in a symmetric multiprocessor (SMP) system are deconfigured following a first failed processor to return the SMP system to a symmetric state. One or more deconfiguration options are identified, and a respective cost is calculated for each deconfiguration option. A deconfiguration option is selected and applied to the SMP system based on the respective costs of the one or more identified deconfiguration options.
    Type: Grant
    Filed: October 14, 2015
    Date of Patent: December 4, 2018
    Assignee: International Business Machines Corporation
    Inventors: Jayanth Othayoth, Venkatesh Sainath, Vishwanatha Subbanna, Dhruvaraj Subhashchandran
  • Patent number: 10078517
    Abstract: A system for pipelining signal flow graphs by a plurality of shared memory processors organized in a 3D physical arrangement with the memory overlaid on the processor nodes that reduces storage of temporary variables. A group function formed by two or more instructions to specify two or more parts of the group function. A first instruction specifies a first part and specifies control information for a second instruction adjacent to the first instruction or at a pre-specified location relative to the first instruction. The first instruction when executed transfers the control information to a pending register and produces a result which is transferred to an operand input associated with the second instruction. The second instruction specifies a second part of the group function and when executed transfers the control information from the pending register to a second execution unit to adjust the second execution unit's operation on the received operand.
    Type: Grant
    Filed: August 16, 2016
    Date of Patent: September 18, 2018
    Inventor: Gerald George Pechanek
  • Patent number: 10032723
    Abstract: Version circuitry for use with a semiconductor chip having multiple layers includes multiple status bits. The versioning circuitry includes, for each status bit, gate circuitry, first selector circuitry in a first layer, and second selector circuitry in a second layer. The gate circuitry generates a value for the status bit based at least on a first input and a second input. The first selector circuitry is coupled to the gate circuitry and is configured to select a value for the first input. The second selector circuitry is coupled to the gate circuitry and is configured to select a value for the second input. The gate circuitry generates a default value for the status bit when the first input and the second input each have a default value and generates an opposite value for the status bit when either the first input or the second input has an opposite value.
    Type: Grant
    Filed: November 30, 2016
    Date of Patent: July 24, 2018
    Assignee: Intel Corporation
    Inventors: Karthik Tammanur Ranganathan, Jau Soon Chee, Himanshu Kukreja
  • Patent number: 10015056
    Abstract: System, method, and apparatus for improving the performance of collective operations in High Performance Computing (HPC). Compute nodes in a networked HPC environment form collective groups to perform collective operations. A spanning tree is formed including the compute nodes and switches and links used to interconnect the compute nodes, wherein the spanning tree is configured such that there is only a single route between any pair of nodes in the tree. The compute nodes implement processes for performing the collective operations, which includes exchanging messages between processes executing on other compute nodes, wherein the messages contain indicia identifying collective operations they belong to. Each switch is configured to implement message forwarding operations for its portion of the spanning tree. Each of the nodes in the spanning tree implements a ratcheted cyclical state machine that is used for synchronizing collective operations, along with status messages that are exchanged between nodes.
    Type: Grant
    Filed: July 12, 2016
    Date of Patent: July 3, 2018
    Assignee: Intel Corporation
    Inventors: Michael Heinz, Todd Rimmer, James Kunz, Mark Debbage
  • Patent number: 10001933
    Abstract: A host device can offload certain copy operations to an I/O adapter device coupled to the host device. The I/O adapter device can perform a copy operation to copy data from a source storage volume to a destination storage volume. The source storage volume and the destination storage volume can be local or remote to the I/O adapter device. The copy operations can be performed for replica creation, online migration or for copy-on-write snapshots.
    Type: Grant
    Filed: June 23, 2015
    Date of Patent: June 19, 2018
    Assignee: Amazon Technologies, Inc.
    Inventor: Robert Michael Johnson
  • Patent number: 9965434
    Abstract: Proposed is an action machine for processing packet data in a network processor. The action machine comprises: first and second data storage units adapted to store data for processing; and a processing unit adapted to process data from the first and second data storage units. The first storage unit is adapted to be accessed by the processing unit and a unit external to the action machine, and the second storage unit is adapted to only be accessed by the processing unit.
    Type: Grant
    Filed: September 23, 2015
    Date of Patent: May 8, 2018
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Fabrice Jean Verplanken, Francois Abel, Claude Basso, Damon Philipe
  • Patent number: 9956973
    Abstract: A system, method, and apparatus for generating vital messages on an on-board system of a vehicle is disclosed. The method includes generating a plurality of vital messages with each processor of a plurality of different processors of the on-board system based on train data available to each processor, transmitting the plurality of vital messages from the plurality of different processors to a separate processor, and generating, by the separate processor, a final vital message based on at least two vital messages of the plurality of vital messages. A system and an apparatus for implementing the aforementioned method includes appropriately communicatively connected hardware components.
    Type: Grant
    Filed: July 6, 2015
    Date of Patent: May 1, 2018
    Assignee: Westinghouse Air Brake Technologies Corporation
    Inventors: Kristofer M. Ruhland, Kendrick W. Gawne, James L. Fenske
  • Patent number: 9769112
    Abstract: A method of operating a hypercube network of processing devices includes determining that a plurality of the processing devices are storing data to be processed at a single processing device, obtaining the addresses of the plurality of processing devices storing the data to be processed, determining the most common number for each digit of the addresses of the plurality of processing devices storing the data to be processed, generating a new address comprising the determined most common number for each digit, and transferring the data to be processed to the processing device with the generated new address.
    Type: Grant
    Filed: September 25, 2012
    Date of Patent: September 19, 2017
    Assignee: International Business Machines Corporation
    Inventors: Graham A. Bent, Patrick Dantressangle, Paul D. Stone
  • Patent number: 9760487
    Abstract: In one embodiment, a computer-implemented method includes encountering a store operation during a compile-time of a program, where the store operation is applicable to a memory line. It is determined, by a computer processor, that no cache coherence action is necessary for the store operation. A store-without-coherence-action instruction is generated for the store operation, responsive to determining that no cache coherence action is necessary. The store-without-coherence-action instruction specifies that the store operation is to be performed without a cache coherence action, and cache coherence is maintained upon execution of the store-without-coherence-action instruction.
    Type: Grant
    Filed: June 19, 2015
    Date of Patent: September 12, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Constantinos Evangelinos, Ravi Nair, Martin Ohmacht
  • Patent number: 9720832
    Abstract: In one embodiment, a computer-implemented method includes encountering a store operation during a compile-time of a program, where the store operation is applicable to a memory line. It is determined, by a computer processor, that no cache coherence action is necessary for the store operation. A store-without-coherence-action instruction is generated for the store operation, responsive to determining that no cache coherence action is necessary. The store-without-coherence-action instruction specifies that the store operation is to be performed without a cache coherence action, and cache coherence is maintained upon execution of the store-without-coherence-action instruction.
    Type: Grant
    Filed: March 27, 2015
    Date of Patent: August 1, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Constantinos Evangelinos, Ravi Nair, Martin Ohmacht
  • Patent number: 9692649
    Abstract: Embodiments of the present invention disclose a method, computer program product, and for determining a recommendation relating to a configuration of a plurality of server nodes of a computing system. In one embodiment, in accordance with the present invention, the computer implemented method includes the steps of, for each server node, storing a first performance parameter value, wherein each first performance parameter value is a benchmarked value that corresponds to a measured actual performance parameter of its associated server node, and applying a first configuration rule based, at least in part, on the first performance parameter values of the plurality of server nodes to obtain a first configuration recommendation. In another embodiment, the method further includes the step of presenting the first recommendation to a human user.
    Type: Grant
    Filed: February 26, 2014
    Date of Patent: June 27, 2017
    Assignee: International Business Machines Corporation
    Inventors: Srihari V. Angaluri, Gary D. Cudak, Christopher J. Hardee, Bryan M. Reese, Junjiro Sumikawa
  • Patent number: 9639432
    Abstract: A first computing device is provided for rolling back a computing environment. The computing device includes processors configured to acquire a stream containing entries including snapshot entries, memory entries, and input/output entries wherein each entry includes information and is associated with a timestamp. The processors are further configured to receive a snapshot entry associated with a first timestamp, revert to a memory state using information provided in at least one memory entry associated with a timestamp after the first timestamp, and re-execute a previously executed process, wherein the re-execution of the process is started using the first timestamp, information from the received snapshot entry, and information for input/output operations corresponding to the input/output entries associated with timestamps after the first timestamp.
    Type: Grant
    Filed: December 1, 2014
    Date of Patent: May 2, 2017
    Assignee: Citrix Systems, Inc.
    Inventors: Chris Wade, Stanislaw Skowronek
  • Patent number: 9336179
    Abstract: The present invention provides a computer subsystem and a computer system. The computer subsystem includes L composite nodes, each composite node includes M basic nodes, each basic node includes N central processing units CPUs and one node controller NC, where any two CPUs in each basic node are interconnected, each CPU in each basic node is connected to the NC in the basic node, the NC in each basic node has a routing function, any two NCs in the M basic nodes are interconnected, and a connection between the L composite nodes formed through a connection between NCs enable communication between any two NCs to require at most three hops. The computer subsystem and the computer system according to embodiments of the present invention can reduce the kinds and the number of interconnection chips, and simplify an interconnection structure of a system, thereby improving reliability of the system.
    Type: Grant
    Filed: November 7, 2012
    Date of Patent: May 10, 2016
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Jiangen Liu, Chenghong He, Haibin Wang, Xinyu Hou
  • Patent number: 9307448
    Abstract: Systems and methods for distributed computing between communication devices. A femto node is treated as a trusted extension of a user equipment and performs processing tasks on behalf of the user equipment. The femto node is also treated as a trusted extension of network servers and performs services on behalf of the network servers. Tasks are thus distributed between the network servers, the femto node and one or more user equipments. The tasks include processing data, filtering incoming messages, and caching network service information.
    Type: Grant
    Filed: April 14, 2014
    Date of Patent: April 5, 2016
    Assignee: QUALCOMM Incorporated
    Inventors: Dilip Krishnaswamy, Subbarao V. Yallapragada, Sanjiv Nanda
  • Patent number: 9294419
    Abstract: Architectures, apparatus and systems employing scalable multi-layer 2D-mesh routers. A 2D router mesh comprises bi-direction pairs of linked paths coupled between pairs of IO interfaces and configured in a plurality of rows and columns forming a 2D mesh. Router nodes are located at the intersections of the rows and columns, and are configured to forward data units between IO inputs and outputs coupled to the mesh at its edges through use of shortest path routes defined by agents at the IO interfaces. Multiple instances of the 2D meshes may be employed to support bandwidth scaling of the router architecture. One implementation of a multi-layer 2D mesh is built using a standard tile that is tessellated to form a 2D array of standard tiles, with each 2D mesh layer offset and overlaid relative to the other 2D mesh layers. IO interfaces are then coupled to the multi-layer 2D mesh via muxes/demuxes and/or crossbar interconnects.
    Type: Grant
    Filed: June 26, 2013
    Date of Patent: March 22, 2016
    Assignee: INTEL CORPORATION
    Inventors: William C. Hasenplaugh, Tryggve Fossum, Judson S. Leonard
  • Patent number: 9280382
    Abstract: A device receives a command to initiate parallel processing. The command includes an operator associated with an operation that is to be performed in connection with the parallel processing, and a reference to a multidimensional array to which the operator is to be applied. The operator is represented by a symbol, and the multidimensional array includes at least three dimensions. The command also includes an indication of one or more dimensions by which the multidimensional array is to be partitioned. The device partitions the multidimensional array, along the one or more dimensions, to divide the multidimensional array into multiple blocks, each of the multiple blocks representing a subset of the multidimensional array. The device controls application of the operator to the multiple blocks to cause the operator to be applied in parallel to at least two blocks of the multiple blocks.
    Type: Grant
    Filed: October 22, 2013
    Date of Patent: March 8, 2016
    Assignee: The MathWorks, Inc.
    Inventor: Halldor N Stefansson
  • Publication number: 20150039856
    Abstract: Efficient computation of complex multiplication results and very efficient fast Fourier transforms (FFTs) are provided. A parallel array VLIW digital signal processor is employed along with specialized complex multiplication instructions and communication operations between the processing elements which are overlapped with computation to provide very high performance operation. Successive iterations of a loop of tightly packed VLIWs are used allowing the complex multiplication pipeline hardware to be efficiently used. In addition, efficient techniques for supporting combined multiply accumulate operations are described.
    Type: Application
    Filed: August 11, 2014
    Publication date: February 5, 2015
    Applicant: Altera Corporation
    Inventors: Nikos P. Pitsianis, Gerald George Pechanek, Ricardo Rodriguez
  • Patent number: 8904398
    Abstract: Mapping tasks to physical processors in parallel computing system may include partitioning tasks in the parallel computing system into groups of tasks, the tasks being grouped according to their communication characteristics (e.g., pattern and frequency); mapping, by a processor, the groups of tasks to groups of physical processors, respectively; and fine tuning, by the processor, the mapping within each of the groups.
    Type: Grant
    Filed: March 6, 2012
    Date of Patent: December 2, 2014
    Assignee: International Business Machines Corporation
    Inventors: I-Hsin Chung, David J. Klepacki, Che-Rung Lee, Hui-Fang Wen
  • Patent number: 8898648
    Abstract: A profiling tool identifies a code region with a false sharing potential. A static analysis tool classifies variables and arrays in the identified code region. A mapping detection library correlates memory access instructions in the identified code region with variables and arrays in the identified code region while a processor is running the identified code region. The mapping detection library identifies one or more instructions at risk, in the identified code region, which are subject to an analysis by a false sharing detection library. A false sharing detection library performs a run-time analysis of the one or more instructions at risk while the processor is re-running the identified code region. The false sharing detection library determines, based on the performed run-time analysis, whether two different portions of the cache memory line are accessed by the generated binary code.
    Type: Grant
    Filed: November 30, 2012
    Date of Patent: November 25, 2014
    Assignee: International Business Machines Corporation
    Inventors: I-Hsin Chung, Guojing Cong, Hiroki Murata, Yasushi Negishi, Hui-Fang Wen
  • Patent number: 8856495
    Abstract: A mechanism is provided for automatically routing network interconnects in a data processing system. A processor in a node of a plurality of nodes receives network topology from neighboring nodes in the plurality of nodes within the data processing system. The processor constructs a system node map that identifies a physical connectivity between the node and the neighboring nodes. The processor programs a switch in the node with a connectivity map that indicates a set of point-to-point connections with the neighboring nodes. The set of point-to-point connections comprise locally-connected connections and pass-through connections.
    Type: Grant
    Filed: July 25, 2011
    Date of Patent: October 7, 2014
    Assignee: International Business Machines Corporation
    Inventors: Wael R. El-Essawy, David A. Papa, Jarrod A. Roy
  • Patent number: 8856493
    Abstract: A method of rotating data in a plurality of processing elements comprises a plurality of shifting operations and a plurality of storing operations, with the shifting and storing operations coordinated to enable a three shears operation to be performed on the data. The plurality of storing operations is responsive to the processing element's positions.
    Type: Grant
    Filed: February 14, 2012
    Date of Patent: October 7, 2014
    Assignee: Micron Technology, Inc.
    Inventor: Mark Beaumont
  • Patent number: 8850163
    Abstract: A mechanism is provided for automatically routing network interconnects in a data processing system. A processor in a node of a plurality of nodes receives network topology from neighboring nodes in the plurality of nodes within the data processing system. The processor constructs a system node map that identifies a physical connectivity between the node and the neighboring nodes. The processor programs a switch in the node with a connectivity map that indicates a set of point-to-point connections with the neighboring nodes. The set of point-to-point connections comprise locally-connected connections and pass-through connections.
    Type: Grant
    Filed: August 10, 2012
    Date of Patent: September 30, 2014
    Assignee: International Business Machines Corporation
    Inventors: Wael R. El-Essawy, David A. Papa, Jarrod A. Roy
  • Patent number: 8768819
    Abstract: Systems and methods for administering trade orders are described. An embodiment comprises receiving, from a first server operated by a first trader, a communication including a first trade order and one or more selection criteria, the first trade order including at least one of a specified instrument, a specified quantity, and a specified price; determining that a database of trade orders does not contain a trade order matching the first trade order; identifying a plurality of traders satisfying the selection criteria; sending, to a plurality of second servers, a query including at least one of the specified instrument, the specified quantity, and the specified price; receiving, from a one of the plurality of second servers operated on behalf of a second trader, a positive response to the query; and facilitating execution of a trade between the first trader and the second trader for the specified instrument at the specified price.
    Type: Grant
    Filed: November 14, 2008
    Date of Patent: July 1, 2014
    Assignee: CFPH, LLC
    Inventors: Howard W. Lutnick, Dean P. Alderucci, Mark Miller, Andrew Fishkind, Kevin Foley, Bill Rice, Brian L. Gay, Philip Marber, Charles Plott
  • Patent number: 8750285
    Abstract: Embodiments describe a system and/or method for efficient classification of network packets. According to an aspect a method includes describing a packet as a feature vector and mapping the feature vector to a feature space. The method can further include defining a feature prism, classifying the packet relative to the feature prism, and determining if the feature vector matches the feature prism. If the feature vector matches the feature prism the packet is passed to a data recipient, if not, the packet is blocked. Another embodiment is an apparatus that includes an identification component that defines at least one feature of a packet and a classification component that classifies the packet based at least in part upon the at least one defined feature.
    Type: Grant
    Filed: September 26, 2011
    Date of Patent: June 10, 2014
    Assignee: QUALCOMM Incorporated
    Inventors: Michael Paddon, Gregory Gordon Rose, Philip Michael Hawkes
  • Patent number: 8750857
    Abstract: Systems and methods for distributed computing between communication devices. A femto node is treated as a trusted extension of a user equipment and performs processing tasks on behalf of the user equipment. The femto node is also treated as a trusted extension of network servers and performs services on behalf of the network servers. Tasks are thus distributed between the network servers, the femto node and one or more user equipments. The tasks include processing data, filtering incoming messages, and caching network service information.
    Type: Grant
    Filed: December 2, 2010
    Date of Patent: June 10, 2014
    Assignee: QUALCOMM Incorporated
    Inventors: Dilip Krishnaswamy, Subbarao V. Yallapragada, Sanjiv Nanda, Soumya Das, Samir Salib Soliman, Peerapol Tinnakornsrisuphap, Vidya Narayanan
  • Patent number: 8638805
    Abstract: Described embodiments provide for restructuring a scheduling hierarchy of a network processor having a plurality of processing modules and a shared memory. The scheduling hierarchy schedules packets for transmission. The network processor generates tasks corresponding to each received packet associated with a data flow. A traffic manager receives tasks provided by one of the processing modules and determines a queue of the scheduling hierarchy corresponding to the task. The queue has a parent scheduler at each of one or more next levels of the scheduling hierarchy up to a root scheduler, forming a branch of the hierarchy. The traffic manager determines if the queue and one or more of the parent schedulers of the branch should be restructured. If so, the traffic manager drops subsequently received tasks for the branch, drains all tasks of the branch, and removes the corresponding nodes of the branch from the scheduling hierarchy.
    Type: Grant
    Filed: September 30, 2011
    Date of Patent: January 28, 2014
    Assignee: LSI Corporation
    Inventors: Balakrishnan Sundararaman, Shashank Nemawarkar, David Sonnier, Shailendra Aulakh, Allen Vestal
  • Publication number: 20140006852
    Abstract: A three-dimensional (3-D) processor system includes a first processor chip and a second processor chip in a stacked configuration. The first processor chip includes a first processor having a first set of state registers. The second processor chip includes a second processor having a second set of state registers that corresponds to the first set of state registers. The first and second processors are connected through vertical connections between the first and second processor chips. A mode control circuit operates the processor system in one of a plurality of operating modes. In one mode of operation, the first processor is active and the second processor is inactive, and the first processor operates at a speed greater than a maximum safe speed of the first processor, and the first processor uses the second set of state registers of the second processor to checkpoint a state of the first processor.
    Type: Application
    Filed: June 28, 2012
    Publication date: January 2, 2014
    Applicant: International Business Machines Corporation
    Inventors: Alper Buyuktosunoglu, Philip G. Emma, Allan M. Hartstein, Michael B. Healy, Krishnan K. Kailas
  • Publication number: 20130283006
    Abstract: Three-dimensional (3-D) processor structures are provided which are constructed by connecting processors in a stacked configuration. For example, a processor system includes a first processor chip comprising a first processor, and a second processor chip comprising a second processor. The first and second processor chips are connected in a stacked configuration with the first and second processors connected through vertical connections between the first and second processor chips. The processor system further includes a mode control circuit to selectively configure the first and second processors of the first and second processor chips to operate in one of a plurality of operating modes, wherein the processors can be selectively configured to operate independently, to aggregate resources, to share resources, and/or be combined to form a single processor image.
    Type: Application
    Filed: September 4, 2012
    Publication date: October 24, 2013
    Applicant: International Business Machines Corporation
    Inventors: Alper Buyuktosunoglu, Philip G. Emma, Allan M. Hartstein, Michael B. Healy, Krishnan Kunjunny Kailas
  • Publication number: 20130283005
    Abstract: Three-dimensional (3-D) processor devices are provided, which are constructed by connecting processors in a stacked configuration. For instance, a semiconductor device includes a first processor chip comprising one or more processors, a second processor chip comprising one or more processors, and a plurality of input/output ports. The first and second processor chips are connected in a stacked configuration and commonly share the plurality of input/output ports. Methods are also provided to selectively operate the semiconductor device in one of a plurality of operating modes to control power of the semiconductor device.
    Type: Application
    Filed: April 20, 2012
    Publication date: October 24, 2013
    Applicant: International Business Machines Corporation
    Inventor: Philip G. Emma
  • Patent number: 8532288
    Abstract: A cryptographic engine for modulo N multiplication, which is structured as a plurality of almost identical, serially connected Processing Elements, is controlled so as to accept input in blocks that are smaller than the maximum capability of the engine in terms of bits multiplied at one time. The serially connected hardware is thus partitioned on the fly to process a variety of cryptographic key sizes while still maintaining all of the hardware in an active processing state.
    Type: Grant
    Filed: December 1, 2006
    Date of Patent: September 10, 2013
    Assignee: International Business Machines Corporation
    Inventors: Camil Fayad, John K. Li, Siegfried K. H. Sutter, Phil C. Yeh