For A Parallel Or Multiprocessor System Patents (Class 717/149)

Loop compiling (Class 717/150)

Dependency-based streamlined processing

Patent number: 10853079

Abstract: A method and computer program product for performing a plurality of processing operations. A plurality of processor nodes each include one or more operational instances. Each processor node includes criteria for generating its operational instances. The processor nodes are linked together in a directed acyclic processing graph in which dependent nodes use data from the operational instances of upstream nodes to perform a node-specific set of processing operations. Dependency relationships between the processor nodes are defined on an operational instance basis, where operational instances in dependent processor nodes identify data associated with, or generated by, specific upstream operational instances that is used to perform the node-specific set of operations for that dependent operational instance. The processing graph may also include connectors nodes defining instance-level dependency relationships between processor nodes.

Type: Grant

Filed: September 19, 2019

Date of Patent: December 1, 2020

Assignee: Side Effects Software Inc.

Inventors: Ken Xu, Taylor James Petrick
Dataflow life cycles

Patent number: 10853131

Abstract: System and methods for implementing dataflow life cycles are described and include forming, by a first server computing system, a dataflow life cycle by associating a dataflow with a customized code; associating, by the first server computing system, the customized code of the dataflow life cycle with context information, the customized code including one or more of pre-processing customized code and post-processing customized code; scheduling, by the first server computing system, the dataflow of the dataflow life cycle to be executed by a second server computing system when the customized code includes the pre-processing customized code and when the pre-processing customized code is successfully executed by the first server computing system; and executing, by the first server computing system, the post-processing customized code when the customized code includes the post-processing customized code and when the dataflow of the dataflow life cycle is successfully executed by the second server computing system

Type: Grant

Filed: November 20, 2017

Date of Patent: December 1, 2020

Assignee: salesforce.com, inc.

Inventors: Ruisheng Shi, Farid Nabavi, Alex Gitelman
Method for implementing processing elements in a chip card

Patent number: 10831691

Abstract: The present disclosure relates to a method for implementing processing elements in a chip card such that the processing elements can communicate data between each other in order to perform a computation task, wherein the data communication requires each processing element to have a respective number of connections to other processing elements. The method comprises: providing a complete graph with an even number of nodes that is higher than the maximum of the numbers of connections by one or two. If the number of processing elements is higher that the number of nodes of the graph, the graph may be duplicated and the duplicated graphs may be combined into a combined graph. A methodology for placing and connecting the processing elements may be determined in accordance with the structure of nodes of a resulting graph, the resulting graph being the complete graph or the combined graph.

Type: Grant

Filed: May 24, 2019

Date of Patent: November 10, 2020

Assignee: International Business Machines Corporation

Inventors: Martino Dazzi, Pier Andrea Francese, Abu Sebastian, Riduan Khaddam-Aljameh, Evangelos Stavros Eleftheriou
Service-level resiliency in virtualization environments

Patent number: 10785089

Abstract: A set of service-level reliability metrics and a method to allocate these metrics to the layers of the service delivery platform. These initial targets can be tuned during service design and delivery, and feed the vendor requirements process, forming the basis for measuring, tracking, and responding based on the service-level reliability metrics.

Type: Grant

Filed: May 7, 2018

Date of Patent: September 22, 2020

Assignee: AT&T Intellectual Property I, L.P.

Inventor: Paul Reeser
Multi-processor computer architecture incorporating distributed multi-ported common memory modules

Patent number: 10741226

Abstract: A multi-processor computer architecture incorporating distributed multi-ported common memory modules wherein each of the memory modules comprises a control block functioning as a cross-bar router in conjunction with one or more associated memory banks or other data storage devices. Each memory module has multiple I/O ports and the ability to relay requests to other memory modules if the desired memory location is not found on the first module. A computer system in accordance with the invention may comprise memory module cards along with processor cards interconnected using a baseboard or backplane having a toroidal interconnect architecture between the cards.

Type: Grant

Filed: May 28, 2013

Date of Patent: August 11, 2020

Assignee: FG SRC LLC

Inventors: Jon M. Huppenthal, Timothy J. Tewalt, Lee A. Burton, David E. Caliga
Compilation method

Patent number: 10691432

Abstract: A method for generating a program to run on multiple tiles. The method comprises: receiving an input graph comprising data nodes, compute vertices and edges; receiving an initial tile-mapping specifying which data nodes and vertices are allocated to which tile; and determining a subgraph of the input graph that meets one or more heuristic rules. The rules comprises: the subgraph comprises at least one data node, the subgraph spans no more than a threshold number of tiles in the initial tile-mapping, and the subgraph comprises at least a minimum number of edges outputting to one or more vertices on one or more other tiles. The method further comprises adapting the initial mapping to migrate the data nodes and any vertices of the determined subgraph to said one or more other tiles, and compiling the executable program from the graph with the vertices and data nodes allocated by the adapted mapping.

Type: Grant

Filed: February 15, 2019

Date of Patent: June 23, 2020

Assignee: Graphcore Limited

Inventors: Mark Lloyd Pupilli, David Lacey
System and method for encapsulating computer communications

Patent number: 10678616

Abstract: Implementations of the present disclosure are directed to a method, a system, and an article for binding computer languages. An example computer-implemented method includes: operating an application on at least one computer in a first computer language; operating a platform for the application on the at least one computer in a second computer language; binding the first computer language with the second computer language; and communicating between the application and the platform using the binding of the first computer language and the second computer language.

Type: Grant

Filed: November 9, 2018

Date of Patent: June 9, 2020

Assignee: MZ IP Holdings, LLC

Inventors: John O'Connor, Nathan Spencer, Garth Gillespie, Yan Zhang
Compiler optimizations for vector operations that are reformatting-resistant

Patent number: 10642586

Abstract: An optimizing compiler includes a vector optimization mechanism that optimizes vector operations that are reformatting-resistant, such as source instructions that do not have a corresponding reformatting operation, sink instructions that do not have a corresponding reformatting operation, a source instruction that is a scalar value, a sink instruction that may produce a scalar value, and an internal operation that depends on lanes being in a specified order. The ability to optimize vector instructions that are reformatting-resistant reduces the number of operations to improve the run-time performance of the code.

Type: Grant

Filed: December 8, 2018

Date of Patent: May 5, 2020

Assignee: International Business Machines Corporation

Inventors: Michael Karl Gschwind, William J. Schmidt
Method for intra-subgraph optimization in tuple graph programs

Patent number: 10599482

Abstract: A programming model generates a graph for a program, the graph including a plurality of nodes and edges, wherein each node of the graph represents an operation and edges between the nodes represent streams of data input to and output from the operations represented by the nodes. The model determines where in a distributed architecture to execute the operations represented by the nodes. Such determining may include determining which nodes have location restrictions, assigning locations to each node having a location restriction based on the restriction, and partitioning the graph into a plurality of subgraphs, the partitioning including assigning locations to nodes without location restrictions in accordance with a first set of constraints, wherein each node within a particular subgraph is assigned to the same location. Each of the subgraphs is executed at its assigned location in a respective single thread.

Type: Grant

Filed: August 24, 2017

Date of Patent: March 24, 2020

Assignee: Google LLC

Inventors: Gautham Thambidorai, Matthew Rosencrantz, Sanjay Ghemawat, Srdjan Petrovic, Ivan Posva
Multiprocessor programming toolkit for design reuse

Patent number: 10592233

Abstract: Techniques for specifying and implementing a software application targeted for execution on a multiprocessor array (MPA). The MPA may include a plurality of processing elements, supporting memory, and a high bandwidth interconnection network (IN), communicatively coupling the plurality of processing elements and supporting memory. In some embodiments, software code may include first program instructions executable to perform a function. In some embodiments, the software code may also include one or more language constructs that are configurable to specify one or more parameter inputs. In some embodiments, the one or more parameter inputs are configurable to specify a set of hardware resources usable to execute the software code. In some embodiments, the hardware resources include multiple processors and may include multiple supporting memories.

Type: Grant

Filed: January 16, 2018

Date of Patent: March 17, 2020

Assignee: COHERENT LOGIX, INCORPORATED

Inventors: Stephen E. Lim, Viet N. Ngo, Jeffrey M. Nicholson, John Mark Beardslee, Teng-I Wang, Zhong Qing Shang, Michael Lyle Purnell
Systems and methods for load-balancing by secondary processors in parallelized indexing

Patent number: 10572515

Abstract: The invention relates to electronic indexing, and more particularly, to the parallelization of indexing. Systems and methods of the invention index data archives by breaking a job into work items and sending the work items to multiple processors that can each determine whether to index data associated with the work item or to create a new work item and have a different processor index the data. This gives the system an internal load-balancing that results in indexing jobs during which no processor stands idle while another processor indexes data of unexpected complexity.

Type: Grant

Filed: October 9, 2017

Date of Patent: February 25, 2020

Assignee: NUIX PTY LTD

Inventors: David Sitsky, Eddie Sheehy
In-memory shared data reuse replacement and caching

Patent number: 10372677

Abstract: A cache management system for managing a plurality of intermediate data includes a processor and a memory having stored thereon instructions that cause the processor to perform identifying a new intermediate data to be accessed, loading the intermediate data from the memory in response to identifying the new intermediate data as one of the plurality of intermediate data, in response to not identifying the new intermediate data as one of the plurality of intermediate data, selecting a set of victim intermediate data to evict from the memory based on a plurality of scores associated with respective ones of the plurality of intermediate data, the scores being based on a score table, evicting the set of victim intermediate data from the memory, updating the score table based on the set of victim intermediate data, and adding the new intermediate data to the plurality of intermediate data stored in the memory.

Type: Grant

Filed: January 11, 2017

Date of Patent: August 6, 2019

Assignee: Samsung Electronics Co., Ltd.

Inventors: Zhengyu Yang, Jiayin Wang, Thomas David Evans
System and method for mobile augmented reality task scheduling

Patent number: 10360073

Abstract: A system for scheduling mobile augmented reality tasks performed on at least one mobile device and a workspace includes: a mobile device, comprising a central processing unit (CPU) and a graphics processing unit (GPU); a Network Profiling Component, configured to gather network related context data; a Device Profiling Component, configured to gather hardware related context data; an Application Profiling Component, configured to gather application related context data; and a Scheduling Component, configured to receive the network related context data, the hardware related context data, and the application related context data, and to schedule tasks between the CPU, the GPU and the workspace.

Type: Grant

Filed: December 16, 2014

Date of Patent: July 23, 2019

Assignee: DEUTSCHE TELEKOM AG

Inventors: Pan Hui, Junqiu Wei, Christoph Peylo
Generation of directed acyclic graphs from task routines

Patent number: 10331495

Abstract: An apparatus including a processor to: parse comments of multiple task routines to identify I/O parameters; generate a visualization of a DAG to include a visual representation of each task routine, wherein each representation includes a task graph object of the task routine, at least one input data graph object that represents an input to the task routine and that includes a visual indication of at least one characteristic of the input; and at least one output data graph object that represents an output of the task routine and that includes a visual indication of at least one characteristic of the output; in the I/O parameters, identify each dependency between an output of one task routine and an input of another; for each identified dependency, augment the visualization with a dependency marker that visually links the visual representations of each associated pair of task routines; and visually output the visualization.

Type: Grant

Filed: February 15, 2018

Date of Patent: June 25, 2019

Assignee: SAS INSTITUTE INC.

Inventors: Henry Gabriel Victor Bequet, Chaowang “Ricky” Zhang
Optimization of loops and data flow sections in multi-core processor environment

Patent number: 10331615

Abstract: The present invention relates to a method for compiling code for a multi-core processor, comprising: detecting and optimizing a loop, partitioning the loop into partitions executable and mappable on physical hardware with optimal instruction level parallelism, optimizing the loop iterations and/or loop counter for ideal mapping on hardware, chaining the loop partitions generating a list representing the execution sequence of the partitions.

Type: Grant

Filed: May 22, 2017

Date of Patent: June 25, 2019

Assignee: Hyperion Core, Inc.

Inventor: Martin Vorbach
Duplicate in-memory shared-intermediate data detection and reuse module in spark framework

Patent number: 10311025

Abstract: A cache management system for managing a plurality of intermediate data includes a processor, and a memory having stored thereon the plurality of intermediate data and instructions that when executed by the processor, cause the processor to perform identifying a new intermediate data to be accessed, loading the intermediate data from the memory in response to identifying the new intermediate data as one of the plurality of intermediate data, and in response to not identifying the new intermediate data as one of the plurality of intermediate data identifying a reusable intermediate data having a longest duplicate generating logic chain that is at least in part the same as a generating logic chain of the new intermediate data, and generating the new intermediate data from the reusable intermediate data and a portion of the generating logic chain of the new intermediate data not in common with the reusable intermediate data.

Type: Grant

Filed: January 11, 2017

Date of Patent: June 4, 2019

Assignee: Samsung Electronics Co., Ltd.

Inventors: Zhengyu Yang, Jiayin Wang, Thomas David Evans
System and method for distributed graphics processing unit (GPU) computation

Patent number: 10303522

Abstract: A system and method for distributed graphics processing unit (GPU) computation are disclosed. A particular embodiment includes: receiving a user task service request from a user node; querying resource availability from a plurality of slave nodes having a plurality of graphics processing units (GPUs) thereon; assigning the user task service request to a plurality of available GPUs based on the resource availability and resource requirements of the user task service request, the assigning including starting a service on a GPU using a distributed processing container and creating a corresponding uniform resource locator (URL); and retaining a list of URLs corresponding to the resources assigned to the user task service request.

Type: Grant

Filed: July 1, 2017

Date of Patent: May 28, 2019

Assignee: TUSIMPLE

Inventors: Kai Zhou, Siyuan Liu
System and method for virtual link trunking

Patent number: 10284457

Abstract: A method, an information handling system (IHS), and a virtual link trunking (VLT) system for determining VLT ports to block and unblock in an IHS. The method includes calculating a forwarding table index for a local switch of currently active and inactive switch peers for a VLT port. A pre-determined forwarding table is retrieved from a memory containing a plurality of port blocking and unblocking actions for the switch peers. Current port blocking and unblocking actions are identified in the pre-determined forwarding table corresponding to the forwarding table index. Changes are determined between the previous port blocking and unblocking actions and the current port blocking and unblocking actions. The input/output (I/O) ports are configured for the local switch based on the determined changes in the port blocking and unblocking actions.

Type: Grant

Filed: July 12, 2016

Date of Patent: May 7, 2019

Assignee: Dell Products, L.P.

Inventors: Senthil Nathan Muthukaruppan, Shankara Ramamurthy, Pugalendran Rajendran
Generation of directed acyclic graphs from task routines

Patent number: 10255115

Abstract: An apparatus including a processor to: parse comments of multiple task routines to identify I/O parameters; generate a visualization of a DAG to include a visual representation of each task routine, wherein each representation includes a task graph object of the task routine, at least one input data graph object that represents an input to the task routine and that includes a visual indication of at least one characteristic of the input; and at least one output data graph object that represents an output of the task routine and that includes a visual indication of at least one characteristic of the output; in the I/O parameters, identify each dependency between an output of one task routine and an input of another; for each identified dependency, augment the visualization with a dependency marker that visually links the visual representations of each associated pair of task routines; and visually output the visualization.

Type: Grant

Filed: February 15, 2018

Date of Patent: April 9, 2019

Assignee: SAS INSTITUTE INC.

Inventors: Henry Gabriel Victor Bequet, Chaowang “Ricky” Zhang
Parallelization compiling method, parallelization compiler, and vehicular device

Patent number: 10228923

Abstract: A parallelization compiling method for generating a segmented program from a sequential program, in which multiple macro tasks are included and at least two of the macro tasks have a data dependency relationship with one another, includes determining an existence of invalidation information for invalidating at least a part of the data dependency relationship between the at least two of the plurality of macro tasks before compiling the sequential program into the segmented program, and generating the segmented program by compiling the sequential program into the segmented program with reference to a determination result of the existence of the invalidation information. When the invalidation information is determined to exist, the at least a part of the data dependency relationship is invalidated before the compiling of the sequential program into the segmented program.

Type: Grant

Filed: March 29, 2016

Date of Patent: March 12, 2019

Assignees: DENSO CORPORATION, WASEDA UNIVERSITY

Inventors: Yoshihiro Yatou, Noriyuki Suzuki, Kenichi Mineda, Hironori Kasahara, Keiji Kimura, Hiroki Mikami, Dan Umeda
Generation of directed acyclic graphs from task routines

Patent number: 10210025

Abstract: An apparatus including a processor to: parse comments of multiple task routines to identify I/O parameters; generate a visualization of a DAG to include a visual representation of each task routine, wherein each representation includes a task graph object of the task routine, at least one input data graph object that represents an input to the task routine and that includes a visual indication of at least one characteristic of the input; and at least one output data graph object that represents an output of the task routine and that includes a visual indication of at least one characteristic of the output; in the I/O parameters, identify each dependency between an output of one task routine and an input of another; for each identified dependency, augment the visualization with a dependency marker that visually links the visual representations of each associated pair of task routines; and visually output the visualization.

Type: Grant

Filed: February 15, 2018

Date of Patent: February 19, 2019

Assignee: SAS INSTITUTE INC.

Inventors: Henry Gabriel Victor Bequet, Chaowang “Ricky” Zhang
Compiler optimizations for vector operations that are reformatting-resistant

Patent number: 10169012

Abstract: An optimizing compiler includes a vector optimization mechanism that optimizes vector operations that are reformatting-resistant, such as source instructions that do not have a corresponding reformatting operation, sink instructions that do not have a corresponding reformatting operation, a source instruction that is a scalar value, a sink instruction that may produce a scalar value, and an internal operation that depends on lanes being in a specified order. The ability to optimize vector instructions that are reformatting-resistant reduces the number of operations to improve the run-time performance of the code.

Type: Grant

Filed: November 1, 2017

Date of Patent: January 1, 2019

Assignee: International Business Machines Corporation

Inventors: Michael Karl Gschwind, William J. Schmidt
Method for automatic tissue segmentation of medical images

Patent number: 10157466

Abstract: A method to segment images that contain multiple objects in a nested structure including acquiring an image; defining the multiple objects by layers, each layer corresponding to one region, where a region contains an innermost object and all the objects nested within the innermost object; stacking the layers in an order of the nested structure of the multiple objects, the stack of layers having at least a top layer and a bottom layer; extending each layer with padded nodes; connecting the top layer to a sink and the bottom layer to a source, wherein each intermediate layer between the top layer and the bottom layer are connected only to the adjacent layer by undirected links; and measuring a boundary length for each layer.

Type: Grant

Filed: January 23, 2017

Date of Patent: December 18, 2018

Assignees: Riverside Research Institute, New York University

Inventors: Jen-wei Kuo, Jonathan Mamou, Xuan Zhao, Jeffrey A. Ketterling, Orlando Aristizabal, Daniel H. Turnbull, Yao Wang
Code refactoring mechanism for asynchronous code optimization using topological sorting

Patent number: 10157055

Abstract: Methods, systems, apparatuses, and computer program products are provided for transforming asynchronous code into more efficient, logically equivalent asynchronous code; Program code is converted into a first syntax tree. A dependency graph is generated from the first syntax tree with each node of the dependency graph corresponding to a code statement and having an assigned weight. Weighted topological sorting of the dependency graph is performed to generate a sorted dependency graph. A second syntax tree is generated from the sorted dependency graph. In another implementation, the program code is transformed into await-relaxed and/or loop-relaxed program code prior to being transformed into the first syntax tree.

Type: Grant

Filed: September 29, 2016

Date of Patent: December 18, 2018

Assignee: Microsoft Technology Licensing, LLC

Inventors: Gal Tamir, Elad Iwanir, Eli Koreh
Federated device support for generation of directed acyclic graphs

Patent number: 10157086

Abstract: An apparatus including a processor to: parse comments of multiple task routines to identify I/O parameters; for each task routine, generate a macro including its I/O parameters; transmit the macros to a requesting device to enable it to generate a visualization of a DAG to include visual representations of the task routines; wherein each representation includes a task graph object, an input data graph object representing an input and indicating a characteristic of the input, and an output data graph object representing an output and indicating a characteristic of the output; and wherein the requesting device is to: identify, in the I/O parameters, each dependency between an output and an input of a pair of task routines; augment the visualization, for each identified dependency, with a dependency marker that visually links the visual representations of the pair of task routines; and visually output the visualization.

Type: Grant

Filed: February 16, 2018

Date of Patent: December 18, 2018

Assignee: SAS INSTITUTE INC.

Inventors: Henry Gabriel Victor Bequet, Chaowang “Ricky” Zhang
Triggering answer boxes

Patent number: 10146849

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing search results. In one aspect, a method includes receiving a query. A plurality of search results responsive to the query are identified. The search results are analyzed to determine that at least a first search result is associated with a first answer box topic. The search results are provided along with an answer box precursor for the first answer box topic.

Type: Grant

Filed: October 25, 2017

Date of Patent: December 4, 2018

Assignee: Google LLC

Inventors: Tal Cohen, Ziv Bar-Yossef, Igor Tsvetkov, Adi Mano, Oren Naim, Nitsan Oz, Nir Andelman, Pravir Kumar Gupta
System and method for GPU maximum register count optimization applied to general matrix-matrix multiplication

Patent number: 10067910

Abstract: A method and system performing a general matrix-matrix multiplication (GEMM) operation using a kernel compiled with optimal maximum register count (MRC). During operation, the system may generate the kernel compiled with optimal MRC. This may involve determining a fastest compiled kernel among a set of compiled kernels by comparing the speeds of the compiled kernels. Each kernel may be compiled with a different MRC value between zero and a predetermined maximum number of registers per thread. The fastest compiled kernel is determined to be the kernel with optimal MRC. The system may receive data representing at least two matrices. The system may select the kernel compiled with optimal MRC, and perform the GEMM operation on the two matrices using the selected kernel. Some embodiments may also perform general matrix-vector multiplication (GEMV), sparse matrix-vector multiplication (SpMV), or k-means clustering operations using kernels compiled with optimal MRC.

Type: Grant

Filed: July 1, 2016

Date of Patent: September 4, 2018

Assignee: PALO ALTO RESEARCH CENTER INCORPORATED

Inventor: Rong Zhou
Multithreading and concurrency control for a rule-based transaction engine

Patent number: 10002161

Abstract: The subject matter disclosed herein provides methods and apparatus, including computer program products for rules-based processing. In one aspect there is provided a method. The method may include, for example, evaluating rules to determine whether to enable or disable one or more actions in a ready set of actions. Moreover, the method may include scheduling the ready set of actions, each of which is scheduled for execution and executed, the execution of each of the ready set of actions using a separate, concurrent thread, the concurrency of the actions controlled using a control mechanism. Related systems, apparatus, methods, and/or articles are also described.

Type: Grant

Filed: December 3, 2008

Date of Patent: June 19, 2018

Assignee: SAP SE

Inventors: Sören Balko, Matthias Miltz
Methods and systems for managing an instruction sequence with a divergent control flow in a SIMT architecture

Patent number: 9898288

Abstract: A computer-implemented method of executing an instruction sequence with a recursive function call of a plurality of threads within a thread group in a Single-Instruction-Multiple-Threads (SIMT) system is provided. Each thread is provided with a function call counter (FCC), an active mask, an execution mask and a per-thread program counter (PTPC). The instruction sequence with the recursive function call is executed by the threads in the thread group according to a program counter (PC) indicating a target. Upon executing the recursive function call, for each thread, the active mask is set according to the PTPC and the target indicated by the PC, the FCC is determined when entering or returning from the recursive function call, the execution mask is determined according to the FCC and the active mask. It is determined whether an execution result of the recursive function call takes effects according to the execution mask.

Type: Grant

Filed: December 29, 2015

Date of Patent: February 20, 2018

Assignee: MEDIATEK INC.

Inventors: Yan-Hong Lu, Jia-Yang Chang, Pao-Hung Kuo, Chia-Chi Chang, Pei-Kuei Tsung
Method for implementing a reduced size register view data structure in a microprocessor

Patent number: 9891924

Abstract: A method for implementing a reduced size register view data structure in a microprocessor. The method includes receiving an incoming instruction sequence using a global front end; grouping the instructions to form instruction blocks; using a plurality of multiplexers to access ports of a scheduling array to store the instruction blocks as a series of chunks.

Type: Grant

Filed: March 14, 2014

Date of Patent: February 13, 2018

Assignee: Intel Corporation

Inventor: Mohammad A. Abdallah
Compiler optimizations for vector operations that are reformatting-resistant

Patent number: 9886252

Abstract: An optimizing compiler includes a vector optimization mechanism that optimizes vector operations that are reformatting-resistant, such as source instructions that do not have a corresponding reformatting operation, sink instructions that do not have a corresponding reformatting operation, a source instruction that is a scalar value, a sink instruction that may produce a scalar value, and an internal operation that depends on lanes being in a specified order. The ability to optimize vector instructions that are reformatting-resistant reduces the number of operations to improve the run-time performance of the code.

Type: Grant

Filed: August 31, 2015

Date of Patent: February 6, 2018

Assignee: International Business Machines Corporation

Inventors: Michael Karl Gschwind, William J. Schmidt
Compiler optimizations for vector operations that are reformatting-resistant

Patent number: 9880821

Abstract: An optimizing compiler includes a vector optimization mechanism that optimizes vector operations that are reformatting-resistant, such as source instructions that do not have a corresponding reformatting operation, sink instructions that do not have a corresponding reformatting operation, a source instruction that is a scalar value, a sink instruction that may produce a scalar value, and an internal operation that depends on lanes being in a specified order. The ability to optimize vector instructions that are reformatting-resistant reduces the number of operations to improve the run-time performance of the code.

Type: Grant

Filed: August 17, 2015

Date of Patent: January 30, 2018

Assignee: International Business Machines Corporation

Inventors: Michael Karl Gschwind, William J. Schmidt
Range selection for data parallel programming environments

Patent number: 9823927

Abstract: According to some embodiments, the workgroup divisibility requirement may be dispensed with on a selective or permanent basis, i.e. in all cases, particular cases or at particular times and/or under particular conditions. An application programming interface implementation may be allowed to launch workgroups with non-uniform local sizes. Two different local sizes may be used in a case of a one-dimensional workload.

Type: Grant

Filed: November 30, 2012

Date of Patent: November 21, 2017

Assignee: Intel Corporation

Inventors: Aaron R. Kunze, Dillon Sharlet, Andrew E. Brownsword
Triggering answer boxes

Patent number: 9805110

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing search results. In one aspect, a method includes receiving a query. A plurality of search results responsive to the query are identified. The search results are analyzed to determine that at least a first search result is associated with a first answer box topic. The search results are provided along with an answer box precursor for the first answer box topic.

Type: Grant

Filed: May 2, 2016

Date of Patent: October 31, 2017

Assignee: Google Inc.

Inventors: Tal Cohen, Ziv Bar-Yossef, Igor Tsvetkov, Adi Mano, Oren Naim, Nitsan Oz, Nir Andelman, Pravir Kumar Gupta
Workload partitioning procedure for null message-based PDES

Patent number: 9798587

Abstract: An embodiment of the invention includes applying a first partition to a plurality of LPs, wherein a particular LP is assigned to a first set of LPs. A second partition is applied to the LPs, wherein the particular LP is assigned to an LP set different from the first set. For both the first and second partitions, lookahead values and transit times are determined for each of the LPs and related links. For the first partition, a first system progression rate is computed using a specified function with the lookahead values and transit times determined for the first partition. For the second partition, a second system progression rate is computed using the specified function with the lookahead values and transit times determined for the second partition. The first and second system progression rates are compared to determine which is the lowest.

Type: Grant

Filed: May 13, 2011

Date of Patent: October 24, 2017

Assignee: International Business Machines Corporation

Inventors: Cheng-Hong Li, Alfred J. Park, Eugen Schenfeld
Shader program profiler

Patent number: 9799087

Abstract: Systems, methods, and computer readable media to improve the development of image processing intensive programs are described. In general, techniques are disclosed to non-intrusively monitor the run-time performance of shader programs on a graphics processing unit (GPU)—that is, to profile shader program execution. More particularly, GPU-based hardware threads may be configured to run in parallel too, and not interfere with, the execution environment of a GPU during shader program execution. When so configured, GPU hardware threads may be used to accurately characterize the run-time behavior of an application program's shader operations.

Type: Grant

Filed: September 9, 2013

Date of Patent: October 24, 2017

Assignee: Apple Inc.

Inventors: Sun Tjen Fam, Andrew Sowerby
Systems and methods for load-balancing by secondary processors in parallelized indexing

Patent number: 9785700

Abstract: The invention relates to electronic indexing, and more particularly, to the parallelization of indexing. Systems and methods of the invention index data archives by breaking a job into work items and sending the work items to multiple processors that can each determine whether to index data associated with the work item or to create a new work item and have a different processor index the data. This gives the system an internal load-balancing that results in indexing jobs during which no processor stands idle while another processor indexes data of unexpected complexity.

Type: Grant

Filed: August 7, 2013

Date of Patent: October 10, 2017

Assignee: Nuix Pty Ltd

Inventors: David Sitsky, Eddie Sheehy
General purpose distributed data parallel computing using a high level language

Patent number: 9720743

Abstract: General-purpose distributed data-parallel computing using a high-level language is disclosed. Data parallel portions of a sequential program that is written by a developer in a high-level language are automatically translated into a distributed execution plan. The distributed execution plan is then executed on large compute clusters. Thus, the developer is allowed to write the program using familiar programming constructs in the high level language. Moreover, developers without experience with distributed compute systems are able to take advantage of such systems.

Type: Grant

Filed: July 23, 2015

Date of Patent: August 1, 2017

Assignee: Microsoft Technology Licensing, LLC

Inventors: Yuan Yu, Dennis Fetterly, Michael Isard, Ulfar Erlingsson, Mihai Budiu
Runtime handling of task dependencies using dependence graphs

Patent number: 9652286

Abstract: Embodiments include systems and methods for handling task dependencies in a runtime environment using dependence graphs. For example, a computer-implemented runtime engine includes runtime libraries configured to handle tasks and task dependencies. The task dependencies can be converted into data dependencies. At runtime, as the runtime engine encounters tasks and associated data dependencies, it can add those identified tasks as nodes of a dependence graph, and can add edges between the nodes that correspond to the data dependencies without deadlock. The runtime engine can schedule the tasks for execution according to a topological traversal of the dependence graph in a manner that preserves task dependencies substantially as defined by the source code.

Type: Grant

Filed: March 21, 2014

Date of Patent: May 16, 2017

Assignee: Oracle International Corporation

Inventor: Bin Fan
Collaborative method and system to balance workload distribution

Patent number: 9577869

Abstract: A method, system and program product for balanced workload distribution in a plurality of networked computing nodes. The networked computing nodes may be arranged as a connected graph defining at least one direct neighbor to each networked computing node. The method comprises determining a first workload indicator of the i-th computing node, at a first stage before a new task may be started by the i-th computing node, determining an estimated workload indicator of the i-th computing node, assuming that the new task is performed at a second stage on the i-th computing node, determining estimated workload indicators of each direct neighbor assuming that the new task is performed at the second stage, deciding whether to move the new task to another computing node, and moving the new task to one of the direct neighboring computing nodes of the i-th computing node such that workloads are balanced.

Type: Grant

Filed: August 28, 2013

Date of Patent: February 21, 2017

Assignee: International Business Machines Corporation

Inventors: Gianluca Della Corte, Alessandro Donatelli, Antonio M. Sgro
Device and method for the distributed execution of digital data processing operations

Patent number: 9569272

Abstract: A method and device for digital data processing based on a data flow processing model is suitable for the execution, in a distributed manner on multiple calculation nodes, of multiple data processing operations modelled by directed graphs, where two different processing operations include at least one common calculation node. The device includes an identification processor configured to, from a valued directed multi-graph made up of the union of several distinct processing graphs and divided into several valued directed sub-multi-graphs, called chunks, and whose input and output nodes are buffer memory nodes of the multi-graph, identify a coordination module for each chunk. Furthermore each identified coordination module is configured to synchronize portions of processing operations that are to be executed in the chunk with which the respective coordination module is associated, independently of portions of processing operations that are to be executed in other chunks.

Type: Grant

Filed: July 9, 2010

Date of Patent: February 14, 2017

Assignee: Commissariat a l'energie atomique et aux alternatives

Inventor: Yvain Thonnart
Computer-readable recording medium storing information processing program, information processing apparatus, and information processing method

Patent number: 9552197

Abstract: A non-transitory computer-readable recording medium stores therein a program for causing an information processing apparatus to execute a process including analyzing a source program with respect to the information processing apparatus that starts hardware prefetching upon detecting an access to a consecutive area on a main storage device and stops the hardware prefetching upon detecting an end of the access to the consecutive area, specifying an array structure in a loop process as a hardware prefetching target, and generating, from the source program, a machine language program in which the array structure is changed so that a second access occurring next to a first access to the array structure refers to an area being consecutive from the area being referred to by the first access.

Type: Grant

Filed: August 31, 2015

Date of Patent: January 24, 2017

Assignee: FUJITSU LIMITED

Inventor: Shigeru Kimura
Server, arithmatic processing method, and arithmatic processing system

Patent number: 9467532

Abstract: A computing method is provided which includes calling a general purpose graphics processing subroutine for execution of a target program by a client; sending a program code and resource data for execution of the target program to a server by the client; and executing the program code using a general purpose graphics processing unit by the server.

Type: Grant

Filed: January 22, 2013

Date of Patent: October 11, 2016

Assignee: INDUSTRY-ACADEMIC COOPERATION FOUNDATION, YONSEI UNIVERSITY

Inventors: Won Woo Ro, Keunsoo Kim, Seung Hun Kim
Optimizing compiler performance by object collocation

Patent number: 9448778

Abstract: A computer-implemented method, system, and computer program product for performing object collocation on a computer system are provided. The method includes analyzing a sequence of computer instructions for object allocations and uses of the allocated objects. The method further includes creating an allocation interference graph of object allocation nodes with edges indicating pairs of allocations to be omitted from collocation. The method also includes coloring the allocation interference graph such that adjacent nodes are assigned different colors, and creating an object allocation at a program point prior to allocations of a selected color from the allocation interference graph. The method additionally includes storing an address associated with the created object allocation in a collocation pointer, and replacing a use of each allocation of the selected color with a use of the collocation pointer to collocate multiple objects.

Type: Grant

Filed: July 30, 2014

Date of Patent: September 20, 2016

Assignee: International Business Machines Corporation

Inventors: Patrick Doyle, Pramod Ramarao, Vijay Sundaresan
Increasing performance at runtime from trace data

Patent number: 9436589

Abstract: An analysis system may perform network analysis on data gathered from an executing application. The analysis system may identify relationships between code elements and use tracer data to quantify and classify various code elements. In some cases, the analysis system may operate with only data gathered while tracing an application, while other cases may combine static analysis data with tracing data. The network analysis may identify groups of related code elements through cluster analysis, as well as identify bottlenecks from one to many and many to one relationships. The analysis system may generate visualizations showing the interconnections or relationships within the executing code, along with highlighted elements that may be limiting performance.

Type: Grant

Filed: March 29, 2013

Date of Patent: September 6, 2016

Assignee: Microsoft Technology Licensing, LLC

Inventors: Ying Li, Alexander G. Gounares, Charles D. Garrett, Russell S. Krajec
Adaptive lock for a computing system having multiple runtime environments and multiple processing units

Patent number: 9424103

Abstract: A method for operating a lock in a computing system having plural processing units and running under multiple runtime environments is provided. When a requester thread attempts to acquire the lock while the lock is held by a holder thread, determine whether the holder thread is suspendable or non-suspendable. If the holder thread is non-suspendable, put the requester thread in a spin state regardless of whether the requester thread is suspendable or non-suspendable; otherwise determines whether the requester thread is suspendable or non-suspendable unless the requester thread quits acquiring the lock. If the requester thread is non-suspendable, arrange the requester thread to attempt acquiring the lock again; otherwise add the requester thread to a wait queue as an additional suspended thread. Suspended threads stored in the wait queue are allowable to be resumed later for lock acquisition. The method is applicable for the computing system with a multicore processor.

Type: Grant

Filed: September 30, 2014

Date of Patent: August 23, 2016

Assignee: Hong Kong Applied Science and Technology Research Institute Company Limited

Inventors: Yi Al, Lin Xu, Jianchao Lu, Shaohua Zhang
Agile communication operator

Patent number: 9395957

Abstract: A high level programming language provides an agile communication operator that generates a segmented computational space based on a resource map for distributing the computational space across compute nodes. The agile communication operator decomposes the computational space into segments, causes the segments to be assigned to compute nodes, and allows the user to centrally manage and automate movement of the segments between the compute nodes. The segment movement may be managed using either a full global-view representation or a local-global-view representation of the segments.

Type: Grant

Filed: December 22, 2010

Date of Patent: July 19, 2016

Assignee: Microsoft Technology Licensing, LLC

Inventor: Paul F. Ringseth
Apparatus and method for generating vector code

Patent number: 9367291

Abstract: An apparatus and method for generating vector code are provided. The apparatus and method generate vector code using scalar-type kernel code, without user's changing a code type or modifying data layout, thereby enhancing user's convenience of use and retaining the portability of OpenCL.

Type: Grant

Filed: March 28, 2014

Date of Patent: June 14, 2016

Assignees: Samsung Electronics Co., Ltd., Seoul National University R&DB Foundation

Inventors: Jin-Seok Lee, Seong-Gun Kim, Dong-Hoon Yoo, Seok-Joong Hwang, Jeongho Nah, Jaejin Lee, Jun Lee
Methods and apparatus to configure a process control system using an electronic description language script

Patent number: 9354629

Abstract: Example methods and apparatus to configure a process control system using an electronic description language (EDL) script are disclosed. A disclosed example method comprises loading a first script representative of a process plant, the first script comprising an interpretive system-level script structured in accordance with an electronic description language, and compiling the first script to form a second script, the second script structured in accordance with a vendor-specific configuration language associated with a particular process control system for the process plant.

Type: Grant

Filed: February 19, 2009

Date of Patent: May 31, 2016

Assignee: Fisher-Rosemount Systems, Inc.

Inventors: James Randall Balentine, Gary Keith Law, Mark Nixon
Data splitting for multi-instantiated objects

Patent number: 9311065

Abstract: Embodiments relate to data splitting for multi-instantiated objects. An aspect includes receiving a portion of source code for compilation having a dynamic object to split using object size array data splitting. Another aspect includes replacing all memory allocations for the dynamic object with a total size of an object size array and object field arrays including a predetermined padding. Another aspect includes inserting statements in the source code after the memory allocations to populate the object size array with a value of a number of elements of the object size array. Another aspect includes updating a stride for load and store operations using dynamic pointers. Yet another aspect includes modifying field references by adding a distance between the object size array and the object field array to respective address operations.

Type: Grant

Filed: June 19, 2014

Date of Patent: April 12, 2016

Assignee: International Business Machines Corporation

Inventors: Shimin Cui, Yan Zhang

prev 1 2 3 4 5 6 … next