For A Parallel Or Multiprocessor System Patents (Class 717/149)
-
Patent number: 10853079Abstract: A method and computer program product for performing a plurality of processing operations. A plurality of processor nodes each include one or more operational instances. Each processor node includes criteria for generating its operational instances. The processor nodes are linked together in a directed acyclic processing graph in which dependent nodes use data from the operational instances of upstream nodes to perform a node-specific set of processing operations. Dependency relationships between the processor nodes are defined on an operational instance basis, where operational instances in dependent processor nodes identify data associated with, or generated by, specific upstream operational instances that is used to perform the node-specific set of operations for that dependent operational instance. The processing graph may also include connectors nodes defining instance-level dependency relationships between processor nodes.Type: GrantFiled: September 19, 2019Date of Patent: December 1, 2020Assignee: Side Effects Software Inc.Inventors: Ken Xu, Taylor James Petrick
-
Patent number: 10853131Abstract: System and methods for implementing dataflow life cycles are described and include forming, by a first server computing system, a dataflow life cycle by associating a dataflow with a customized code; associating, by the first server computing system, the customized code of the dataflow life cycle with context information, the customized code including one or more of pre-processing customized code and post-processing customized code; scheduling, by the first server computing system, the dataflow of the dataflow life cycle to be executed by a second server computing system when the customized code includes the pre-processing customized code and when the pre-processing customized code is successfully executed by the first server computing system; and executing, by the first server computing system, the post-processing customized code when the customized code includes the post-processing customized code and when the dataflow of the dataflow life cycle is successfully executed by the second server computing systemType: GrantFiled: November 20, 2017Date of Patent: December 1, 2020Assignee: salesforce.com, inc.Inventors: Ruisheng Shi, Farid Nabavi, Alex Gitelman
-
Patent number: 10831691Abstract: The present disclosure relates to a method for implementing processing elements in a chip card such that the processing elements can communicate data between each other in order to perform a computation task, wherein the data communication requires each processing element to have a respective number of connections to other processing elements. The method comprises: providing a complete graph with an even number of nodes that is higher than the maximum of the numbers of connections by one or two. If the number of processing elements is higher that the number of nodes of the graph, the graph may be duplicated and the duplicated graphs may be combined into a combined graph. A methodology for placing and connecting the processing elements may be determined in accordance with the structure of nodes of a resulting graph, the resulting graph being the complete graph or the combined graph.Type: GrantFiled: May 24, 2019Date of Patent: November 10, 2020Assignee: International Business Machines CorporationInventors: Martino Dazzi, Pier Andrea Francese, Abu Sebastian, Riduan Khaddam-Aljameh, Evangelos Stavros Eleftheriou
-
Patent number: 10785089Abstract: A set of service-level reliability metrics and a method to allocate these metrics to the layers of the service delivery platform. These initial targets can be tuned during service design and delivery, and feed the vendor requirements process, forming the basis for measuring, tracking, and responding based on the service-level reliability metrics.Type: GrantFiled: May 7, 2018Date of Patent: September 22, 2020Assignee: AT&T Intellectual Property I, L.P.Inventor: Paul Reeser
-
Patent number: 10741226Abstract: A multi-processor computer architecture incorporating distributed multi-ported common memory modules wherein each of the memory modules comprises a control block functioning as a cross-bar router in conjunction with one or more associated memory banks or other data storage devices. Each memory module has multiple I/O ports and the ability to relay requests to other memory modules if the desired memory location is not found on the first module. A computer system in accordance with the invention may comprise memory module cards along with processor cards interconnected using a baseboard or backplane having a toroidal interconnect architecture between the cards.Type: GrantFiled: May 28, 2013Date of Patent: August 11, 2020Assignee: FG SRC LLCInventors: Jon M. Huppenthal, Timothy J. Tewalt, Lee A. Burton, David E. Caliga
-
Patent number: 10691432Abstract: A method for generating a program to run on multiple tiles. The method comprises: receiving an input graph comprising data nodes, compute vertices and edges; receiving an initial tile-mapping specifying which data nodes and vertices are allocated to which tile; and determining a subgraph of the input graph that meets one or more heuristic rules. The rules comprises: the subgraph comprises at least one data node, the subgraph spans no more than a threshold number of tiles in the initial tile-mapping, and the subgraph comprises at least a minimum number of edges outputting to one or more vertices on one or more other tiles. The method further comprises adapting the initial mapping to migrate the data nodes and any vertices of the determined subgraph to said one or more other tiles, and compiling the executable program from the graph with the vertices and data nodes allocated by the adapted mapping.Type: GrantFiled: February 15, 2019Date of Patent: June 23, 2020Assignee: Graphcore LimitedInventors: Mark Lloyd Pupilli, David Lacey
-
Patent number: 10678616Abstract: Implementations of the present disclosure are directed to a method, a system, and an article for binding computer languages. An example computer-implemented method includes: operating an application on at least one computer in a first computer language; operating a platform for the application on the at least one computer in a second computer language; binding the first computer language with the second computer language; and communicating between the application and the platform using the binding of the first computer language and the second computer language.Type: GrantFiled: November 9, 2018Date of Patent: June 9, 2020Assignee: MZ IP Holdings, LLCInventors: John O'Connor, Nathan Spencer, Garth Gillespie, Yan Zhang
-
Patent number: 10642586Abstract: An optimizing compiler includes a vector optimization mechanism that optimizes vector operations that are reformatting-resistant, such as source instructions that do not have a corresponding reformatting operation, sink instructions that do not have a corresponding reformatting operation, a source instruction that is a scalar value, a sink instruction that may produce a scalar value, and an internal operation that depends on lanes being in a specified order. The ability to optimize vector instructions that are reformatting-resistant reduces the number of operations to improve the run-time performance of the code.Type: GrantFiled: December 8, 2018Date of Patent: May 5, 2020Assignee: International Business Machines CorporationInventors: Michael Karl Gschwind, William J. Schmidt
-
Patent number: 10599482Abstract: A programming model generates a graph for a program, the graph including a plurality of nodes and edges, wherein each node of the graph represents an operation and edges between the nodes represent streams of data input to and output from the operations represented by the nodes. The model determines where in a distributed architecture to execute the operations represented by the nodes. Such determining may include determining which nodes have location restrictions, assigning locations to each node having a location restriction based on the restriction, and partitioning the graph into a plurality of subgraphs, the partitioning including assigning locations to nodes without location restrictions in accordance with a first set of constraints, wherein each node within a particular subgraph is assigned to the same location. Each of the subgraphs is executed at its assigned location in a respective single thread.Type: GrantFiled: August 24, 2017Date of Patent: March 24, 2020Assignee: Google LLCInventors: Gautham Thambidorai, Matthew Rosencrantz, Sanjay Ghemawat, Srdjan Petrovic, Ivan Posva
-
Patent number: 10592233Abstract: Techniques for specifying and implementing a software application targeted for execution on a multiprocessor array (MPA). The MPA may include a plurality of processing elements, supporting memory, and a high bandwidth interconnection network (IN), communicatively coupling the plurality of processing elements and supporting memory. In some embodiments, software code may include first program instructions executable to perform a function. In some embodiments, the software code may also include one or more language constructs that are configurable to specify one or more parameter inputs. In some embodiments, the one or more parameter inputs are configurable to specify a set of hardware resources usable to execute the software code. In some embodiments, the hardware resources include multiple processors and may include multiple supporting memories.Type: GrantFiled: January 16, 2018Date of Patent: March 17, 2020Assignee: COHERENT LOGIX, INCORPORATEDInventors: Stephen E. Lim, Viet N. Ngo, Jeffrey M. Nicholson, John Mark Beardslee, Teng-I Wang, Zhong Qing Shang, Michael Lyle Purnell
-
Patent number: 10572515Abstract: The invention relates to electronic indexing, and more particularly, to the parallelization of indexing. Systems and methods of the invention index data archives by breaking a job into work items and sending the work items to multiple processors that can each determine whether to index data associated with the work item or to create a new work item and have a different processor index the data. This gives the system an internal load-balancing that results in indexing jobs during which no processor stands idle while another processor indexes data of unexpected complexity.Type: GrantFiled: October 9, 2017Date of Patent: February 25, 2020Assignee: NUIX PTY LTDInventors: David Sitsky, Eddie Sheehy
-
Patent number: 10372677Abstract: A cache management system for managing a plurality of intermediate data includes a processor and a memory having stored thereon instructions that cause the processor to perform identifying a new intermediate data to be accessed, loading the intermediate data from the memory in response to identifying the new intermediate data as one of the plurality of intermediate data, in response to not identifying the new intermediate data as one of the plurality of intermediate data, selecting a set of victim intermediate data to evict from the memory based on a plurality of scores associated with respective ones of the plurality of intermediate data, the scores being based on a score table, evicting the set of victim intermediate data from the memory, updating the score table based on the set of victim intermediate data, and adding the new intermediate data to the plurality of intermediate data stored in the memory.Type: GrantFiled: January 11, 2017Date of Patent: August 6, 2019Assignee: Samsung Electronics Co., Ltd.Inventors: Zhengyu Yang, Jiayin Wang, Thomas David Evans
-
Patent number: 10360073Abstract: A system for scheduling mobile augmented reality tasks performed on at least one mobile device and a workspace includes: a mobile device, comprising a central processing unit (CPU) and a graphics processing unit (GPU); a Network Profiling Component, configured to gather network related context data; a Device Profiling Component, configured to gather hardware related context data; an Application Profiling Component, configured to gather application related context data; and a Scheduling Component, configured to receive the network related context data, the hardware related context data, and the application related context data, and to schedule tasks between the CPU, the GPU and the workspace.Type: GrantFiled: December 16, 2014Date of Patent: July 23, 2019Assignee: DEUTSCHE TELEKOM AGInventors: Pan Hui, Junqiu Wei, Christoph Peylo
-
Patent number: 10331495Abstract: An apparatus including a processor to: parse comments of multiple task routines to identify I/O parameters; generate a visualization of a DAG to include a visual representation of each task routine, wherein each representation includes a task graph object of the task routine, at least one input data graph object that represents an input to the task routine and that includes a visual indication of at least one characteristic of the input; and at least one output data graph object that represents an output of the task routine and that includes a visual indication of at least one characteristic of the output; in the I/O parameters, identify each dependency between an output of one task routine and an input of another; for each identified dependency, augment the visualization with a dependency marker that visually links the visual representations of each associated pair of task routines; and visually output the visualization.Type: GrantFiled: February 15, 2018Date of Patent: June 25, 2019Assignee: SAS INSTITUTE INC.Inventors: Henry Gabriel Victor Bequet, Chaowang “Ricky” Zhang
-
Patent number: 10331615Abstract: The present invention relates to a method for compiling code for a multi-core processor, comprising: detecting and optimizing a loop, partitioning the loop into partitions executable and mappable on physical hardware with optimal instruction level parallelism, optimizing the loop iterations and/or loop counter for ideal mapping on hardware, chaining the loop partitions generating a list representing the execution sequence of the partitions.Type: GrantFiled: May 22, 2017Date of Patent: June 25, 2019Assignee: Hyperion Core, Inc.Inventor: Martin Vorbach
-
Patent number: 10311025Abstract: A cache management system for managing a plurality of intermediate data includes a processor, and a memory having stored thereon the plurality of intermediate data and instructions that when executed by the processor, cause the processor to perform identifying a new intermediate data to be accessed, loading the intermediate data from the memory in response to identifying the new intermediate data as one of the plurality of intermediate data, and in response to not identifying the new intermediate data as one of the plurality of intermediate data identifying a reusable intermediate data having a longest duplicate generating logic chain that is at least in part the same as a generating logic chain of the new intermediate data, and generating the new intermediate data from the reusable intermediate data and a portion of the generating logic chain of the new intermediate data not in common with the reusable intermediate data.Type: GrantFiled: January 11, 2017Date of Patent: June 4, 2019Assignee: Samsung Electronics Co., Ltd.Inventors: Zhengyu Yang, Jiayin Wang, Thomas David Evans
-
Patent number: 10303522Abstract: A system and method for distributed graphics processing unit (GPU) computation are disclosed. A particular embodiment includes: receiving a user task service request from a user node; querying resource availability from a plurality of slave nodes having a plurality of graphics processing units (GPUs) thereon; assigning the user task service request to a plurality of available GPUs based on the resource availability and resource requirements of the user task service request, the assigning including starting a service on a GPU using a distributed processing container and creating a corresponding uniform resource locator (URL); and retaining a list of URLs corresponding to the resources assigned to the user task service request.Type: GrantFiled: July 1, 2017Date of Patent: May 28, 2019Assignee: TUSIMPLEInventors: Kai Zhou, Siyuan Liu
-
Patent number: 10284457Abstract: A method, an information handling system (IHS), and a virtual link trunking (VLT) system for determining VLT ports to block and unblock in an IHS. The method includes calculating a forwarding table index for a local switch of currently active and inactive switch peers for a VLT port. A pre-determined forwarding table is retrieved from a memory containing a plurality of port blocking and unblocking actions for the switch peers. Current port blocking and unblocking actions are identified in the pre-determined forwarding table corresponding to the forwarding table index. Changes are determined between the previous port blocking and unblocking actions and the current port blocking and unblocking actions. The input/output (I/O) ports are configured for the local switch based on the determined changes in the port blocking and unblocking actions.Type: GrantFiled: July 12, 2016Date of Patent: May 7, 2019Assignee: Dell Products, L.P.Inventors: Senthil Nathan Muthukaruppan, Shankara Ramamurthy, Pugalendran Rajendran
-
Patent number: 10255115Abstract: An apparatus including a processor to: parse comments of multiple task routines to identify I/O parameters; generate a visualization of a DAG to include a visual representation of each task routine, wherein each representation includes a task graph object of the task routine, at least one input data graph object that represents an input to the task routine and that includes a visual indication of at least one characteristic of the input; and at least one output data graph object that represents an output of the task routine and that includes a visual indication of at least one characteristic of the output; in the I/O parameters, identify each dependency between an output of one task routine and an input of another; for each identified dependency, augment the visualization with a dependency marker that visually links the visual representations of each associated pair of task routines; and visually output the visualization.Type: GrantFiled: February 15, 2018Date of Patent: April 9, 2019Assignee: SAS INSTITUTE INC.Inventors: Henry Gabriel Victor Bequet, Chaowang “Ricky” Zhang
-
Patent number: 10228923Abstract: A parallelization compiling method for generating a segmented program from a sequential program, in which multiple macro tasks are included and at least two of the macro tasks have a data dependency relationship with one another, includes determining an existence of invalidation information for invalidating at least a part of the data dependency relationship between the at least two of the plurality of macro tasks before compiling the sequential program into the segmented program, and generating the segmented program by compiling the sequential program into the segmented program with reference to a determination result of the existence of the invalidation information. When the invalidation information is determined to exist, the at least a part of the data dependency relationship is invalidated before the compiling of the sequential program into the segmented program.Type: GrantFiled: March 29, 2016Date of Patent: March 12, 2019Assignees: DENSO CORPORATION, WASEDA UNIVERSITYInventors: Yoshihiro Yatou, Noriyuki Suzuki, Kenichi Mineda, Hironori Kasahara, Keiji Kimura, Hiroki Mikami, Dan Umeda
-
Patent number: 10210025Abstract: An apparatus including a processor to: parse comments of multiple task routines to identify I/O parameters; generate a visualization of a DAG to include a visual representation of each task routine, wherein each representation includes a task graph object of the task routine, at least one input data graph object that represents an input to the task routine and that includes a visual indication of at least one characteristic of the input; and at least one output data graph object that represents an output of the task routine and that includes a visual indication of at least one characteristic of the output; in the I/O parameters, identify each dependency between an output of one task routine and an input of another; for each identified dependency, augment the visualization with a dependency marker that visually links the visual representations of each associated pair of task routines; and visually output the visualization.Type: GrantFiled: February 15, 2018Date of Patent: February 19, 2019Assignee: SAS INSTITUTE INC.Inventors: Henry Gabriel Victor Bequet, Chaowang “Ricky” Zhang
-
Patent number: 10169012Abstract: An optimizing compiler includes a vector optimization mechanism that optimizes vector operations that are reformatting-resistant, such as source instructions that do not have a corresponding reformatting operation, sink instructions that do not have a corresponding reformatting operation, a source instruction that is a scalar value, a sink instruction that may produce a scalar value, and an internal operation that depends on lanes being in a specified order. The ability to optimize vector instructions that are reformatting-resistant reduces the number of operations to improve the run-time performance of the code.Type: GrantFiled: November 1, 2017Date of Patent: January 1, 2019Assignee: International Business Machines CorporationInventors: Michael Karl Gschwind, William J. Schmidt
-
Patent number: 10157466Abstract: A method to segment images that contain multiple objects in a nested structure including acquiring an image; defining the multiple objects by layers, each layer corresponding to one region, where a region contains an innermost object and all the objects nested within the innermost object; stacking the layers in an order of the nested structure of the multiple objects, the stack of layers having at least a top layer and a bottom layer; extending each layer with padded nodes; connecting the top layer to a sink and the bottom layer to a source, wherein each intermediate layer between the top layer and the bottom layer are connected only to the adjacent layer by undirected links; and measuring a boundary length for each layer.Type: GrantFiled: January 23, 2017Date of Patent: December 18, 2018Assignees: Riverside Research Institute, New York UniversityInventors: Jen-wei Kuo, Jonathan Mamou, Xuan Zhao, Jeffrey A. Ketterling, Orlando Aristizabal, Daniel H. Turnbull, Yao Wang
-
Patent number: 10157055Abstract: Methods, systems, apparatuses, and computer program products are provided for transforming asynchronous code into more efficient, logically equivalent asynchronous code; Program code is converted into a first syntax tree. A dependency graph is generated from the first syntax tree with each node of the dependency graph corresponding to a code statement and having an assigned weight. Weighted topological sorting of the dependency graph is performed to generate a sorted dependency graph. A second syntax tree is generated from the sorted dependency graph. In another implementation, the program code is transformed into await-relaxed and/or loop-relaxed program code prior to being transformed into the first syntax tree.Type: GrantFiled: September 29, 2016Date of Patent: December 18, 2018Assignee: Microsoft Technology Licensing, LLCInventors: Gal Tamir, Elad Iwanir, Eli Koreh
-
Patent number: 10157086Abstract: An apparatus including a processor to: parse comments of multiple task routines to identify I/O parameters; for each task routine, generate a macro including its I/O parameters; transmit the macros to a requesting device to enable it to generate a visualization of a DAG to include visual representations of the task routines; wherein each representation includes a task graph object, an input data graph object representing an input and indicating a characteristic of the input, and an output data graph object representing an output and indicating a characteristic of the output; and wherein the requesting device is to: identify, in the I/O parameters, each dependency between an output and an input of a pair of task routines; augment the visualization, for each identified dependency, with a dependency marker that visually links the visual representations of the pair of task routines; and visually output the visualization.Type: GrantFiled: February 16, 2018Date of Patent: December 18, 2018Assignee: SAS INSTITUTE INC.Inventors: Henry Gabriel Victor Bequet, Chaowang “Ricky” Zhang
-
Patent number: 10146849Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing search results. In one aspect, a method includes receiving a query. A plurality of search results responsive to the query are identified. The search results are analyzed to determine that at least a first search result is associated with a first answer box topic. The search results are provided along with an answer box precursor for the first answer box topic.Type: GrantFiled: October 25, 2017Date of Patent: December 4, 2018Assignee: Google LLCInventors: Tal Cohen, Ziv Bar-Yossef, Igor Tsvetkov, Adi Mano, Oren Naim, Nitsan Oz, Nir Andelman, Pravir Kumar Gupta
-
Patent number: 10067910Abstract: A method and system performing a general matrix-matrix multiplication (GEMM) operation using a kernel compiled with optimal maximum register count (MRC). During operation, the system may generate the kernel compiled with optimal MRC. This may involve determining a fastest compiled kernel among a set of compiled kernels by comparing the speeds of the compiled kernels. Each kernel may be compiled with a different MRC value between zero and a predetermined maximum number of registers per thread. The fastest compiled kernel is determined to be the kernel with optimal MRC. The system may receive data representing at least two matrices. The system may select the kernel compiled with optimal MRC, and perform the GEMM operation on the two matrices using the selected kernel. Some embodiments may also perform general matrix-vector multiplication (GEMV), sparse matrix-vector multiplication (SpMV), or k-means clustering operations using kernels compiled with optimal MRC.Type: GrantFiled: July 1, 2016Date of Patent: September 4, 2018Assignee: PALO ALTO RESEARCH CENTER INCORPORATEDInventor: Rong Zhou
-
Patent number: 10002161Abstract: The subject matter disclosed herein provides methods and apparatus, including computer program products for rules-based processing. In one aspect there is provided a method. The method may include, for example, evaluating rules to determine whether to enable or disable one or more actions in a ready set of actions. Moreover, the method may include scheduling the ready set of actions, each of which is scheduled for execution and executed, the execution of each of the ready set of actions using a separate, concurrent thread, the concurrency of the actions controlled using a control mechanism. Related systems, apparatus, methods, and/or articles are also described.Type: GrantFiled: December 3, 2008Date of Patent: June 19, 2018Assignee: SAP SEInventors: Sören Balko, Matthias Miltz
-
Patent number: 9898288Abstract: A computer-implemented method of executing an instruction sequence with a recursive function call of a plurality of threads within a thread group in a Single-Instruction-Multiple-Threads (SIMT) system is provided. Each thread is provided with a function call counter (FCC), an active mask, an execution mask and a per-thread program counter (PTPC). The instruction sequence with the recursive function call is executed by the threads in the thread group according to a program counter (PC) indicating a target. Upon executing the recursive function call, for each thread, the active mask is set according to the PTPC and the target indicated by the PC, the FCC is determined when entering or returning from the recursive function call, the execution mask is determined according to the FCC and the active mask. It is determined whether an execution result of the recursive function call takes effects according to the execution mask.Type: GrantFiled: December 29, 2015Date of Patent: February 20, 2018Assignee: MEDIATEK INC.Inventors: Yan-Hong Lu, Jia-Yang Chang, Pao-Hung Kuo, Chia-Chi Chang, Pei-Kuei Tsung
-
Patent number: 9891924Abstract: A method for implementing a reduced size register view data structure in a microprocessor. The method includes receiving an incoming instruction sequence using a global front end; grouping the instructions to form instruction blocks; using a plurality of multiplexers to access ports of a scheduling array to store the instruction blocks as a series of chunks.Type: GrantFiled: March 14, 2014Date of Patent: February 13, 2018Assignee: Intel CorporationInventor: Mohammad A. Abdallah
-
Patent number: 9886252Abstract: An optimizing compiler includes a vector optimization mechanism that optimizes vector operations that are reformatting-resistant, such as source instructions that do not have a corresponding reformatting operation, sink instructions that do not have a corresponding reformatting operation, a source instruction that is a scalar value, a sink instruction that may produce a scalar value, and an internal operation that depends on lanes being in a specified order. The ability to optimize vector instructions that are reformatting-resistant reduces the number of operations to improve the run-time performance of the code.Type: GrantFiled: August 31, 2015Date of Patent: February 6, 2018Assignee: International Business Machines CorporationInventors: Michael Karl Gschwind, William J. Schmidt
-
Patent number: 9880821Abstract: An optimizing compiler includes a vector optimization mechanism that optimizes vector operations that are reformatting-resistant, such as source instructions that do not have a corresponding reformatting operation, sink instructions that do not have a corresponding reformatting operation, a source instruction that is a scalar value, a sink instruction that may produce a scalar value, and an internal operation that depends on lanes being in a specified order. The ability to optimize vector instructions that are reformatting-resistant reduces the number of operations to improve the run-time performance of the code.Type: GrantFiled: August 17, 2015Date of Patent: January 30, 2018Assignee: International Business Machines CorporationInventors: Michael Karl Gschwind, William J. Schmidt
-
Patent number: 9823927Abstract: According to some embodiments, the workgroup divisibility requirement may be dispensed with on a selective or permanent basis, i.e. in all cases, particular cases or at particular times and/or under particular conditions. An application programming interface implementation may be allowed to launch workgroups with non-uniform local sizes. Two different local sizes may be used in a case of a one-dimensional workload.Type: GrantFiled: November 30, 2012Date of Patent: November 21, 2017Assignee: Intel CorporationInventors: Aaron R. Kunze, Dillon Sharlet, Andrew E. Brownsword
-
Patent number: 9805110Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing search results. In one aspect, a method includes receiving a query. A plurality of search results responsive to the query are identified. The search results are analyzed to determine that at least a first search result is associated with a first answer box topic. The search results are provided along with an answer box precursor for the first answer box topic.Type: GrantFiled: May 2, 2016Date of Patent: October 31, 2017Assignee: Google Inc.Inventors: Tal Cohen, Ziv Bar-Yossef, Igor Tsvetkov, Adi Mano, Oren Naim, Nitsan Oz, Nir Andelman, Pravir Kumar Gupta
-
Patent number: 9798587Abstract: An embodiment of the invention includes applying a first partition to a plurality of LPs, wherein a particular LP is assigned to a first set of LPs. A second partition is applied to the LPs, wherein the particular LP is assigned to an LP set different from the first set. For both the first and second partitions, lookahead values and transit times are determined for each of the LPs and related links. For the first partition, a first system progression rate is computed using a specified function with the lookahead values and transit times determined for the first partition. For the second partition, a second system progression rate is computed using the specified function with the lookahead values and transit times determined for the second partition. The first and second system progression rates are compared to determine which is the lowest.Type: GrantFiled: May 13, 2011Date of Patent: October 24, 2017Assignee: International Business Machines CorporationInventors: Cheng-Hong Li, Alfred J. Park, Eugen Schenfeld
-
Patent number: 9799087Abstract: Systems, methods, and computer readable media to improve the development of image processing intensive programs are described. In general, techniques are disclosed to non-intrusively monitor the run-time performance of shader programs on a graphics processing unit (GPU)—that is, to profile shader program execution. More particularly, GPU-based hardware threads may be configured to run in parallel too, and not interfere with, the execution environment of a GPU during shader program execution. When so configured, GPU hardware threads may be used to accurately characterize the run-time behavior of an application program's shader operations.Type: GrantFiled: September 9, 2013Date of Patent: October 24, 2017Assignee: Apple Inc.Inventors: Sun Tjen Fam, Andrew Sowerby
-
Patent number: 9785700Abstract: The invention relates to electronic indexing, and more particularly, to the parallelization of indexing. Systems and methods of the invention index data archives by breaking a job into work items and sending the work items to multiple processors that can each determine whether to index data associated with the work item or to create a new work item and have a different processor index the data. This gives the system an internal load-balancing that results in indexing jobs during which no processor stands idle while another processor indexes data of unexpected complexity.Type: GrantFiled: August 7, 2013Date of Patent: October 10, 2017Assignee: Nuix Pty LtdInventors: David Sitsky, Eddie Sheehy
-
Patent number: 9720743Abstract: General-purpose distributed data-parallel computing using a high-level language is disclosed. Data parallel portions of a sequential program that is written by a developer in a high-level language are automatically translated into a distributed execution plan. The distributed execution plan is then executed on large compute clusters. Thus, the developer is allowed to write the program using familiar programming constructs in the high level language. Moreover, developers without experience with distributed compute systems are able to take advantage of such systems.Type: GrantFiled: July 23, 2015Date of Patent: August 1, 2017Assignee: Microsoft Technology Licensing, LLCInventors: Yuan Yu, Dennis Fetterly, Michael Isard, Ulfar Erlingsson, Mihai Budiu
-
Patent number: 9652286Abstract: Embodiments include systems and methods for handling task dependencies in a runtime environment using dependence graphs. For example, a computer-implemented runtime engine includes runtime libraries configured to handle tasks and task dependencies. The task dependencies can be converted into data dependencies. At runtime, as the runtime engine encounters tasks and associated data dependencies, it can add those identified tasks as nodes of a dependence graph, and can add edges between the nodes that correspond to the data dependencies without deadlock. The runtime engine can schedule the tasks for execution according to a topological traversal of the dependence graph in a manner that preserves task dependencies substantially as defined by the source code.Type: GrantFiled: March 21, 2014Date of Patent: May 16, 2017Assignee: Oracle International CorporationInventor: Bin Fan
-
Patent number: 9577869Abstract: A method, system and program product for balanced workload distribution in a plurality of networked computing nodes. The networked computing nodes may be arranged as a connected graph defining at least one direct neighbor to each networked computing node. The method comprises determining a first workload indicator of the i-th computing node, at a first stage before a new task may be started by the i-th computing node, determining an estimated workload indicator of the i-th computing node, assuming that the new task is performed at a second stage on the i-th computing node, determining estimated workload indicators of each direct neighbor assuming that the new task is performed at the second stage, deciding whether to move the new task to another computing node, and moving the new task to one of the direct neighboring computing nodes of the i-th computing node such that workloads are balanced.Type: GrantFiled: August 28, 2013Date of Patent: February 21, 2017Assignee: International Business Machines CorporationInventors: Gianluca Della Corte, Alessandro Donatelli, Antonio M. Sgro
-
Patent number: 9569272Abstract: A method and device for digital data processing based on a data flow processing model is suitable for the execution, in a distributed manner on multiple calculation nodes, of multiple data processing operations modelled by directed graphs, where two different processing operations include at least one common calculation node. The device includes an identification processor configured to, from a valued directed multi-graph made up of the union of several distinct processing graphs and divided into several valued directed sub-multi-graphs, called chunks, and whose input and output nodes are buffer memory nodes of the multi-graph, identify a coordination module for each chunk. Furthermore each identified coordination module is configured to synchronize portions of processing operations that are to be executed in the chunk with which the respective coordination module is associated, independently of portions of processing operations that are to be executed in other chunks.Type: GrantFiled: July 9, 2010Date of Patent: February 14, 2017Assignee: Commissariat a l'energie atomique et aux alternativesInventor: Yvain Thonnart
-
Patent number: 9552197Abstract: A non-transitory computer-readable recording medium stores therein a program for causing an information processing apparatus to execute a process including analyzing a source program with respect to the information processing apparatus that starts hardware prefetching upon detecting an access to a consecutive area on a main storage device and stops the hardware prefetching upon detecting an end of the access to the consecutive area, specifying an array structure in a loop process as a hardware prefetching target, and generating, from the source program, a machine language program in which the array structure is changed so that a second access occurring next to a first access to the array structure refers to an area being consecutive from the area being referred to by the first access.Type: GrantFiled: August 31, 2015Date of Patent: January 24, 2017Assignee: FUJITSU LIMITEDInventor: Shigeru Kimura
-
Patent number: 9467532Abstract: A computing method is provided which includes calling a general purpose graphics processing subroutine for execution of a target program by a client; sending a program code and resource data for execution of the target program to a server by the client; and executing the program code using a general purpose graphics processing unit by the server.Type: GrantFiled: January 22, 2013Date of Patent: October 11, 2016Assignee: INDUSTRY-ACADEMIC COOPERATION FOUNDATION, YONSEI UNIVERSITYInventors: Won Woo Ro, Keunsoo Kim, Seung Hun Kim
-
Patent number: 9448778Abstract: A computer-implemented method, system, and computer program product for performing object collocation on a computer system are provided. The method includes analyzing a sequence of computer instructions for object allocations and uses of the allocated objects. The method further includes creating an allocation interference graph of object allocation nodes with edges indicating pairs of allocations to be omitted from collocation. The method also includes coloring the allocation interference graph such that adjacent nodes are assigned different colors, and creating an object allocation at a program point prior to allocations of a selected color from the allocation interference graph. The method additionally includes storing an address associated with the created object allocation in a collocation pointer, and replacing a use of each allocation of the selected color with a use of the collocation pointer to collocate multiple objects.Type: GrantFiled: July 30, 2014Date of Patent: September 20, 2016Assignee: International Business Machines CorporationInventors: Patrick Doyle, Pramod Ramarao, Vijay Sundaresan
-
Patent number: 9436589Abstract: An analysis system may perform network analysis on data gathered from an executing application. The analysis system may identify relationships between code elements and use tracer data to quantify and classify various code elements. In some cases, the analysis system may operate with only data gathered while tracing an application, while other cases may combine static analysis data with tracing data. The network analysis may identify groups of related code elements through cluster analysis, as well as identify bottlenecks from one to many and many to one relationships. The analysis system may generate visualizations showing the interconnections or relationships within the executing code, along with highlighted elements that may be limiting performance.Type: GrantFiled: March 29, 2013Date of Patent: September 6, 2016Assignee: Microsoft Technology Licensing, LLCInventors: Ying Li, Alexander G. Gounares, Charles D. Garrett, Russell S. Krajec
-
Patent number: 9424103Abstract: A method for operating a lock in a computing system having plural processing units and running under multiple runtime environments is provided. When a requester thread attempts to acquire the lock while the lock is held by a holder thread, determine whether the holder thread is suspendable or non-suspendable. If the holder thread is non-suspendable, put the requester thread in a spin state regardless of whether the requester thread is suspendable or non-suspendable; otherwise determines whether the requester thread is suspendable or non-suspendable unless the requester thread quits acquiring the lock. If the requester thread is non-suspendable, arrange the requester thread to attempt acquiring the lock again; otherwise add the requester thread to a wait queue as an additional suspended thread. Suspended threads stored in the wait queue are allowable to be resumed later for lock acquisition. The method is applicable for the computing system with a multicore processor.Type: GrantFiled: September 30, 2014Date of Patent: August 23, 2016Assignee: Hong Kong Applied Science and Technology Research Institute Company LimitedInventors: Yi Al, Lin Xu, Jianchao Lu, Shaohua Zhang
-
Patent number: 9395957Abstract: A high level programming language provides an agile communication operator that generates a segmented computational space based on a resource map for distributing the computational space across compute nodes. The agile communication operator decomposes the computational space into segments, causes the segments to be assigned to compute nodes, and allows the user to centrally manage and automate movement of the segments between the compute nodes. The segment movement may be managed using either a full global-view representation or a local-global-view representation of the segments.Type: GrantFiled: December 22, 2010Date of Patent: July 19, 2016Assignee: Microsoft Technology Licensing, LLCInventor: Paul F. Ringseth
-
Patent number: 9367291Abstract: An apparatus and method for generating vector code are provided. The apparatus and method generate vector code using scalar-type kernel code, without user's changing a code type or modifying data layout, thereby enhancing user's convenience of use and retaining the portability of OpenCL.Type: GrantFiled: March 28, 2014Date of Patent: June 14, 2016Assignees: Samsung Electronics Co., Ltd., Seoul National University R&DB FoundationInventors: Jin-Seok Lee, Seong-Gun Kim, Dong-Hoon Yoo, Seok-Joong Hwang, Jeongho Nah, Jaejin Lee, Jun Lee
-
Patent number: 9354629Abstract: Example methods and apparatus to configure a process control system using an electronic description language (EDL) script are disclosed. A disclosed example method comprises loading a first script representative of a process plant, the first script comprising an interpretive system-level script structured in accordance with an electronic description language, and compiling the first script to form a second script, the second script structured in accordance with a vendor-specific configuration language associated with a particular process control system for the process plant.Type: GrantFiled: February 19, 2009Date of Patent: May 31, 2016Assignee: Fisher-Rosemount Systems, Inc.Inventors: James Randall Balentine, Gary Keith Law, Mark Nixon
-
Patent number: 9311065Abstract: Embodiments relate to data splitting for multi-instantiated objects. An aspect includes receiving a portion of source code for compilation having a dynamic object to split using object size array data splitting. Another aspect includes replacing all memory allocations for the dynamic object with a total size of an object size array and object field arrays including a predetermined padding. Another aspect includes inserting statements in the source code after the memory allocations to populate the object size array with a value of a number of elements of the object size array. Another aspect includes updating a stride for load and store operations using dynamic pointers. Yet another aspect includes modifying field references by adding a distance between the object size array and the object field array to respective address operations.Type: GrantFiled: June 19, 2014Date of Patent: April 12, 2016Assignee: International Business Machines CorporationInventors: Shimin Cui, Yan Zhang