Loop Compiling Patents (Class 717/150)

Merging buffer access operations in a coarse-grained reconfigurable computing system

Patent number: 12254300

Abstract: A method for merging buffers and associated operations includes receiving a compute graph for a reconfigurable dataflow computing system and conducting a buffer allocation and merging process responsive to determining that a first operation specified by a first operation node is a memory indexing operation and that the first operation node is a producer for exactly one consuming node that specifies a second operation. The buffer allocation and merging process may include replacing the first operation node and the consuming node with a merged buffer node within the graph responsive to determining that the first operation and the second operation can be merged into a merged indexing operation and that the resource cost of the merged node is less than the sum of the resource costs of separate buffer nodes. A corresponding system and computer readable medium are also disclosed herein.

Type: Grant

Filed: October 27, 2022

Date of Patent: March 18, 2025

Assignee: SambaNova Systems, Inc.

Inventors: David Alan Koeplinger, Adam Bordelon, Weihang Fan, Kevin Brown, Weiwei Chen
Flow control for reconfigurable processors

Patent number: 12236220

Abstract: The technology disclosed relates to storing a dataflow graph with a plurality of compute nodes that transmit data along data connections, and controlling data transmission between compute nodes in the plurality of compute nodes along the data connections by using control connections to control writing of data.

Type: Grant

Filed: June 7, 2023

Date of Patent: February 25, 2025

Assignee: SambaNova Systems, Inc.

Inventors: Weiwei Chen, Raghu Prabhakar, David Alan Koeplinger, Sitanshu Gupta, Ruddhi Chaphekar, Ajit Punj, Sumti Jairath
Programmable multi-level data access address generator

Patent number: 12072799

Abstract: A programmable address generator has an iteration variable generator for generation of an ordered set of iteration variables, which are re-ordered by an iteration variable selection fabric, which delivers the re-ordered iteration variables to one or more address generators. A configurator receives an instruction containing fields which provide configuration constants to the address generator, iteration variable selection fabric, and address generators. After configuration, the address generators provide addresses coupled to a memory. In one example of the invention, the address generators generate an input address, a coefficient address, and an output address for performing convolutional neural network inferences.

Type: Grant

Filed: March 14, 2023

Date of Patent: August 27, 2024

Assignee: Ceremorphic, Inc.

Inventors: Lizy Kurian John, Venkat Mattela, Heonchul Park
Offload server, offload control method, and offload program

Patent number: 12056475

Abstract: An offload server includes: an application code analysis section configured to analyze source code of an application; a data transfer designation section configured to, on the basis of a result of the code analysis, designate a data transfer to be collectively performed on, before starting GPU processing and after finishing the GPU processing, of variables that need to be transferred between a CPU and a GPU, those which are not mutually referenced nor mutually updated between CPU processing and the GPU processing and which are only to be returned to the CPU as a result of the GPU processing; a parallel processing designation section configured to identify loop statements in the application, and, for each of the identified loop statements, specify a statement specifying application of parallel processing by the GPU and perform compilation.

Type: Grant

Filed: February 4, 2020

Date of Patent: August 6, 2024

Assignee: Nippon Telegraph and Telephone Corporation

Inventor: Yoji Yamato
Microkernel-based software optimization of neural networks

Patent number: 11720351

Abstract: Disclosed are systems and methods related to providing for the optimized software implementations of artificial intelligence (“AI”) networks. The system receives operations (“ops”) consisting of a set of instructions to be performed within an AI network. The system then receives microkernels implementing one or more instructions to be performed within the AI network for a specific hardware component. Next, the system generates a kernel for each of the operations. Generating the kernel for each of the operations includes configuring input data to be received from the AI network; detecting a specific hardware component to be used; selecting one or more microkernels to be invoked by the kernel based on the detection of the specific hardware component; and configuring output data to be sent to the AI network as a result of the invocation of the microkernel(s).

Type: Grant

Filed: March 17, 2021

Date of Patent: August 8, 2023

Assignee: OnSpecta, Inc.

Inventors: Victor Jakubiuk, Sebastian Kaczor
Data plane program verification

Patent number: 11720373

Abstract: A method for verifying data plane programs is provided in some embodiments. Because the behavior of a data plane program (e.g., a program written in the P4 language) is determined in part by the control plane populating match-action tables with specific forwarding rules, in some embodiments, programmers are provided with a way to document assumptions about the control plane using annotations (e.g., in the form of “assertions” or “assumptions” about the state based on the unknown control plane contribution). In some embodiments, annotations are added automatically to verify common properties, including checking that every header read or written is valid, that every expression has a well-defined value, and that all standard metadata is manipulated correctly. The method in some embodiments translates programs from a first language (e.g., P4) to a second language (e.g., Guarded Command Language (GCL)) for verification by a satisfiability modulo theory (SMT) solver.

Type: Grant

Filed: November 29, 2021

Date of Patent: August 8, 2023

Assignee: Barefoot Networks, Inc.

Inventors: Jeongkeun Lee, Cole Nathan Schlesinger, John Nathan Foster, Han Wang, Robert Soule, William Hallahan, Steffen Julif Smolka, Mon Jed Liu
Compilation and execution of source code as services

Patent number: 11714616

Abstract: This document relates to compilation of source code into services. One example method involves receiving input source code, identifying data dependencies in the input source code, and identifying immutability points in the input source code based at least on the data dependencies. The example method also involves converting at least some of the input source code occurring after the immutability points to one or more service modules.

Type: Grant

Filed: June 28, 2019

Date of Patent: August 1, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Robert Lovejoy Goodwin, Janaina Barreiro Gambaro Bueno, Sitaramaswamy V. Lanka, Dragos Barac, Javier Garcia Flynn, Pedram Faghihi Rezaei, Karthik Pattabiraman
Processor in non-volatile storage memory

Patent number: 11705207

Abstract: In one example, a computing system includes a device, the device including: a non-volatile memory divided into a plurality of selectable locations, each bit in the non-volatile memory configured to have corresponding data independently altered, wherein the selectable locations are grouped into a plurality of data lines; and one or more processing units coupled to the non-volatile memory, each of the processing units associated with a data line of the plurality of data lines, and each of the processing units configured to compute, based on data in an associated data line of the plurality of data lines, corresponding results, wherein the non-volatile memory is configured to selectively write, based on the corresponding results, data in selectable locations of the associated data line reserved to store results of the computation from the process unit associated with the associated data line.

Type: Grant

Filed: November 24, 2020

Date of Patent: July 18, 2023

Assignee: WESTERN DIGITAL TECHNOLOGIES, INC.

Inventors: Luis Vitorio Cargnini, Viacheslav Anatolyevich Dubeyko
Fast compiling source code without dependencies

Patent number: 11669313

Abstract: Techniques for an ultra-fact software compilation of source code are provided. A compiler receives software code and may divide it into code sections. A map of ordered nodes may be generated, such that each node in the map may include a code section and the order of the nodes indicates an execution order of the software code. Each code section may be compiled into an executable object in parallel and independently from other code sections. A binary executable may be generated by linking executable objects generated from the code sections. The methodology significantly differs from existing source code compilation techniques because conventional compilers build executable sequentially, whereas the embodiments divide the source code into multiple smaller code sections and compile them individually and in parallel. Compiling multiple code sections improves the compilations in order of magnitude from conventional techniques.

Type: Grant

Filed: September 30, 2021

Date of Patent: June 6, 2023

Assignee: PayPal, Inc.

Inventor: Abraham Richard Hoffman
Data storage device executing runt write commands as free commands

Patent number: 11656797

Abstract: A data storage device is disclosed comprising a head actuated over a disk comprising a plurality of data tracks. A plurality of access commands including a plurality of write commands are stored in a command queue, and the access commands are sorted into an execution order. A first write command is selected from the command queue based on the execution order, and a first part of the first write command is executed leaving a runt write command. The runt write command is executed between two of the sorted access commands so that the runt write command does not affect the execution order.

Type: Grant

Filed: July 28, 2021

Date of Patent: May 23, 2023

Assignee: Western Digital Technologies, Inc.

Inventor: David R. Hall
Collapsing of multiple nested loops, methods, and instructions

Patent number: 11640298

Abstract: In an embodiment, the present invention is directed to a processor including a decode logic to receive a multi-dimensional loop counter update instruction and to decode the multi-dimensional loop counter update instruction into at least one decoded instruction, and an execution logic to execute the at least one decoded instruction to update at least one loop counter value of a first operand associated with the multi-dimensional loop counter update instruction by a first amount. Methods to collapse loops using such instructions are also disclosed. Other embodiments are described and claimed.

Type: Grant

Filed: May 18, 2021

Date of Patent: May 2, 2023

Assignee: Intel Corporation

Inventors: Mikhail Plotnikov, Andrey Naraikin, Elmoustapha Ould-Ahmed-Vall
Systems and methods for scalable hierarchical polyhedral compilation

Patent number: 11537373

Abstract: A system for compiling programs for execution thereof using a hierarchical processing system having two or more levels of memory hierarchy can perform memory-level-specific optimizations, without exceeding a specified maximum compilation time. To this end, the compiler system employs a polyhedral model and limits the dimensions of a polyhedral program representation that is processed by the compiler at each level using a focalization operator that temporarily reduces one or more dimensions of the polyhedral representation. Semantic correctness is provided via a defocalization operator that can restore all polyhedral dimensions that had been temporarily removed.

Type: Grant

Filed: September 28, 2020

Date of Patent: December 27, 2022

Assignee: Qualcomm Technologies, Inc.

Inventors: Muthu Manikandan Baskaran, Benoit J. Meister, Benoit Pradelle
Hardware acceleration method, compiler, and device

Patent number: 11262992

Abstract: A hardware acceleration method includes: obtaining compilation policy information and a source code, where the compilation policy information indicates that a first code type matches a first processor and a second code type matches a second processor, analyzing a code segment in the source code according to the compilation policy information, determining a first code segment belonging to the first code type or a second code segment belonging to the second code type, compiling the first code segment into a first executable code, sending the first executable code to the first processor, compiling the second code segment into a second executable code, and sending the second executable code to the second processor.

Type: Grant

Filed: November 19, 2019

Date of Patent: March 1, 2022

Assignee: HUAWEI TECHNOLOGIES CO., LTD.

Inventors: Jian Chen, Hong Zhou, Xinyu Hu, Hongguang Guan, Xiaojun Zhang
Hardware acceleration method, compiler, and device

Patent number: 11200039

Abstract: A hardware acceleration method includes: obtaining compilation policy information and a source code, where the compilation policy information indicates that a first code type matches a first processor and a second code type matches a second processor, analyzing a code segment in the source code according to the compilation policy information, determining a first code segment belonging to the first code type or a second code segment belonging to the second code type, compiling the first code segment into a first executable code, sending the first executable code to the first processor, compiling the second code segment into a second executable code, and sending the second executable code to the second processor.

Type: Grant

Filed: November 19, 2019

Date of Patent: December 14, 2021

Assignee: HUAWEI TECHNOLOGIES CO., LTD.

Inventors: Jian Chen, Hong Zhou, Xinyu Hu, Hongguang Guan, Xiaojun Zhang
Instrumentation profiling for reconfigurable processors

Patent number: 11126574

Abstract: A data processing system comprises compile time logic, runtime logic, a control bus, and instrumentation units operatively coupled to processing units of an array. The compile time logic is configured to generate configuration files for a dataflow graph. The runtime logic is configured to execute the configuration files on the array, and to trigger start and stop events, as defined by the configuration files, in response to implementation of compute and memory operations of the dataflow graph on the array. A control bus is configured to form event routes in the array. The instrumentation units have inputs and outputs connected to the control bus and to the processing units. The instrumentation units are configured to consume the start events on the inputs and start counting clock cycles, consume the stop events on the inputs and stop counting the clock cycles, and report the counted clock cycles on the outputs.

Type: Grant

Filed: February 12, 2021

Date of Patent: September 21, 2021

Assignee: SambaNova Systems, Inc.

Inventors: Raghu Prabhakar, Matthew Thomas Grimm, Sumti Jairath, Kin Hing Leung, Sitanshu Gupta, Yuan Lin, Luca Boasso
Data processing device data processing method and recording medium

Patent number: 11048511

Abstract: A data processing device according to the present invention includes: a loop counter group that includes a loop-control register set; a loop controller that controls the loop counter group, based on a value of the loop counter group, and generates a loop end signal; a controller that controls the loop counter group and the loop controller, based on an instruction word taken from an instruction memory and the loop end signal, and generates a calculator control signal and a program-counter control signal; a calculator that executes a calculation, based on the calculator control signal; and a program counter that performs a count operation in response to the program-counter control signal, and stores an address of the instruction memory storing an instruction word to be executed next.

Type: Grant

Filed: November 9, 2018

Date of Patent: June 29, 2021

Assignee: NEC CORPORATION

Inventor: Yuki Kobayashi
System and method for supporting large queries in a multidimensional database environment

Patent number: 10984020

Abstract: In accordance with an embodiment, the system provides support for large queries in a multidimensional database computing environment. A kernel-based data structure, referred to herein as an odometer retriever, or odometer, that manages pointers to data blocks, contains control information, or otherwise operates as an array of arrays of pointers to stored members. When used with a dynamic flow, the approach enables the system to be used, for example to handle grid queries, Multidimensional Expressions (MDX) queries, or other types of queries in which the potential size of the query can be up to 264 bits.

Type: Grant

Filed: October 24, 2016

Date of Patent: April 20, 2021

Assignee: ORACLE INTERNATIONAL CORPORATION

Inventor: Alexey Roytman
Methods, systems and apparatus to improve FPGA pipeline emulation efficiency on CPUs

Patent number: 10909287

Abstract: Methods, apparatus, systems and articles of manufacture are disclosed to improve FPGA pipeline emulation efficiency on CPUs. An example disclosed apparatus includes a loop detector to identify a register shift loop in field programmable gate array (FPGA) code, an unroller to shift and store pipeline stages in the register shift loop to a temporary unroll array, an intermediate canceller to cancel out intermediate load and store values of the temporary unroll array to retain last shifted values of the pipeline stages, and a propagator to improve emulation efficiency of the FPGA code by generating a scalar loop of the retained last shifted values for a vectorization input.

Type: Grant

Filed: June 28, 2017

Date of Patent: February 2, 2021

Assignee: INTEL CORPORATION

Inventors: Xinmin Tian, Geoff Lowney
Performing a compiler optimization pass as a transaction

Patent number: 10891120

Abstract: Embodiments described herein provide a solution for optimizing a compiling of program code. A proposed state pointer, which corresponds to a current state pointer to a current state node that represents a section of the program code, is added in an intermediate language (IL) representation of the program code. When the optimizing compiler determines that an optimization should be made to a section of code, the current state node is copied to create a proposed state node, which is then referenced by the proposed state pointer. The proposed state node is edited to include the optimization while the current state node remains unchanged. The success of the optimization is evaluated, and an updated IL representation is generated in which any references to nodes that are no longer included in the flow of the former IL representation are removed.

Type: Grant

Filed: February 14, 2019

Date of Patent: January 12, 2021

Assignee: International Business Machines Corporation

Inventor: Irwin D'Souza
Collapsing of multiple nested loops, methods, and instructions

Patent number: 10877758

Abstract: In an embodiment, the present invention is directed to a processor including a decode logic to receive a multi-dimensional loop counter update instruction and to decode the multi-dimensional loop counter update instruction into at least one decoded instruction, and an execution logic to execute the at least one decoded instruction to update at least one loop counter value of a first operand associated with the multi-dimensional loop counter update instruction by a first amount. Methods to collapse loops using such instructions are also disclosed. Other embodiments are described and claimed.

Type: Grant

Filed: September 4, 2018

Date of Patent: December 29, 2020

Assignee: Intel Corporation

Inventors: Mikhail Plotnikov, Andrey Naraikin, Elmoustapha Ould-Ahmed-Vall
Control speculation in dataflow graphs

Patent number: 10860301

Abstract: Systems, apparatuses and methods may provide for technology that determines that a control loop is to be executed for an unspecified number of iterations and automatically forces the control loop to be executed for a fixed number of iterations in addition to the unspecified number of iterations, where execution of the control loop for the fixed number of iterations is conducted in parallel. In one example, the technology also removes one or more dataflow tokens associated with the execution of the control loop for the fixed number of iterations.

Type: Grant

Filed: June 28, 2019

Date of Patent: December 8, 2020

Assignee: Intel Corporation

Inventor: Kermin ChoFleming
Escape analysis supporting on-stack replacement

Patent number: 10782945

Abstract: An enhanced object allocation optimization selectively traverses an intermediate representation detecting on-stack replacement transitions, which when found are analyzed to determine whether a control flow-edge from a first block to a second block that is marked as an OSR resumption block exists. Responding to when the second block is marked, a pseudo call including arguments of all live local variables holding pointers to objects is inserted into the intermediate representation while optimization opportunities exist and executing a modified escape analysis on a modified intermediate representation examining each pseudo call as an escape point for all object references received by the pseudo call as arguments; ignoring uses of local variables dominated by these pseudo calls; and stack allocating objects to handle the non-local control flow due to on-stack replacement control flow using the pseudo call.

Type: Grant

Filed: April 4, 2019

Date of Patent: September 22, 2020

Assignee: International Business Machines Corporation

Inventors: Andrew James Craik, Vijay Sundaresan
Control device

Patent number: 10712716

Abstract: A control device according to the present invention includes a plurality of arithmetic units that operate in parallel. A sensor value of the control amount is input to the first arithmetic unit in a signal transmission sequence, and a correction amount for the manipulation amount is output from the last arithmetic unit in the signal transmission sequence. The first arithmetic unit has a controller that produces an output by processing the input sensor value, and the arithmetic units other than the first arithmetic unit has a delay element that delays an input by a predetermined number of steps and a controller that produces an output by processing the delayed input.

Type: Grant

Filed: February 12, 2014

Date of Patent: July 14, 2020

Assignees: TOYOTA JIDOSHA KABUSHIKI KAISHA, NATIONAL UNIVERSITY CORPORATION NAGOYA UNIVERSITY

Inventors: Kota Sata, Junichi Kako, Satoru Watanabe, Yuta Suzuki, Masato Edahiro
Systems and methods for stencil amplification

Patent number: 10713022

Abstract: In a sequence of major computational steps or in an iterative computation, a stencil amplifier can increase the number of data elements accessed from one or more data structures in a single major step or iteration, thereby decreasing the total number of computations and/or communication operations in the overall sequence or the iterative computation. Stencil amplification, which can be optimized according to a specified parameter such as compile time, run time, code size, etc., can improve the performance of a computing system executing the sequence or the iterative computation in terms of run time, memory load, energy consumption, etc. The stencil amplifier typically determines boundaries, to avoid erroneously accessing data elements not present in the one or more data structures.

Type: Grant

Filed: October 29, 2015

Date of Patent: July 14, 2020

Assignee: Reservoir Labs, Inc.

Inventors: Muthu M. Baskaran, Thomas Henretty, Richard A. Lethin, Benoit J. Meister
System and method for supporting queries having sub-select constructs in a multidimensional database environment

Patent number: 10628451

Abstract: In accordance with an embodiment, described herein is a system and method for supporting queries having sub-select constructs in a multidimensional database computing environment. The system enables a sub-select construct to be provided as part of an input query, for example using a Multidimensional Expressions (MDX), or other type of query. The inner sub-select, specified by the input query, is not executed by the system before the main query, but is used to restrict the data space for execution of the main select. The approach to processing the sub-select enables support for security-sensitive or other types of aggregation use cases.

Type: Grant

Filed: October 24, 2016

Date of Patent: April 21, 2020

Assignee: ORACLE INTERNATIONAL CORPORATION

Inventors: Roman Reichman, Victor Belyaev, Kumar Ramaiyer, Inna Grois, Natasha Reichman
Hardware acceleration method, compiler, and device

Patent number: 10558443

Abstract: A hardware acceleration method, a compiler, and a device, to improve code execution efficiency and implement hardware acceleration. The method includes: obtaining, by a compiler, compilation policy information and source code, where the compilation policy information indicates that a first code type matches a first processor and a second code type matches a second processor; analyzing, by the compiler, a code segment in the source code according to the compilation policy information, and determining a first code segment belonging to the first code type or a second code segment belonging to the second code type; and compiling, by the compiler, the first code segment into first executable code, and sending the first executable code to the first processor; and compiling the second code segment into second executable code, and sending the second executable code to the second processor.

Type: Grant

Filed: December 28, 2017

Date of Patent: February 11, 2020

Assignee: HUAWEI TECHNOLOGIES CO., LTD.

Inventors: Jian Chen, Hong Zhou, Xinyu Hu, Hongguang Guan, Xiaojun Zhang
Information processing apparatus and conversion method

Patent number: 10496408

Abstract: An information processing apparatus sets, in a second program: a second array where an occurrence pattern indicating whether elements are subjected to computation is a repetition of a pattern for every power-of-two number of elements; a second mask array generated by adding masks indicating that corresponding elements are not subjected to the computation to a first mask array so that the second mask array includes as many masks as the number of elements included in a second pattern; and a second instruction string providing an instruction for the computation of elements corresponding to masks indicating that corresponding elements are subjected to the computation, among the elements set in the second array. Each mask in the second mask array to be applied to an element in the second array is specified by a bitwise logical AND using a value indicating the position of the element in the second array.

Type: Grant

Filed: March 28, 2017

Date of Patent: December 3, 2019

Assignee: FUJITSU LIMITED

Inventor: Masanori Yamanaka
Program development support system and program development support software

Patent number: 10310823

Abstract: The program development support system which creates a program described in a graph form, and arranged for executing data processing by a target device includes: a GUI part; a program-creating part; a process-executing function database; and a data-transfer function database. On condition that a process included in the data processing can be executed by different kinds of processing units provided in a target device, process-executing functions for executing the process by the respective processing units are held in the process-executing function database, and data-transfer functions therefor are held in the data-transfer function database. Which processing unit to use for executing the process can be selected by the GUI part. The program-creating part reads a process-executing function corresponding to the processing unit thus selected, and data-transfer functions and creates a program for making a target device execute the intended data processing.

Type: Grant

Filed: August 26, 2016

Date of Patent: June 4, 2019

Assignee: Renesas Electronics Corporation

Inventor: Yuki Kobayashi
Performance metric of a system conveying web content

Patent number: 10230590

Abstract: Obtaining a performance metric in a system for conveying web content from a server node to a terminal node along one or more network nodes, involving an inspecting of a data flow transmitting said web content toward the terminal node for extracting web content records. The extracted web content records are correlated to at least one web session. The extracted web content records are associated to a performance of one or more of said nodes. The performance metric is calculated from the correlated and associated web content records for at least one web session and one or more of said nodes.

Type: Grant

Filed: December 3, 2013

Date of Patent: March 12, 2019

Assignee: Telefonaktiebolaget LM Ericsson (publ)

Inventors: Icaro L. J. Da Silva, Fredrik Kuivinen, Jing Fu
Parallel processing optimization method, and information processing device

Patent number: 10223168

Abstract: A specification unit specifies numbers of cores executing processing when a predetermined number of processings to be executed in parallel is allocated to cores by same amount by changing number of processings to be allocated within a range of numbers of cores capable of executing parallel processing. A determination unit determines number of cores with highest processing performance as the number of cores executing the parallel processing from among the specified numbers of cores.

Type: Grant

Filed: June 20, 2014

Date of Patent: March 5, 2019

Assignee: FUJITSU LIMITED

Inventors: Naoto Fukumoto, Kohta Nakashima
Systems and methods for footprint based scheduling

Patent number: 10095494

Abstract: A system can generate and impose constraints on a compiler/scheduler so as to specifically minimize the footprints of one or more program variables. The constraints can be based on scopes of the variables and/or on dependence distances between statements specifying operations that use the one or more program variables.

Type: Grant

Filed: August 28, 2015

Date of Patent: October 9, 2018

Assignee: Reservoir Labs, Inc.

Inventors: Benoit J. Meister, Muthu M. Baskaran, Richard A. Lethin
Systems and methods for efficient determination of task dependences after loop tiling

Patent number: 10095434

Abstract: A compilation system can compile a program to be executed using an event driven tasks (EDT) system that requires knowledge of dependencies between program statement instances, and generate the required dependencies efficiently when a tiling transformation is applied. To this end, the system may use pre-tiling dependencies and can derive post-tiling dependencies via an analysis of the tiling to be applied.

Type: Grant

Filed: January 4, 2016

Date of Patent: October 9, 2018

Assignee: RESERVOIR LABS, INC.

Inventors: Muthu M. Baskaran, Thomas Henretty, Ann Johnson, Athanasios Konstantinidis, M. H. Langston, Janice O. McMahon, Benoit J. Meister, Paul D. Mountcastle, Aale Naqvi, Benoit Pradelle, Tahina Ramananandro, Sanket Tavarageri, Richard A. Lethin
Program optimization method, program optimization program, and program optimization apparatus

Patent number: 9760352

Abstract: A program optimization method, executed by an arithmetic processing device, includes collecting profile information including a runtime analysis result by causing a computer to execute an original program to be optimized, calculating a calculation wait time based on the profile information, and generating a tuned-up program, when the calculation wait time is longer than a first threshold, by inserting an SIMD operation control line that performs an SIMD operation for an instruction in IF statement in the loop when an SIMD instruction ratio in the loop in the original program is lower than a second threshold.

Type: Grant

Filed: July 15, 2015

Date of Patent: September 12, 2017

Assignee: FUJITSU LIMITED

Inventors: Shusaku Nakashima, Toshiya Naito
Nested communication operator

Patent number: 9507568

Abstract: A high level programming language provides a nested communication operator that partitions a computational space. An indexable type with a rank and element type defines the computational space. The nested communication operator partitions a specified dimension of an index indexable type into segments specified by a segmentation vector and returns an output indexable type that represents the segments. By doing so, the nested communication operator allows data parallel algorithms to operate on the segments as individual units.

Type: Grant

Filed: December 9, 2010

Date of Patent: November 29, 2016

Assignee: Microsoft Technology Licensing, LLC

Inventor: Paul F. Ringseth
Message inlining

Patent number: 9430196

Abstract: In certain embodiments, source code that includes at least one request to send an asynchronous message between state machines is obtained at a computing system. The computing system determines whether the asynchronous message can be implemented as an inlined message. Executable machine code corresponding to the source code is compiled such that the asynchronous message is compiled as an inlined message.

Type: Grant

Filed: October 16, 2014

Date of Patent: August 30, 2016

Assignee: Cisco Technology, Inc.

Inventors: Swaha Das Miller, Jonathan Gregory Rossie, Jr., Jamie Taylor
Compiling multi-threaded applications for targeted criticalities

Patent number: 9430201

Abstract: Methods are disclosed of compiling a software application having multiple functions. At least one of the functions is identified as a targeted function having a significant contribution to performance of the software application. A code version of the targeted function is generated with one of multiple machine models corresponding to different target utilizations for a target architecture, specifically corresponding to the one with the greatest of the different target utilizations. The generated code version of the targeted function is matched with an application thread of the target architecture.

Type: Grant

Filed: August 21, 2014

Date of Patent: August 30, 2016

Assignee: Oracle International Corporation

Inventors: Spiros Kalogeropulos, Partha Tirumalai
Information processing apparatus and compilation method

Patent number: 9430203

Abstract: A storage unit stores source code including loop processing that is written with an array referenced by an index, a loop variable, and a parameter. A computing unit generates a conditional expression indicating that the index of the array satisfies a predetermined condition, using the loop variable and the parameter. The computing unit generates determination information on the parameter, by eliminating the loop variable from the conditional expression through formula manipulation. Then, the computing unit generates object code corresponding to the source code in accordance with the determination information.

Type: Grant

Filed: October 24, 2014

Date of Patent: August 30, 2016

Assignee: FUJITSU LIMITED

Inventors: Kuninori Ishii, Masanori Yamanaka, Masaki Arai
Heterogeneous thread scheduling

Patent number: 9424092

Abstract: Heterogeneous thread scheduling techniques are described in which a processing workload is distributed to heterogeneous processing cores of a processing system. The heterogeneous thread scheduling may be implemented based upon a combination of periodic assessments of system-wide power management considerations used to control states of the processing cores and higher frequency thread-by-thread placement decisions that are made in accordance with thread specific policies. In one or more implementations, a system workload context is periodically analyzed for a processing system having heterogeneous cores including power efficient cores and performance oriented cores. Based on the periodic analysis, cores states are set for some of the heterogeneous cores to control activation of the power efficient cores and performance oriented cores for thread scheduling.

Type: Grant

Filed: September 26, 2014

Date of Patent: August 23, 2016

Assignee: Microsoft Technology Licensing, LLC

Inventors: Neeraj Kumar Singh, Tristan A. Brown, Jeremiah S. Samli, Jason S. Wohlgemuth, Youssef Maged Barakat
Loop abstraction for model checking

Patent number: 9158506

Abstract: Loop abstraction includes determining an original loop within the source code. The original loop includes a control statement and a loop body such that the original loop causes the loop body to be repeatedly executed based on the control statement. Further, output variables in the original loop and a number of blocks associated with the original loop are identified. The number of blocks is indicative of a count of unconditionally executed statement sets in which at least one output variable is computed. An abstract loop corresponding to the original loop is generated by adding a modified expression for accelerated assignment for each output variable in a subset of the output variables, and replacing the control statement with a bounded control statement. The original loop is replaced with the abstract loop for generating an abstract source code for the model checking.

Type: Grant

Filed: February 27, 2015

Date of Patent: October 13, 2015

Assignee: Tata Consultancy Services Limited

Inventors: Priyanka Dilip Darke, Bharti Dewrao Chimdyalwar, Venkatesh R, Ulka Aniruddha Shrotri
Computer-readable recording medium, compiling method, and information processing apparatus

Patent number: 9141357

Abstract: A compiler determines executability of loop fusion, for each of a plurality of loops existing in a code to be processed, based on performance information of a system where the code to be processed is executed and based on operands and number of data transfers executed inside each of the loops. Then, the compiler executes fusion of loop processing in accordance with a determination result of executability of the loop fusion.

Type: Grant

Filed: April 17, 2014

Date of Patent: September 22, 2015

Assignee: FUJITSU LIMITED

Inventors: Tomoko Nikko, Shuichi Chiba
Method and apparatus for efficient execution of concurrent processes on a multithreaded message passing system

Patent number: 9122513

Abstract: A graph analytics appliance can be employed to extract data from a graph database in an efficient manner. The graph analytics appliance includes a router, a worklist scheduler, a processing unit, and an input/output unit. The router receives an abstraction program including a plurality of parallel algorithms for a query request from an abstraction program compiler residing on computational node or the graph analytics appliance. The worklist scheduler generates a prioritized plurality of parallel threads for executing the query request from the plurality of parallel algorithms. The processing unit executes multiple threads selected from the prioritized plurality of parallel threads. The input/output unit communicates with a graph database.

Type: Grant

Filed: November 28, 2012

Date of Patent: September 1, 2015

Assignee: International Business Machines Corporation

Inventors: Arpith C. Jacob, Jude A. Rivers
Method and apparatus for efficient execution of concurrent processes on a multithreaded message passing system

Patent number: 9116738

Abstract: A graph analytics appliance can be employed to extract data from a graph database in an efficient manner. The graph analytics appliance includes a router, a worklist scheduler, a processing unit, and an input/output unit. The router receives an abstraction program including a plurality of parallel algorithms for a query request from an abstraction program compiler residing on computational node or the graph analytics appliance. The worklist scheduler generates a prioritized plurality of parallel threads for executing the query request from the plurality of parallel algorithms. The processing unit executes multiple threads selected from the prioritized plurality of parallel threads. The input/output unit communicates with a graph database.

Type: Grant

Filed: November 13, 2012

Date of Patent: August 25, 2015

Assignee: International Business Machines Corporation

Inventors: Arpith C. Jacob, Jude A. Rivers
Optimization of loops and data flow sections in multi-core processor environment

Patent number: 9043769

Abstract: The present invention relates to a method for compiling code for a multi-core processor, comprising: detecting and optimizing a loop, partitioning the loop into partitions executable and mappable on physical hardware with optimal instruction level parallelism, optimizing the loop iterations and/or loop counter for ideal mapping on hardware, chaining the loop partitions generating a list representing the execution sequence of the partitions.

Type: Grant

Filed: December 28, 2010

Date of Patent: May 26, 2015

Assignee: Hyperion Core Inc.

Inventor: Martin Vorbach
Stream processor with compiled programs

Patent number: 9038041

Abstract: A stream processing platform that provides fast execution of stream processing applications within a safe runtime environment. The platform includes a stream compiler that converts a representation of a stream processing application into executable program modules for a safe environment. The platform allows users to specify aspects of the program that contribute to generation of modules that execute as intended. A user may specify aspects to control a type of implementation for loops, order of execution for parallel paths, whether multiple instances of an operation can be performed in parallel or whether certain operations should be executed in separate threads. In addition, the stream compiler may generate executable modules in a way that cause a safe runtime environment to allocate memory or otherwise operate efficiently.

Type: Grant

Filed: December 22, 2006

Date of Patent: May 19, 2015

Assignee: TIBCO Software, Inc.

Inventors: Jonathan Salz, Richard S. Tibbetts
Staged loop instructions

Patent number: 9038042

Abstract: Loop instructions are analyzed and assigned stage numbers based on dependencies between them and machine resources available. The loop instructions are selectively executed based on their stage numbers, thereby eliminating the need for explicit loop set-up and tear-down instructions. On a Single Instruction, Multiple Data machine, the final instance of each instruction may be executed on a subset of the processing elements or vector elements, dependent on the number of iterations of the original loop.

Type: Grant

Filed: June 29, 2012

Date of Patent: May 19, 2015

Assignee: ANALOG DEVICES, INC.

Inventors: Michael G. Perkins, Andrew J. Higham
INFORMATION PROCESSING APPARATUS AND COMPILATION METHOD

Publication number: 20150135171

Abstract: A storage unit stores source code including loop processing that is written with an array referenced by an index, a loop variable, and a parameter. A computing unit generates a conditional expression indicating that the index of the array satisfies a predetermined condition, using the loop variable and the parameter. The computing unit generates determination information on the parameter, by eliminating the loop variable from the conditional expression through formula manipulation. Then, the computing unit generates object code corresponding to the source code in accordance with the determination information.

Type: Application

Filed: October 24, 2014

Publication date: May 14, 2015

Inventors: Kuninori ISHII, Masanori YAMANAKA, MASAKI ARAI
Crash notification between debuggers

Patent number: 9009671

Abstract: Crash notification between debuggers, including: initiating, by a first debugger, a first debug session of a first application; detecting, by the first debugger, an error condition in the first application; determining, by the first debugger, whether any variables utilized by the first application are related to variables utilized by a second application, wherein the second application is being debugged in a second debug session by a second debugger; and communicating, by the first debugger to a second debugger, information associated with the error condition in the first application.

Type: Grant

Filed: February 13, 2013

Date of Patent: April 14, 2015

Assignee: International Business Machines Corporation

Inventors: Cary L. Bates, Justin K. King, Lee Nee, Michelle A. Schlicht
Flexible and run-time-modifiable inclusion of functionality in computer code

Patent number: 8997042

Abstract: The current application is directed to flexible and run-time-modifiable implementation of crosscutting functionalities, including code instrumentation, error logging, and other such crosscutting functionalities. These crosscutting functionalities generally violate, or run counter to, modern code-development strategies and programming-language features that seek to partition logic into hierarchically organized compartments and modules with related functionalities, attribute values, and other common features. One feature of the methods and systems for implementing crosscutting functionalities to which the current application is directed is an intelligent switch that can be controlled, at run time, to alter invocation and behavior of crosscutting-functionality implementations, including data-collection instrumentation, error logging, and other crosscutting-functionality implementations.

Type: Grant

Filed: October 15, 2012

Date of Patent: March 31, 2015

Assignee: Pivotal Software, Inc.

Inventors: John Victor Kew, Jonathan Travis
AUTO MULTI-THREADING IN MACROSCALAR COMPILERS

Publication number: 20150058832

Abstract: System and methods for the parallelization of software applications are described. In some embodiments, a compiler may automatically identify within source code dependencies of a function called by another function. A persistent database may be generated to store identified dependencies. When calls the function are encountered within the source code, the persistent database may be checked, and a parallelized implementation of the function may be employed dependent upon the dependency indicated in the persistent database.

Type: Application

Filed: November 4, 2014

Publication date: February 26, 2015

Inventor: Jeffry E. Gonion
Processors and compiling methods for processors

Patent number: 8966459

Abstract: A compiling method compiles an object program to be executed by a processor having a plurality of execution units operable in parallel. In the method a first availability chain is created from a producer instruction (p1), scheduled for execution by a first one of the execution units (20: AGU), to a first consumer instruction (c1), scheduled for execution by a second one of the execution units (22: EXU) and requiring a value produced by the said producer instruction. The first availability chain comprises at least one move instruction (mv1-mv3) for moving the required value from a first point (20: ARF) accessible by the first execution unit to a second point (22: DRF) accessible by the second execution unit.

Type: Grant

Filed: February 20, 2014

Date of Patent: February 24, 2015

Assignee: Altera Corporation

Inventors: Marcio Merino Fernandes, Raymond Malcolm Livesley

1 2 3 4 5 … next