Including Loop Patents (Class 717/160)

Including scheduling instructions (Class 717/161)

Compilation with caching of code analysis result

Patent number: 11941383

Abstract: Techniques to speed up code compilation may include caching code analysis results such that the analysis of subsequent code having a similar structured can be omitted. For example, a loop-nest construct in the code can be parsed, and an execution statement in the loop-nest construct can be analyzed by a compiler to generate an analysis result indicating a set of execution conditions for the execution statement. A lookup key can be generated from the control statements bounding the execution statement, and the analysis result can be stored with the lookup key in a cache entry of the cache. The execution statement is then modified according to the analysis result for optimization. Instead of having to analyze a subsequent execution statement bounded by the same control statements, the analysis result of the subsequent execution statement can be retrieved from the cache and be used to modify the subsequent execution statement.

Type: Grant

Filed: March 8, 2022

Date of Patent: March 26, 2024

Assignee: Amazon Technologies, Inc.

Inventors: Hongbin Zheng, Pushkar Ratnalikar
Compiler-driven storage allocation of runtime values

Patent number: 11934876

Abstract: A compiler-implemented technique for performing a storage allocation is described. Computer code to be converted into machine instructions for execution on an integrated circuit device is received. Based on the computer code, a set of values that are to be stored on the integrated circuit device are determined. An interference graph that includes the set of values and a set of interferences is constructed. A number of possible placements and a number of blocked placements in a memory of the integrated circuit device are computed for each of the set of values. At least a portion of the set of values are assigned to a set of memory locations in the memory based on the numbers of possible placements and blocked placements, resulting in a set of memory location assignments.

Type: Grant

Filed: June 9, 2021

Date of Patent: March 19, 2024

Assignee: Amazon Technologies, Inc.

Inventor: Preston Pengra Briggs
Synchronization instruction insertion method and apparatus

Patent number: 11934832

Abstract: This application discloses example synchronization instruction insertion methods and example apparatuses. One example method includes obtaining a first program block comprising one or more statements, where each of the one or more statements includes one or more function instructions. A first function instruction and a second function instruction between which data dependency exists in the first program block can then be determined. A synchronization instruction pair between a first statement including the first function instruction and a second statement including the second function instruction can then be inserted.

Type: Grant

Filed: December 21, 2021

Date of Patent: March 19, 2024

Assignee: Huawei Technologies Co., Ltd.

Inventors: Xiong Gao, Kun Zhang
Method and apparatus for real-time control loop application execution from a high-level description

Patent number: 11900076

Abstract: The present approach provides a method for safety-critical systems to reduce the required long V development and certification process, into a process that is up to 80% shorter, as well as safer. The present approach creates a pre-certified system, with both pre-certified hardware and pre-certified software. The pre-certified system may be configured to implement a safety-critical software compilation, that contains variables, operations, and template instantiations defining the safety-critical system. This approach eliminates the process below the high-level requirements for the safety-critical software through prior action. To support the configuration, the present approach implements three kinds of components: variables, operators, and templates that provide input, output and abstracted concepts. A configuration defines a set of variables, operations and template instantiations. A tool is used that takes high-level requirements written in a computer readable format into the configuration.

Type: Grant

Filed: April 21, 2021

Date of Patent: February 13, 2024

Assignee: SOLI BV

Inventor: Filip Leonard Etienne Verhaeghe
Storage cache management

Patent number: 11860780

Abstract: A method of cache management, the method comprising: identifying, among a plurality of storage items, storage items having an access count above a first threshold to generate a set of storage items; identifying, among the set of storage items, storage items having an updated access count above a second threshold to generate a subset of storage items, wherein, for each storage item, the updated access count is dependent upon a number of accesses subsequent to generating the set of storage items; and adding the storage items of the subset of storage items to a cache.

Type: Grant

Filed: January 28, 2022

Date of Patent: January 2, 2024

Assignee: PURE STORAGE, INC.

Inventors: Ethan Miller, John Colgrove
Method and apparatus for improved security in trigger action platforms

Patent number: 11856000

Abstract: An apparatus and method for improving the security of trigger action platforms of a type providing interoperability between computer services send the trigger service additional information about an interoperability rule for the computer services so that the trigger service may implement a minimizer reducing the data communicated when the interoperability is implemented. Implementation of the minimizer may be done in a way that is transparent to the trigger action platform eliminating the need for disruption of existing interoperability services.

Type: Grant

Filed: June 4, 2021

Date of Patent: December 26, 2023

Assignee: Wisconsin Alumni Research Foundation

Inventors: Yunang Chen, Mohannad Alhanahnah, Andrei Sabelfeld, Rahul Chatterjee, Earlence Fernandes
Systems and methods for unified computing on digital and quantum computers

Patent number: 11842177

Abstract: Computer systems and methods are provided for compiling a computer program to run on a quantum processor comprising a plurality of qubits, qudits or quantum continuous variables. A compiler obtains the program in a unified language, that is effectively a classical language, as opposed to a quantum language, and performs code refactoring on all or a portion of the program to form a refactored code and converts the refactored code into a first code. The compiler compiles the first code into a second code comprising a plurality of data elements in one or more quantum data structures. The compiler converts the second code to a third code expressed in a quantum gate-level language in accordance with an instruction set and gate locality constraints of the target quantum processor.

Type: Grant

Filed: June 3, 2021

Date of Patent: December 12, 2023

Assignee: HORIZON QUANTUM COMPUTING PTE. LTD.

Inventors: Joseph Francis Fitzsimons, Si-Hui Tan
Computer-readable recording medium storing program for converting first single instruction multiple data (SIMD) command using first mask register into second SIMD command using second mask register, command conversion method for converting first SIMD command using first mask register into second SIMD command using second mask register, and command conversion apparatus for converting first SIMD command using first mask register into second SIMD command using second mask register

Patent number: 11803384

Abstract: A recording medium stores a program for causing a computer to execute a process including: converting, in a first source code corresponding to a first-type processor, a first load command for a first mask register included in the first-type processor into a second load command for a second mask register included in a second-type processor; and converting, when a first SIMD command for performing an arithmetic operation using the first mask register exists after the first load command in the first source code and a state of a value of the first mask register does not coincide with a state of a value of the first mask register, the first SIMD command into a second SIMD command corresponding to the second-type processor and a change command for changing a state of a value of the second mask register to a state of a value of the second mask register.

Type: Grant

Filed: May 31, 2022

Date of Patent: October 31, 2023

Assignee: FUJITSU LIMITED

Inventors: Koji Kurihara, Kentaro Kawakami
Application migration using cost-aware code dependency graph

Patent number: 11714615

Abstract: Described are techniques for application migration. The techniques include migrating an application to a target cloud infrastructure and generating a cost-aware code dependency graph during execution of the application on the target cloud infrastructure. The techniques further include modifying the application by removing source code corresponding to unused nodes according to the cost-aware code dependency graph and replacing identified source code of a high-cost subgraph of the cost-aware code dependency graph with calls to a generated microservice configured to provide functionality similar to the identified source code. The techniques further include implementing the modified application on one or more virtual machines of the target cloud infrastructure.

Type: Grant

Filed: September 18, 2020

Date of Patent: August 1, 2023

Assignee: International Business Machines Corporation

Inventors: Bruno Silva, Marco Aurelio Stelmar Netto, Renato Luiz de Freitas Cunha, Nelson Mimura Gonzalez
Sparsity uniformity enforcement for multicore processor

Patent number: 11709662

Abstract: Methods and systems relating to the field of parallel computing are disclosed herein. The methods and systems disclosed include approaches for sparsity uniformity enforcement for a set of computational nodes which are used to execute a complex computation. A disclosed method includes determining a sparsity distribution in a set of operand data, and generating, using a compiler, a set of instructions for executing, using the set of operand data and a set of processing cores, a complex computation. Alternatively, the method includes altering the operand data. The method also includes distributing the set of operand data to the set of processing cores for use in executing the complex computation in accordance with the set of instructions. Either the altering is conducted to, or the compiler is programmed to, balance the sparsity distribution among the set of processing cores.

Type: Grant

Filed: November 5, 2021

Date of Patent: July 25, 2023

Assignee: Tenstorrent Inc.

Inventors: Ljubisa Bajic, Davor Capalija, Yu Ting Chen, Andrew Grebenisan, Hassan Farooq, Akhmed Rakhmati, Stephen Chin, Vladimir Blagojevic, Almeet Bhullar, Jasmina Vasiljevic
Compiler program, compiling method, information processing device

Patent number: 11693638

Abstract: A compiler program causes a computer to execute optimization processing for an optimization target program. The optimization target program includes a loop including a vector store instruction and a vector load instruction for an array variable. The optimization processing includes (1) unrolling the vector store instruction and the vector load instruction in the loop by an unrolling number of times to generate a plurality of unrolled vector store instructions and a plurality of unrolled vector load instructions, and (2) scheduling to move an unrolled vector load instruction among the plurality of unrolled vector load instructions, which is located after a first unrolled vector store instruction that is located at first among the plurality of unrolled vector load instructions, before the first unrolled vector store instruction.

Type: Grant

Filed: March 17, 2021

Date of Patent: July 4, 2023

Assignee: FUJITSU LIMITED

Inventors: Kensuke Watanabe, Masatoshi Haraguchi, Shun Kamatsuka, Yasunobu Tanimura
Sparsity uniformity enforcement for multicore processor

Patent number: 11693639

Abstract: Methods and systems relating to the field of parallel computing are disclosed herein. The methods and systems disclosed include approaches for sparsity uniformity enforcement for a set of computational nodes which are used to execute a complex computation. A disclosed method includes determining a sparsity distribution in a set of operand data, and generating, using a compiler, a set of instructions for executing, using the set of operand data and a set of processing cores, a complex computation. Alternatively, the method includes altering the operand data. The method also includes distributing the set of operand data to the set of processing cores for use in executing the complex computation in accordance with the set of instructions. Either the altering is conducted to, or the compiler is programmed to, balance the sparsity distribution among the set of processing cores.

Type: Grant

Filed: November 5, 2021

Date of Patent: July 4, 2023

Assignee: Tenstorrent Inc.

Inventors: Ljubisa Bajic, Davor Capalija, Yu Ting Chen, Andrew Grebenisan, Hassan Farooq, Akhmed Rakhmati, Stephen Chin, Vladimir Blagojevic, Almeet Bhullar, Jasmina Vasiljevic
Static versioning in the polyhedral model

Patent number: 11693636

Abstract: An approach is presented to enhancing the optimization process in a polyhedral compiler by introducing compile-time versioning, i.e., the production of several versions of optimized code under varying assumptions on its run-time parameters. We illustrate this process by enabling versioning in the polyhedral processor placement pass. We propose an efficient code generation method and validate that versioning can be useful in a polyhedral compiler by performing benchmarking on a small set of deep learning layers defined for dynamically-sized tensors.

Type: Grant

Filed: November 15, 2021

Date of Patent: July 4, 2023

Assignee: Reservoir Labs Inc.

Inventors: Benoit J. Meister, Adithya Dattatri
Flexible optimized data handling in systems with multiple memories

Patent number: 11687369

Abstract: Methods and systems for optimizing an application for a computing system having multiple distinct memory locations that are interconnected by one or more communication channels include determining one or more data handling properties for a data region in an application. One or more data handling policies for the data region are determined based on the one or more data handling properties. Data setup costs are determined for a scope in the application that uses the data region in different memory locations based on the one or more data handling properties. The application is optimized in accordance with the one or more data handling policies and the data setup costs for the different memory locations.

Type: Grant

Filed: March 2, 2021

Date of Patent: June 27, 2023

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Tong Chen, John Kevin O'Brien, Daniel A. Prener, Zehra N. Sura
Loop execution control for a multi-threaded, self-scheduling reconfigurable computing fabric using a reenter queue

Patent number: 11675598

Abstract: Representative apparatus, method, and system embodiments are disclosed for configurable computing. A representative system includes an interconnection network; a processor; and a plurality of configurable circuit clusters. Each configurable circuit cluster includes a plurality of configurable circuits arranged in an array; a synchronous network coupled to each configurable circuit of the array; and an asynchronous packet network coupled to each configurable circuit of the array.

Type: Grant

Filed: March 15, 2022

Date of Patent: June 13, 2023

Assignee: Micron Technology, Inc.

Inventor: Tony M. Brewer
Loop thread order execution control of a multi-threaded, self-scheduling reconfigurable computing fabric

Patent number: 11675734

Abstract: Representative apparatus, method, and system embodiments are disclosed for configurable computing. A representative system includes an interconnection network; a processor; and a plurality of configurable circuit clusters. Each configurable circuit cluster includes a plurality of configurable circuits arranged in an array; a synchronous network coupled to each configurable circuit of the array; and an asynchronous packet network coupled to each configurable circuit of the array.

Type: Grant

Filed: March 4, 2022

Date of Patent: June 13, 2023

Assignee: Micron Technology, Inc.

Inventor: Tony M. Brewer
Method and apparatus for optimizing code for field programmable gate arrays

Patent number: 11656857

Abstract: A method for the generation of a hardware accelerator (20) is described. The method comprises inputting (110) a program (105) with a plurality of lines of code describing an algorithm to be implemented on the hardware accelerator (20) and generating (125) a dataflow graph in memory from the inputted program (105). The dataflow graph is optimized and an output program (140) created from the dataflow graph is output. The output program (140) is then provided to a high-level synthesis tool for generating the hardware accelerator (20).

Type: Grant

Filed: August 9, 2019

Date of Patent: May 23, 2023

Assignee: INESC TEC—Instituto de Engenharia de Sistemas

Inventors: Afonso Soares Canas Ferreira, João Manuel Paiva Cardoso
System to analyze and enhance software based on graph attention networks

Patent number: 11640295

Abstract: Systems, apparatuses and methods may provide for technology that generates a dependence graph based on a plurality of intermediate representation (IR) code instructions associated with a compiled program code, generates a set of graph embedding vectors based on the plurality of IR code instructions, and determines, via a neural network, one of an analysis of the compiled program code or an enhancement of the program code based on the dependence graph and the set of graph embedding vectors. The technology may provide a graph attention neural network that includes a recurrent block and at least one task-specific neural network layer, the recurrent block including a graph attention layer and a transition function. The technology may also apply dynamic per-position recurrence-halting to determine a number of recurring steps for each position in the recurrent block based on adaptive computation time.

Type: Grant

Filed: June 26, 2020

Date of Patent: May 2, 2023

Assignee: Intel Corporation

Inventors: Mariano Tepper, Bryn Keller, Mihai Capota, Vy Vo, Nesreen Ahmed, Theodore Willke
Dataflow optimization apparatus and method for low-power operation of multicore systems

Patent number: 11635997

Abstract: The present disclosure relates to a dataflow optimization method for low-power operation of a multicore system, the dataflow optimization method including: a step (a) of creating an FSM including a plurality of system states in consideration of dynamic factors that trigger a transition in system states for original dataflow; and a step (b) of optimizing the original dataflow through optimization of the created FSM.

Type: Grant

Filed: July 12, 2019

Date of Patent: April 25, 2023

Assignee: AJOU UNIVERSITY INDUSTRY-ACADEMIC COOPERATION FOUNDATION

Inventors: Hoeseok Yang, Hyeonseok Jung
Analysis for modeling data cache utilization

Patent number: 11630654

Abstract: Aspects include modeling data cache utilization for each loop in a loop nest; estimating total data cache lines fetched in one iteration of the loop; and determining the possibility of data cache reuse across loop iterations using data cache lines fetched and associativity constraints. Aspects also include estimating, for memory reference pairs, reuse by one reference of data cache line fetched by another; estimating total number of cache misses for all iterations of the loop; and estimating total number of cache misses of a reference for iterations of a next outer loop as equal to total cache misses for an entire inner loop. Aspects further include estimating memory cost of a loop unroll and jam transformation, without performing the transformation; and extending a data cache model to estimate best unroll-and-jam factors for the loop nest, capable of minimizing total cache misses incurred by the memory references in the loop body.

Type: Grant

Filed: August 19, 2021

Date of Patent: April 18, 2023

Assignee: International Business Machines Corporation

Inventors: Wai Hung Tsang, Prithayan Barua, Ettore Tiotto, Bardia Mahjour, Jun Shirako
Method and system for protecting data processed by data processing accelerators

Patent number: 11609766

Abstract: According to one embodiment, a data processing system performs a secure boot using a security module (e.g., a trusted platform module (TPM)) of a host system. The system verifies that an operating system (OS) and one or more drivers including an accelerator driver associated with a data processing (DP) accelerator is provided by a trusted source. The system launches the accelerator driver within the OS. The system generates a trusted execution environment (TEE) associated with one or more processors of the host system. The system launches an application and a runtime library within the TEE, where the application communicates with the DP accelerator via the runtime library and the accelerator driver.

Type: Grant

Filed: January 4, 2019

Date of Patent: March 21, 2023

Assignees: BAIDU USA LLC, BAIDU.COM TIMES TECHNOLOGY (BEIJING) CO., LTD., KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY LIMITED

Inventors: Yueqiang Cheng, Yong Liu, Tao Wei, Jian Ouyang
Method and system for identification of redundant function-level slicing calls

Patent number: 11579851

Abstract: This disclosure relates generally to the field of source code processing, and, more particularly to a method and system for identification of redundant function-level slicing calls. The method disclosed generates program dependence graphs (PDGs) based on a slicing criteria and a function corresponding to the function-level slicing call. Further the method classifies the function-level slicing call into redundant or non-redundant by traversing the PDGs and checking if a predefined condition is satisfied or not. The function-level slicing calls are classified as redundant if the check is not satisfied and are classified as non-redundant if the check is satisfied. The disclosed method can be used in identifying redundant function-level slicing calls in applications such as automated false positive elimination (AFPE), automated test case generation and so on.

Type: Grant

Filed: September 20, 2021

Date of Patent: February 14, 2023

Assignee: Tata Consultancy Services Limited

Inventor: Tukaram Bhagwat Muske
Information processing apparatus, computer-readable recording medium storing compiling program, and compiling method

Patent number: 11579853

Abstract: An information processing apparatus includes a processor configured to: for each of a plurality of loops, acquire loop information including a number of variables, a number of registers, a number of memory commands for inputting and outputting a value of the variable between the register and a main storage device, and a number of arithmetic commands for the value of the variable stored in the register, which are used in the loop; calculate the number of variables, the number of registers, the number of memory commands, and the number of arithmetic commands, which correspond to a combination of the loops that are candidates for loop fusion, for each of the combinations of the loops; determine a combination to which the loop fusion is to be applied among the combinations which are calculated for each of the combinations; and execute the loop fusion on the determined combination.

Type: Grant

Filed: November 18, 2021

Date of Patent: February 14, 2023

Assignee: FUJITSU LIMITED

Inventor: Tomoko Nikko
Mapping convolution to a channel convolution engine

Patent number: 11537865

Abstract: A processor system comprises a first and second group of registers and a hardware channel convolution processor unit. The first group of registers is configured to store data elements of channels of a portion of a convolution data matrix. Each register stores at least one data element from each channel. The second group of registers is configured to store data elements of convolution weight matrices including a separate convolution weight matrix for each channel. Each register stores at least one data element from each convolution weight matrix. The hardware channel convolution processor unit is configured to multiply each data element in the first group of registers with a corresponding data element in the second group of registers and sum together the multiplication results for each specific channel to determine corresponding channel convolution result data elements in a corresponding channel convolution result matrix.

Type: Grant

Filed: February 18, 2020

Date of Patent: December 27, 2022

Assignee: Meta Platforms, Inc.

Inventors: Krishnakumar Narayanan Nair, Rakesh Komuravelli, Abdulkadir Utku Diril, Ehsan Khish Ardestani Zadeh, Yuchen Hao, Martin Schatz, Thomas Mark Ulrich, Olivia Wu, Anup Ramesh Kadkol, Amin Firoozshahian
Offloading server and offloading program

Patent number: 11403083

Abstract: An offloading server includes: a data transfer designation section configured to analyze reference relationships of variables used in loop statements in an application and designate, for data that can be transferred outside a loop, a data transfer using an explicit directive that explicitly specifies a data transfer outside the loop; a parallel processing designation section configured to identify loop statements in the application and specify a directive specifying application of parallel processing by an accelerator and perform compilation for each of the loop statements; and a parallel processing pattern creation section configured to exclude loop statements causing a compilation error from loop statements to be offloaded and create a plurality of parallel processing patterns each of which specifies whether to perform parallel processing for each of the loop statements not causing a compilation error.

Type: Grant

Filed: June 3, 2019

Date of Patent: August 2, 2022

Assignee: Nippon Telegraph and Telephone Corporation

Inventors: Yoji Yamato, Hirofumi Noguchi, Misao Kataoka, Takuma Isoda, Tatsuya Demizu
Method, a device, and a computer program product for determining a resource required for executing a code segment

Patent number: 11354159

Abstract: A method comprises: compiling the code segment with a compiler; and determining, based on an intermediate result of the compiling, a resource associated with a dedicated processing unit and for executing the code segment. As such, the resource required for executing a code segment may be determined quickly without actually executing the code segment and allocating or releasing the resource, which helps subsequent resource allocation and further brings about a better user experience.

Type: Grant

Filed: August 14, 2019

Date of Patent: June 7, 2022

Assignee: EMC IP Holding Company LLC

Inventors: Jinpeng Liu, Pengfei Wu, Junping Zhao, Kun Wang
Processing method, device, equipment and storage medium of loop instruction

Patent number: 11340903

Abstract: The present application discloses a processing method, device, equipment and storage medium of a loop instruction, and relates to the fields of voice and chips. A specific embodiment is: acquiring a computer program including a first loop body, where the first loop body is generated according to a second loop body in a software code to be compiled, the first loop body includes a plurality of first loop instructions, the plurality of first loop instructions can be identified by a hardware structure of a computer device; in the case that the first loop body is detected, determining loop parameters of the first loop body according to the plurality of first loop instructions; acquiring the plurality of first loop instructions according to the loop parameters of the first loop body; executing the plurality of first loop instructions.

Type: Grant

Filed: March 23, 2021

Date of Patent: May 24, 2022

Assignee: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventors: Junhui Wen, Chao Tian
Speculative caching in a content delivery network

Patent number: 11272028

Abstract: A server in a content delivery (CD) network that distributes content on behalf of one or more subscribers. Responsive to a request from a client for a particular resource, if the particular resource is already in a cache on the server, serving the particular to the client from the cache; otherwise if the particular resource is not already cached on the server, when a count value exceeds a first threshold value, obtaining, caching, and serving the particular resource. When the count value is less than a second threshold value, obtaining and serving the particular resource. When the count value is: (i) not less than the second threshold value, and (ii) not greater than the first threshold value, then obtaining the particular resource and selectively caching the particular resource; and serving the particular resource to the client.

Type: Grant

Filed: April 29, 2020

Date of Patent: March 8, 2022

Assignee: Level 3 Communications, LLC

Inventors: Daniel Lee Jensen, William Crowder, Christopher Newton, William R. Power
Determining when to perform and performing runtime binary slimming

Patent number: 11221835

Abstract: One or more execution traces of an application are accessed. The one or more execution traces have been collected at a basic block level. Basic blocks in the one or more execution traces are scored. Scores for the basic blocks represent benefits of performing binary slimming at the corresponding basic blocks. Runtime binary slimming is performed of the application based on the scores of the basic blocks.

Type: Grant

Filed: February 10, 2020

Date of Patent: January 11, 2022

Assignee: International Business Machines Corporation

Inventors: Michael Vu Le, Ian Michael Molloy, Taemin Park
System and/or method for error compensation in mechanical transmissions

Patent number: 11148287

Abstract: The system can include a set of joints, a controller, and a model engine; and can optionally include a support structure and an end effector. Joints can include: a motor, a transmission mechanism, an input sensor, and an output sensor. The system can enable articulation of the plurality of joints.

Type: Grant

Filed: April 13, 2021

Date of Patent: October 19, 2021

Assignee: Orangewood Labs Inc.

Inventors: Abhinav Kumar, Aditya Bhatia, Akash Bansal, Anubhav Singh, Ashutosh Prakash, Aman Malhotra, Harshit Gaur, Prasang Srivasatava, Ashish Chauhan
Loop-oriented neural network compilation

Patent number: 11144291

Abstract: Methods of accelerating the execution of neural networks are disclosed. A description of a neural network may be received. A plurality of operators may be identified based on the description of the neural network. A plurality of symbolic models associated with the plurality of operators may be generated. For each symbolic model, a nested loop associated with an operator may be identified, a loop order may be defined, and a set of data dependencies may be defined. A set of inter-operator dependencies may be extracted based on the description of the neural network. The plurality of symbolic models and the set of inter-operator dependencies may be analyzed to identify a combinable pair of nested loops. The combinable pair of nested loops may be combined to form a combined nested loop.

Type: Grant

Filed: November 27, 2019

Date of Patent: October 12, 2021

Assignee: Amazon Technologies, Inc.

Inventors: Hongbin Zheng, Preston Pengra Briggs, Tobias Joseph Kastulus Edler von Koch, Taemin Kim, Randy Renfu Huang
Systems and methods for optimizing nested loop instructions in pipeline processing stages within a machine perception and dense algorithm integrated circuit

Patent number: 11061678

Abstract: In one embodiment, a method for improving a performance of an integrated circuit includes implementing one or more computing devices executing a compiler program that: (i) evaluates a target instruction set intended for execution by an integrated circuit; (ii) identifies one or more nested loop instructions within the target instruction set based on the evaluation; (iii) evaluates whether a most inner loop body within the one or more nested loop instructions comprises a candidate inner loop body that requires a loop optimization that mitigates an operational penalty to the integrated circuit based on one or more executional properties of the most inner loop instruction; and (iv) implements the loop optimization that modifies the target instruction set to include loop optimization instructions to control, at runtime, an execution and a termination of the most inner loop body thereby mitigating the operational penalty to the integrated circuit.

Type: Grant

Filed: December 18, 2020

Date of Patent: July 13, 2021

Assignee: qaudric.io Inc

Inventors: Nigel Drego, Mrinalini Ravichandran, Jianman Chang, Daniel Firu, Veerbhan Kheterpal
Flexible optimized data handling in systems with multiple memories

Patent number: 10996989

Abstract: Methods and systems for optimizing an application for a computing system having multiple distinct memory locations that are interconnected by one or more communication channels include determining one or more data handling properties for a data region in an application. One or more data handling policies for the data region are determined based on the one or more data handling properties. Data setup costs are determined for a scope in the application that uses the data region in different memory locations based on the one or more data handling properties. The application is optimized in accordance with the one or more data handling policies and the data setup costs for the different memory locations.

Type: Grant

Filed: June 13, 2016

Date of Patent: May 4, 2021

Assignee: International Business Machines Corporation

Inventors: Tong Chen, John Kevin O'Brien, Daniel A. Prener, Zehra N. Sura
Statelessly populating data stream into successive files

Patent number: 10901944

Abstract: Storing an incoming data stream using successive files that are consecutively populated. The appropriate file to populate a given data stream portion into is determined by mapping the data stream offset to a file, and potentially also an address within that file. The successive files may be the same size, so that the file can be identified based on the data stream address (or offset) without the use of an index. Furthermore, the files may be easily named by having that size be some multiple of a binary power of bytes. That way, the files themselves can be automatically and named and identified by using the more significant bit or bits of the data stream offset to uniquely identify the file and establish ordering of the files. Replication may occur from a primary to a secondary store by transmitting the offset, and the actual data to be stored.

Type: Grant

Filed: May 24, 2017

Date of Patent: January 26, 2021

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Rogerio Ramos, Fayssal Martani, Cristian Diaconu, Karthick Krishnamoorthy, Jacob R. Lorch
Resilient programming frameworks for iterative computations

Patent number: 10831616

Abstract: An information processing system, computer readable storage medium, and method for supporting resilient execution of computer programs. A method provides a resilient store wherein information in the resilient store can be accessed in the event of a failure. The method periodically checkpoints application state in the resilient store. A resilient executor comprises software which executes applications by catching failures. The method uses the resilient executor to execute at least one application. In response to the resilient executor detecting a failure, restoring application state information to the at least one application from a checkpoint stored in the resilient store, the resilient executor resuming execution of the at least one application with the restored application state information.

Type: Grant

Filed: April 18, 2019

Date of Patent: November 10, 2020

Assignee: International Business Machines Corporation

Inventors: Arun Iyengar, Joshua J. Milthorpe
Resilient programming frameworks for iterative computations on computer systems

Patent number: 10831617

Abstract: An information processing system, computer readable storage medium, and method for supporting resilient execution of computer programs. A method provides a resilient store wherein information in the resilient store can be accessed in the event of a failure. The method periodically checkpoints application state in the resilient store. A resilient executor comprises software which executes applications by catching failures. The method uses the resilient executor to execute at least one application. In response to the resilient executor detecting a failure, restoring application state information to the at least one application from a checkpoint stored in the resilient store, the resilient executor resuming execution of the at least one application with the restored application state information.

Type: Grant

Filed: April 23, 2019

Date of Patent: November 10, 2020

Assignee: International Business Machines Corporation

Inventors: Arun Iyengar, Joshua J. Milthorpe
Method of estimating program speed-up in highly parallel architectures using static analysis

Patent number: 10754744

Abstract: The amount of speed-up that can be obtained by optimizing the program to run on a different architecture is determined by static measurements of the program. Multiple such static measurements are processed by a machine learning system after being discretized to alter their accuracy vs precision. Static analysis requires less analysis overhead and permits analysis of program portions to optimize allocation of porting resources on a large program.

Type: Grant

Filed: March 15, 2016

Date of Patent: August 25, 2020

Assignee: Wisconsin Alumni Research Foundation

Inventors: Karthikeyan Sankaralingam, Newsha Ardalani, Urmish Thakker
Compiler optimization of coroutines

Patent number: 10747511

Abstract: As a memory usage optimization, a compiler identifies coroutines whose activation frames can be allocated on a caller's stack instead of allocating the frame on the heap. For example, when the compiler determines that a coroutine C's life cannot extend beyond the life of the routine R that first calls the coroutine C, the compiler generates code to allocate the activation frame for C on the stack of R, instead of generating code to allocate C's frame from heap memory. In some cases, as another optimization, code for coroutine C is also inlined with code for the routine R that calls C. Coroutine activation frame content variations and layout variations are also described.

Type: Grant

Filed: June 26, 2015

Date of Patent: August 18, 2020

Assignee: Microsoft Technology Licensing, LLC

Inventors: James J. Radigan, Gor Nishanov
Recompiling GPU code based on spill/fill instructions and number of stall cycles

Patent number: 10698689

Abstract: An apparatus to facilitate register sharing is disclosed. The apparatus includes one or more processors to generate first machine code having a first General Purpose Register (GRF) per thread ratio, detect an occurrence of one or more spill/fill instructions in the first machine code, and generate second machine code having a second GRF per thread ratio upon a detection of one or more spill/fill instructions in the first machine code, wherein the second GRF per thread ratio is based on a disabling of a first of a plurality of hardware threads.

Type: Grant

Filed: September 1, 2018

Date of Patent: June 30, 2020

Assignee: Intel Corporation

Inventors: Pratik J. Ashar, Supratim Pal, Subramaniam Maiyuran, Wei-Yu Chen, Guei-Yuan Lueh
Method and system for self-optimizing path-based object allocation tracking

Patent number: 10691575

Abstract: A system and method for the efficient monitoring of memory allocations performed during the executing code is presented. The proposed approach analyzes the code to build a control flow graph that describes all possible execution sequences of the code. Individual execution paths are identified by an analysis of the control path and memory allocation counters representing the memory allocations of each execution path are placed in the code. The memory allocation counters provide next to data describing memory allocations also execution frequency data of execution paths. The execution frequency data is used to identify the path with the highest execution frequency. The position of the memory allocation counters is further adapted with the optimization goal that the path with the highest execution frequency triggers the least number of memory allocation counter increments.

Type: Grant

Filed: October 2, 2018

Date of Patent: June 23, 2020

Assignee: Dynatrace LLC

Inventors: Philipp Lengauer, Stefan Fitzek
Loop break

Patent number: 10628142

Abstract: In the described examples, a non-transitory machine-readable medium includes a compiler that detects a soft-break indicator in a loop included in source code and the compiler applies software pipelining to generate compiled code for the loop. The compiled code includes assembly instructions and the soft-break indicator enables the compiler to arrange the assembly instructions to complete in-flight iterations of the loop after execution of the soft-break.

Type: Grant

Filed: July 20, 2017

Date of Patent: April 21, 2020

Assignee: TEXAS INSTRUMENTS INCORPORATED

Inventor: Jesse Gregory Villarreal, Jr.
Compiler transformation with loop and data partitioning

Patent number: 10628141

Abstract: Logic may transform a target code to partition data automatically and/or autonomously based on a memory constraint associated with a resource such as a target device. Logic may identify a tag in the code to identify a task, wherein the task comprises at least one loop, the loop to process data elements in one or more arrays. Logic may automatically generate instructions to determine one or more partitions for the at least one loop to partition data elements, accessed by one or more memory access instructions for the one or more arrays within the at least one loop, based on a memory constraint, the memory constraint to identify an amount of memory available for allocation to process the task. Logic may determine one or more iteration space blocks for the parallel loops, determine memory windows for each block, copy data into and out of constrained memory, and transform array accesses.

Type: Grant

Filed: May 7, 2018

Date of Patent: April 21, 2020

Assignee: INTEL CORPORATION

Inventors: Rakesh Krishnaiyer, Konstantin Bobrovskii, Dmitry Budanov
Sharing dynamic variables in a high availability environment

Patent number: 10592217

Abstract: Methods and systems are provided that utilize compiler technology in identifying changed critical variables in work assignment code that cause synchronization issues between a master system and another server. The identified changed critical variables are shared by the master server in a high availability environment. In general, the sharing of changed critical variables includes sending, via a master system, changed code or critical variables to a receiving system. The receiving system can implement the changed code or critical variables to maintain synchronization with the master system.

Type: Grant

Filed: October 10, 2013

Date of Patent: March 17, 2020

Assignee: AVAYA INC.

Inventor: Robert C. Steiner
Apparatus and method for processing sparse data

Patent number: 10437562

Abstract: An apparatus and method are described for designing an accelerator for processing sparse data. For example, one embodiment comprises a machine-readable medium having program code stored thereon which, when executed by a processor, causes the processor to perform the operations of: analyzing input graph program code and parameters associated with a target accelerator in view of an accelerator architecture template; responsively mapping the parameters onto the architecture template to implement customizations to the accelerator architecture template; and generating a hardware description representation of the target accelerator based on the determined mapping of the parameters to apply to the accelerator architecture template.

Type: Grant

Filed: December 30, 2016

Date of Patent: October 8, 2019

Assignee: Intel Corporation

Inventors: Eriko Nurvitadhi, Yu Wang, Deborah T. Marr
Agile communication operator

Patent number: 10423391

Abstract: A high level programming language provides an agile communication operator that generates a segmented computational space for distributing the computational space across compute nodes. The agile communication operator decomposes the computational space into segments, causes the segments to be assigned to compute nodes, and allows the user to centrally manage and automate movement of the segments between the compute nodes. The segment movement may be managed using either a full global-view representation or a local-global-view representation of the segments.

Type: Grant

Filed: July 19, 2016

Date of Patent: September 24, 2019

Assignee: Microsoft Technology Licensing, LLC

Inventor: Paul F. Ringseth
Methods and apparatus to map single static assignment instructions onto a data flow graph in a data flow architecture

Patent number: 10346144

Abstract: Methods, apparatus, systems and articles of manufacture to map a set of instructions onto a data flow graph are disclosed herein. An example apparatus includes a variable handler to modify a variable in the set of instructions. The variable is used multiple times in the set of instructions and the set of instructions are in a static single assignment form. The apparatus also includes a PHI handler to replace a PHI instruction contained in the set of instructions with a set of control data flow instructions and a data flow graph generator to map the set of instructions modified by the variable handler and the PHI handler onto a data flow graph without transforming the instructions out of the static single assignment form.

Type: Grant

Filed: September 29, 2017

Date of Patent: July 9, 2019

Assignee: INTEL CORPORATION

Inventors: Yongzhi Zhang, Kent D. Glossop
Performing a compiler optimization pass as a transaction

Patent number: 10289395

Abstract: Embodiments described herein provide a solution for optimizing a compiling of program code. A proposed state pointer, which corresponds to a current state pointer to a current state node that represents a section of the program code, is added in an intermediate language (IL) representation of the program code. When the optimizing compiler determines that an optimization should be made to a section of code, the current state node is copied to create a proposed state node, which is then referenced by the proposed state pointer. The proposed state node is edited to include the optimization while the current state node remains unchanged. The success of the optimization is evaluated, and an updated IL representation is generated in which any references to nodes that are no longer included in the flow of the former IL representation are removed.

Type: Grant

Filed: October 17, 2017

Date of Patent: May 14, 2019

Assignee: International Business Machines Corporation

Inventor: Irwin D'Souza
Systems and methods for approximation based optimization of data processors

Patent number: 10209971

Abstract: A compilation system can apply a smoothness constraint to the arguments of a compute-bound function invoked in a software program, to ensure that the value(s) of one or more function arguments are within specified respective threshold(s) from selected nominal value(s). If the constraint is satisfied, the function invocation is replaced with an approximation thereof. The smoothness constraint may be determined for a range of value(s) of function argument(s) so as to determine a neighborhood within which the function can be replaced with an approximation thereof. The replacement of the function with an approximation thereof can facilitate simultaneous optimization of computation accuracy, performance, and energy/power consumption.

Type: Grant

Filed: April 29, 2015

Date of Patent: February 19, 2019

Assignee: Reservoir Labs, Inc.

Inventors: Muthu M. Baskaran, Thomas Henretty, Ann Johnson, Athanasios Konstantinidis, M. H. Langston, Richard A. Lethin, Janice O. McMahon, Benoit J. Meister, Paul Mountcastle
System for improved parallelization of program code

Patent number: 10140097

Abstract: A system is provided in which a human annotation, undertaken for direct implementation of parallelization measures, is used for training an adaptive automatic classification method, which is then applied automatically to code blocks to be analyzed, wherein further suitable patterns obtained by human review from the automatically analyzed code blocks may then be used in turn for continuous improvement of the adaptive automatic classification method.

Type: Grant

Filed: April 2, 2015

Date of Patent: November 27, 2018

Assignee: SIEMENS AKTIENGESELLSCHAFT

Inventor: Wolfgang Mauerer
De-obfuscating scripted language for network intrusion detection using a regular expression signature

Patent number: 10089464

Abstract: A device receives data, identifies a context associated with the data, and identifies a script, within the data, associated with the context. The device parses the script to identify tokens, forms nodes based on the tokens, and assembles a syntax tree using the nodes. The device renames one or more identifiers associated with the nodes and generates a normalized text, associated with the script, based on the syntax tree after renaming the one or more identifiers. The device determines whether the normalized text matches a regular expression signature and processes the data based on determining whether the normalized text matches the regular expression signature. The device processes the data by a first process when the normalized text matches the regular expression signature or by a second process, different from the first process, when the normalized text does not match the regular expression signature.

Type: Grant

Filed: August 15, 2016

Date of Patent: October 2, 2018

Assignee: Juniper Networks, Inc.

Inventor: Ankur Tyagi

1 2 3 4 5 … next