Including Loop Patents (Class 717/160)

Including scheduling instructions (Class 717/161)

Padding and suppressing rows and columns of data

Patent number: 12367147

Abstract: A method is described herein. The method generally includes receiving stream parameters that defines an array, wherein the stream parameters include a first null element count and a second null element count. The method generally includes forming a stream of vectors for the multidimensional array responsive to the stream parameters. The stream of vectors generally includes a vector of null elements at a beginning of the stream of vectors based on the first null element count. The stream of vectors generally includes a null element at a beginning of each vector of the stream of vectors based on the second null element count. The stream of vectors generally includes a set of data distributed across a subset of the stream of vectors. The method generally includes providing the stream of vectors.

Type: Grant

Filed: February 6, 2023

Date of Patent: July 22, 2025

Assignee: Texas Instruments Incorporated

Inventors: Asheesh Bhardwaj, Burton Adrik Copeland, Elliott Gurrola, Tim Anderson, William Leven
Application optimization method and apparatus supporting the same

Patent number: 12229539

Abstract: Provided is an application optimization method and an electronic device supporting the same. According to an example embodiment, the application optimization method may include: determining whether a condition set with respect to a duration of an idle state of the electronic device is satisfied, selecting an application for which application optimization is to be performed based on an application usage record of a user of the electronic device in response to the set condition being satisfied, and generating an optimized application by performing the application optimization in the background for the selected application.

Type: Grant

Filed: June 23, 2022

Date of Patent: February 18, 2025

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Byungsoo Kwon, Kiljae Kim, Daehyun Cho
Generating source code adapted for implementation on FPGAs

Patent number: 12223301

Abstract: A method for generating source code includes: transforming a block diagram into an intermediate representation, wherein transforming the block diagram into the intermediate representation comprises transforming at least two blocks, wherein at least one loop results from transforming an operation block; identifying at least one candidate loop in the intermediate representation, wherein a loop body of a candidate loop comprises at least one instruction that accesses the array variable; identifying at least one parallelizable loop from the at least one candidate loop; determining build options for the at least one parallelizable loop and the array variable; inserting build pragmas based on the determined build options in the intermediate representation; and translating the intermediate representation into the source code.

Type: Grant

Filed: March 17, 2023

Date of Patent: February 11, 2025

Assignee: DSPACE GMBH

Inventors: Joerg Niere, Kingshuk Karuri, Pubali Mazumder
Splitting vector processing loops with an unknown trip count

Patent number: 12182552

Abstract: A computer-based technique for processing an application includes determining that a loop of the application includes a reference to a data item of a vector data type. A trip count of the loop is determined to have an unknown trip count. The loop is split into a first loop and a second loop based on a splitting factor. The second loop is unrolled.

Type: Grant

Filed: May 24, 2022

Date of Patent: December 31, 2024

Assignee: Xilinx, Inc.

Inventor: Ajit K. Agarwal
Loop unrolling processing apparatus, method, and program

Patent number: 12093691

Abstract: A generation unit generates arithmetic expressions. Here, N denotes the number of looping times of the loop processing. L denotes a designated lower limit of unroll stage number. M denotes a designated upper limit of the unroll stage number. Q denotes a quotient obtained by dividing N by L. R denotes a remainder obtained by dividing N by L. The arithmetic expressions include an arithmetic expression that represents executing loop processing whose number of looping times is a quotient obtained by dividing R by (M?L), with the unroll stage number M when R?Q*(M?L)>0 is not satisfied, and then executing, when a remainder obtained by dividing R by (M?L) is other than 0, processing of one loop with sum of the remainder and L as the unroll stage number, and then executing loop processing with the unroll stage number L.

Type: Grant

Filed: February 14, 2020

Date of Patent: September 17, 2024

Assignee: NEC CORPORATION

Inventor: Yoshiyuki Ohno
Sequence partition based schedule optimization

Patent number: 12079632

Abstract: Sequence partition based schedule optimization is performed by generating a sequence and a schedule based on the sequence, dividing the sequence into a plurality of sequence partitions based on the schedule and the data dependency graph, each sequence partition including a portion of the plurality of instructions and a portion of the plurality of buffers, performing, for each sequence partition, a plurality of partition optimizing iterations, and merging the plurality of sequence partitions to produce a merged schedule.

Type: Grant

Filed: December 16, 2022

Date of Patent: September 3, 2024

Assignee: EDGECORTIX INC.

Inventors: Jens Huthmann, Sakyasingha Dasgupta, Nikolay Nez
Reducing computation in neural networks using self-modifying code

Patent number: 12073199

Abstract: In various implementations, provided are systems and methods for reducing neural network processing. A compiler may generate instructions from source code for a neural network having a repeatable set of operations. The instructions may include a plurality of blocks. The compiler may add an overwrite instruction to the plurality of blocks that, when executed by one or more execution engines, triggers an overwrite action. The overwrite action causes the instructions of subsequent blocks to be overwritten with NOP instructions. The overwrite action is triggered only when a condition is satisfied.

Type: Grant

Filed: June 6, 2019

Date of Patent: August 27, 2024

Assignee: Amazon Technologies, Inc.

Inventors: Vignesh Vivekraja, Randy Renfu Huang, Yu Zhou, Ron Diamant, Richard John Heaton
Data processing apparatus and method for generating prefetches based on a nested prefetch pattern

Patent number: 12045618

Abstract: The invention provides a data processing apparatus and a data processing method for generating prefetches of data for use during execution of instructions by processing circuitry. The prefetches that are generated are based on a nested prefetch pattern. The nested prefetch pattern comprises a first pattern and a second pattern. The first pattern is defined by a first address offset between sequentially accessed addresses and a first observed number of the sequentially accessed addresses separated by the first address offset. The second pattern is defined by a second address offset between sequential iterations of the first pattern and a second observed number of the sequential iterations of the first pattern separated by the second address offset.

Type: Grant

Filed: March 23, 2021

Date of Patent: July 23, 2024

Assignee: Arm Limited

Inventors: Natalya Bondarenko, Stefano Ghiggini, Geoffray Matthieu Lacourba, Cédric Denis Robert Airaud
Methods, systems, and apparatuses to optimize partial flag updating instructions via dynamic two-pass execution in a processor

Patent number: 12039329

Abstract: Systems, methods, and apparatuses relating to circuitry to implement dynamic two-pass execution of a partial flag updating instruction in a processor are described.

Type: Grant

Filed: December 24, 2020

Date of Patent: July 16, 2024

Assignee: Intel Corporation

Inventors: Wing Shek Wong, Vikash Agarwal, Charles Vitu, Mihir Shah
Methods and apparatus to perform automatic compiler optimization to enable streaming-store generation for unaligned contiguous write access

Patent number: 12032934

Abstract: Methods, apparatus, systems and articles of manufacture (e.g., computer readable storage media) to perform automatic compiler optimization to enable streaming-store generation for unaligned contiguous write access are disclosed. Example apparatus disclosed herein are to mark a store instruction in source program code as a transformation candidate when the store instruction is associated with a group of memory accesses that are unaligned with respect to a size of a cache line in a cache. Disclosed apparatus are also to transform the store instruction that is marked as the transformation candidate to form transformed program code when a non-temporal property is satisfied, the transformed program code to replace the store instruction with (i) a write to a buffer in the cache and (ii) a streaming-store instruction that is to write contents of the buffer to memory.

Type: Grant

Filed: September 23, 2021

Date of Patent: July 9, 2024

Assignee: Intel Corporation

Inventors: Charles Yount, Rakesh Krishnaiyer, Timothy Creech, Daniel Woodworth, Joshua Cranmer
Database creation using table type information

Patent number: 12032563

Abstract: A system, method, and computer-readable medium for proving feedback on database instructions, identifying, for example, existing patterns and providing suggested replacement instructions. This may have the effect of improving the efficiency of instructions used to create and/or manipulate databases. According to some aspects, these and other benefits may be achieved by parsing received instructions into an organizational structure, traversing the organizational structure for known patterns, and suggesting replacement patterns. In implementation, this may be effected by receiving one or more sets of known patterns and corresponding replacement patterns, parsing received instructions, comparing the known patterns with the parsed instructions, and providing suggested replacement patterns based on one or more known patterns matching the parsed instructions. A benefit of may include reducing Cartesian products during the merging of tables.

Type: Grant

Filed: January 10, 2023

Date of Patent: July 9, 2024

Assignee: Capital One Services, LLC

Inventors: Dennis J. Mire, Puneet Goyal, Siddharth Gupta, Srinivas Kumar, Deepak Sundararaj, Oron Hazi
Inconsistency-based bug detection

Patent number: 12008364

Abstract: A system identifies a pattern in source code. The pattern is identified based, at least in part, on correlation between units of code, or on the derivation of a rule from a recurring sequence in the source code. The system identifies a portion of the source code that at least partially matches the pattern, and determines that this portion includes at least one deviation from the pattern. The system then generates an error message to describe the deviation.

Type: Grant

Filed: June 24, 2021

Date of Patent: June 11, 2024

Assignee: Amazon Technologies Inc.

Inventors: Hangqi Zhao, Qiang Zhou, Sengamedu Hanumantha Rao Srinivasan
Systems and methods for processing information associated with a unified computation engine involving dynamic mapping and/or other features

Patent number: 11983158

Abstract: Systems and methods related to dynamically orchestrating execution of a protocol by generating one or more executors and one or more functions using a network controller based on the protocol are disclosed. In one example embodiment, there is provided a method of receiving a user-defined logic to allow the network controller to generate a dataset including a plurality of accumulators, abstract one or more executors and functions, and generate a set of protocols at the one or more functions to activate a respective application stage to fulfill the service request based on the logic in near real-time. Moreover, methods herein may include determining a telemetry and/or insight generation based, at least in part, on the dataset.

Type: Grant

Filed: October 26, 2022

Date of Patent: May 14, 2024

Assignee: PAYPAL, INC.

Inventors: Manimuthu Ayyannan, Deepak Mohanakumar Chandramouli, Karteek Reddy Chada
Compilation with caching of code analysis result

Patent number: 11941383

Abstract: Techniques to speed up code compilation may include caching code analysis results such that the analysis of subsequent code having a similar structured can be omitted. For example, a loop-nest construct in the code can be parsed, and an execution statement in the loop-nest construct can be analyzed by a compiler to generate an analysis result indicating a set of execution conditions for the execution statement. A lookup key can be generated from the control statements bounding the execution statement, and the analysis result can be stored with the lookup key in a cache entry of the cache. The execution statement is then modified according to the analysis result for optimization. Instead of having to analyze a subsequent execution statement bounded by the same control statements, the analysis result of the subsequent execution statement can be retrieved from the cache and be used to modify the subsequent execution statement.

Type: Grant

Filed: March 8, 2022

Date of Patent: March 26, 2024

Assignee: Amazon Technologies, Inc.

Inventors: Hongbin Zheng, Pushkar Ratnalikar
Synchronization instruction insertion method and apparatus

Patent number: 11934832

Abstract: This application discloses example synchronization instruction insertion methods and example apparatuses. One example method includes obtaining a first program block comprising one or more statements, where each of the one or more statements includes one or more function instructions. A first function instruction and a second function instruction between which data dependency exists in the first program block can then be determined. A synchronization instruction pair between a first statement including the first function instruction and a second statement including the second function instruction can then be inserted.

Type: Grant

Filed: December 21, 2021

Date of Patent: March 19, 2024

Assignee: Huawei Technologies Co., Ltd.

Inventors: Xiong Gao, Kun Zhang
Compiler-driven storage allocation of runtime values

Patent number: 11934876

Abstract: A compiler-implemented technique for performing a storage allocation is described. Computer code to be converted into machine instructions for execution on an integrated circuit device is received. Based on the computer code, a set of values that are to be stored on the integrated circuit device are determined. An interference graph that includes the set of values and a set of interferences is constructed. A number of possible placements and a number of blocked placements in a memory of the integrated circuit device are computed for each of the set of values. At least a portion of the set of values are assigned to a set of memory locations in the memory based on the numbers of possible placements and blocked placements, resulting in a set of memory location assignments.

Type: Grant

Filed: June 9, 2021

Date of Patent: March 19, 2024

Assignee: Amazon Technologies, Inc.

Inventor: Preston Pengra Briggs
Method and apparatus for real-time control loop application execution from a high-level description

Patent number: 11900076

Abstract: The present approach provides a method for safety-critical systems to reduce the required long V development and certification process, into a process that is up to 80% shorter, as well as safer. The present approach creates a pre-certified system, with both pre-certified hardware and pre-certified software. The pre-certified system may be configured to implement a safety-critical software compilation, that contains variables, operations, and template instantiations defining the safety-critical system. This approach eliminates the process below the high-level requirements for the safety-critical software through prior action. To support the configuration, the present approach implements three kinds of components: variables, operators, and templates that provide input, output and abstracted concepts. A configuration defines a set of variables, operations and template instantiations. A tool is used that takes high-level requirements written in a computer readable format into the configuration.

Type: Grant

Filed: April 21, 2021

Date of Patent: February 13, 2024

Assignee: SOLI BV

Inventor: Filip Leonard Etienne Verhaeghe
Storage cache management

Patent number: 11860780

Abstract: A method of cache management, the method comprising: identifying, among a plurality of storage items, storage items having an access count above a first threshold to generate a set of storage items; identifying, among the set of storage items, storage items having an updated access count above a second threshold to generate a subset of storage items, wherein, for each storage item, the updated access count is dependent upon a number of accesses subsequent to generating the set of storage items; and adding the storage items of the subset of storage items to a cache.

Type: Grant

Filed: January 28, 2022

Date of Patent: January 2, 2024

Assignee: PURE STORAGE, INC.

Inventors: Ethan Miller, John Colgrove
Method and apparatus for improved security in trigger action platforms

Patent number: 11856000

Abstract: An apparatus and method for improving the security of trigger action platforms of a type providing interoperability between computer services send the trigger service additional information about an interoperability rule for the computer services so that the trigger service may implement a minimizer reducing the data communicated when the interoperability is implemented. Implementation of the minimizer may be done in a way that is transparent to the trigger action platform eliminating the need for disruption of existing interoperability services.

Type: Grant

Filed: June 4, 2021

Date of Patent: December 26, 2023

Assignee: Wisconsin Alumni Research Foundation

Inventors: Yunang Chen, Mohannad Alhanahnah, Andrei Sabelfeld, Rahul Chatterjee, Earlence Fernandes
Systems and methods for unified computing on digital and quantum computers

Patent number: 11842177

Abstract: Computer systems and methods are provided for compiling a computer program to run on a quantum processor comprising a plurality of qubits, qudits or quantum continuous variables. A compiler obtains the program in a unified language, that is effectively a classical language, as opposed to a quantum language, and performs code refactoring on all or a portion of the program to form a refactored code and converts the refactored code into a first code. The compiler compiles the first code into a second code comprising a plurality of data elements in one or more quantum data structures. The compiler converts the second code to a third code expressed in a quantum gate-level language in accordance with an instruction set and gate locality constraints of the target quantum processor.

Type: Grant

Filed: June 3, 2021

Date of Patent: December 12, 2023

Assignee: HORIZON QUANTUM COMPUTING PTE. LTD.

Inventors: Joseph Francis Fitzsimons, Si-Hui Tan
Computer-readable recording medium storing program for converting first single instruction multiple data (SIMD) command using first mask register into second SIMD command using second mask register, command conversion method for converting first SIMD command using first mask register into second SIMD command using second mask register, and command conversion apparatus for converting first SIMD command using first mask register into second SIMD command using second mask register

Patent number: 11803384

Abstract: A recording medium stores a program for causing a computer to execute a process including: converting, in a first source code corresponding to a first-type processor, a first load command for a first mask register included in the first-type processor into a second load command for a second mask register included in a second-type processor; and converting, when a first SIMD command for performing an arithmetic operation using the first mask register exists after the first load command in the first source code and a state of a value of the first mask register does not coincide with a state of a value of the first mask register, the first SIMD command into a second SIMD command corresponding to the second-type processor and a change command for changing a state of a value of the second mask register to a state of a value of the second mask register.

Type: Grant

Filed: May 31, 2022

Date of Patent: October 31, 2023

Assignee: FUJITSU LIMITED

Inventors: Koji Kurihara, Kentaro Kawakami
Application migration using cost-aware code dependency graph

Patent number: 11714615

Abstract: Described are techniques for application migration. The techniques include migrating an application to a target cloud infrastructure and generating a cost-aware code dependency graph during execution of the application on the target cloud infrastructure. The techniques further include modifying the application by removing source code corresponding to unused nodes according to the cost-aware code dependency graph and replacing identified source code of a high-cost subgraph of the cost-aware code dependency graph with calls to a generated microservice configured to provide functionality similar to the identified source code. The techniques further include implementing the modified application on one or more virtual machines of the target cloud infrastructure.

Type: Grant

Filed: September 18, 2020

Date of Patent: August 1, 2023

Assignee: International Business Machines Corporation

Inventors: Bruno Silva, Marco Aurelio Stelmar Netto, Renato Luiz de Freitas Cunha, Nelson Mimura Gonzalez
Sparsity uniformity enforcement for multicore processor

Patent number: 11709662

Abstract: Methods and systems relating to the field of parallel computing are disclosed herein. The methods and systems disclosed include approaches for sparsity uniformity enforcement for a set of computational nodes which are used to execute a complex computation. A disclosed method includes determining a sparsity distribution in a set of operand data, and generating, using a compiler, a set of instructions for executing, using the set of operand data and a set of processing cores, a complex computation. Alternatively, the method includes altering the operand data. The method also includes distributing the set of operand data to the set of processing cores for use in executing the complex computation in accordance with the set of instructions. Either the altering is conducted to, or the compiler is programmed to, balance the sparsity distribution among the set of processing cores.

Type: Grant

Filed: November 5, 2021

Date of Patent: July 25, 2023

Assignee: Tenstorrent Inc.

Inventors: Ljubisa Bajic, Davor Capalija, Yu Ting Chen, Andrew Grebenisan, Hassan Farooq, Akhmed Rakhmati, Stephen Chin, Vladimir Blagojevic, Almeet Bhullar, Jasmina Vasiljevic
Compiler program, compiling method, information processing device

Patent number: 11693638

Abstract: A compiler program causes a computer to execute optimization processing for an optimization target program. The optimization target program includes a loop including a vector store instruction and a vector load instruction for an array variable. The optimization processing includes (1) unrolling the vector store instruction and the vector load instruction in the loop by an unrolling number of times to generate a plurality of unrolled vector store instructions and a plurality of unrolled vector load instructions, and (2) scheduling to move an unrolled vector load instruction among the plurality of unrolled vector load instructions, which is located after a first unrolled vector store instruction that is located at first among the plurality of unrolled vector load instructions, before the first unrolled vector store instruction.

Type: Grant

Filed: March 17, 2021

Date of Patent: July 4, 2023

Assignee: FUJITSU LIMITED

Inventors: Kensuke Watanabe, Masatoshi Haraguchi, Shun Kamatsuka, Yasunobu Tanimura
Sparsity uniformity enforcement for multicore processor

Patent number: 11693639

Abstract: Methods and systems relating to the field of parallel computing are disclosed herein. The methods and systems disclosed include approaches for sparsity uniformity enforcement for a set of computational nodes which are used to execute a complex computation. A disclosed method includes determining a sparsity distribution in a set of operand data, and generating, using a compiler, a set of instructions for executing, using the set of operand data and a set of processing cores, a complex computation. Alternatively, the method includes altering the operand data. The method also includes distributing the set of operand data to the set of processing cores for use in executing the complex computation in accordance with the set of instructions. Either the altering is conducted to, or the compiler is programmed to, balance the sparsity distribution among the set of processing cores.

Type: Grant

Filed: November 5, 2021

Date of Patent: July 4, 2023

Assignee: Tenstorrent Inc.

Inventors: Ljubisa Bajic, Davor Capalija, Yu Ting Chen, Andrew Grebenisan, Hassan Farooq, Akhmed Rakhmati, Stephen Chin, Vladimir Blagojevic, Almeet Bhullar, Jasmina Vasiljevic
Static versioning in the polyhedral model

Patent number: 11693636

Abstract: An approach is presented to enhancing the optimization process in a polyhedral compiler by introducing compile-time versioning, i.e., the production of several versions of optimized code under varying assumptions on its run-time parameters. We illustrate this process by enabling versioning in the polyhedral processor placement pass. We propose an efficient code generation method and validate that versioning can be useful in a polyhedral compiler by performing benchmarking on a small set of deep learning layers defined for dynamically-sized tensors.

Type: Grant

Filed: November 15, 2021

Date of Patent: July 4, 2023

Assignee: Reservoir Labs Inc.

Inventors: Benoit J. Meister, Adithya Dattatri
Flexible optimized data handling in systems with multiple memories

Patent number: 11687369

Abstract: Methods and systems for optimizing an application for a computing system having multiple distinct memory locations that are interconnected by one or more communication channels include determining one or more data handling properties for a data region in an application. One or more data handling policies for the data region are determined based on the one or more data handling properties. Data setup costs are determined for a scope in the application that uses the data region in different memory locations based on the one or more data handling properties. The application is optimized in accordance with the one or more data handling policies and the data setup costs for the different memory locations.

Type: Grant

Filed: March 2, 2021

Date of Patent: June 27, 2023

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Tong Chen, John Kevin O'Brien, Daniel A. Prener, Zehra N. Sura
Loop thread order execution control of a multi-threaded, self-scheduling reconfigurable computing fabric

Patent number: 11675734

Abstract: Representative apparatus, method, and system embodiments are disclosed for configurable computing. A representative system includes an interconnection network; a processor; and a plurality of configurable circuit clusters. Each configurable circuit cluster includes a plurality of configurable circuits arranged in an array; a synchronous network coupled to each configurable circuit of the array; and an asynchronous packet network coupled to each configurable circuit of the array.

Type: Grant

Filed: March 4, 2022

Date of Patent: June 13, 2023

Assignee: Micron Technology, Inc.

Inventor: Tony M. Brewer
Loop execution control for a multi-threaded, self-scheduling reconfigurable computing fabric using a reenter queue

Patent number: 11675598

Abstract: Representative apparatus, method, and system embodiments are disclosed for configurable computing. A representative system includes an interconnection network; a processor; and a plurality of configurable circuit clusters. Each configurable circuit cluster includes a plurality of configurable circuits arranged in an array; a synchronous network coupled to each configurable circuit of the array; and an asynchronous packet network coupled to each configurable circuit of the array.

Type: Grant

Filed: March 15, 2022

Date of Patent: June 13, 2023

Assignee: Micron Technology, Inc.

Inventor: Tony M. Brewer
Method and apparatus for optimizing code for field programmable gate arrays

Patent number: 11656857

Abstract: A method for the generation of a hardware accelerator (20) is described. The method comprises inputting (110) a program (105) with a plurality of lines of code describing an algorithm to be implemented on the hardware accelerator (20) and generating (125) a dataflow graph in memory from the inputted program (105). The dataflow graph is optimized and an output program (140) created from the dataflow graph is output. The output program (140) is then provided to a high-level synthesis tool for generating the hardware accelerator (20).

Type: Grant

Filed: August 9, 2019

Date of Patent: May 23, 2023

Assignee: INESC TEC—Instituto de Engenharia de Sistemas

Inventors: Afonso Soares Canas Ferreira, João Manuel Paiva Cardoso
System to analyze and enhance software based on graph attention networks

Patent number: 11640295

Abstract: Systems, apparatuses and methods may provide for technology that generates a dependence graph based on a plurality of intermediate representation (IR) code instructions associated with a compiled program code, generates a set of graph embedding vectors based on the plurality of IR code instructions, and determines, via a neural network, one of an analysis of the compiled program code or an enhancement of the program code based on the dependence graph and the set of graph embedding vectors. The technology may provide a graph attention neural network that includes a recurrent block and at least one task-specific neural network layer, the recurrent block including a graph attention layer and a transition function. The technology may also apply dynamic per-position recurrence-halting to determine a number of recurring steps for each position in the recurrent block based on adaptive computation time.

Type: Grant

Filed: June 26, 2020

Date of Patent: May 2, 2023

Assignee: Intel Corporation

Inventors: Mariano Tepper, Bryn Keller, Mihai Capota, Vy Vo, Nesreen Ahmed, Theodore Willke
Dataflow optimization apparatus and method for low-power operation of multicore systems

Patent number: 11635997

Abstract: The present disclosure relates to a dataflow optimization method for low-power operation of a multicore system, the dataflow optimization method including: a step (a) of creating an FSM including a plurality of system states in consideration of dynamic factors that trigger a transition in system states for original dataflow; and a step (b) of optimizing the original dataflow through optimization of the created FSM.

Type: Grant

Filed: July 12, 2019

Date of Patent: April 25, 2023

Assignee: AJOU UNIVERSITY INDUSTRY-ACADEMIC COOPERATION FOUNDATION

Inventors: Hoeseok Yang, Hyeonseok Jung
Analysis for modeling data cache utilization

Patent number: 11630654

Abstract: Aspects include modeling data cache utilization for each loop in a loop nest; estimating total data cache lines fetched in one iteration of the loop; and determining the possibility of data cache reuse across loop iterations using data cache lines fetched and associativity constraints. Aspects also include estimating, for memory reference pairs, reuse by one reference of data cache line fetched by another; estimating total number of cache misses for all iterations of the loop; and estimating total number of cache misses of a reference for iterations of a next outer loop as equal to total cache misses for an entire inner loop. Aspects further include estimating memory cost of a loop unroll and jam transformation, without performing the transformation; and extending a data cache model to estimate best unroll-and-jam factors for the loop nest, capable of minimizing total cache misses incurred by the memory references in the loop body.

Type: Grant

Filed: August 19, 2021

Date of Patent: April 18, 2023

Assignee: International Business Machines Corporation

Inventors: Wai Hung Tsang, Prithayan Barua, Ettore Tiotto, Bardia Mahjour, Jun Shirako
Method and system for protecting data processed by data processing accelerators

Patent number: 11609766

Abstract: According to one embodiment, a data processing system performs a secure boot using a security module (e.g., a trusted platform module (TPM)) of a host system. The system verifies that an operating system (OS) and one or more drivers including an accelerator driver associated with a data processing (DP) accelerator is provided by a trusted source. The system launches the accelerator driver within the OS. The system generates a trusted execution environment (TEE) associated with one or more processors of the host system. The system launches an application and a runtime library within the TEE, where the application communicates with the DP accelerator via the runtime library and the accelerator driver.

Type: Grant

Filed: January 4, 2019

Date of Patent: March 21, 2023

Assignees: BAIDU USA LLC, BAIDU.COM TIMES TECHNOLOGY (BEIJING) CO., LTD., KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY LIMITED

Inventors: Yueqiang Cheng, Yong Liu, Tao Wei, Jian Ouyang
Method and system for identification of redundant function-level slicing calls

Patent number: 11579851

Abstract: This disclosure relates generally to the field of source code processing, and, more particularly to a method and system for identification of redundant function-level slicing calls. The method disclosed generates program dependence graphs (PDGs) based on a slicing criteria and a function corresponding to the function-level slicing call. Further the method classifies the function-level slicing call into redundant or non-redundant by traversing the PDGs and checking if a predefined condition is satisfied or not. The function-level slicing calls are classified as redundant if the check is not satisfied and are classified as non-redundant if the check is satisfied. The disclosed method can be used in identifying redundant function-level slicing calls in applications such as automated false positive elimination (AFPE), automated test case generation and so on.

Type: Grant

Filed: September 20, 2021

Date of Patent: February 14, 2023

Assignee: Tata Consultancy Services Limited

Inventor: Tukaram Bhagwat Muske
Information processing apparatus, computer-readable recording medium storing compiling program, and compiling method

Patent number: 11579853

Abstract: An information processing apparatus includes a processor configured to: for each of a plurality of loops, acquire loop information including a number of variables, a number of registers, a number of memory commands for inputting and outputting a value of the variable between the register and a main storage device, and a number of arithmetic commands for the value of the variable stored in the register, which are used in the loop; calculate the number of variables, the number of registers, the number of memory commands, and the number of arithmetic commands, which correspond to a combination of the loops that are candidates for loop fusion, for each of the combinations of the loops; determine a combination to which the loop fusion is to be applied among the combinations which are calculated for each of the combinations; and execute the loop fusion on the determined combination.

Type: Grant

Filed: November 18, 2021

Date of Patent: February 14, 2023

Assignee: FUJITSU LIMITED

Inventor: Tomoko Nikko
Mapping convolution to a channel convolution engine

Patent number: 11537865

Abstract: A processor system comprises a first and second group of registers and a hardware channel convolution processor unit. The first group of registers is configured to store data elements of channels of a portion of a convolution data matrix. Each register stores at least one data element from each channel. The second group of registers is configured to store data elements of convolution weight matrices including a separate convolution weight matrix for each channel. Each register stores at least one data element from each convolution weight matrix. The hardware channel convolution processor unit is configured to multiply each data element in the first group of registers with a corresponding data element in the second group of registers and sum together the multiplication results for each specific channel to determine corresponding channel convolution result data elements in a corresponding channel convolution result matrix.

Type: Grant

Filed: February 18, 2020

Date of Patent: December 27, 2022

Assignee: Meta Platforms, Inc.

Inventors: Krishnakumar Narayanan Nair, Rakesh Komuravelli, Abdulkadir Utku Diril, Ehsan Khish Ardestani Zadeh, Yuchen Hao, Martin Schatz, Thomas Mark Ulrich, Olivia Wu, Anup Ramesh Kadkol, Amin Firoozshahian
Offloading server and offloading program

Patent number: 11403083

Abstract: An offloading server includes: a data transfer designation section configured to analyze reference relationships of variables used in loop statements in an application and designate, for data that can be transferred outside a loop, a data transfer using an explicit directive that explicitly specifies a data transfer outside the loop; a parallel processing designation section configured to identify loop statements in the application and specify a directive specifying application of parallel processing by an accelerator and perform compilation for each of the loop statements; and a parallel processing pattern creation section configured to exclude loop statements causing a compilation error from loop statements to be offloaded and create a plurality of parallel processing patterns each of which specifies whether to perform parallel processing for each of the loop statements not causing a compilation error.

Type: Grant

Filed: June 3, 2019

Date of Patent: August 2, 2022

Assignee: Nippon Telegraph and Telephone Corporation

Inventors: Yoji Yamato, Hirofumi Noguchi, Misao Kataoka, Takuma Isoda, Tatsuya Demizu
Method, a device, and a computer program product for determining a resource required for executing a code segment

Patent number: 11354159

Abstract: A method comprises: compiling the code segment with a compiler; and determining, based on an intermediate result of the compiling, a resource associated with a dedicated processing unit and for executing the code segment. As such, the resource required for executing a code segment may be determined quickly without actually executing the code segment and allocating or releasing the resource, which helps subsequent resource allocation and further brings about a better user experience.

Type: Grant

Filed: August 14, 2019

Date of Patent: June 7, 2022

Assignee: EMC IP Holding Company LLC

Inventors: Jinpeng Liu, Pengfei Wu, Junping Zhao, Kun Wang
Processing method, device, equipment and storage medium of loop instruction

Patent number: 11340903

Abstract: The present application discloses a processing method, device, equipment and storage medium of a loop instruction, and relates to the fields of voice and chips. A specific embodiment is: acquiring a computer program including a first loop body, where the first loop body is generated according to a second loop body in a software code to be compiled, the first loop body includes a plurality of first loop instructions, the plurality of first loop instructions can be identified by a hardware structure of a computer device; in the case that the first loop body is detected, determining loop parameters of the first loop body according to the plurality of first loop instructions; acquiring the plurality of first loop instructions according to the loop parameters of the first loop body; executing the plurality of first loop instructions.

Type: Grant

Filed: March 23, 2021

Date of Patent: May 24, 2022

Assignee: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventors: Junhui Wen, Chao Tian
Speculative caching in a content delivery network

Patent number: 11272028

Abstract: A server in a content delivery (CD) network that distributes content on behalf of one or more subscribers. Responsive to a request from a client for a particular resource, if the particular resource is already in a cache on the server, serving the particular to the client from the cache; otherwise if the particular resource is not already cached on the server, when a count value exceeds a first threshold value, obtaining, caching, and serving the particular resource. When the count value is less than a second threshold value, obtaining and serving the particular resource. When the count value is: (i) not less than the second threshold value, and (ii) not greater than the first threshold value, then obtaining the particular resource and selectively caching the particular resource; and serving the particular resource to the client.

Type: Grant

Filed: April 29, 2020

Date of Patent: March 8, 2022

Assignee: Level 3 Communications, LLC

Inventors: Daniel Lee Jensen, William Crowder, Christopher Newton, William R. Power
Determining when to perform and performing runtime binary slimming

Patent number: 11221835

Abstract: One or more execution traces of an application are accessed. The one or more execution traces have been collected at a basic block level. Basic blocks in the one or more execution traces are scored. Scores for the basic blocks represent benefits of performing binary slimming at the corresponding basic blocks. Runtime binary slimming is performed of the application based on the scores of the basic blocks.

Type: Grant

Filed: February 10, 2020

Date of Patent: January 11, 2022

Assignee: International Business Machines Corporation

Inventors: Michael Vu Le, Ian Michael Molloy, Taemin Park
System and/or method for error compensation in mechanical transmissions

Patent number: 11148287

Abstract: The system can include a set of joints, a controller, and a model engine; and can optionally include a support structure and an end effector. Joints can include: a motor, a transmission mechanism, an input sensor, and an output sensor. The system can enable articulation of the plurality of joints.

Type: Grant

Filed: April 13, 2021

Date of Patent: October 19, 2021

Assignee: Orangewood Labs Inc.

Inventors: Abhinav Kumar, Aditya Bhatia, Akash Bansal, Anubhav Singh, Ashutosh Prakash, Aman Malhotra, Harshit Gaur, Prasang Srivasatava, Ashish Chauhan
Loop-oriented neural network compilation

Patent number: 11144291

Abstract: Methods of accelerating the execution of neural networks are disclosed. A description of a neural network may be received. A plurality of operators may be identified based on the description of the neural network. A plurality of symbolic models associated with the plurality of operators may be generated. For each symbolic model, a nested loop associated with an operator may be identified, a loop order may be defined, and a set of data dependencies may be defined. A set of inter-operator dependencies may be extracted based on the description of the neural network. The plurality of symbolic models and the set of inter-operator dependencies may be analyzed to identify a combinable pair of nested loops. The combinable pair of nested loops may be combined to form a combined nested loop.

Type: Grant

Filed: November 27, 2019

Date of Patent: October 12, 2021

Assignee: Amazon Technologies, Inc.

Inventors: Hongbin Zheng, Preston Pengra Briggs, Tobias Joseph Kastulus Edler von Koch, Taemin Kim, Randy Renfu Huang
Systems and methods for optimizing nested loop instructions in pipeline processing stages within a machine perception and dense algorithm integrated circuit

Patent number: 11061678

Abstract: In one embodiment, a method for improving a performance of an integrated circuit includes implementing one or more computing devices executing a compiler program that: (i) evaluates a target instruction set intended for execution by an integrated circuit; (ii) identifies one or more nested loop instructions within the target instruction set based on the evaluation; (iii) evaluates whether a most inner loop body within the one or more nested loop instructions comprises a candidate inner loop body that requires a loop optimization that mitigates an operational penalty to the integrated circuit based on one or more executional properties of the most inner loop instruction; and (iv) implements the loop optimization that modifies the target instruction set to include loop optimization instructions to control, at runtime, an execution and a termination of the most inner loop body thereby mitigating the operational penalty to the integrated circuit.

Type: Grant

Filed: December 18, 2020

Date of Patent: July 13, 2021

Assignee: qaudric.io Inc

Inventors: Nigel Drego, Mrinalini Ravichandran, Jianman Chang, Daniel Firu, Veerbhan Kheterpal
Flexible optimized data handling in systems with multiple memories

Patent number: 10996989

Abstract: Methods and systems for optimizing an application for a computing system having multiple distinct memory locations that are interconnected by one or more communication channels include determining one or more data handling properties for a data region in an application. One or more data handling policies for the data region are determined based on the one or more data handling properties. Data setup costs are determined for a scope in the application that uses the data region in different memory locations based on the one or more data handling properties. The application is optimized in accordance with the one or more data handling policies and the data setup costs for the different memory locations.

Type: Grant

Filed: June 13, 2016

Date of Patent: May 4, 2021

Assignee: International Business Machines Corporation

Inventors: Tong Chen, John Kevin O'Brien, Daniel A. Prener, Zehra N. Sura
Statelessly populating data stream into successive files

Patent number: 10901944

Abstract: Storing an incoming data stream using successive files that are consecutively populated. The appropriate file to populate a given data stream portion into is determined by mapping the data stream offset to a file, and potentially also an address within that file. The successive files may be the same size, so that the file can be identified based on the data stream address (or offset) without the use of an index. Furthermore, the files may be easily named by having that size be some multiple of a binary power of bytes. That way, the files themselves can be automatically and named and identified by using the more significant bit or bits of the data stream offset to uniquely identify the file and establish ordering of the files. Replication may occur from a primary to a secondary store by transmitting the offset, and the actual data to be stored.

Type: Grant

Filed: May 24, 2017

Date of Patent: January 26, 2021

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Rogerio Ramos, Fayssal Martani, Cristian Diaconu, Karthick Krishnamoorthy, Jacob R. Lorch
Resilient programming frameworks for iterative computations

Patent number: 10831616

Abstract: An information processing system, computer readable storage medium, and method for supporting resilient execution of computer programs. A method provides a resilient store wherein information in the resilient store can be accessed in the event of a failure. The method periodically checkpoints application state in the resilient store. A resilient executor comprises software which executes applications by catching failures. The method uses the resilient executor to execute at least one application. In response to the resilient executor detecting a failure, restoring application state information to the at least one application from a checkpoint stored in the resilient store, the resilient executor resuming execution of the at least one application with the restored application state information.

Type: Grant

Filed: April 18, 2019

Date of Patent: November 10, 2020

Assignee: International Business Machines Corporation

Inventors: Arun Iyengar, Joshua J. Milthorpe
Resilient programming frameworks for iterative computations on computer systems

Patent number: 10831617

Abstract: An information processing system, computer readable storage medium, and method for supporting resilient execution of computer programs. A method provides a resilient store wherein information in the resilient store can be accessed in the event of a failure. The method periodically checkpoints application state in the resilient store. A resilient executor comprises software which executes applications by catching failures. The method uses the resilient executor to execute at least one application. In response to the resilient executor detecting a failure, restoring application state information to the at least one application from a checkpoint stored in the resilient store, the resilient executor resuming execution of the at least one application with the restored application state information.

Type: Grant

Filed: April 23, 2019

Date of Patent: November 10, 2020

Assignee: International Business Machines Corporation

Inventors: Arun Iyengar, Joshua J. Milthorpe
Method of estimating program speed-up in highly parallel architectures using static analysis

Patent number: 10754744

Abstract: The amount of speed-up that can be obtained by optimizing the program to run on a different architecture is determined by static measurements of the program. Multiple such static measurements are processed by a machine learning system after being discretized to alter their accuracy vs precision. Static analysis requires less analysis overhead and permits analysis of program portions to optimize allocation of porting resources on a large program.

Type: Grant

Filed: March 15, 2016

Date of Patent: August 25, 2020

Assignee: Wisconsin Alumni Research Foundation

Inventors: Karthikeyan Sankaralingam, Newsha Ardalani, Urmish Thakker

1 2 3 4 5 … next