Including Loop Patents (Class 717/160)
  • Patent number: 12229539
    Abstract: Provided is an application optimization method and an electronic device supporting the same. According to an example embodiment, the application optimization method may include: determining whether a condition set with respect to a duration of an idle state of the electronic device is satisfied, selecting an application for which application optimization is to be performed based on an application usage record of a user of the electronic device in response to the set condition being satisfied, and generating an optimized application by performing the application optimization in the background for the selected application.
    Type: Grant
    Filed: June 23, 2022
    Date of Patent: February 18, 2025
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Byungsoo Kwon, Kiljae Kim, Daehyun Cho
  • Patent number: 12223301
    Abstract: A method for generating source code includes: transforming a block diagram into an intermediate representation, wherein transforming the block diagram into the intermediate representation comprises transforming at least two blocks, wherein at least one loop results from transforming an operation block; identifying at least one candidate loop in the intermediate representation, wherein a loop body of a candidate loop comprises at least one instruction that accesses the array variable; identifying at least one parallelizable loop from the at least one candidate loop; determining build options for the at least one parallelizable loop and the array variable; inserting build pragmas based on the determined build options in the intermediate representation; and translating the intermediate representation into the source code.
    Type: Grant
    Filed: March 17, 2023
    Date of Patent: February 11, 2025
    Assignee: DSPACE GMBH
    Inventors: Joerg Niere, Kingshuk Karuri, Pubali Mazumder
  • Patent number: 12182552
    Abstract: A computer-based technique for processing an application includes determining that a loop of the application includes a reference to a data item of a vector data type. A trip count of the loop is determined to have an unknown trip count. The loop is split into a first loop and a second loop based on a splitting factor. The second loop is unrolled.
    Type: Grant
    Filed: May 24, 2022
    Date of Patent: December 31, 2024
    Assignee: Xilinx, Inc.
    Inventor: Ajit K. Agarwal
  • Patent number: 12093691
    Abstract: A generation unit generates arithmetic expressions. Here, N denotes the number of looping times of the loop processing. L denotes a designated lower limit of unroll stage number. M denotes a designated upper limit of the unroll stage number. Q denotes a quotient obtained by dividing N by L. R denotes a remainder obtained by dividing N by L. The arithmetic expressions include an arithmetic expression that represents executing loop processing whose number of looping times is a quotient obtained by dividing R by (M?L), with the unroll stage number M when R?Q*(M?L)>0 is not satisfied, and then executing, when a remainder obtained by dividing R by (M?L) is other than 0, processing of one loop with sum of the remainder and L as the unroll stage number, and then executing loop processing with the unroll stage number L.
    Type: Grant
    Filed: February 14, 2020
    Date of Patent: September 17, 2024
    Assignee: NEC CORPORATION
    Inventor: Yoshiyuki Ohno
  • Patent number: 12079632
    Abstract: Sequence partition based schedule optimization is performed by generating a sequence and a schedule based on the sequence, dividing the sequence into a plurality of sequence partitions based on the schedule and the data dependency graph, each sequence partition including a portion of the plurality of instructions and a portion of the plurality of buffers, performing, for each sequence partition, a plurality of partition optimizing iterations, and merging the plurality of sequence partitions to produce a merged schedule.
    Type: Grant
    Filed: December 16, 2022
    Date of Patent: September 3, 2024
    Assignee: EDGECORTIX INC.
    Inventors: Jens Huthmann, Sakyasingha Dasgupta, Nikolay Nez
  • Patent number: 12073199
    Abstract: In various implementations, provided are systems and methods for reducing neural network processing. A compiler may generate instructions from source code for a neural network having a repeatable set of operations. The instructions may include a plurality of blocks. The compiler may add an overwrite instruction to the plurality of blocks that, when executed by one or more execution engines, triggers an overwrite action. The overwrite action causes the instructions of subsequent blocks to be overwritten with NOP instructions. The overwrite action is triggered only when a condition is satisfied.
    Type: Grant
    Filed: June 6, 2019
    Date of Patent: August 27, 2024
    Assignee: Amazon Technologies, Inc.
    Inventors: Vignesh Vivekraja, Randy Renfu Huang, Yu Zhou, Ron Diamant, Richard John Heaton
  • Patent number: 12045618
    Abstract: The invention provides a data processing apparatus and a data processing method for generating prefetches of data for use during execution of instructions by processing circuitry. The prefetches that are generated are based on a nested prefetch pattern. The nested prefetch pattern comprises a first pattern and a second pattern. The first pattern is defined by a first address offset between sequentially accessed addresses and a first observed number of the sequentially accessed addresses separated by the first address offset. The second pattern is defined by a second address offset between sequential iterations of the first pattern and a second observed number of the sequential iterations of the first pattern separated by the second address offset.
    Type: Grant
    Filed: March 23, 2021
    Date of Patent: July 23, 2024
    Assignee: Arm Limited
    Inventors: Natalya Bondarenko, Stefano Ghiggini, Geoffray Matthieu Lacourba, Cédric Denis Robert Airaud
  • Patent number: 12039329
    Abstract: Systems, methods, and apparatuses relating to circuitry to implement dynamic two-pass execution of a partial flag updating instruction in a processor are described.
    Type: Grant
    Filed: December 24, 2020
    Date of Patent: July 16, 2024
    Assignee: Intel Corporation
    Inventors: Wing Shek Wong, Vikash Agarwal, Charles Vitu, Mihir Shah
  • Patent number: 12032934
    Abstract: Methods, apparatus, systems and articles of manufacture (e.g., computer readable storage media) to perform automatic compiler optimization to enable streaming-store generation for unaligned contiguous write access are disclosed. Example apparatus disclosed herein are to mark a store instruction in source program code as a transformation candidate when the store instruction is associated with a group of memory accesses that are unaligned with respect to a size of a cache line in a cache. Disclosed apparatus are also to transform the store instruction that is marked as the transformation candidate to form transformed program code when a non-temporal property is satisfied, the transformed program code to replace the store instruction with (i) a write to a buffer in the cache and (ii) a streaming-store instruction that is to write contents of the buffer to memory.
    Type: Grant
    Filed: September 23, 2021
    Date of Patent: July 9, 2024
    Assignee: Intel Corporation
    Inventors: Charles Yount, Rakesh Krishnaiyer, Timothy Creech, Daniel Woodworth, Joshua Cranmer
  • Patent number: 12032563
    Abstract: A system, method, and computer-readable medium for proving feedback on database instructions, identifying, for example, existing patterns and providing suggested replacement instructions. This may have the effect of improving the efficiency of instructions used to create and/or manipulate databases. According to some aspects, these and other benefits may be achieved by parsing received instructions into an organizational structure, traversing the organizational structure for known patterns, and suggesting replacement patterns. In implementation, this may be effected by receiving one or more sets of known patterns and corresponding replacement patterns, parsing received instructions, comparing the known patterns with the parsed instructions, and providing suggested replacement patterns based on one or more known patterns matching the parsed instructions. A benefit of may include reducing Cartesian products during the merging of tables.
    Type: Grant
    Filed: January 10, 2023
    Date of Patent: July 9, 2024
    Assignee: Capital One Services, LLC
    Inventors: Dennis J. Mire, Puneet Goyal, Siddharth Gupta, Srinivas Kumar, Deepak Sundararaj, Oron Hazi
  • Patent number: 12008364
    Abstract: A system identifies a pattern in source code. The pattern is identified based, at least in part, on correlation between units of code, or on the derivation of a rule from a recurring sequence in the source code. The system identifies a portion of the source code that at least partially matches the pattern, and determines that this portion includes at least one deviation from the pattern. The system then generates an error message to describe the deviation.
    Type: Grant
    Filed: June 24, 2021
    Date of Patent: June 11, 2024
    Assignee: Amazon Technologies Inc.
    Inventors: Hangqi Zhao, Qiang Zhou, Sengamedu Hanumantha Rao Srinivasan
  • Patent number: 11983158
    Abstract: Systems and methods related to dynamically orchestrating execution of a protocol by generating one or more executors and one or more functions using a network controller based on the protocol are disclosed. In one example embodiment, there is provided a method of receiving a user-defined logic to allow the network controller to generate a dataset including a plurality of accumulators, abstract one or more executors and functions, and generate a set of protocols at the one or more functions to activate a respective application stage to fulfill the service request based on the logic in near real-time. Moreover, methods herein may include determining a telemetry and/or insight generation based, at least in part, on the dataset.
    Type: Grant
    Filed: October 26, 2022
    Date of Patent: May 14, 2024
    Assignee: PAYPAL, INC.
    Inventors: Manimuthu Ayyannan, Deepak Mohanakumar Chandramouli, Karteek Reddy Chada
  • Patent number: 11941383
    Abstract: Techniques to speed up code compilation may include caching code analysis results such that the analysis of subsequent code having a similar structured can be omitted. For example, a loop-nest construct in the code can be parsed, and an execution statement in the loop-nest construct can be analyzed by a compiler to generate an analysis result indicating a set of execution conditions for the execution statement. A lookup key can be generated from the control statements bounding the execution statement, and the analysis result can be stored with the lookup key in a cache entry of the cache. The execution statement is then modified according to the analysis result for optimization. Instead of having to analyze a subsequent execution statement bounded by the same control statements, the analysis result of the subsequent execution statement can be retrieved from the cache and be used to modify the subsequent execution statement.
    Type: Grant
    Filed: March 8, 2022
    Date of Patent: March 26, 2024
    Assignee: Amazon Technologies, Inc.
    Inventors: Hongbin Zheng, Pushkar Ratnalikar
  • Patent number: 11934876
    Abstract: A compiler-implemented technique for performing a storage allocation is described. Computer code to be converted into machine instructions for execution on an integrated circuit device is received. Based on the computer code, a set of values that are to be stored on the integrated circuit device are determined. An interference graph that includes the set of values and a set of interferences is constructed. A number of possible placements and a number of blocked placements in a memory of the integrated circuit device are computed for each of the set of values. At least a portion of the set of values are assigned to a set of memory locations in the memory based on the numbers of possible placements and blocked placements, resulting in a set of memory location assignments.
    Type: Grant
    Filed: June 9, 2021
    Date of Patent: March 19, 2024
    Assignee: Amazon Technologies, Inc.
    Inventor: Preston Pengra Briggs
  • Patent number: 11934832
    Abstract: This application discloses example synchronization instruction insertion methods and example apparatuses. One example method includes obtaining a first program block comprising one or more statements, where each of the one or more statements includes one or more function instructions. A first function instruction and a second function instruction between which data dependency exists in the first program block can then be determined. A synchronization instruction pair between a first statement including the first function instruction and a second statement including the second function instruction can then be inserted.
    Type: Grant
    Filed: December 21, 2021
    Date of Patent: March 19, 2024
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Xiong Gao, Kun Zhang
  • Patent number: 11900076
    Abstract: The present approach provides a method for safety-critical systems to reduce the required long V development and certification process, into a process that is up to 80% shorter, as well as safer. The present approach creates a pre-certified system, with both pre-certified hardware and pre-certified software. The pre-certified system may be configured to implement a safety-critical software compilation, that contains variables, operations, and template instantiations defining the safety-critical system. This approach eliminates the process below the high-level requirements for the safety-critical software through prior action. To support the configuration, the present approach implements three kinds of components: variables, operators, and templates that provide input, output and abstracted concepts. A configuration defines a set of variables, operations and template instantiations. A tool is used that takes high-level requirements written in a computer readable format into the configuration.
    Type: Grant
    Filed: April 21, 2021
    Date of Patent: February 13, 2024
    Assignee: SOLI BV
    Inventor: Filip Leonard Etienne Verhaeghe
  • Patent number: 11860780
    Abstract: A method of cache management, the method comprising: identifying, among a plurality of storage items, storage items having an access count above a first threshold to generate a set of storage items; identifying, among the set of storage items, storage items having an updated access count above a second threshold to generate a subset of storage items, wherein, for each storage item, the updated access count is dependent upon a number of accesses subsequent to generating the set of storage items; and adding the storage items of the subset of storage items to a cache.
    Type: Grant
    Filed: January 28, 2022
    Date of Patent: January 2, 2024
    Assignee: PURE STORAGE, INC.
    Inventors: Ethan Miller, John Colgrove
  • Patent number: 11856000
    Abstract: An apparatus and method for improving the security of trigger action platforms of a type providing interoperability between computer services send the trigger service additional information about an interoperability rule for the computer services so that the trigger service may implement a minimizer reducing the data communicated when the interoperability is implemented. Implementation of the minimizer may be done in a way that is transparent to the trigger action platform eliminating the need for disruption of existing interoperability services.
    Type: Grant
    Filed: June 4, 2021
    Date of Patent: December 26, 2023
    Assignee: Wisconsin Alumni Research Foundation
    Inventors: Yunang Chen, Mohannad Alhanahnah, Andrei Sabelfeld, Rahul Chatterjee, Earlence Fernandes
  • Patent number: 11842177
    Abstract: Computer systems and methods are provided for compiling a computer program to run on a quantum processor comprising a plurality of qubits, qudits or quantum continuous variables. A compiler obtains the program in a unified language, that is effectively a classical language, as opposed to a quantum language, and performs code refactoring on all or a portion of the program to form a refactored code and converts the refactored code into a first code. The compiler compiles the first code into a second code comprising a plurality of data elements in one or more quantum data structures. The compiler converts the second code to a third code expressed in a quantum gate-level language in accordance with an instruction set and gate locality constraints of the target quantum processor.
    Type: Grant
    Filed: June 3, 2021
    Date of Patent: December 12, 2023
    Assignee: HORIZON QUANTUM COMPUTING PTE. LTD.
    Inventors: Joseph Francis Fitzsimons, Si-Hui Tan
  • Patent number: 11803384
    Abstract: A recording medium stores a program for causing a computer to execute a process including: converting, in a first source code corresponding to a first-type processor, a first load command for a first mask register included in the first-type processor into a second load command for a second mask register included in a second-type processor; and converting, when a first SIMD command for performing an arithmetic operation using the first mask register exists after the first load command in the first source code and a state of a value of the first mask register does not coincide with a state of a value of the first mask register, the first SIMD command into a second SIMD command corresponding to the second-type processor and a change command for changing a state of a value of the second mask register to a state of a value of the second mask register.
    Type: Grant
    Filed: May 31, 2022
    Date of Patent: October 31, 2023
    Assignee: FUJITSU LIMITED
    Inventors: Koji Kurihara, Kentaro Kawakami
  • Patent number: 11714615
    Abstract: Described are techniques for application migration. The techniques include migrating an application to a target cloud infrastructure and generating a cost-aware code dependency graph during execution of the application on the target cloud infrastructure. The techniques further include modifying the application by removing source code corresponding to unused nodes according to the cost-aware code dependency graph and replacing identified source code of a high-cost subgraph of the cost-aware code dependency graph with calls to a generated microservice configured to provide functionality similar to the identified source code. The techniques further include implementing the modified application on one or more virtual machines of the target cloud infrastructure.
    Type: Grant
    Filed: September 18, 2020
    Date of Patent: August 1, 2023
    Assignee: International Business Machines Corporation
    Inventors: Bruno Silva, Marco Aurelio Stelmar Netto, Renato Luiz de Freitas Cunha, Nelson Mimura Gonzalez
  • Patent number: 11709662
    Abstract: Methods and systems relating to the field of parallel computing are disclosed herein. The methods and systems disclosed include approaches for sparsity uniformity enforcement for a set of computational nodes which are used to execute a complex computation. A disclosed method includes determining a sparsity distribution in a set of operand data, and generating, using a compiler, a set of instructions for executing, using the set of operand data and a set of processing cores, a complex computation. Alternatively, the method includes altering the operand data. The method also includes distributing the set of operand data to the set of processing cores for use in executing the complex computation in accordance with the set of instructions. Either the altering is conducted to, or the compiler is programmed to, balance the sparsity distribution among the set of processing cores.
    Type: Grant
    Filed: November 5, 2021
    Date of Patent: July 25, 2023
    Assignee: Tenstorrent Inc.
    Inventors: Ljubisa Bajic, Davor Capalija, Yu Ting Chen, Andrew Grebenisan, Hassan Farooq, Akhmed Rakhmati, Stephen Chin, Vladimir Blagojevic, Almeet Bhullar, Jasmina Vasiljevic
  • Patent number: 11693636
    Abstract: An approach is presented to enhancing the optimization process in a polyhedral compiler by introducing compile-time versioning, i.e., the production of several versions of optimized code under varying assumptions on its run-time parameters. We illustrate this process by enabling versioning in the polyhedral processor placement pass. We propose an efficient code generation method and validate that versioning can be useful in a polyhedral compiler by performing benchmarking on a small set of deep learning layers defined for dynamically-sized tensors.
    Type: Grant
    Filed: November 15, 2021
    Date of Patent: July 4, 2023
    Assignee: Reservoir Labs Inc.
    Inventors: Benoit J. Meister, Adithya Dattatri
  • Patent number: 11693639
    Abstract: Methods and systems relating to the field of parallel computing are disclosed herein. The methods and systems disclosed include approaches for sparsity uniformity enforcement for a set of computational nodes which are used to execute a complex computation. A disclosed method includes determining a sparsity distribution in a set of operand data, and generating, using a compiler, a set of instructions for executing, using the set of operand data and a set of processing cores, a complex computation. Alternatively, the method includes altering the operand data. The method also includes distributing the set of operand data to the set of processing cores for use in executing the complex computation in accordance with the set of instructions. Either the altering is conducted to, or the compiler is programmed to, balance the sparsity distribution among the set of processing cores.
    Type: Grant
    Filed: November 5, 2021
    Date of Patent: July 4, 2023
    Assignee: Tenstorrent Inc.
    Inventors: Ljubisa Bajic, Davor Capalija, Yu Ting Chen, Andrew Grebenisan, Hassan Farooq, Akhmed Rakhmati, Stephen Chin, Vladimir Blagojevic, Almeet Bhullar, Jasmina Vasiljevic
  • Patent number: 11693638
    Abstract: A compiler program causes a computer to execute optimization processing for an optimization target program. The optimization target program includes a loop including a vector store instruction and a vector load instruction for an array variable. The optimization processing includes (1) unrolling the vector store instruction and the vector load instruction in the loop by an unrolling number of times to generate a plurality of unrolled vector store instructions and a plurality of unrolled vector load instructions, and (2) scheduling to move an unrolled vector load instruction among the plurality of unrolled vector load instructions, which is located after a first unrolled vector store instruction that is located at first among the plurality of unrolled vector load instructions, before the first unrolled vector store instruction.
    Type: Grant
    Filed: March 17, 2021
    Date of Patent: July 4, 2023
    Assignee: FUJITSU LIMITED
    Inventors: Kensuke Watanabe, Masatoshi Haraguchi, Shun Kamatsuka, Yasunobu Tanimura
  • Patent number: 11687369
    Abstract: Methods and systems for optimizing an application for a computing system having multiple distinct memory locations that are interconnected by one or more communication channels include determining one or more data handling properties for a data region in an application. One or more data handling policies for the data region are determined based on the one or more data handling properties. Data setup costs are determined for a scope in the application that uses the data region in different memory locations based on the one or more data handling properties. The application is optimized in accordance with the one or more data handling policies and the data setup costs for the different memory locations.
    Type: Grant
    Filed: March 2, 2021
    Date of Patent: June 27, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Tong Chen, John Kevin O'Brien, Daniel A. Prener, Zehra N. Sura
  • Patent number: 11675734
    Abstract: Representative apparatus, method, and system embodiments are disclosed for configurable computing. A representative system includes an interconnection network; a processor; and a plurality of configurable circuit clusters. Each configurable circuit cluster includes a plurality of configurable circuits arranged in an array; a synchronous network coupled to each configurable circuit of the array; and an asynchronous packet network coupled to each configurable circuit of the array.
    Type: Grant
    Filed: March 4, 2022
    Date of Patent: June 13, 2023
    Assignee: Micron Technology, Inc.
    Inventor: Tony M. Brewer
  • Patent number: 11675598
    Abstract: Representative apparatus, method, and system embodiments are disclosed for configurable computing. A representative system includes an interconnection network; a processor; and a plurality of configurable circuit clusters. Each configurable circuit cluster includes a plurality of configurable circuits arranged in an array; a synchronous network coupled to each configurable circuit of the array; and an asynchronous packet network coupled to each configurable circuit of the array.
    Type: Grant
    Filed: March 15, 2022
    Date of Patent: June 13, 2023
    Assignee: Micron Technology, Inc.
    Inventor: Tony M. Brewer
  • Patent number: 11656857
    Abstract: A method for the generation of a hardware accelerator (20) is described. The method comprises inputting (110) a program (105) with a plurality of lines of code describing an algorithm to be implemented on the hardware accelerator (20) and generating (125) a dataflow graph in memory from the inputted program (105). The dataflow graph is optimized and an output program (140) created from the dataflow graph is output. The output program (140) is then provided to a high-level synthesis tool for generating the hardware accelerator (20).
    Type: Grant
    Filed: August 9, 2019
    Date of Patent: May 23, 2023
    Assignee: INESC TEC—Instituto de Engenharia de Sistemas
    Inventors: Afonso Soares Canas Ferreira, João Manuel Paiva Cardoso
  • Patent number: 11640295
    Abstract: Systems, apparatuses and methods may provide for technology that generates a dependence graph based on a plurality of intermediate representation (IR) code instructions associated with a compiled program code, generates a set of graph embedding vectors based on the plurality of IR code instructions, and determines, via a neural network, one of an analysis of the compiled program code or an enhancement of the program code based on the dependence graph and the set of graph embedding vectors. The technology may provide a graph attention neural network that includes a recurrent block and at least one task-specific neural network layer, the recurrent block including a graph attention layer and a transition function. The technology may also apply dynamic per-position recurrence-halting to determine a number of recurring steps for each position in the recurrent block based on adaptive computation time.
    Type: Grant
    Filed: June 26, 2020
    Date of Patent: May 2, 2023
    Assignee: Intel Corporation
    Inventors: Mariano Tepper, Bryn Keller, Mihai Capota, Vy Vo, Nesreen Ahmed, Theodore Willke
  • Patent number: 11635997
    Abstract: The present disclosure relates to a dataflow optimization method for low-power operation of a multicore system, the dataflow optimization method including: a step (a) of creating an FSM including a plurality of system states in consideration of dynamic factors that trigger a transition in system states for original dataflow; and a step (b) of optimizing the original dataflow through optimization of the created FSM.
    Type: Grant
    Filed: July 12, 2019
    Date of Patent: April 25, 2023
    Assignee: AJOU UNIVERSITY INDUSTRY-ACADEMIC COOPERATION FOUNDATION
    Inventors: Hoeseok Yang, Hyeonseok Jung
  • Patent number: 11630654
    Abstract: Aspects include modeling data cache utilization for each loop in a loop nest; estimating total data cache lines fetched in one iteration of the loop; and determining the possibility of data cache reuse across loop iterations using data cache lines fetched and associativity constraints. Aspects also include estimating, for memory reference pairs, reuse by one reference of data cache line fetched by another; estimating total number of cache misses for all iterations of the loop; and estimating total number of cache misses of a reference for iterations of a next outer loop as equal to total cache misses for an entire inner loop. Aspects further include estimating memory cost of a loop unroll and jam transformation, without performing the transformation; and extending a data cache model to estimate best unroll-and-jam factors for the loop nest, capable of minimizing total cache misses incurred by the memory references in the loop body.
    Type: Grant
    Filed: August 19, 2021
    Date of Patent: April 18, 2023
    Assignee: International Business Machines Corporation
    Inventors: Wai Hung Tsang, Prithayan Barua, Ettore Tiotto, Bardia Mahjour, Jun Shirako
  • Patent number: 11609766
    Abstract: According to one embodiment, a data processing system performs a secure boot using a security module (e.g., a trusted platform module (TPM)) of a host system. The system verifies that an operating system (OS) and one or more drivers including an accelerator driver associated with a data processing (DP) accelerator is provided by a trusted source. The system launches the accelerator driver within the OS. The system generates a trusted execution environment (TEE) associated with one or more processors of the host system. The system launches an application and a runtime library within the TEE, where the application communicates with the DP accelerator via the runtime library and the accelerator driver.
    Type: Grant
    Filed: January 4, 2019
    Date of Patent: March 21, 2023
    Assignees: BAIDU USA LLC, BAIDU.COM TIMES TECHNOLOGY (BEIJING) CO., LTD., KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY LIMITED
    Inventors: Yueqiang Cheng, Yong Liu, Tao Wei, Jian Ouyang
  • Patent number: 11579853
    Abstract: An information processing apparatus includes a processor configured to: for each of a plurality of loops, acquire loop information including a number of variables, a number of registers, a number of memory commands for inputting and outputting a value of the variable between the register and a main storage device, and a number of arithmetic commands for the value of the variable stored in the register, which are used in the loop; calculate the number of variables, the number of registers, the number of memory commands, and the number of arithmetic commands, which correspond to a combination of the loops that are candidates for loop fusion, for each of the combinations of the loops; determine a combination to which the loop fusion is to be applied among the combinations which are calculated for each of the combinations; and execute the loop fusion on the determined combination.
    Type: Grant
    Filed: November 18, 2021
    Date of Patent: February 14, 2023
    Assignee: FUJITSU LIMITED
    Inventor: Tomoko Nikko
  • Patent number: 11579851
    Abstract: This disclosure relates generally to the field of source code processing, and, more particularly to a method and system for identification of redundant function-level slicing calls. The method disclosed generates program dependence graphs (PDGs) based on a slicing criteria and a function corresponding to the function-level slicing call. Further the method classifies the function-level slicing call into redundant or non-redundant by traversing the PDGs and checking if a predefined condition is satisfied or not. The function-level slicing calls are classified as redundant if the check is not satisfied and are classified as non-redundant if the check is satisfied. The disclosed method can be used in identifying redundant function-level slicing calls in applications such as automated false positive elimination (AFPE), automated test case generation and so on.
    Type: Grant
    Filed: September 20, 2021
    Date of Patent: February 14, 2023
    Assignee: Tata Consultancy Services Limited
    Inventor: Tukaram Bhagwat Muske
  • Patent number: 11537865
    Abstract: A processor system comprises a first and second group of registers and a hardware channel convolution processor unit. The first group of registers is configured to store data elements of channels of a portion of a convolution data matrix. Each register stores at least one data element from each channel. The second group of registers is configured to store data elements of convolution weight matrices including a separate convolution weight matrix for each channel. Each register stores at least one data element from each convolution weight matrix. The hardware channel convolution processor unit is configured to multiply each data element in the first group of registers with a corresponding data element in the second group of registers and sum together the multiplication results for each specific channel to determine corresponding channel convolution result data elements in a corresponding channel convolution result matrix.
    Type: Grant
    Filed: February 18, 2020
    Date of Patent: December 27, 2022
    Assignee: Meta Platforms, Inc.
    Inventors: Krishnakumar Narayanan Nair, Rakesh Komuravelli, Abdulkadir Utku Diril, Ehsan Khish Ardestani Zadeh, Yuchen Hao, Martin Schatz, Thomas Mark Ulrich, Olivia Wu, Anup Ramesh Kadkol, Amin Firoozshahian
  • Patent number: 11403083
    Abstract: An offloading server includes: a data transfer designation section configured to analyze reference relationships of variables used in loop statements in an application and designate, for data that can be transferred outside a loop, a data transfer using an explicit directive that explicitly specifies a data transfer outside the loop; a parallel processing designation section configured to identify loop statements in the application and specify a directive specifying application of parallel processing by an accelerator and perform compilation for each of the loop statements; and a parallel processing pattern creation section configured to exclude loop statements causing a compilation error from loop statements to be offloaded and create a plurality of parallel processing patterns each of which specifies whether to perform parallel processing for each of the loop statements not causing a compilation error.
    Type: Grant
    Filed: June 3, 2019
    Date of Patent: August 2, 2022
    Assignee: Nippon Telegraph and Telephone Corporation
    Inventors: Yoji Yamato, Hirofumi Noguchi, Misao Kataoka, Takuma Isoda, Tatsuya Demizu
  • Patent number: 11354159
    Abstract: A method comprises: compiling the code segment with a compiler; and determining, based on an intermediate result of the compiling, a resource associated with a dedicated processing unit and for executing the code segment. As such, the resource required for executing a code segment may be determined quickly without actually executing the code segment and allocating or releasing the resource, which helps subsequent resource allocation and further brings about a better user experience.
    Type: Grant
    Filed: August 14, 2019
    Date of Patent: June 7, 2022
    Assignee: EMC IP Holding Company LLC
    Inventors: Jinpeng Liu, Pengfei Wu, Junping Zhao, Kun Wang
  • Patent number: 11340903
    Abstract: The present application discloses a processing method, device, equipment and storage medium of a loop instruction, and relates to the fields of voice and chips. A specific embodiment is: acquiring a computer program including a first loop body, where the first loop body is generated according to a second loop body in a software code to be compiled, the first loop body includes a plurality of first loop instructions, the plurality of first loop instructions can be identified by a hardware structure of a computer device; in the case that the first loop body is detected, determining loop parameters of the first loop body according to the plurality of first loop instructions; acquiring the plurality of first loop instructions according to the loop parameters of the first loop body; executing the plurality of first loop instructions.
    Type: Grant
    Filed: March 23, 2021
    Date of Patent: May 24, 2022
    Assignee: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.
    Inventors: Junhui Wen, Chao Tian
  • Patent number: 11272028
    Abstract: A server in a content delivery (CD) network that distributes content on behalf of one or more subscribers. Responsive to a request from a client for a particular resource, if the particular resource is already in a cache on the server, serving the particular to the client from the cache; otherwise if the particular resource is not already cached on the server, when a count value exceeds a first threshold value, obtaining, caching, and serving the particular resource. When the count value is less than a second threshold value, obtaining and serving the particular resource. When the count value is: (i) not less than the second threshold value, and (ii) not greater than the first threshold value, then obtaining the particular resource and selectively caching the particular resource; and serving the particular resource to the client.
    Type: Grant
    Filed: April 29, 2020
    Date of Patent: March 8, 2022
    Assignee: Level 3 Communications, LLC
    Inventors: Daniel Lee Jensen, William Crowder, Christopher Newton, William R. Power
  • Patent number: 11221835
    Abstract: One or more execution traces of an application are accessed. The one or more execution traces have been collected at a basic block level. Basic blocks in the one or more execution traces are scored. Scores for the basic blocks represent benefits of performing binary slimming at the corresponding basic blocks. Runtime binary slimming is performed of the application based on the scores of the basic blocks.
    Type: Grant
    Filed: February 10, 2020
    Date of Patent: January 11, 2022
    Assignee: International Business Machines Corporation
    Inventors: Michael Vu Le, Ian Michael Molloy, Taemin Park
  • Patent number: 11148287
    Abstract: The system can include a set of joints, a controller, and a model engine; and can optionally include a support structure and an end effector. Joints can include: a motor, a transmission mechanism, an input sensor, and an output sensor. The system can enable articulation of the plurality of joints.
    Type: Grant
    Filed: April 13, 2021
    Date of Patent: October 19, 2021
    Assignee: Orangewood Labs Inc.
    Inventors: Abhinav Kumar, Aditya Bhatia, Akash Bansal, Anubhav Singh, Ashutosh Prakash, Aman Malhotra, Harshit Gaur, Prasang Srivasatava, Ashish Chauhan
  • Patent number: 11144291
    Abstract: Methods of accelerating the execution of neural networks are disclosed. A description of a neural network may be received. A plurality of operators may be identified based on the description of the neural network. A plurality of symbolic models associated with the plurality of operators may be generated. For each symbolic model, a nested loop associated with an operator may be identified, a loop order may be defined, and a set of data dependencies may be defined. A set of inter-operator dependencies may be extracted based on the description of the neural network. The plurality of symbolic models and the set of inter-operator dependencies may be analyzed to identify a combinable pair of nested loops. The combinable pair of nested loops may be combined to form a combined nested loop.
    Type: Grant
    Filed: November 27, 2019
    Date of Patent: October 12, 2021
    Assignee: Amazon Technologies, Inc.
    Inventors: Hongbin Zheng, Preston Pengra Briggs, Tobias Joseph Kastulus Edler von Koch, Taemin Kim, Randy Renfu Huang
  • Patent number: 11061678
    Abstract: In one embodiment, a method for improving a performance of an integrated circuit includes implementing one or more computing devices executing a compiler program that: (i) evaluates a target instruction set intended for execution by an integrated circuit; (ii) identifies one or more nested loop instructions within the target instruction set based on the evaluation; (iii) evaluates whether a most inner loop body within the one or more nested loop instructions comprises a candidate inner loop body that requires a loop optimization that mitigates an operational penalty to the integrated circuit based on one or more executional properties of the most inner loop instruction; and (iv) implements the loop optimization that modifies the target instruction set to include loop optimization instructions to control, at runtime, an execution and a termination of the most inner loop body thereby mitigating the operational penalty to the integrated circuit.
    Type: Grant
    Filed: December 18, 2020
    Date of Patent: July 13, 2021
    Assignee: qaudric.io Inc
    Inventors: Nigel Drego, Mrinalini Ravichandran, Jianman Chang, Daniel Firu, Veerbhan Kheterpal
  • Patent number: 10996989
    Abstract: Methods and systems for optimizing an application for a computing system having multiple distinct memory locations that are interconnected by one or more communication channels include determining one or more data handling properties for a data region in an application. One or more data handling policies for the data region are determined based on the one or more data handling properties. Data setup costs are determined for a scope in the application that uses the data region in different memory locations based on the one or more data handling properties. The application is optimized in accordance with the one or more data handling policies and the data setup costs for the different memory locations.
    Type: Grant
    Filed: June 13, 2016
    Date of Patent: May 4, 2021
    Assignee: International Business Machines Corporation
    Inventors: Tong Chen, John Kevin O'Brien, Daniel A. Prener, Zehra N. Sura
  • Patent number: 10901944
    Abstract: Storing an incoming data stream using successive files that are consecutively populated. The appropriate file to populate a given data stream portion into is determined by mapping the data stream offset to a file, and potentially also an address within that file. The successive files may be the same size, so that the file can be identified based on the data stream address (or offset) without the use of an index. Furthermore, the files may be easily named by having that size be some multiple of a binary power of bytes. That way, the files themselves can be automatically and named and identified by using the more significant bit or bits of the data stream offset to uniquely identify the file and establish ordering of the files. Replication may occur from a primary to a secondary store by transmitting the offset, and the actual data to be stored.
    Type: Grant
    Filed: May 24, 2017
    Date of Patent: January 26, 2021
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Rogerio Ramos, Fayssal Martani, Cristian Diaconu, Karthick Krishnamoorthy, Jacob R. Lorch
  • Patent number: 10831616
    Abstract: An information processing system, computer readable storage medium, and method for supporting resilient execution of computer programs. A method provides a resilient store wherein information in the resilient store can be accessed in the event of a failure. The method periodically checkpoints application state in the resilient store. A resilient executor comprises software which executes applications by catching failures. The method uses the resilient executor to execute at least one application. In response to the resilient executor detecting a failure, restoring application state information to the at least one application from a checkpoint stored in the resilient store, the resilient executor resuming execution of the at least one application with the restored application state information.
    Type: Grant
    Filed: April 18, 2019
    Date of Patent: November 10, 2020
    Assignee: International Business Machines Corporation
    Inventors: Arun Iyengar, Joshua J. Milthorpe
  • Patent number: 10831617
    Abstract: An information processing system, computer readable storage medium, and method for supporting resilient execution of computer programs. A method provides a resilient store wherein information in the resilient store can be accessed in the event of a failure. The method periodically checkpoints application state in the resilient store. A resilient executor comprises software which executes applications by catching failures. The method uses the resilient executor to execute at least one application. In response to the resilient executor detecting a failure, restoring application state information to the at least one application from a checkpoint stored in the resilient store, the resilient executor resuming execution of the at least one application with the restored application state information.
    Type: Grant
    Filed: April 23, 2019
    Date of Patent: November 10, 2020
    Assignee: International Business Machines Corporation
    Inventors: Arun Iyengar, Joshua J. Milthorpe
  • Patent number: 10754744
    Abstract: The amount of speed-up that can be obtained by optimizing the program to run on a different architecture is determined by static measurements of the program. Multiple such static measurements are processed by a machine learning system after being discretized to alter their accuracy vs precision. Static analysis requires less analysis overhead and permits analysis of program portions to optimize allocation of porting resources on a large program.
    Type: Grant
    Filed: March 15, 2016
    Date of Patent: August 25, 2020
    Assignee: Wisconsin Alumni Research Foundation
    Inventors: Karthikeyan Sankaralingam, Newsha Ardalani, Urmish Thakker
  • Patent number: 10747511
    Abstract: As a memory usage optimization, a compiler identifies coroutines whose activation frames can be allocated on a caller's stack instead of allocating the frame on the heap. For example, when the compiler determines that a coroutine C's life cannot extend beyond the life of the routine R that first calls the coroutine C, the compiler generates code to allocate the activation frame for C on the stack of R, instead of generating code to allocate C's frame from heap memory. In some cases, as another optimization, code for coroutine C is also inlined with code for the routine R that calls C. Coroutine activation frame content variations and layout variations are also described.
    Type: Grant
    Filed: June 26, 2015
    Date of Patent: August 18, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: James J. Radigan, Gor Nishanov