Including Loop Patents (Class 717/160)
-
Patent number: 10089464Abstract: A device receives data, identifies a context associated with the data, and identifies a script, within the data, associated with the context. The device parses the script to identify tokens, forms nodes based on the tokens, and assembles a syntax tree using the nodes. The device renames one or more identifiers associated with the nodes and generates a normalized text, associated with the script, based on the syntax tree after renaming the one or more identifiers. The device determines whether the normalized text matches a regular expression signature and processes the data based on determining whether the normalized text matches the regular expression signature. The device processes the data by a first process when the normalized text matches the regular expression signature or by a second process, different from the first process, when the normalized text does not match the regular expression signature.Type: GrantFiled: August 15, 2016Date of Patent: October 2, 2018Assignee: Juniper Networks, Inc.Inventor: Ankur Tyagi
-
Patent number: 10089009Abstract: A computer-implemented method for layered storage of enterprise data comprises receiving from one or more virtual machines data blocks; time-based grouping the data blocks into data containers; dividing each data container in X fixed length mega-blocks; for each data container applying erasure encoding to the X fixed length mega-blocks to thereby generate Y fixed length mega-blocks with redundant data, Y being larger than X; and distributed storing the Y fixed length mega-blocks across one or multiple backend storage systems.Type: GrantFiled: January 17, 2017Date of Patent: October 2, 2018Assignee: INURONInventor: Kurt Glazemakers
-
Patent number: 10089088Abstract: A computer configured to perform compiling, including a memory configured to store a source program and a processor, the processor is configured to execute a method which includes; compiling the source program, wherein a number of cycles desired for executing each function included in the source program and information indicating a call relationship between a task and a function called by the task are generated, and performing link processing, wherein a number of cycles desired for executing each task based on the number of cycles desired for executing each function and the call relationship.Type: GrantFiled: May 26, 2016Date of Patent: October 2, 2018Assignee: FUJITSU LIMITEDInventor: Kuninori Ishii
-
Patent number: 10074151Abstract: Described herein are technologies related to technologies to facilitate real-time computer vision applications, especially those with autonomous or semi-autonomous locomotive robots (e.g., drones or self-driving cars). More particularly, the technologies described herein facilitate, for example, real-time motion estimation using dense optical flow. The technologies accelerate dense optical flow (DOF) processing of images by using the parallel processing techniques of a single-instruction, multiple data (SIMD) computing system.Type: GrantFiled: September 30, 2015Date of Patent: September 11, 2018Assignee: Intel CorporationInventor: Avigdor Eldar
-
Patent number: 10039036Abstract: The invention provides a system and method for repairing corrupt security information. At a serving node in a telecommunications network, security capabilities of a terminal are received when the terminal registers with the serving node. The received security capabilities are stored. A path switch request message is received from a target base station following an X2 handover request sent from a source base station to the target base station for handover of the terminal, the path switch request including the security capabilities of the terminal. The serving node determines whether the security capabilities of the terminal stored in the storage medium should be sent to the target base station. If so, the serving node sends the stored security capabilities of the terminal to the target base station for use in reselecting security algorithms to be used in communications between the target base station and terminal following the handover.Type: GrantFiled: May 19, 2017Date of Patent: July 31, 2018Assignee: TELEFONAKTIEBOLAGET LM ERICSSON (PUBL)Inventor: Karl Norrman
-
Patent number: 9898268Abstract: A method and system for enhanced local commoning optimization of compilation of a program. Commoning of volatiles within an extended block for a particular memory model associated with a particular programming language is performed, using a two pass approach. Within a first pass, a determination is made as to where in the program to evaluate volatile expressions that can be commoned. In a second pass, all remaining expressions that are not volatile expressions are commoned.Type: GrantFiled: July 20, 2016Date of Patent: February 20, 2018Assignee: International Business Machines CorporationInventors: Andrew J. Craik, Patrick R. Doyle, Vijay Sundaresan
-
Patent number: 9892661Abstract: A method for digital immunity includes identifying a call graph of an executable entity, and mapping nodes of the call graph to a cipher table of obscured information, such that each node based on invariants in the executable entity. A cipher table maintains associations between the invariants and the obscured information. Construction of an obscured information item, such as a executable set of instructions or a program, involves extracting, from the cipher table, ordered portions of the obscured information, in which the ordered portions have a sequence based on the ordering of the invariants, and ensuring that the obscured information matches a predetermined ordering corresponding to acceptable operation, such as by execution of the instructions represented by the obscured information, or steganographic target program (to distinguish from the executable entity being evaluated). The unmodified nature of the executable entity is assured by successful execution of the steganographic target program.Type: GrantFiled: February 1, 2017Date of Patent: February 13, 2018Assignee: DIGITAL IMMUNITY LLCInventors: Thomas H. Probert, Henry R. Tumblin
-
Patent number: 9753727Abstract: Generally, this disclosure provides technologies for generating and executing partially vectorized code that may include backward dependencies within a loop body of the code to be vectorized. The method may include identifying backward dependencies within a loop body of the code; selecting one or more ranges of iterations within the loop body, wherein the selected ranges exclude the identified backward dependencies; and vectorizing the selected ranges. The system may include a vector processor configured to provide predicated vector instruction execution, loop iteration range enabling, and dynamic loop dependence checking.Type: GrantFiled: October 25, 2012Date of Patent: September 5, 2017Assignee: INTEL CORPORATIONInventors: Tin-Fook Ngai, Chunxiao Lin, Yingzhe Shen, Chao Zhang
-
Patent number: 9696995Abstract: Mechanisms for extracting data dependencies during runtime are provided. With these mechanisms, a portion of code having a loop is executed. A first parallel execution group is generated for the loop, the group comprising a subset of iterations of the loop less than a total number of iterations of the loop. The first parallel execution group is executed by executing each iteration in parallel. Store data for iterations are stored in corresponding store caches of the processor. Dependency checking logic of the processor determines, for each iteration, whether the iteration has a data dependence. Only the store data for stores where there was no data dependence determined are committed to memory.Type: GrantFiled: December 30, 2009Date of Patent: July 4, 2017Assignee: International Business Machines CorporationInventors: Alexandre E. Eichenberger, Brian K. Flachs, Charles R. Johns, Mark R. Nutter
-
Patent number: 9696996Abstract: Mechanisms for extracting data dependencies during runtime are provided. With these mechanisms, a portion of code having a loop is executed. A first parallel execution group is generated for the loop, the group comprising a subset of iterations of the loop less than a total number of iterations of the loop. The first parallel execution group is executed by executing each iteration in parallel. Store data for iterations are stored in corresponding store caches of the processor, Dependency checking logic of the processor determines, for each iteration, whether the iteration has a data dependence. Only the store data for stores where there was no data dependence determined are committed to memory.Type: GrantFiled: March 30, 2012Date of Patent: July 4, 2017Assignee: International Business Machines CorporationInventors: Alexandre E. Eichenberger, Brian K. Flachs, Charles R. Johns, Mark R. Nutter
-
Patent number: 9619290Abstract: A method of balancing execution rates for a plurality of parallel program loops being executed concurrently by a processor may include estimating a completion time for each program loop of the plurality of program loops, determining a difference between the estimated completion time of a first program loop of the plurality of program loops and the estimated completion time of a second program loop of the plurality of program loops, and decreasing the difference by adjusting an execution rate of the first program loop.Type: GrantFiled: March 6, 2015Date of Patent: April 11, 2017Assignee: Advanced Micro Devices, Inc.Inventors: Peter Bailey, Indrani Paul, Manish Arora
-
Patent number: 9600254Abstract: A method for reducing loop branches comprises analyzing an intermediate code to identify a candidate loop; analyzing the candidate loop to identify a candidate conditional statement containing at least one mutable operand; and determining if the computation in the candidate conditional statement is monotonic. The method further comprises calculating initial and final values of the mutable operand and generating a first version of the candidate loop which does not contain the candidate conditional statement and which is configured to be executed if the initial and final values of the mutable operand satisfy a range check. The method also comprises generating a second version of the candidate loop which contains the candidate conditional statement and which is configured to be executed if at least one of the initial and final values of the mutable operand does not satisfy the range check.Type: GrantFiled: November 30, 2015Date of Patent: March 21, 2017Assignee: International Business Machines CorporationInventors: Yaoqing Gao, Archana Ravindar
-
Patent number: 9602289Abstract: A method for digital immunity includes identifying a call graph of an executable entity, and mapping nodes of the call graph to a cipher table of obscured information, such that each node based on invariants in the executable entity. A cipher table maintains associations between the invariants and the obscured information. Construction of an obscured information item, such as a executable set of instructions or a program, involves extracting, from the cipher table, ordered portions of the obscured information, in which the ordered portions have a sequence based on the ordering of the invariants, and ensuring that the obscured information matches a predetermined ordering corresponding to acceptable operation, such as by execution of the instructions represented by the obscured information, or steganographic target program (to distinguish from the executable entity being evaluated). The unmodified nature of the executable entity is assured by successful execution of the steganographic target program.Type: GrantFiled: November 25, 2015Date of Patent: March 21, 2017Assignee: DIGITAL IMMUNITY LLCInventor: Thomas H Probert
-
Patent number: 9588747Abstract: Methods and apparatuses of converting a program, which may enhance an execution speed of a computer program, are provided. The method may include receiving a program, detecting at least one loop statement including at least one branch statement within the program, determining whether the loop statement may be split into at one or more sub-loop statements which perform the same function as a function of the loop statement and from which the branch statement has been removed, splitting the loop statement into the sub-loop statements and removing the branch statement included in the loop statement if it is determined that the loop statement may be split as a result of the determination, and outputting a result of removing the branch statement.Type: GrantFiled: March 11, 2014Date of Patent: March 7, 2017Assignee: Samsung Electronics Co., Ltd.Inventors: Sang-oak Woo, Seok-yoon Jung, Si-hwa Lee, Igor M. Laevskiy, Oleg V. Talalov, Vladislav Y. Aranov
-
Patent number: 9582332Abstract: In accordance with some embodiments, a public infrastructure as a service (IaaS) user can provide a file, to a cloud service provider, with information about the specific instructions and opcodes that may be used in an application run on the cloud service provider's system. This information may be developed at compile time by the user before the user deploys the workload onto the public IaaS cloud. Thus the user has complete control over the information that is provided.Type: GrantFiled: August 31, 2012Date of Patent: February 28, 2017Assignee: Intel CorporationInventor: Radhakrishna Hiremane
-
Patent number: 9569185Abstract: A method for compiling code includes receiving a code section representation including a guard at a location, placing, at the guard, a triggering condition, and relocating the guard from the location to a second location in the code section representation. The method also includes transforming the guard into a control-split instruction. The control-split instruction includes a deoptimization branch and a continue execution branch. The method further includes placing, at the guard, a deoptimization instruction, and selecting a symbolic frame state linked to a side-effecting instruction. The side-effecting instruction is the last side-effecting instruction before the control-split instruction. The method also includes linking the deoptimization instruction with the symbolic frame state based on the symbolic frame state being linked to the side-effecting instruction, unlinking the symbolic frame state from the side-effecting instruction, and storing the code section representation.Type: GrantFiled: February 7, 2014Date of Patent: February 14, 2017Assignee: Oracle International CorporationInventors: Thomas Wuerthinger, Gilles Marie Duboscq
-
Patent number: 9552195Abstract: Disclosed here are methods, systems, paradigms and structures for incrementally compiling scripts at runtime to generate executable code. The incremental compilation generates executable code corresponding to basic blocks of a script in various phases and at various scopes. In a first phase, an executable code for a basic block of the script is generated for a set of types of variables of the basic block. The generated executable block is stored and executed for subsequent requests. In a second phase, a set of executable blocks whose profiling information, such as frequency of (a) execution, (b) transition between two executable blocks, or (c) execution of a particular path, satisfies an optimization criterion is identified. The identified set of executable blocks are combined to generate an executable control region, which is more optimal than the executable blocks generated in the first phase. The executable control region is executed for subsequent requests.Type: GrantFiled: March 8, 2013Date of Patent: January 24, 2017Assignee: Facebook, Inc.Inventors: Ali-Reza Adl-Tabatabai, Guilherme de Lima Ottoni
-
Patent number: 9424011Abstract: A computer-implemented method, carried out by one or more processors, for recursive expression reduction. In an embodiment, the method comprises the steps of identifying a candidate loop, where the candidate loop includes at least one or more reduction variables and reduction operations; altering grouping of loop invariants and loop variants within the candidate loop; and performing recursive expression simplification for an inner loop, wherein the inner loop is located within the candidate loop.Type: GrantFiled: April 1, 2014Date of Patent: August 23, 2016Assignee: International Business Machines CorporationInventors: Shimin Cui, Yaoqing Gao
-
Patent number: 9405516Abstract: A computer-implemented method, carried out by one or more processors, for recursive expression reduction. In an embodiment, the method comprises the steps of identifying a candidate loop, where the candidate loop includes at least one or more reduction variables and reduction operations; altering grouping of loop invariants and loop variants within the candidate loop; and performing recursive expression simplification for an inner loop, wherein the inner loop is located within the candidate loop.Type: GrantFiled: January 23, 2015Date of Patent: August 2, 2016Assignee: International Business Machines CorporationInventors: Shimin Cui, Yaoqing Gao
-
Patent number: 9361079Abstract: A technique is disclosed for executing a compiled parallel application on a general purpose processor. The compiled parallel application comprises parallel thread execution code, which includes single-instruction multiple-data (SIMD) constructs, as well as references to intrinsic functions conventionally available in a graphics processing unit. The parallel thread execution code is transformed into an intermediate representation, which includes vector instruction constructs. The SIMD constructs are mapped to vector instructions available within the intermediate representation. Intrinsic functions are mapped to corresponding emulated runtime implementations. The technique advantageously enables parallel applications compiled for execution on a graphics processing unit to be executed on a general purpose central processing unit configured to support vector instructions.Type: GrantFiled: January 30, 2012Date of Patent: June 7, 2016Assignee: NVIDIA CorporationInventors: Vinod Grover, Andrew Kerr, Sean Lee
-
Patent number: 9348587Abstract: The present invention relates to a processor having a trace cache and a plurality of ALUs arranged in a matrix, comprising an analyzer unit located between the trace cache and the ALUs, wherein the analyzer unit analyzes the code in the trace cache, detects loops, transforms the code, and issues to the ALUs sections of the code combined to blocks for joint execution for a plurality of clock cycles.Type: GrantFiled: July 8, 2011Date of Patent: May 24, 2016Assignee: Hyperion Core, Inc.Inventor: Martin Vorbach
-
Patent number: 9342284Abstract: Mechanisms for reducing memory access violations are disclosed. Sets of instructions may be identified and the identified sets of instructions may be re-translated or optimized to generate other sets of instructions. Execution of the other sets of instructions is analyzed to determine whether additional memory access violations occur. When additional memory access violations occur, further sets of instructions may be generated or re-translation/optimization of instructions may be disabled.Type: GrantFiled: September 27, 2013Date of Patent: May 17, 2016Assignee: Intel CorporationInventors: Wessam M. Hassanein, Abhay S. Kanhere, Paul Caprioli
-
Patent number: 9280339Abstract: This disclosure describes systems, methods, and computer-readable media related to online advertisement campaign recommendations. An archive file may be received from a server. The archive file may include one or more compiled code files and a manifest file. The archive file may be unpackaged. The one or more compiled code files may be optimized based at least in part on the manifest file. The optimizing the one or more compiled code files may include identifying a first sequence of bytes and a second sequence of bytes from one or more sources; formatting the second sequence of bytes based at least in part on one or more rules; searching the one or more compiled code files to identify one or more sequence of bytes matching the first sequence of bytes; and replacing the identified one or more sequence of bytes with the formatted second sequence of bytes. The optimized compiled code files may be stored.Type: GrantFiled: December 12, 2013Date of Patent: March 8, 2016Assignee: Amazon Technologies, Inc.Inventors: Andrew Eugene Prunicki, Jianming Wu
-
Patent number: 9275246Abstract: A system and method for static detection and categorization of information-flow downgraders includes transforming a program stored in a memory device by statically analyzing program variables to yield a single assignment to each variable in an instruction set. The instruction set is translated to production rules with string operations. A context-free grammar is generated from the production rules to identify a finite set of strings. An information-flow downgrader function is identified by checking the finite set of strings against one or more function specifications.Type: GrantFiled: October 8, 2009Date of Patent: March 1, 2016Assignee: International Business Machines CorporationInventors: Yinnon Haviv, Roee Hay, Marco Pistoia, Guy Podjarny, Adi Sharabani, Takaaki Tateishi, Omer Tripp, Omri Weisman
-
Patent number: 9262166Abstract: Various embodiments are directed to a heterogeneous processor architecture comprised of a CPU and a GPU on the same processor die. The heterogeneous processor architecture may optimize source code in a GPU compiler using vector strip mining to reduce instructions of arbitrary vector lengths into GPU supported vector lengths and loop peeling. It may be first determined that the source code is eligible for optimization if more than one machine code instruction of compiled source code under-utilizes GPU instruction bandwidth limitations. The initial vector strip mining results may be discarded and the first iteration of the inner loop body may be peeled out of the loop. The type of operands in the source code may be lowered and the peeled out inner loop body of source code may be vector strip mined again to obtain optimized source code.Type: GrantFiled: November 30, 2011Date of Patent: February 16, 2016Assignee: INTEL CORPORATIONInventors: Xiaozhu Kang, Biju George, Ken Lueh
-
Patent number: 9256411Abstract: An optimizing compiler includes a strength reduction mechanism that optimizes a computer program that includes conditional operations by analyzing the instructions in the computer program in a single pass, determining whether instruction substitution is profitable for original instructions in the code, and performing instruction substitution for one or more original instructions for which instruction substitution is deemed profitable, including conditional operations. The substituted instructions result in strength reduction in the computer program.Type: GrantFiled: February 18, 2013Date of Patent: February 9, 2016Assignee: International Business Machines CorporationInventor: William J. Schmidt
-
Patent number: 9250879Abstract: An optimizing compiler includes a strength reduction mechanism that optimizes a computer program that includes conditional operations by analyzing the instructions in the computer program in a single pass, determining whether instruction substitution is profitable for original instructions in the code, and performing instruction substitution for one or more original instructions for which instruction substitution is deemed profitable, including conditional operations. The substituted instructions result in strength reduction in the computer program.Type: GrantFiled: February 14, 2013Date of Patent: February 2, 2016Assignee: International Business Machines CorporationInventor: William J. Schmidt
-
Patent number: 9250895Abstract: According to one exemplary embodiment, a method for establishing subsystem boundaries is provided. The method may include receiving an input program having a plurality of subroutines and at least one inter-subroutine call. The method may include generating a graph having a plurality of nodes and at least one edge, wherein the at least one edge includes a first end connected to a first node and a second end connected to a second node. The method may include assigning an edge weight to the at least one edge wherein the edge weight is based on a number of second ends received by the second node. The method may include determining, based on the assigned edge weight, a distance value between each pair of nodes. The method may include generating a grouping of nodes based on the determined distance value between each pair of nodes.Type: GrantFiled: June 24, 2014Date of Patent: February 2, 2016Assignee: International Business Machines CorporationInventor: Michael T. Strosaker
-
Patent number: 9244677Abstract: Loop vectorization methods and apparatus are disclosed. An example method includes setting a dynamic adjustment value of a vectorization loop; executing the vectorization loop to vectorize a loop by grouping iterations of the loop into one or more vectors; identifying a dependency between iterations of the loop as; and setting the dynamic adjustment value based on the identified dependency.Type: GrantFiled: September 28, 2012Date of Patent: January 26, 2016Assignee: Intel CorporationInventors: Nalini Vasudevan, Jayashankar Bharadwaj, Christopher J. Hughes, Milind B. Girkar, Mark J Charney, Robert Valentine, Victor W. Lee, Daehyun Kim, Albert Hartono, Sara S. Baghsorkhi
-
Patent number: 9195444Abstract: In a compiler apparatus, a memory unit stores a first code including a loop having a first arithmetic expression including a first variable that refers to a result of K iterations previous calculation. A transformation unit develops the first arithmetic expression into a second arithmetic expression not including the first variable, using a second variable that refers to a result of K+1 iterations or more previous calculation, compares an execution time for executing the loop on the basis of the first arithmetic expression with an execution time for executing the loop in which the calculations of Jth and J+Kth iterations of the loop are executed in parallel on the basis of the second arithmetic expression, and decides based on the comparison result whether to transform the first code into a second code including a parallel processing instruction for executing the Jth and J+Kth iterations in parallel.Type: GrantFiled: March 2, 2015Date of Patent: November 24, 2015Assignee: FUJITSU LIMITEDInventor: Masatoshi Haraguchi
-
Patent number: 9176717Abstract: An embodiment of the invention provides a method for exploiting stateless and stateful data parallelism in a streaming application, wherein a compiler determines whether an operator of the streaming application is safe to parallelize based on a definition of the operator and an instance of the definition. The operator is not safe to parallelize when the operator has selectivity greater than 1, wherein the selectivity is the number of output tuples generated for each input tuple. Parallel regions are formed within the streaming application with the compiler when the operator is safe to parallelize. Synchronization strategies for the parallel regions are determined with the compiler, wherein the synchronization strategies are determined based on the definition of the operator and the instance of the definition. The synchronization strategies of the parallel regions are enforced with a runtime system.Type: GrantFiled: October 12, 2012Date of Patent: November 3, 2015Assignee: International Business Machines CorporationInventors: Bugra Gedik, Martin J. Hirzel, Scott A. Schneider, Kun-Lung Wu
-
Patent number: 9170794Abstract: An embodiment of the invention provides a method for exploiting stateless and stateful data parallelism in a streaming application, wherein a compiler determines whether an operator of the streaming application is safe to parallelize based on a definition of the operator and an instance of the definition. The operator is not safe to parallelize when the operator has selectivity greater than 1, wherein the selectivity is the number of output tuples generated for each input tuple. Parallel regions are formed within the streaming application with the compiler when the operator is safe to parallelize. Synchronization strategies for the parallel regions are determined with the compiler, wherein the synchronization strategies are determined based on the definition of the operator and the instance of the definition. The synchronization strategies of the parallel regions are enforced with a runtime system.Type: GrantFiled: August 28, 2012Date of Patent: October 27, 2015Assignee: International Business Machines CorporationInventors: Bugra Gedik, Martin J. Hirzel, Scott A. Schneider, Kun-Lung Wu
-
Patent number: 9170792Abstract: In an embodiment, a system includes a processor including at least one core to execute operations of a loop that includes S stages. The system also includes stage insertion means for adding a delay stage to the loop to increase a lifetime of a corresponding register associated with a first variable of the loop and to delay storage of contents of the register. The system also includes a dynamic random access memory (DRAM). Other embodiments are described and claimed.Type: GrantFiled: May 30, 2013Date of Patent: October 27, 2015Assignee: Intel CorporationInventors: Hyunchul Park, Hongbo Rong, Youfeng Wu
-
Patent number: 9152531Abstract: The invention is directed to instrumenting object code of an application and/or an operating system on a target machine so that execution trace data can be generated, collected, and subsequently analyzed for various purposes, such as debugging and performance. Automatic instrumentation may be performed on an application's object code before, during or after linking. A target machine's operating system's object code can be manually or automatically instrumented. By identifying address space switches and thread switches in the operating system's object code, instrumented code can be inserted at locations that enable the execution trace data to be generated. The instrumentation of the operating system and application can enable visibility of total system behavior by enabling generation of trace information sufficient to reconstruct address space switches and context switches.Type: GrantFiled: February 18, 2005Date of Patent: October 6, 2015Assignee: GREEN HILLS SOFWARE, INC.Inventors: Daniel M. Hecht, Michael Lindahl, David Kleidermacher, Gregory E. Davis, Neil C. Puthuff
-
Patent number: 9146717Abstract: Techniques for optimizing code include methods, systems, and computer program products that implement operations including: identifying a decision table having values arranged in one or more cells in a row and column format, the values defining business rules; evaluating the decision table to generate one or more temporary tables, at least one temporary table including the values associated with particular positions of a string variable of undefined length; evaluating the one or more temporary tables to set the positions of the string variable based on comparisons of the values with inputs; and generating a portion of code defining the business rules based on the evaluation of the one or more temporary tables.Type: GrantFiled: March 12, 2012Date of Patent: September 29, 2015Assignee: SAP SEInventor: Carsten Ziegler
-
Patent number: 9118625Abstract: Provided are an anti-malware (AM) system, a method of processing data in the AM system, and a computing device including the AM system. The AM system includes a hardware-based AV engine configured to perform hash matching on data for AV scanning of the data, and an AV function module configured to determine whether or not the data includes a virus pattern on the basis of a result of the hash matching.Type: GrantFiled: November 18, 2013Date of Patent: August 25, 2015Assignee: SAMSUNG SDS CO., LTD.Inventor: In Seon Yoo
-
Patent number: 9052984Abstract: A system, method and program product for optimizing compiled Java code to reduce file size. A system is provided that includes: a first optimization that removes unnecessary exception declarations in the compiled Java code; a second optimization that converts checked exception declarations to unchecked exception declarations in the compiled Java code; and a third optimization that removes exception lists in the compiled Java code.Type: GrantFiled: February 26, 2009Date of Patent: June 9, 2015Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Sean C. Foley, Berthold M. Lebert
-
Patent number: 9038045Abstract: Control flow information and data flow information associated with a program containing a upc_forall loop are built. A shared reference map data structure using the control flow information and the data flow information is created. All local shared accesses are hashed to facilitate a constant access stride after being rewritten. All local shared references in a hash entry having a longest list are privatized. The upc_forall loop is rewritten into a for loop. Responsive to a determination that an unprocessed upc_forall loop does not exist, dead store elimination is run. The control flow information and the data flow information associated with the program containing the for loop is rebuilt.Type: GrantFiled: November 15, 2011Date of Patent: May 19, 2015Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Yaoqing Gao, Liangxiao Hu, Raul Esteban Silvera, Ettore Tiotto
-
Patent number: 9021233Abstract: A vector data access unit includes data access ordering circuitry, for issuing data access requests indicated by elements of earlier and a later vector instructions, one being a write instruction. An element indicating the next data access for each of the instructions is determined. The next data accesses for the earlier and the later instructions may be reordered. The next data access of the earlier instruction is selected if the position of the earlier instruction's next data element is less than or equal to the position of the later instruction's next data element minus a predetermined value. The next data access of the later instruction may be selected if the position of the earlier instruction's next data element is higher than the position of the later instruction's next data element minus a predetermined value. Thus data accesses from earlier and later instructions are partially interleaved.Type: GrantFiled: September 28, 2011Date of Patent: April 28, 2015Assignee: ARM LimitedInventor: Alastair David Reid
-
Patent number: 9015690Abstract: A system and method for optimization of code with non-adjacent loops. A compiler builds a node tree, which is not a control flow graph, that represents parent-child relationships of nodes of a computer program. Each node represents a control flow statement or a straight-line block of statements of the computer program. If a non-adjacent loop pair of nodes satisfy predetermined conditions, the compiler may perform legal code transformations on the computer program and corresponding node transformations on the node tree. These transformations may make adjacent this pair of loop nodes. The compiler may be configured to perform legal code transformations, such as head and tail duplication, code motion, and if-merging, in order to make adjacent these two loop nodes. Then loop fusion may be performed on this loop pair in order to increase instruction level parallelism (ILP) within an optimized version of the original source code.Type: GrantFiled: August 22, 2009Date of Patent: April 21, 2015Assignee: Advanced Micro Devices, Inc.Inventors: Mei Ye, Dinesh Suresh, Dz-ching Ju, Michael Lai
-
Publication number: 20150106798Abstract: Methods and systems are provided that utilize compiler technology in identifying changed critical variables in work assignment code that cause synchronization issues between a master system and another server. The identified changed critical variables are shared by the master server in a high availability environment. In general, the sharing of changed critical variables includes sending, via a master system, changed code or critical variables to a receiving system. The receiving system can implement the changed code or critical variables to maintain synchronization with the master system.Type: ApplicationFiled: October 10, 2013Publication date: April 16, 2015Applicant: Avaya Inc.Inventor: Robert C. Steiner
-
Publication number: 20150095897Abstract: Methods and apparatuses of converting a program, which may enhance an execution speed of a computer program, are provided. The method may include receiving a program, detecting at least one loop statement including at least one branch statement within the program, determining whether the loop statement may be split into at one or more sub-loop statements which perform the same function as a function of the loop statement and from which the branch statement has been removed, splitting the loop statement into the sub-loop statements and removing the branch statement included in the loop statement if it is determined that the loop statement may be split as a result of the determination, and outputting a result of removing the branch statement.Type: ApplicationFiled: March 11, 2014Publication date: April 2, 2015Applicant: Samsung Electronics Co., Ltd.Inventors: Sang-oak WOO, Seok-yoon Jung, Si-hwa Lee, Igor M. Laevskiy, Oleg V. Talalov, Vladislav Y. Aranov
-
Patent number: 8997073Abstract: A computer implemented method entails identifying code regions in an application from which offloadable tasks can be generated by a compiler for heterogenous computing system with processor and accelerator memory, including adding relaxed semantics to a directive based language in the heterogenous computing for allowing a suggesting rather than specifying a parallel code region as an offloadable candidate, and identifying one or more offloadable tasks in a neighborhood of code region marked by the directive.Type: GrantFiled: April 25, 2014Date of Patent: March 31, 2015Assignee: NEC Laboratories America, Inc.Inventors: Nishkam Ravi, Yi Yang, Srimat Chakradhar
-
Publication number: 20150089485Abstract: In a system for automatic generation of event-driven, tuple-space based programs from a sequential specification, a hierarchical mapping solution can target different runtimes relying on event-driven tasks (EDTs). The solution uses loop types to encode short, transitive relations among EDTs that can be evaluated efficiently at runtime. Specifically, permutable loops translate immediately into conservative point-to-point synchronizations of distance one. A runtime-agnostic which can be used to target the transformed code to different runtimes.Type: ApplicationFiled: September 22, 2014Publication date: March 26, 2015Inventors: Muthu M. Baskaran, Thomas Henretty, M. H. Langston, Richard A. Lethin, Benoit J. Meister, Nicolas T. Vasilache, David E. Wohlford
-
Patent number: 8990791Abstract: Partitioned global address space (PGAS) programming language source code is retrieved by an executed PGAS compiler. At least one shared memory array access indexed by an affine expression that includes a distinct thread identifier that is constant and different for each of a group of program execution threads targeted to execute the PGAS source code is identified within the PGAS source code. It is determined whether the at least one shared memory array access results in a local shared memory access by all of the group of program execution threads for all references to the at least one shared memory array access during execution of a compiled executable of the PGAS source code. A direct memory access executable code is generated for each shared memory array access determined to result in the local shared memory access by all of the group of program execution threads.Type: GrantFiled: July 29, 2011Date of Patent: March 24, 2015Assignee: International Business Machines CorporationInventors: Salem Derisavi, Ettore Tiotto
-
Patent number: 8984499Abstract: According to one embodiment, a code optimizer is configured to receive first code having a program loop implemented with scalar instructions to store values of a first array to a second array based on values of a third array and to generate second code representing the program loop using at least one vector instruction. The second code include a shuffle instruction to shuffle elements of the first array based on the third array using a shuffle table in a vector manner, a blend instruction to blend the shuffled elements of the first array using a blend table in a vector manner, and a store instruction to store the blended elements of the first array in the second array.Type: GrantFiled: December 15, 2011Date of Patent: March 17, 2015Assignee: Intel CorporationInventors: Tal Uliel, Elmoustapha Ould-Ahmedvall, Bret T. Toll
-
Publication number: 20150067662Abstract: A computer system for generating an optimized program code from a program code having a loop with an exit branch, wherein the computer system comprises a processing unit, wherein the processing unit is arranged to convert an exit instruction of the exit branch into a predicated exit instruction, wherein the processing unit is arranged to determine common dependencies within the loop, wherein the processing unit is arranged to generate modified dependencies by adding additional dependencies to the common dependencies, and wherein the processing unit is arranged to apply an algorithm that uses software pipelining for generating an optimized program code for the loop based on the modified dependencies.Type: ApplicationFiled: April 20, 2012Publication date: March 5, 2015Applicant: Freescale Semiconductor, Inc.Inventor: Rene Catalin Palalau
-
Patent number: 8966459Abstract: A compiling method compiles an object program to be executed by a processor having a plurality of execution units operable in parallel. In the method a first availability chain is created from a producer instruction (p1), scheduled for execution by a first one of the execution units (20: AGU), to a first consumer instruction (c1), scheduled for execution by a second one of the execution units (22: EXU) and requiring a value produced by the said producer instruction. The first availability chain comprises at least one move instruction (mv1-mv3) for moving the required value from a first point (20: ARF) accessible by the first execution unit to a second point (22: DRF) accessible by the second execution unit.Type: GrantFiled: February 20, 2014Date of Patent: February 24, 2015Assignee: Altera CorporationInventors: Marcio Merino Fernandes, Raymond Malcolm Livesley
-
Patent number: 8966461Abstract: A medium, method, and apparatus are disclosed for eliding superfluous function invocations in a vector-processing environment. A compiler receives program code comprising a width-contingent invocation of a function. The compiler creates a width-specific executable version of the program code by determining a vector width of a target computer system and omitting the function from the width-specific executable if the vector width meets one or more criteria. For example, the compiler may omit the function call if the vector width is greater than a minimum size.Type: GrantFiled: September 29, 2011Date of Patent: February 24, 2015Assignee: Advanced Micro Devices, Inc.Inventors: Benedict R. Gaster, Lee W. Howes, Mark D. Hummel
-
Patent number: RE46277Abstract: An operation method has processing for applying a same type of operation in parallel to N M-bit operands to obtain N M-bit operation results executed on a computer. Here, N is an integer equal to or greater than 2 and M is an integer equal to or greater than 1. The operation method includes: an operation step of applying the type of operation to an N*M-bit provisional operand that is formed by concatenating the N M-bit operands, to obtain one N*M-bit provisional operation result, and generating correction information based on an effect had, by applying the operation, on each M bits of the provisional operation result from a bit that neighbors the M bits; and a correction step of correcting the provisional operation result in M-bit units with use of the correction information, to obtain the N M-bit operation results.Type: GrantFiled: June 24, 2009Date of Patent: January 17, 2017Assignee: SOCIONEXT INC.Inventor: Masato Suzuki