Code Restructuring Patents (Class 717/159)
-
Patent number: 8392900Abstract: Systems and methods according to the present invention provide techniques which modify programs having barrier statements. Dependence relations between statements, and enforcement associations between the barrier statements and the dependence relations, in the program are identified. The dependence relations are classified as being either enforceable by point-to-point synchronization or not enforceable by point-to-point synchronization. A subset of the barrier statements, which will enforce those dependence relations that are unenforceable by point-to-point synchronization, are determined. Other(s) of the barrier statements are replaced with a point-to-point synchronization routine.Type: GrantFiled: March 17, 2005Date of Patent: March 5, 2013Assignee: Hewlett-Packard Development Company, L.P.Inventors: Jean-Francois Collard, Robert Schreiber
-
Patent number: 8387035Abstract: A scheduling algorithm is provided for selecting the placement of instructions with internal slack into a schedule of instructions within a loop. The algorithm achieves this by pinning nodes with internal slack to corresponding nodes on the critical path of the code that have similar properties in terms of the data dependency graph, such as earliest time and latest time. The effect is that nodes with internal slack are more often optimally placed in the schedule, reducing the need for rotating registers or register copy instructions. The benefit of the present invention can primarily be seen when performing instruction scheduling or software pipelining on loop code, but can also apply to other forms of instruction scheduling when greater control of placement of nodes with internal slack is desired.Type: GrantFiled: January 13, 2009Date of Patent: February 26, 2013Assignee: International Business Machines CorporationInventor: Allan Russell Martin
-
Patent number: 8387065Abstract: A method and a data processing system by which population count (popcount) operations are efficiently performed without incurring the latency and loss of critical processing cycles and bandwidth of real time processing. The method comprises: identifying data to be stored to memory for which a popcount may need to be determined; speculatively performing a popcount operation on the data as a background process of the processor while the data is being stored to memory; storing the data to a first memory location; and storing a value of the popcount generated by the popcount operation within a second memory location. The method further comprises: determining a size of data; determining a granular level at which the popcount operation on the data will be performed; and reserving a size of said second memory location that is sufficiently large to hold the value of the popcount.Type: GrantFiled: April 16, 2009Date of Patent: February 26, 2013Assignee: International Business Machines CorporationInventors: Ravi K. Arimilli, Ronald N. Kalla, Balaram Sinharoy
-
Patent number: 8375376Abstract: A description processing device has: a receiving unit which receives a behavior level description; a label-name generating unit which generates a label name; a label disposing unit which disposes a top label statement; an extracting unit which extracts an extracted label statement, a variable-name generating unit which generates a variable name; a replacing unit which replaces a statement immediately below the top label statement to the extracted label statement by a column of a conditional executable statement and an operation/assignment statement and replaces a jump statement for jumping to the extracted label statement by a column of an operation/assignment statement and a jump statement for jumping to the top label; a control unit which repeats the extraction, the generation of a new variable name, and the replacement; an inserting unit which inserts an operation/assignment statement; and an output unit which outputs the behavior level description.Type: GrantFiled: March 27, 2009Date of Patent: February 12, 2013Assignee: NEC CorporationInventor: Kazutoshi Wakabayashi
-
Patent number: 8375043Abstract: An XQuery access API is described, for providing access to XML data from a data source, using the XQuery language. A requestor can request, from a server, performance of an operation on XML data, wherein request messages and response messages conform to the Simple Object Access Protocol (SOAP). Request and response messages can be transmitted using Hypertext Transfer Protocol (HTTP) or Hypertext Transfer Protocol over Secure Socket Layer (HTTPS). The format of the request and response messages is specified in a definition of a Web service, where the definition conforms to the Web Service Description Language (WSDL).Type: GrantFiled: January 19, 2011Date of Patent: February 12, 2013Assignee: Oracle International CorporationInventors: Muralidhar Krishnaprasad, Zhen Hua Liu, Karuna Muthiah, Ying Lu, James W. Warner, Rohan Angrish, Vikas Arora, Anand Manikutty
-
Patent number: 8375373Abstract: In a change-resilient intermediate language code, registers have been allocated but symbolic references and pseudo instructions still use unbound items. Pseudo instructions having a specific location within generated intermediate language code request insertion of machine instruction(s) at the location to perform specified operations. Specified operations may include, for example, operations to perform or facilitate garbage collection, memory allocation, exception handling, various kinds of method calls and execution engine service calls, managed object field access, heap management, generic code, static variable storage access, address mode modification, and/or symbolic reference to types. A binder may transform the intermediate language code into executable code. Little or no register allocation is needed during binding, but unbound items such as offsets, sizes, slots, and the like are determined and specified to produce executable code.Type: GrantFiled: April 19, 2010Date of Patent: February 12, 2013Assignee: Microsoft CorporationInventor: Peter Franz Valentin Sollich
-
Patent number: 8370817Abstract: A mechanism is provided for optimizing scalar code executed on a single instruction multiple data (SIMD) engine by aligning the slots of SIMD registers. With the mechanism, a compiler is provided that parses source code and, for each statement in the program, generates an expression tree. The compiler inspects all storage inputs to scalar operations in the expression tree to determine their alignment in the SIMD registers. This alignment is propagated up the expression tree from the leaves. When the alignments of two operands in the expression tree are the same, the resulting alignment is the shared value. When the alignments of two operands in the expression tree are different, one operand is shifted. For shifted operands, a shift operation is inserted in the expression tree. The executable code is then generated for the expression tree and shifts are inserted where indicated.Type: GrantFiled: May 27, 2008Date of Patent: February 5, 2013Assignee: International Business Machines CorporationInventors: Alexandre E. Eichenberger, John Kevin Patrick O'Brien
-
Patent number: 8370823Abstract: Device, system, and method of computer program optimization. For example, an apparatus to analyze a plurality of versions of computer program includes: a code analyzer to determine one or more code differences between first and second versions of the computer program, based on at least one optimization log associated with at least one of the first and second versions of the computer program.Type: GrantFiled: August 27, 2007Date of Patent: February 5, 2013Assignee: International Business Machines CorporationInventors: Guy Bashkansky, Gad Haber, Marcel Zalmanovici
-
Patent number: 8359587Abstract: A compilation method and mechanism for parallelizing program code. A method for compilation includes analyzing source code and identifying candidate code for parallelization. The method includes parallelizing the candidate code, in response to determining said profitability meets a predetermined criteria; and generating object code corresponding to the source code. The generated object code includes both a non-parallelized version of the candidate code and a parallelized version of the candidate code. During execution of the object code, a dynamic selection between execution of the non-parallelized version of the candidate code and the parallelized version of the candidate code is made. Changing execution from said parallelized version of the candidate code to the non-parallelized version of the candidate code, may be in response to determining a transaction failure count meets a pre-determined threshold.Type: GrantFiled: May 1, 2008Date of Patent: January 22, 2013Assignee: Oracle America, Inc.Inventors: Yonghong Song, Spiros Kalogeropulos, Partha P. Tirumalai
-
Patent number: 8359588Abstract: A method of reducing inter-task latency for software comprising a sequence of instructions including a synchronous remote procedure call to be executed on a multiprocessor system comprising a calling processor and at least one remote engine. The method comprises the steps of: inputting the software; inputting a runtime resource description describing a runtime environment of the multiprocessor system; identifying the synchronous remote procedure call in the sequence of instructions; replacing the synchronous remote procedure call in the sequence of instructions with an initiation instruction and a wait instruction to generate a substitute sequence of instructions; reordering the substitute sequence of instructions with reference to the runtime resource description and the dependencies to generate a reordered sequence of instructions; and outputting the reordered sequence of instructions.Type: GrantFiled: November 25, 2009Date of Patent: January 22, 2013Assignee: Arm LimitedInventor: Alastair David Reid
-
Patent number: 8359586Abstract: In an embodiment, a code generator receives input code having a plurality of functional elements, such as blocks, nodes, statements, commands, etc. The input code processes a data set, such as an image file. The code generator further receives one or more criteria for the generated code. The functional elements of the input code are provided with one or more parameters regarding the block sizes that the respective functional elements can process, such as an available block size and a preferred block size. The code generator queries the functional elements of the input code to obtain their available and preferred block sizes, and builds an intermediate representation (IR) of the input code. The code generator re-organizes and modifies the IR so that it achieves the one or more criteria. Output code that meets the one or more criteria is generated from the reorganized and modified IR.Type: GrantFiled: August 20, 2007Date of Patent: January 22, 2013Assignee: The MathWorks, Inc.Inventors: Donald P. Orofino, II, Witold R. Jachimczyk
-
Publication number: 20120331453Abstract: A method for vectorization of a block of code is provided. The method comprises receiving a first block of code as input; and converting the first block of code into at least a second block of code and a third block of code. The first block of code accesses a first set of memory addresses that are potentially misaligned. The second block of code performs conditional leaping address incrementation to selectively access a first subset of the first set of memory addresses. The third block of code accesses a second subset of the first set of memory addresses starting from an aligned memory address, simultaneously accessing multiple memory addresses at a time. No memory address belongs to both the first subset and the second subset of memory addresses.Type: ApplicationFiled: September 7, 2012Publication date: December 27, 2012Applicant: INTERNATIONAL BUSINESS MACHINESInventors: Dorit Nuzman, Ira Rosen, Ayal Zaks
-
Patent number: 8341614Abstract: Methods, software media, compilers and programming techniques are described for creating copyable stack-based closures, such as a block, for languages which allocate automatic or local variables on a stack memory structure. In one exemplary method, a data structure of the block is first written to the stack memory structure, and this may be the automatic default operation, at run-time, for the block; then, a block copy instruction, added explicitly (in one embodiment) by a programmer during creation of the block, is executed to copy the block to a heap memory structure. The block includes a function pointer that references a function which uses data in the block.Type: GrantFiled: September 30, 2008Date of Patent: December 25, 2012Assignee: Apple Inc.Inventors: Gerald Blaine Garst, Jr., William Bumgarner, Fariborz Jahanian, Christopher Arthur Lattner
-
Patent number: 8340131Abstract: A system (and a method) are disclosed for reliably disseminating a state of a node in a large network consisting of nodes with constrained resources. The system comprises a process embodied by a state machine comprised of an advertise state, a request state, and a share state. The system processes input events, mutates its internal state, and outputs side effects. The outputs from one node in the network may become inputs events to one or more other nodes in the network. Viral dissemination is an emergent behavior across the nodes in a network that all independently and continuously perform these input processings, state mutations, and output side effects.Type: GrantFiled: May 1, 2009Date of Patent: December 25, 2012Assignee: Sentilla Corporation, Inc.Inventors: Courtney Sharp, Jason Ostrander, Joseph Polastre
-
Publication number: 20120324431Abstract: The present invention extends to methods, systems, and computer program products for transforming source code to await execution of asynchronous operations. Embodiments of the invention simplify authoring and use of asynchronous methods, by generating statements that use well-defined awaitable objects to await completion of asynchronous operations. For example, a computer system can transform a statement that requests to await the completion of an asynchronous operation into a plurality of statements that use a predefined pattern of members of an awaitable object corresponding the asynchronous operation. The pattern can include one or more members configured to return a completion status of the asynchronous operation, one or more members configured to resume execution of the asynchronous method at a resumption point when the asynchronous operation completes, and one or more members configured to retrieve completion results.Type: ApplicationFiled: June 16, 2011Publication date: December 20, 2012Applicant: Microsoft CorporationInventors: Stephen Harris Toub, Mads Torgersen, Lucian Jules Wischik, Anders Hejlsberg, Niklas Gustafsson, Dmitry Lomov, Matthew J. Warren
-
Publication number: 20120317558Abstract: The present invention extends to methods, systems, and computer program products for binding executable code at runtime. Embodiments of the invention include late binding of specified aspects of code to improve execution performance. A runtime dynamically binds lower level code based on runtime information to optimize execution of a higher level algorithm. Aspects of a higher level algorithm having a requisite (e.g., higher) impact on execution performance can be targeted for late binding. Improved performance can be achieved with minimal runtime costs using late binding for aspects having the requisite execution performance impact.Type: ApplicationFiled: June 10, 2011Publication date: December 13, 2012Applicant: Microsoft CorporationInventors: Amit Kumar Agarwal, Weirong Zhu, Yosseff Levanoni
-
Publication number: 20120311553Abstract: A method of compiling source code into object code for a multi-threaded runtime environment is disclosed. Source code is compiled into object code using a compilation engine. Marshalling attributes associated with method code intended for executing in a secondary thread are identified. The marshalling attributes and the method code are rewritten as marshaled method code for executing the method code in the secondary thread according to the identified marshalling attributes.Type: ApplicationFiled: June 1, 2012Publication date: December 6, 2012Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventor: PAUL A. CROCKETT
-
Patent number: 8327343Abstract: Methods, systems and apparatus for optimizing a source code are provided. Dependencies for each header file and source code file in the source code are identified for all possible compilation contexts. Certain dependencies can be classified into complete types and incomplete types or named references. Further, these incomplete type dependencies are removed by adding forward declarations where appropriate.Type: GrantFiled: October 20, 2005Date of Patent: December 4, 2012Assignee: Cisco Technology, Inc.Inventors: Marcos S. Klein, Richard Brian Livingston, Vinod Pandarinathan, Venkata Rajasekharu Athreyapurapu
-
Patent number: 8316360Abstract: Methods and apparatus to optimize the parallel execution of software processes are disclosed. An example method includes receiving a first software process that processes a set of data, locating a first primitive in the first software process, and decomposing the first primitive into a first set of one or more sub-primitives. The example methods and apparatus additionally perform static fusion and dynamic fusion to optimize software processes for execution in parallel processing systems.Type: GrantFiled: September 29, 2006Date of Patent: November 20, 2012Assignee: Intel CorporationInventors: Byoungro So, Anwar M. Ghuloum, Youfeng Wu
-
Patent number: 8316357Abstract: The efficient use of type descriptors with frozen objects. A frozen object might actually include several type descriptors, a primary type descriptor that is canonical according to a set of canonicalization rules, and an auxiliary type descriptor that is not identical to the primary type descriptor. The auxiliary type descriptor may be used to access the canonical type descriptor. When performing an operation, if the auxiliary type descriptor can be used to perform the operation, then that auxiliary type descriptor may be used. If the canonical type descriptor is to be used to perform the operation, the auxiliary type descriptor is used to gain access to the canonical primary type descriptor. The primary type descriptor is then used to perform the operation.Type: GrantFiled: September 3, 2008Date of Patent: November 20, 2012Assignee: Microsoft CorporationInventors: Scott D. Mosier, Peter F. Sollich, Frank V. Peschel-Gallee, Patrick H. Dussud, Simon J. Hall, Rudi Martin, Michael M. Magruder, Andrew Pardoe, Madhusudhan Talluri
-
Patent number: 8307354Abstract: A program generation apparatus generates an obfuscated program difficult to analyze from outside and a program execution apparatus executes the program. The program generation apparatus includes an acquisition unit that acquires a 1st program including one or more instructions, the 1st program causing a process by executing the instructions in a predetermined order to obtain a result; a generation unit that generates a 2nd program based on the 1st program; and an output unit that outputs the 2nd program. The 2nd program causes a process that is different from the process caused by the 1st program and varies according to current information determined at execution of the 2nd program in order to obtain a result identical to the result of the 1st program.Type: GrantFiled: June 24, 2005Date of Patent: November 6, 2012Assignee: Panasonic CorporationInventors: Tomoyuki Haga, Yukie Shoda, Taichi Sato, Teruto Hirota
-
Patent number: 8296750Abstract: A method and apparatus for optimizing a target program including a pattern of instructions to be replaced. The method is performed by execution of program code by a processor of an information processing apparatus that includes an output device and a computer readable storage medium storing the program code. At least one transformation is performed on the target program to generate a transformed target subprogram in which dependencies among the instructions included in the target subprogram are matched with dependencies in the pattern to be replaced. The transformed target subprogram is replaced, with a post-replacement instruction stream determined to correspond to the pattern to be replaced, to generate a replaced target subprogram. An optimized target program that includes the replaced target subprogram is outputted to the output device. The at least one transformation includes a first transformation, a loop transformation, or both the first transformation and the loop transformation.Type: GrantFiled: September 12, 2007Date of Patent: October 23, 2012Assignee: International Business Machines CorporationInventor: Motohiro Kawahito
-
Patent number: 8291393Abstract: A computer implemented method for performing inlining in a just-in-time compiler. Compilation of a first code of a program is begun. The first code is one of an interruptible code and a non-interruptible code. A try region is established around a second code of the program to form a wrapped second code. The try region is a boundary between interruptible and non-interruptible code such that a third code that modifies an observable state of the program cannot be moved across the boundary. The second code is, relative to the first code, the other of the interruptible code and the non-interruptible code. The wrapped second code is inlined with the first code during compilation. Compilation of the first code is completed to form a resultant code. The resultant code is stored.Type: GrantFiled: August 20, 2007Date of Patent: October 16, 2012Assignee: International Business Machines CorporationInventors: Patrick G. Gallop, Derek Bruce Inglis, Mark Graham Stoodley
-
Patent number: 8276134Abstract: An improved system and computer programming product for acquisition and release of locks within a software program is disclosed. In an exemplary embodiment, a lock within a loop is transformed by relocating acquisition and release instructions from within the loop to positions outside the loop. This may significantly decrease unnecessarily lock acquisition and release during execution of the software program. In order to avoid contention problems which may arise from acquiring and keeping a lock on an object over a relatively long period of time, a contention test may be inserted into the loop. Such a contention test may temporarily release the lock if another thread in the software program requires access to the locked object.Type: GrantFiled: June 9, 2008Date of Patent: September 25, 2012Assignee: International Business Machines CorporationInventors: Nikola Grcevski, Kevin Alexander Stoodley, Mark Graham Stoodley, Vijay Sundaresan
-
Patent number: 8271889Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for automatically updating user interfaces for a mobile device. In one aspect, a first set of instructions from an automatically synchronizing data store are received. The first set of instructions are executed to provide an interface between a user and an operating system on the mobile device. The data store is automatically updated with a second set of instructions. The second set of instructions are executed to provide a modification to the interface between the user and the operating system on the mobile device.Type: GrantFiled: November 26, 2007Date of Patent: September 18, 2012Assignee: Adobe Systems IncorporatedInventors: Joerg Beckert, GuiQin Zhang, Srini Attaluri, Rupen Chanda, Anssi Kesti-Helia, Antti Piira
-
Patent number: 8266610Abstract: A method, apparatus, and computer instructions for scheduling instructions for execution. Identify a series of instructions in a loop, wherein the series of instructions has a cyclic data dependency. Determine whether the series of instructions is a uniform series of instructions. Schedule execution of the uniform series of instructions within the loop to optimize execution of the loop in response to the identified series of instructions being the uniform series of instructions.Type: GrantFiled: September 19, 2008Date of Patent: September 11, 2012Assignee: International Business Machines CorporationInventor: Allan Russell Martin
-
Patent number: 8266609Abstract: A software transactional memory system is described which utilizes decomposed software transactional memory instructions as well as runtime optimizations to achieve efficient performance. The decomposed instructions allow a compiler with knowledge of the instruction semantics to perform optimizations unavailable on traditional software transactional memory systems. Additionally, high-level software transactional memory optimizations are performed such as code movement around procedure calls, addition of operations to provide strong atomicity, removal of unnecessary read-to-update upgrades, and removal of operations for newly-allocated objects. During execution, multi-use header words for objects are extended to provide for per-object housekeeping, as well as fast snapshots which illustrate changes to objects. Additionally, entries to software transactional memory logs are filtered using an associative table during execution, preventing needless writes to the logs.Type: GrantFiled: March 23, 2006Date of Patent: September 11, 2012Assignee: Microsoft CorporationInventor: Timothy Lawrence Harris
-
Patent number: 8266605Abstract: Described is a method and system for optimizing a code layout for execution on a processor including internal and/or external cache memory. The method and system includes executing a program having a first layout, generating at least one memory access parameter for the program, the memory access parameter being based on a cache memory of a computing system on which the program is designed to run and constructing a second layout for the program as a function of the at least one memory access parameter.Type: GrantFiled: February 22, 2006Date of Patent: September 11, 2012Assignee: Wind River Systems, Inc.Inventors: Roger Wiles, Maarten Koning
-
Patent number: 8261249Abstract: Embodiments of the invention provide a method for deploying and running an application on a massively parallel computer system, while minimizing the costs associated with latency, bandwidth, and limited memory resources. The executable code of a program may be divided into multiple code fragments and distributed to different compute nodes of a parallel computing system. During program execution, one compute node may fetch code fragments from other compute nodes as necessary.Type: GrantFiled: January 8, 2008Date of Patent: September 4, 2012Assignee: International Business Machines CorporationInventors: Charles Jens Archer, Thomas Michael Gooding, Ruth Janine Poole, Albert Sidelnik
-
Patent number: 8261250Abstract: A single-chip multiprocessor system and operation method of this system based on a static macro-scheduling of parallel streams for multiprocessor parallel execution. The single-chip multiprocessor system has buses for direct exchange between the processor register files and access to their store addresses and data. Each explicit parallelism architecture processor of this system has an interprocessor interface providing the synchronization signals exchange, data exchange at the register file level and access to store addresses and data of other processors. The single-chip multiprocessor system uses ILP to increase the performance. Synchronization of the streams parallel execution is ensured using special operations setting a sequence of streams and stream fragments execution prescribed by the program algorithm.Type: GrantFiled: January 10, 2011Date of Patent: September 4, 2012Assignee: Elbrus InternationalInventors: Boris A. Babaian, Yuli Kh. Sakhin, Vladimir Yu. Volkonskiy, Sergey A. Rozhkov, Vladimir V. Tikhorsky, Feodor A. Gruzdov, Leonid N. Nazarov, Mikhail L. Chudakov
-
Patent number: 8255887Abstract: A memory management mechanism requires data structures to be explicitly deallocated in the programming code, but deallocation does not immediately make the memory available for reuse. Before a deallocated memory region can be reused, memory is scanned for pointers to the deallocated region, and any such pointer is set to null. The deallocated memory is then available for reuse. Preferably, deallocated memory regions are accumulated, and an asynchronous memory cleaning process periodically scans memory to nullify the pointers. In order to prevent previously scanned memory becoming contaminated with a dangling pointer before the scan is finished, any write to a pointer is checked to verify that the applicable target address has not been deallocated.Type: GrantFiled: November 29, 2006Date of Patent: August 28, 2012Assignee: International Business Machines CorporationInventor: Timothy Hume Heil
-
Patent number: 8255892Abstract: Disclosed is a method for updating program code stored in a memory, which memory comprises a plurality of memory sectors. The method comprises transforming an updated input code into an updated program code version to be stored in a memory, which memory has stored thereon a current program code version occupying a first set of the memory sectors of the memory, wherein the updated program code version occupies a second set of memory sectors when stored in the memory. The transforming step further comprises receiving a representation of the current program code version; and performing at least one optimization step adapted to decrease the number of memory sectors of the second set of memory sectors occupied by the updated code version that are different from the corresponding memory sectors of the first set of memory sectors occupied by the current program code version.Type: GrantFiled: January 7, 2005Date of Patent: August 28, 2012Assignee: Telefonaktiebolaget L M Ericsson (Publ)Inventor: Johan Eker
-
Patent number: 8250554Abstract: Systems and methods for dynamically generating computer executable technical support procedures, as well as updating/augmenting such executable procedures, by tracking and processing sequences of actions (execution traces) that are taken by experts (or users) when performing a procedure or when executing an executable procedure.Type: GrantFiled: August 8, 2007Date of Patent: August 21, 2012Assignee: International Business Machines CorporationInventors: Lawrence Bergman, Vittorio Castelli, Tessa Lau, Daniel Oblinger
-
Patent number: 8250556Abstract: A system comprises a plurality of computation units interconnected by an interconnection network. A method for configuring the system comprises receiving an initial partitioning of instructions into initial subsets corresponding to different portions of a program; forming a refined partitioning of the instructions into refined subsets each including one or more of the initial subsets, including determining whether to combine a first subset and a second subset to form a third subset according to a comparison of a communication cost between the first subset and second subset and a load cost of the third subset that is based at least in part on a number of instructions issued per cycle by a computation unit; and assigning each refined subset of instructions to one of the computation units for execution on the assigned computation unit.Type: GrantFiled: February 7, 2008Date of Patent: August 21, 2012Assignee: Tilera CorporationInventors: Walter Lee, Robert A. Gottlieb, Vineet Soni, Anant Agarwal, Richard Schooler
-
Patent number: 8250555Abstract: A system comprises a plurality of computation units interconnected by an interconnection network.Type: GrantFiled: February 7, 2008Date of Patent: August 21, 2012Assignee: Tilera CorporationInventors: Walter Lee, Robert A. Gottlieb, Vineet Soni, Anant Agarwal, Richard Schooler
-
Patent number: 8234636Abstract: A modification to source code is applied in an automated manner to improve program performance while maintaining the meaning of an associated program. Source code is rewritten to improve the operation of the associated program. Prior to applying the source code optimization to the program, confirmation of approval by the programmer must be maintained. In one embodiment, the programmer is presented with numerical data pertaining to an improvement ratio associated with application of the source code optimization.Type: GrantFiled: September 12, 2006Date of Patent: July 31, 2012Assignee: International Business Machines CorporationInventors: Motohiro Kawahito, Hideaki Komatsu, Toshio Nakatani
-
Patent number: 8234631Abstract: The present invention provides a method and system for tracing and monitoring of distributed transactions spanning multiple threads or processes, running on multiple host systems, connected by a computer network. The correlation of distributed transactions is based on information that uniquely may identify execution paths within a virtual machine, additional to information which uniquely may identify the virtual machine that processes the execution path. The correlation information is transferred from a monitored thread to threads that are activated by the monitored thread and allows to reconstruct parent-child relations between different threads. Participating threads may run in different processes, in different virtual machines or on different host systems.Type: GrantFiled: August 14, 2008Date of Patent: July 31, 2012Assignee: dynaTrace Software GmbHInventors: Bernd Greifeneder, Markus Pfleger
-
Patent number: 8230395Abstract: Methods and systems are provided for automatically generating code from a graphical model representing a design to be implemented on components of a target computational hardware device. During the automatic code generating process, a memory mapping is automatically determined and generated to provide an optimization of execution of the program on the target device. The optimized memory mapping is incorporated into building the program executable from the automatically generated code of the graphical model.Type: GrantFiled: July 21, 2009Date of Patent: July 24, 2012Assignee: The MathWorks, Inc.Inventors: David Koh, Zijad Galijasevic
-
Publication number: 20120185836Abstract: In one embodiment, a method includes identifying a byte swap operation, building a domain including the byte swap operation and other expressions, identifying domain entries and domain exits associated with the domain, determining that a benefit will be obtained by performing a swap of the domain, and responsive to the determination performing the swap of the domain, and storing the swapped domain in a storage medium. Other embodiments are described and claimed.Type: ApplicationFiled: June 25, 2009Publication date: July 19, 2012Inventor: Mikhail Yurievich Loenko
-
Patent number: 8225295Abstract: We show that register allocation can be viewed as solving a collection of puzzles. We model the register file as a puzzle board and the program variables as puzzle pieces. We model pre-coloring by letting some of the puzzle pieces be already immovably placed on the puzzle board, and we model register aliasing by letting pieces have a plurality widths. For a wide variety of computer architectures, we can solve the puzzles in polynomial time. Puzzle solving is independent of spilling, that is, puzzle solving can be combined with a wide variety of approaches to spilling.Type: GrantFiled: September 20, 2008Date of Patent: July 17, 2012Inventors: Jens Palsberg, Fernando M. Q. Pereira
-
Publication number: 20120117551Abstract: Source code is generated that includes one or more iterator-based expressions such as declarative queries. The source code is translated into an intermediate language that classifies operators making up the iterator-based expressions into classes based on whether the operators are aggregating, element-wise, or sink operators. The intermediate language, including the identified classes, is processed using an automaton to replace the iterator-based expressions with one or more equivalent non-iterator-based expressions. Where an iterator-based expression is nested, the nested expression is processed using an equivalent number of nested automatons. The resulting optimized source code may be compiled and executed using fewer virtual function calls than the equivalent non-optimized source code.Type: ApplicationFiled: November 10, 2010Publication date: May 10, 2012Applicant: Microsoft CorporationInventors: Michael Isard, Yuan Yu, Derek Gordon Murray
-
Publication number: 20120117552Abstract: Methods to improve optimization of compilation are presented. In one embodiment, a method includes identifying one or more optimization speculations with respect to a code region and speculatively performing transformation on an intermediate representation of the code region in accordance with an optimization speculation. The method includes generating an advice message corresponding to the optimization speculation and displaying the advice message if the optimization speculation results in an improved compilation result.Type: ApplicationFiled: November 9, 2010Publication date: May 10, 2012Inventors: Rakesh Krishnaiyer, Hideki Saito Ido, Ernesto Su, John L. Ng, Jin Lin, Xinmin Tian, Robert Y. Geva
-
Patent number: 8176470Abstract: A method, system and computer program product provide an implementation of software. A control flow of a software component is constructed based on a specification model. In various embodiments, the specification model comprises at least one input and at least one requirement referencing the at least one input. At least a partial implementation of the software component is generated based on the control flow and the at least one input and the at least one requirement of the specification model. In some embodiments, the specification model further comprises at least one output, and the at least a partial implementation of the software component is also based on the at least one output.Type: GrantFiled: October 13, 2006Date of Patent: May 8, 2012Assignee: International Business Machines CorporationInventors: Martin Klumpp, Jacques Joseph Labrie, Mary Ann Roth
-
Patent number: 8171464Abstract: An approach is provided for vectorizing misaligned references in compiled code for SIMD architectures that support only aligned loads and stores. In this framework, a loop is first simdized as if the memory unit imposes no alignment constraints. The compiler then inserts data reorganization operations to satisfy the actual alignment requirements of the hardware. Finally, the code generation algorithm generates SIMD codes based on the data reorganization graph, addressing realistic issues such as runtime alignments, unknown loop bounds, residual iteration counts, and multiple statements with arbitrary alignment combinations. Loop peeling is used to reduce the computational overhead associated with misaligned data. A loop prologue and epilogue are peeled from individual iterations in the simdized loop, and vector-splicing instructions are applied to the peeled iterations, while the steady-state loop body incurs no additional computational overhead.Type: GrantFiled: May 16, 2008Date of Patent: May 1, 2012Assignee: International Business Machines CorporationInventors: Alexandre E. Eichenberger, Kai-Ting Amy Wang, Peng Wu
-
Patent number: 8166468Abstract: A computer implemented method, computer program product, and data processing system for reducing the number of inner classes in a compiled computer program written in an object-oriented programming language. An outer class of the compiled computer program is received, wherein the outer class contains an inner class, wherein the outer class comprises instructions to create an instance of an inner class. The instance is to be used as one of a callback, a listener command, a set of instructions by which an object instance of the inner class transfers information to the corresponding containing instance of the outer class, and combinations thereof. A transformation of the outer class is performed by moving methods of the inner class, as well as their contained instructions, into the outer class. The behavior of the compiled computer program remains unchanged.Type: GrantFiled: November 30, 2007Date of Patent: April 24, 2012Assignee: International Business Machines CorporationInventors: Sean Christopher Foley, Berthold Martin Lebert
-
Publication number: 20120096444Abstract: The various embodiments of the invention relate generally to computer software, computer program architecture, software development, and computer programming languages, and more specifically, to techniques for analyzing control flow in COBOL-sourced programs to facilitate optimized conversions to object-oriented program structures. For example, a compiler can include a global optimizer configured to analyze execution flow for a range of blocks of source code in the memory to determine flow-affected code. Also, the compiler can include a native code generator configured to generate native code based on representations of the native code as functions of the source code. The native code is configured to execute on a virtual machine.Type: ApplicationFiled: September 19, 2011Publication date: April 19, 2012Applicant: Micro Focus (US), Inc.Inventors: Jeremy Wright, Robert Sales
-
Publication number: 20120096448Abstract: A method to selectively remove memoizing functions from computer program code is disclosed herein. In one embodiment, such a method includes locating a memoizing function call in program code. The method then replaces the memoizing function call with a simple object allocation. Using escape analysis, the method determines whether the replacement is legal. If the replacement is not legal, the method removes the simple object allocation and reinserts the original memoizing function call in its place. If the replacement is legal, the method retains the simple object allocation in the program code. If desired, certain compiler optimizations, such as stack allocation and scalarization, may then be performed on the simple object allocation. A corresponding computer program product and apparatus are also disclosed herein.Type: ApplicationFiled: October 13, 2010Publication date: April 19, 2012Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventor: Patrick R. Doyle
-
Patent number: 8161467Abstract: A compiler includes a register allocator for allocating registers for instructions in a program to be compiled, and a code generator for generating object code based on the register allocation results performed by the register allocator. The register allocator allocates logical registers for instructions in the program to be compiled. The register allocation further allocates, to physical registers, the logical registers that are allocated to the instructions of the program, so that the physical registers that are live at a procedure call in the program to be compiled are allocated from the bottom of the register stack.Type: GrantFiled: October 31, 2007Date of Patent: April 17, 2012Assignee: International Business Machines CorporationInventors: Akira Koseki, Mikio Takeuchi, Hideaki Komatsu
-
Patent number: 8161470Abstract: Automated injection of Java bytecode instructions for Java load time optimization via runtime checking with upcasts. Exemplary embodiments include a method including generating a stack for each of a plurality of bytecodes, generating a subclass configured to keep a history of instructions that have modified the stack, statically scanning a plurality of Java classes associated with the plurality of bytecodes to locate class file configurations and bytecode patterns that cause loading of additional classes to complete a verification of each of the classes in the plurality of Java classes, rewriting the bytecodes to delay the loading of the additional classes until required at a runtime, recording modifications that have been made to the stack by the instructions, and applying the modifications to each of the bytecodes in the plurality of bytecodes.Type: GrantFiled: August 31, 2007Date of Patent: April 17, 2012Assignee: International Business Machines CorporationInventors: T. Mark W. Bottomley, Nicholas J. Doyle, Aleksandr V. Kennberg, Orlando E. Marquez, Amey A. Shirodkar
-
Publication number: 20120079469Abstract: Systems and methods for the vectorization of software applications are described. In some embodiments, source code dependencies can be expressed in ways that can extend a compiler's ability to vectorize otherwise scalar functions. For example, when compiling a called function, a compiler may identify dependencies of the called function on variables other than parameters passed to the called function. The compiler may record these dependencies, e.g., in a dependency file. Later, when compiling a calling function that calls the called function, the same (or another) compiler may reference the previously-identified dependencies and use them to determine whether and how to vectorize the calling function. In particular, these techniques may facilitate the vectorization of non-leaf loops. Because non-leaf loops are relatively common, the techniques described herein can increase the amount of vectorization that can be applied to many applications.Type: ApplicationFiled: September 23, 2010Publication date: March 29, 2012Inventor: Jeffry E. Gonion