For A Parallel Or Multiprocessor System Patents (Class 717/149)

Loop compiling (Class 717/150)

Program execution optimization using uniform variable identification

Patent number: 9286196

Abstract: A method, apparatus and computer program, each for optimizing execution of a computer program is disclosed in which a topology-based control flow analysis of basic blocks of the computer program is performed and a data flow analysis of the instructions within the basic blocks is performed to determine if each instruction of said computer program is uniform or non-uniform (variant or invariant). Subsequently, when the computer program is executed, storage of a copy of a variable dependent on a uniform instruction is suppressed.

Type: Grant

Filed: January 8, 2015

Date of Patent: March 15, 2016

Assignee: ARM Limited

Inventors: Jiangning Liu, Zhenqiang Chen
Hybrid parallelization strategies for machine learning programs on top of MapReduce

Patent number: 9286044

Abstract: Hybrid parallelization strategies for machine learning programs on top of MapReduce are provided. In one embodiment, a method of and computer program product for parallel execution of machine learning programs are provided. Program code is received. The program code contains at least one parallel for statement having a plurality of iterations. A parallel execution plan is determined for the program code. According to the parallel execution plan, the plurality of iterations is partitioned into a plurality of tasks. Each task comprises at least one iteration. The iterations of each task are independent.

Type: Grant

Filed: June 27, 2014

Date of Patent: March 15, 2016

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Matthias Boehm, Douglas Burdick, Berthold Reinwald, Prithviraj Sen, Shirish Tatikonda, Yuanyuan Tian, Shivakumar Vaithyanathan
Apparatus and method for executing code

Patent number: 9280330

Abstract: An apparatus and method for executing code are provided. The apparatus includes a memory manager that allocates a stack in memory to store processed data that needs to be retained; a loop generator that divides program code programmed to be processed in parallel into regions based on a barrier function, transforms a region that includes the processed data that needs to be retained in the stack into a first coalescing loop, and transforms a region that uses the processed data stored in the stack into a second coalescing loop such that the transformed program code may be serially processed; and a loop changer that reverses a processing order of the second coalescing loop in comparison to a processing order of the first coalescing loop.

Type: Grant

Filed: March 31, 2014

Date of Patent: March 8, 2016

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Jin-Seok Lee, Seong-Gun Kim, Dong-Hoon Yoo, Seok-Joong Hwang
Distributed computations of graphical programs having a pattern

Patent number: 9262141

Abstract: In one embodiment, a computer-implemented method for concurrently processing at least a portion of a graphical model is provided. The method may include obtaining the graphical model; recognizing a pattern in the graphical model, the pattern suitable for concurrent processing; and employing concurrent processing using multi-thread, multi-core, or multi-processor computing device when executing the pattern in the graphical model.

Type: Grant

Filed: September 10, 2007

Date of Patent: February 16, 2016

Assignee: The MathWorks, Inc.

Inventors: Donald Paul Orofino, II, Ramamurthy Mani, Michael James Longfritz
Programming a multi-processor system

Patent number: 9250867

Abstract: A computer-implemented method for creating a program for a multi-processor system comprising a plurality of interspersed processors and memories. A user may specify or create source code using a programming language. The source code specifies a plurality of tasks and communication of data among the plurality of tasks. However, the source code may not (and preferably is not required to) 1) explicitly specify which physical processor will execute each task and 2) explicitly specify which communication mechanism to use among the plurality of tasks. The method then creates machine language instructions based on the source code, wherein the machine language instructions are designed to execute on the plurality of processors. Creation of the machine language instructions comprises assigning tasks for execution on respective processors and selecting communication mechanisms between the processors based on location of the respective processors and required data communication to satisfy system requirements.

Type: Grant

Filed: May 22, 2014

Date of Patent: February 2, 2016

Assignee: Coherent Logix, Incorporated

Inventors: John Mark Beardslee, Michael B. Doerr, Tommy K. Eng
High-performance virtual machine networking

Patent number: 9213570

Abstract: A method for conveying a data packet received from a network to a virtual machine instantiated on a computer system coupled to the network, and a medium and system for carrying out the method, is described. In the method, a guest receive pointer queue of a component executing in the virtual machine is inspected in order to identify a location in a guest receive packet data buffer that is available to receive packet data. Data from the data packet received from the network is copied into the guest receive packet data buffer at the identified location, and a standard receive interrupt is raised in the virtual machine.

Type: Grant

Filed: January 29, 2015

Date of Patent: December 15, 2015

Assignee: VMware, Inc.

Inventor: Michael Nelson
Auto-cloudifying applications via runtime modifications

Patent number: 9207946

Abstract: An approach is provided in which a distributed runtime environment executes a software application that includes isolated runtime constructs corresponding to an isolated runtime environment. During the execution, the distributed runtime environment identifies isolated runtime constructs included in the software application and selects distributed runtime constructs corresponding to the isolated runtime constructs. In turn, the distributed runtime environment executes the distributed runtime constructs in lieu of executing the isolated runtime constructs.

Type: Grant

Filed: August 27, 2013

Date of Patent: December 8, 2015

Assignee: International Business Machines Corporation

Inventor: Douglas Davis
Distributed data-parallel execution engines for user-defined serial problems using branch-and-bound algorithm

Patent number: 9170846

Abstract: A distributed data-parallel execution (DDPE) system splits a computational problem into a plurality of sub-problems using a branch-and-bound algorithm, designates a synchronous stop time for a “plurality of processors” (for example, a cluster) for each round of execution, processes the search tree by recursively using a branch-and-bound algorithm in multiple rounds (without inter-processor communications), determines if further processing is required based on the processing round state data, and terminates processing on the processors when processing is completed.

Type: Grant

Filed: March 29, 2011

Date of Patent: October 27, 2015

Inventors: Daniel Delling, Mihai Budiu, Renato F. Werneck
High performance AVC encoder on a multi-core platform

Patent number: 9148669

Abstract: A method and system for encoding a digital video signal using a plurality of parallel processors. A digital picture is received that is composed of one or more GOPs. The CPU then determines the number of GOPs that need to be encoded and divides them into groups. The number of GOPs in a group may equal the number of parallel processors in the multi-core platform available to encode. The CPU transfers in a single batch to the multi-core platform, a frame of equal rank from each GOP contained in the first group. The multi-core platform encodes the frames in parallel, rearranges the encoded byte stream chunk into normal display order sequence and stores the encoded byte stream. The process may repeat until all the GOPs in the first group have been encoded. Upon completion the multi-core platform outputs the encoded byte stream in normal display order sequence.

Type: Grant

Filed: March 10, 2011

Date of Patent: September 29, 2015

Assignee: Sony Corporation

Inventors: Jonathan Huang, Tsaifa Yu
Systems and methods for executing an application on a mobile device

Patent number: 9146732

Abstract: The invention provides a method and system to execute applications on a mobile device. The applications may be compiled on a remote server and sent to the mobile device before execution. The applications may be updated by the remote server without interaction by the mobile device user.

Type: Grant

Filed: May 9, 2014

Date of Patent: September 29, 2015

Assignee: MFOUNDRY, INC.

Inventor: Rodney Aiglstorfer
Targeted cloud-based debugging

Patent number: 9146834

Abstract: Various arrangements for debugging code are presented. A computer system, such as a web server, may compile code into compiled code. The code may contain one or more subsections, include a first taskflow. A selection of the first taskflow may be received from a remote, developer computer system via a network. The selection of the first taskflow may indicate that the first taskflow is to be debugged. Execution of the first taskflow of the compiled code may occur by the computer system. While the computer system is executing the first taskflow of the compiled code, debugging functionality of the first taskflow may be provided to the developer computer system.

Type: Grant

Filed: May 1, 2014

Date of Patent: September 29, 2015

Assignee: Oracle International Corporation

Inventor: John Smiljanic
Methods and systems for optimizing the performance of software applications at runtime

Patent number: 9128747

Abstract: Systems and method for optimizing the performance of software applications are described. Embodiments include computer implemented steps for identifying at least two constituent software components for parallel execution, executing the identified software components, profiling the performance of the one or more software components at an execution time, creating an optimization model with the set of data gathered from profiling the execution of the one or more software components, and marking at least two software components for execution in parallel in a subsequent execution on the basis of the optimization model. In additional embodiments, the optimization model may be reconfigured on the basis of a cost-benefit analysis of parallelization, and the software components involved marked for sequential execution if the resource overhead associated with parallelization exceeds the corresponding resource or throughput benefit.

Type: Grant

Filed: June 25, 2012

Date of Patent: September 8, 2015

Assignee: Infosys Limited

Inventor: Prasanna Rajaraman
Computer method and system for storing and presenting sequence data

Patent number: 9110872

Abstract: Genetic sequence data occurring in genome sequences is represented for efficient access of the sequence information in a defined storage scheme. A described replet-sequence matrix data structure allows the compression and efficient access of sequence information. The data structure allows the dynamic change of ontology: the replet-information table can evolve by adding, updating, removing replets, and the set of replets present in the table represent the ontology at the moment. The data structure enables the sequence information to be processed in parallel, and also enables multiple views of the sequence data to exist along with replet specific information.

Type: Grant

Filed: October 31, 2003

Date of Patent: August 18, 2015

Assignee: International Business Machines Corporation

Inventor: Jagir Razak Jainul Abdeen Hussan
Protection of user data in hosted application environments

Patent number: 9098709

Abstract: A method of converting an original application into a cloud-hosted application includes splitting the original application into a plurality of application components along security relevant boundaries, mapping the application components to hosting infrastructure boundaries, and using a mechanism to enforce a privacy policy of a user. The mapping may include assigning each application component to a distinct virtual machine, which acts as a container for its assigned component.

Type: Grant

Filed: November 13, 2012

Date of Patent: August 4, 2015

Assignee: International Business Machines Corporation

Inventors: Mihai Christodorescu, Dimitrios Pendarakis, Kapil K. Singh
Process mapping in parallel computing

Patent number: 9063826

Abstract: A method of mapping processes to processors in a parallel computing environment where a parallel application is to be run on a cluster of nodes wherein at least one of the nodes has multiple processors sharing a common memory, the method comprising using compiler based communication analysis to map Message Passing Interface processes to processors on the nodes, whereby at least some more heavily communicating processes are mapped to processors within nodes. Other methods, apparatus, and computer readable media are also provided.

Type: Grant

Filed: November 28, 2011

Date of Patent: June 23, 2015

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Dibyendu Das, Nagarajan Kathiresan, Rajan Ravindran, Bhaskaran Venkatsubramaniam
Optimization of loops and data flow sections in multi-core processor environment

Patent number: 9043769

Abstract: The present invention relates to a method for compiling code for a multi-core processor, comprising: detecting and optimizing a loop, partitioning the loop into partitions executable and mappable on physical hardware with optimal instruction level parallelism, optimizing the loop iterations and/or loop counter for ideal mapping on hardware, chaining the loop partitions generating a list representing the execution sequence of the partitions.

Type: Grant

Filed: December 28, 2010

Date of Patent: May 26, 2015

Assignee: Hyperion Core Inc.

Inventor: Martin Vorbach
Program module applicability analyzer for software development and testing for multi-processor environments

Patent number: 9043770

Abstract: In one embodiment, a machine-implemented method programs a heterogeneous multi-processor computer system to run a plurality of program modules, wherein each program module is to be run on one of the processors The system includes a plurality of processors of two or more different processor types. According to the recited method, machine-implemented offline processing is performed using a plurality of SIET tools of a scheduling information extracting toolkit (SIET) and a plurality of SBT tools of a schedule building toolkit (SBT). A program module applicability analyzer (PMAA) determines whether a first processor of a first processor type is capable of running a first program module without compiling the first program module. Machine-implemented online processing is performed using realtime data to test the scheduling software and the selected schedule solution.

Type: Grant

Filed: January 23, 2013

Date of Patent: May 26, 2015

Assignee: LSI Corporation

Inventors: Pavel Aleksandrovich Aliseychik, Petrus Sebastiaan Adrianus Daniel Evers, Denis Vasilevich Parfenov, Alexander Nikolaevich Filippov, Denis Vladimirovich Zaytsev
Method for partitioning programs between a general purpose core and one or more accelerators

Patent number: 9038040

Abstract: Partitioning programs between a general purpose core and one or more accelerators is provided. A compiler front end is provided for converting a program source code in a corresponding high level programming language into an intermediate code representation. This intermediate code representation is provided to an interprocedural optimizer which determines which core processor or accelerator each portion of the program should execute on and partitions the program into sub-programs based on this set of decisions. The interprocedural optimizer may further add instructions to the partitions to coordinate and synchronize the sub-programs as required. Each sub-program is compiled on an appropriate compiler backend for the instruction set architecture of the particular core processor or accelerator selected to execute the sub-program. The compiled sub-programs and then linked to thereby generate an executable program.

Type: Grant

Filed: January 25, 2006

Date of Patent: May 19, 2015

Assignee: International Business Machines Corporation

Inventors: John Kevin Patrick O'Brien, Kathryn M. O'Brien, Daniel A. Prener
Image sensing and printing device

Patent number: 9036162

Abstract: An image sensing and printing digital camera device includes a housing defining a slot for receiving a printed instruction card having printed thereon an array of dots representing a programming script, the housing further storing therein a roll of print media; an area image sensor for sensing an image and generating pixel data representing the image; a linear image sensor for scanning the array of dots on the card and converting the array of dots into a data signal; a microcontroller provided in the housing, the microcontroller for decoding the data signal into the programming script and applying the programming script on the pixel data; and a printing mechanism for printing the pixel data, having applied thereto the programming script, on the roll of print media. The microcontroller integrates on a single chip a VLIW processor, a printhead interface, and an output buffer effecting communication between the VLIW processor and the printhead interface.

Type: Grant

Filed: July 3, 2012

Date of Patent: May 19, 2015

Assignee: Google Inc.

Inventor: Kia Silverbrook
Streamlining hardware initialization code

Patent number: 9021426

Abstract: According to one embodiment of the present disclosure, hardware initialization code and error action information are retrieved from separate storage areas. The hardware initialization code includes code that initializes a device, and also includes placeholders corresponding to actions that are performed when the device fails initialization. Likewise, the error action information describes the actions that are performed when the device fails initialization. The error action information is converted into macros that include lines of code. As such, the error action placeholders are matched to the macros and, in turn, each of the error action placeholders is replaced with the lines of code corresponding to the matched macros.

Type: Grant

Filed: December 4, 2012

Date of Patent: April 28, 2015

Assignee: International Business Machines Corporation

Inventors: Daniel M. Crowell, John Farrugia, Michael J. Jones, David Dean Sanner
SOURCE-TO-SOURCE TRANSFORMATIONS FOR GRAPH PROCESSING ON MANY-CORE PLATFORMS

Publication number: 20150113514

Abstract: Methods are provided for source-to-source transformations for graph processing on many-core platforms. A method includes receiving a graph application including one graph, expressed by a graph application programming interface configured for defining and manipulating graphs. The method further includes transforming, by a source-to-source compiler, the graph application into a plurality of parallel code variants. Each of the plurality of parallel code variants is specifically configured for parallel execution by a target one of a plurality of different many-core processors. The method also includes selecting and tuning, by a runtime component, a particular one of the parallel code variants for the parallel execution responsive to graph application characteristics, graph data, and an underlying code execution platform of the plurality of different many-core processors.

Type: Application

Filed: October 9, 2014

Publication date: April 23, 2015

Inventors: Srimat Chakradhar, Michela Becchi, Da Li
Method and apparatus for transforming program code

Patent number: 9015683

Abstract: Provided is a method of transforming program code written such that a plurality of work-items are allocated respectively to and concurrently executed on a plurality of processing elements included in a computing unit. A program code translator may identify, in the program code, two or more code regions, which are to be enclosed by work-item coalescing loops (WCLs), based on a synchronization barrier function contained in the program code, such that the work-items are serially executable on a smaller number of processing elements than a number of the processing elements, and may enclose the identified code regions with the WCLs, respectively.

Type: Grant

Filed: December 23, 2010

Date of Patent: April 21, 2015

Assignees: Samsung Electronics Co., Ltd., SNU R&DB Foundation

Inventors: Seung-Mo Cho, Jong-Deok Choi, Jaejin Lee
Vectorization of scalar functions including vectorization annotations and vectorized function signatures matching

Patent number: 9015688

Abstract: Methods and apparatuses associated with vectorization of scalar callee functions are disclosed herein. In various embodiments, compiling a first program may include generating one or more vectorized versions of a scalar callee function of the first program, based at least in part on vectorization annotations of the first program. Additionally, compiling may include generating one or more vectorized function signatures respectively associated with the one or more vectorized versions of the scalar callee function. The one or more vectorized function signatures may enable an appropriate vectorized version of the scalar callee function to be matched and invoked for a generic call from a caller function of a second program to a vectorized version of the scalar callee function.

Type: Grant

Filed: April 1, 2011

Date of Patent: April 21, 2015

Assignee: Intel Corporation

Inventors: Xinmin Tian, Sergey Stanislavoich Kozhukhov, Sergey Victorovich Preis, Robert Yehuda Geva, Konstantin Anatolyevich Pyjov, Hideki Sato, Milind Baburao Girkar, Aleksei Gurievich Kasov, Nikolay Vladimirovich Panchenko
Deterministic sharing of data among concurrent tasks using pre-defined deterministic conflict resolution policies

Patent number: 9009726

Abstract: A “Concurrent Sharing Model” provides a programming model based on revisions and isolation types for concurrent revisions of states, data, or variables shared between two or more concurrent tasks or programs. This model enables revisions of shared states, data, or variables to maintain determinacy despite nondeterministic scheduling between concurrent tasks or programs. More specifically, the Concurrent Sharing Model provides various techniques wherein shared states, data, or variables are conceptually replicated on forks, and only copied or written if necessary, then deterministically merged on joins such that concurrent tasks or programs can work with independent local copies of the shared states, data, or variables while ensuring automated conflict resolution.

Type: Grant

Filed: December 10, 2010

Date of Patent: April 14, 2015

Assignee: Microsoft Technology Licensing, LLC

Inventors: Sebastian Burckhardt, Daniel Johannes Pieter Leijen, Alexandro Baldassin
Programming in a multiprocessor environment

Patent number: 9009660

Abstract: Programming in a multiprocessor environment includes accepting a program specification that defines a plurality of processing modules and one or more channels for sending data between ports of the modules, mapping each of the processing modules to run on a set of one or more processing engines of a network of interconnected processing engines, and for at least some of the channels, assigning one or more elements of one or more processing engines in the network to the channel for sending data between respective processing modules.

Type: Grant

Filed: November 29, 2006

Date of Patent: April 14, 2015

Assignee: Tilera Corporation

Inventors: Patrick Robert Griffin, Walter Lee, Anant Agarwal, David Wentzlaff
IRREDUCIBLE MODULES

Publication number: 20150100948

Abstract: An approach to generating irreducible modules. The approach includes a method that includes receiving, by at least one computing device, data associated with a specification. The method includes defining, by the at least one computing device, a pattern on the received data. The pattern reduces a set of rules into a single condition. The method includes generating, by the at least one computing device, an irreducible module based on the pattern. The irreducible module has one output dependent variable and is associated with a data flow application.

Type: Application

Filed: October 8, 2013

Publication date: April 9, 2015

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: William J. Lewis
PROCESSING METHOD

Publication number: 20150100949

Abstract: A method for processing computer program code to enable different parts of the computer program code to be executed by different processing elements of a plurality of communicating processing elements. The method comprises identifying at least one first part of the computer program code, which is to be executed by a particular one of said processing elements. The method further comprises identifying at least one further part of the computer code which is related to the at least one first part of the computer code. The at least one first part of the computer program code and the at least one further part of the computer program code are caused to be executed by the particular one of said processing elements.

Type: Application

Filed: December 15, 2014

Publication date: April 9, 2015

Applicant: CODEPLAY SOFTWARE LIMITED

Inventors: Jens-Uwe Dolinsky, Andrew Richards, Colin Riley
Analytic engine to parallelize serial code

Patent number: 9003383

Abstract: The subject system provides the ability to parallelize pre-existing serial code by importing and encapsulating all of the serial code into an object orientated flowchart language utilizing an analytic engine so that the imported code can be efficiently executed taking advantage of the partially ordered transitive flowchart system. The importation examines the serial code to ascertain what elements may be processed under an atomic time to instantiate them as either Action or Test objects, whereas statements which require more than atomic time are instantiated as Task object, with the Action, Test and Task objects being processable by separate processors to establish parallel processing, or by the multitasking afforded by the partially ordered transitive flowchart system.

Type: Grant

Filed: July 5, 2012

Date of Patent: April 7, 2015

Assignee: You Know Solutions, LLC

Inventors: Ronald J. Lavallee, Thomas C. Peacock
Optimized division of work among processors in a heterogeneous processing system

Patent number: 8997071

Abstract: A compiler implemented by a computer performs optimized division of work across heterogeneous processors. The compiler divides source code into code sections and characterizes each of the code sections based on pre-defined criteria. Each of the code sections is characterized as at least one of: allocate to a main processor, allocate to a processing element, allocate to one of a parameterized main processor and a parameterized processing element, and indeterminate. The compiler analyzes side-effects and costs of executing the code sections on allocated processors, and transforms the code sections based on results of the analyzing. The transforming includes re-characterizing the code sections for alternate execution in a runtime environment.

Type: Grant

Filed: September 10, 2012

Date of Patent: March 31, 2015

Assignee: International Business Machines Corporation

Inventors: Tong Chen, John K. P. O'Brien, Zehra N. Sura
Intraprocedural privatization for shared array references within partitioned global address space (PGAS) languages

Patent number: 8990791

Abstract: Partitioned global address space (PGAS) programming language source code is retrieved by an executed PGAS compiler. At least one shared memory array access indexed by an affine expression that includes a distinct thread identifier that is constant and different for each of a group of program execution threads targeted to execute the PGAS source code is identified within the PGAS source code. It is determined whether the at least one shared memory array access results in a local shared memory access by all of the group of program execution threads for all references to the at least one shared memory array access during execution of a compiled executable of the PGAS source code. A direct memory access executable code is generated for each shared memory array access determined to result in the local shared memory access by all of the group of program execution threads.

Type: Grant

Filed: July 29, 2011

Date of Patent: March 24, 2015

Assignee: International Business Machines Corporation

Inventors: Salem Derisavi, Ettore Tiotto
Method of converting program code of program running in multi-thread to program code causing less lock collisions, computer program and computer system for the same

Patent number: 8972959

Abstract: A method of converting a program code of a program running in multi-thread to a program code which causes fewer lock collisions. The method includes reading the program code into a memory and searching the program code for a first conditional statement making a branch to a path, which is in a synchronized block and has no side effect on the synchronized block; duplicating the path having no side effect to which the branch is made by the searched first conditional statement into the outside of the synchronized block; and adding a second conditional statement into the program code in response to the duplication, wherein the second conditional statement is a conditional statement making a branch to the duplicated path having no side effect. Also provided is a system and an article of manufacture which causes a computer to carry out the steps of the above method.

Type: Grant

Filed: April 27, 2010

Date of Patent: March 3, 2015

Assignee: International Business Machines Corporation

Inventor: Kazuaki Ishizaki
AUTO MULTI-THREADING IN MACROSCALAR COMPILERS

Publication number: 20150058832

Abstract: System and methods for the parallelization of software applications are described. In some embodiments, a compiler may automatically identify within source code dependencies of a function called by another function. A persistent database may be generated to store identified dependencies. When calls the function are encountered within the source code, the persistent database may be checked, and a parallelized implementation of the function may be employed dependent upon the dependency indicated in the persistent database.

Type: Application

Filed: November 4, 2014

Publication date: February 26, 2015

Inventor: Jeffry E. Gonion
Processors and compiling methods for processors

Patent number: 8966459

Abstract: A compiling method compiles an object program to be executed by a processor having a plurality of execution units operable in parallel. In the method a first availability chain is created from a producer instruction (p1), scheduled for execution by a first one of the execution units (20: AGU), to a first consumer instruction (c1), scheduled for execution by a second one of the execution units (22: EXU) and requiring a value produced by the said producer instruction. The first availability chain comprises at least one move instruction (mv1-mv3) for moving the required value from a first point (20: ARF) accessible by the first execution unit to a second point (22: DRF) accessible by the second execution unit.

Type: Grant

Filed: February 20, 2014

Date of Patent: February 24, 2015

Assignee: Altera Corporation

Inventors: Marcio Merino Fernandes, Raymond Malcolm Livesley
Vector width-aware synchronization-elision for vector processors

Patent number: 8966461

Abstract: A medium, method, and apparatus are disclosed for eliding superfluous function invocations in a vector-processing environment. A compiler receives program code comprising a width-contingent invocation of a function. The compiler creates a width-specific executable version of the program code by determining a vector width of a target computer system and omitting the function from the width-specific executable if the vector width meets one or more criteria. For example, the compiler may omit the function call if the vector width is greater than a minimum size.

Type: Grant

Filed: September 29, 2011

Date of Patent: February 24, 2015

Assignee: Advanced Micro Devices, Inc.

Inventors: Benedict R. Gaster, Lee W. Howes, Mark D. Hummel
Parallel processing of data

Patent number: 8959499

Abstract: A data parallel pipeline may specify multiple parallel data objects that contain multiple elements and multiple parallel operations that operate on the parallel data objects. Based on the data parallel pipeline, a dataflow graph of deferred parallel data objects and deferred parallel operations corresponding to the data parallel pipeline may be generated and one or more graph transformations may be applied to the dataflow graph to generate a revised dataflow graph that includes one or more of the deferred parallel data objects and deferred, combined parallel data operations. The deferred, combined parallel operations may be executed to produce materialized parallel data objects corresponding to the deferred parallel data objects.

Type: Grant

Filed: September 20, 2013

Date of Patent: February 17, 2015

Assignee: Google Inc.

Inventors: Craig D. Chambers, Ashish Raniwala, Frances J. Perry, Stephen R. Adams, Robert R. Henry, Robert Bradshaw, Nathan Weizenbaum
Parallelization method, system and program

Patent number: 8959498

Abstract: A parallelization method, system and program. A program expressed by a block diagram or the like is divided into strands and a balance in calculation time is made among the strands. The functional blocks are divided into strands and the strand involving the maximum calculation time from a strand set is found. One or more movable blocks in the strand involving the maximum calculation time is found. The next step is obtaining calculation time of each strand after the movable block is moved to the strand in the input or output direction according to its property, and moving the block to a strand most largely reducing the calculation time of the strand having the maximum calculation time before the movement. This process loops until calculation time is no longer reduced. Strands are then transformed into source codes. Source codes are compiled and assigned to separate cores or processors for execution.

Type: Grant

Filed: February 22, 2011

Date of Patent: February 17, 2015

Assignee: International Business Machines Corporation

Inventors: Hideaki Komatsu, Takeo Yoshizawa
System and method for dynamically spawning thread blocks within multi-threaded processing systems

Patent number: 8959497

Abstract: One embodiment of the present invention sets forth a technique for partitioning a predecessor thread program into sub-programs and dynamically spawning a thread grid of the sub-programs based on the outcome of a conditional statement in the predecessor thread program. The programming instructions for the predecessor thread program are analyzed to assess the benefit of partitioning the thread program at a conditional statement into sub-programs. If the predecessor thread program is partitioned, then each branch of the conditional statement may be used to form a separate sub-program. Predicate tables are populated at the predecessor thread program run-time to establish which possible instances of the thread sub-programs should be spawned in subsequent execution phases.

Type: Grant

Filed: August 29, 2008

Date of Patent: February 17, 2015

Assignee: NVIDIA Corporation

Inventors: John A. Stratton, David Luebke
Method and apparatus and record carrier

Patent number: 8954941

Abstract: Method of generating respective instruction compaction schemes for subsets of instructions to be processed by a programmable processor, comprising the steps of a) receiving at least one input code sample representative for software to be executed on the programmable processor, the input code comprising a plurality of instructions defining a first set of instructions (S1), b) initializing a set of removed instructions as empty (S3), c) determining the most compact representation of the first set of instructions (S4) d) comparing the size of said most compact representation with a threshold value (S5), e) carrying out steps e1 to e3 if the size is larger than said threshold value, e1) determining which instruction of the first set of instructions has a highest coding cost (S6), e2) removing said instruction having the highest coding cost from the first set of instructions and (S7), e3) adding said instruction to the set of removed instructions (S8), f) repeating steps b-f, wherein the first set of instructions

Type: Grant

Filed: September 3, 2010

Date of Patent: February 10, 2015

Assignee: Intel Corporation

Inventors: Hendrik Tjeerd Joannes Zwartenkot, Alexander Augusteijn, Yuanging Guo, Jürgen Von Oerthel, Jeroen Anton Johan Leijten, Erwan Yann Maurice Le Thenaff
Method and system for parallelization of sequential computer program codes

Patent number: 8949786

Abstract: A method and system for parallelization of sequential computer program code are described. In one embodiment, an automatic parallelization system includes a syntactic analyzer to analyze the structure of the sequential computer program code to identify the positions to insert SPI to the sequential computer code; a profiler for profiling the sequential computer program code by preparing call graph to determine dependency of each line of the sequential computer program code and the time required for the execution of each function of the sequential computer program code; an analyzer to determine parallelizability of the sequential computer program code from the information obtained by analyzing and profiling of the sequential computer program code; and a code generator to insert SPI to the sequential computer program code upon determination of parallelizability to obtain parallel computer program code, which is further outputted to a parallel computing environment for execution and the method thereof.

Type: Grant

Filed: December 1, 2009

Date of Patent: February 3, 2015

Assignee: KPIT Technologies Limited

Inventors: Vinay G. Vaidya, Ranadive Priti, Sah Sudhakar
Mechanism for increasing parallelization in computer programs with read-after-write dependencies associated with prefix operations

Patent number: 8949852

Abstract: Some embodiments provide a system that increases parallelization in a computer program. During operation, the system obtains a binary associative operator and a ordered set of elements associated with a prefix operation in the computer program. Next, the system divides the elements into multiple sets of contiguous iterations based on a number of processors used to execute the computer program. The system then performs, in parallel on the processors, a set of local reductions on the contiguous iterations using the binary associative operator. Afterwards, the system calculates a set of boundary prefixes between the contiguous iterations using the local reductions. Finally, the system applies, in parallel on the processors, the boundary prefixes to the contiguous iterations using the binary associative operator to obtain a set of prefixes for the prefix operation.

Type: Grant

Filed: June 29, 2009

Date of Patent: February 3, 2015

Assignee: Oracle America, Inc.

Inventor: Robert E. Cypher
Systems and methods for compiler-based full-function vectorization

Patent number: 8949808

Abstract: Systems and methods for the vectorization of software applications are described. In some embodiments, a compiler may automatically generate both scalar and vector versions of a function from a single source code description. A vector interface may be exposed in a persistent dependency database that is associated with the function. This may allow a compiler to make vector function calls from within vectorized loops, rather than making multiple serialized scalar function calls from within a vectorized loop. This may in turn facilitate the vectorization of hierarchical code, which may improve application performance when vector execution resources are available.

Type: Grant

Filed: September 23, 2010

Date of Patent: February 3, 2015

Assignee: Apple Inc.

Inventor: Jeffry E. Gonion
Compiling code for parallel processing architectures based on control flow

Patent number: 8949806

Abstract: A system comprises a plurality of computation units interconnected by an interconnection network.

Type: Grant

Filed: August 17, 2012

Date of Patent: February 3, 2015

Assignee: Tilera Corporation

Inventors: Walter Lee, Robert A. Gottlieb, Vineet Soni, Anant Agarwal, Richard Schooler
Saving and loading graphical processing unit (GPU) arrays providing high computational capabilities in a computing environment

Patent number: 8949807

Abstract: A device receives, via a technical computing environment, a program that includes a parallel construct and a command to be executed by graphical processing units, and analyzes the program. The device also creates, based on the parallel construct and the analysis, one or more instances of the command to be executed in parallel by the graphical processing units, and transforms, via the technical computing environment, the one or more command instances into one or more command instances that are executable by the graphical processing units. The device further allocates the one or more transformed command instances to the graphical processing units for parallel execution, and receives, from the graphical processing units, one or more results associated with parallel execution of the one or more transformed command instances by the graphical processing units.

Type: Grant

Filed: September 30, 2013

Date of Patent: February 3, 2015

Assignee: The MathWorks, Inc.

Inventors: Halldor N. Stefansson, Edric Ellis
Processing method

Patent number: 8949805

Abstract: A method for processing computer program code to enable different parts of the computer program code to be executed by different processing elements of a plurality of communicating processing elements. The method comprises identifying at least one first part of the computer program code, which is to be executed by a particular one of said processing elements. The method further comprises identifying at least one further part of the computer code which is related to the at least one first part of the computer code. The at least one first part of the computer program code and the at least one further part of the computer program code are caused to be executed by the particular one of said processing elements.

Type: Grant

Filed: June 11, 2010

Date of Patent: February 3, 2015

Assignee: Codeplay Software Limited

Inventors: Jens-Uwe Dolinsky, Andrew Richards, Colin Riley
Graphical processing unit (GPU) arrays providing high computational capabilities in a computing environment

Patent number: 8935682

Abstract: A device initiates a technical computing environment (TCE), and receives, via the TCE, a program command that permits the TCE to access a graphical processing unit that is remote to the device, where the program command permits the TCE to seamlessly transfer data to the remote GPU. The device transforms, via the TCE, the program command into a program command that is executable by the remote GPU, and provides the transformed program command to the remote GPU for execution. The device also receives, from the remote GPU, one or more results associated with execution of the transformed program command by the remote GPU, and utilizes the one or more results via the TCE.

Type: Grant

Filed: September 6, 2013

Date of Patent: January 13, 2015

Assignee: The MathWorks, Inc.

Inventors: Halldor N. Stefansson, Edric Ellis, Jocelyn Luke Martin
Source code protection

Patent number: 8935681

Abstract: A method comprising encrypting an original plain text file and making it available to a user as a protected file, and issuing to said user a user program and a user license to enable said user to decrypt the protected file and view an image of the original file while preventing the image of the original file from being copied to any file, other than as a further protected file. The image is preferably stored in a memory not backed up to the computer swap file. Preferably, the user program comprises an editor program and the user saves editorial changes to the original image in an encrypted difference file, separate from the original file. Both files are then used to re-create the edited image using the editor program and user license. The user program may comprise any computer tool including compilers.

Type: Grant

Filed: September 29, 2005

Date of Patent: January 13, 2015

Assignees: MStar Semiconductor, Inc., MStar Software R&D (Shenzhen) Ltd., MStar France SAS, MStar Semiconductor, Inc.

Inventor: John David Mersh
System, methods and apparatus for program optimization for multi-threaded processor architectures

Patent number: 8930926

Abstract: Methods, apparatus and computer software product for source code optimization are provided. In an exemplary embodiment, a first custom computing apparatus is used to optimize the execution of source code on a second computing apparatus. In this embodiment, the first custom computing apparatus contains a memory, a storage medium and at least one processor with at least one multi-stage execution unit. The second computing apparatus contains at least two multi-stage execution units that allow for parallel execution of tasks. The first custom computing apparatus optimizes the code for parallelism, locality of operations and contiguity of memory accesses on the second computing apparatus. This Abstract is provided for the sole purpose of complying with the Abstract requirement rules. This Abstract is submitted with the explicit understanding that it will not be used to interpret or to limit the scope or the meaning of the claims.

Type: Grant

Filed: April 16, 2010

Date of Patent: January 6, 2015

Assignee: Reservoir Labs, Inc.

Inventors: Cedric Bastoul, Richard A. Lethin, Allen K. Leung, Benoit J. Meister, Peter Szilagyi, Nicolas T. Vasilache, David E. Wohlford
Modelling serialized object streams

Patent number: 8930888

Abstract: Modelling a serialized object stream can include receiving a stream of bytes corresponding to the serialized form of a first object, creating an empty initial model for containing a generic object and a generic class, and, upon detection of a class from the stream, constructing a corresponding generic class object in the model using a processor. Upon detection of a new object from the stream, a corresponding generic object in the model can be constructed. Further objects and classes in the model that are associated with the generic objects and classes can be referenced.

Type: Grant

Filed: June 28, 2012

Date of Patent: January 6, 2015

Assignee: International Business Machines Corporation

Inventor: Julien Canches
Methods and system for executing a program in multiple execution environments

Patent number: 8924929

Abstract: A system and methods are disclosed for executing a technical computing program in parallel in multiple execution environments. A program is invoked for execution in a first execution environment and from the invocation the program is executed in the first execution environment and one or more additional execution environments to provide for parallel execution of the program. New constructs in a technical computing programming language are disclosed for parallel programming of a technical computing program for execution in multiple execution environments. It is also further disclosed a system and method for changing the mode of operation of an execution environment from a sequential mode to a parallel mode of operation and vice-versa.

Type: Grant

Filed: August 28, 2009

Date of Patent: December 30, 2014

Assignee: The MathWorks, Inc.

Inventor: Cleve Moler
Systems and methods for automatically optimizing high performance computing programming languages

Patent number: 8924946

Abstract: Systems and methods for replacing inferior code segments with optimal code segments. Systems and methods for making such replacements for programming languages using Message Passing Interface (MPI) are provided. For example, at the compiler level, point-to-point code segments may be identified and replaced with all-to-all code segments. Programming code may include X10, Chapel and other programming languages that support parallel for loop.

Type: Grant

Filed: November 24, 2010

Date of Patent: December 30, 2014

Assignee: International Business Machines Corporation

Inventors: Ganesh Bikshandi, Krishna Nandivada Venkata, Igor Peshansky, Vijay Anand Saraswat

prev 1 2 3 4 5 6 7 … next