For A Parallel Or Multiprocessor System Patents (Class 717/149)
  • Publication number: 20140380288
    Abstract: Apparatus, systems, and methods for a compiler are described. One such compiler generates machine code corresponding to a set of elements including a general purpose element and a special purpose element. The compiler identifies a portion in an arrangement of relationally connected operators that corresponds to a special purpose element. The compiler also determines whether the portion meets a condition to be mapped to the special purpose element. The compiler also converts the arrangement into an automaton comprising a plurality of states, wherein the portion is converted using a special purpose state that corresponds to the special purpose element if the portion meets the condition. The compiler also converts the automaton into machine code. Additional apparatus, systems, and methods are disclosed.
    Type: Application
    Filed: September 5, 2014
    Publication date: December 25, 2014
    Inventors: Junjuan Xu, Paul Glendenning
  • Patent number: 8918770
    Abstract: A system and method for compiling includes, for a parallelizable code portion of an application stored on a computer readable storage medium, determining one or more variables that are to be transferred to and/or from a coprocessor if the parallelizable code portion were to be offloaded. A start location and an end location are determined for at least one of the one or more variables as a size in memory. The parallelizable code portion is transformed by inserting an offload construct around the parallelizable code portion and passing the one or more variables and the size as arguments of the offload construct such that the parallelizable code portion is offloaded to a coprocessor at runtime.
    Type: Grant
    Filed: August 24, 2012
    Date of Patent: December 23, 2014
    Assignee: NEC Laboratories America, Inc.
    Inventors: Nishkam Ravi, Tao Bao, Ozcan Ozturk, Srimat Chakradhar
  • Patent number: 8914781
    Abstract: Described is predicting cache locality in a multicore/multithreaded processing environment including when threads share cache data in a non-uniform interleaving manner. Thread execution traces are analyzed to compute a set of per-thread parameters that can then be used to predict cache miss rates for other cache sizes. In one aspect, a model is based upon a probability that the cache reuse distance will increase because of accesses by other threads, and another probability that the reuse distance will decrease because of intercept accesses by other threads to shared data blocks. Estimates of the number of shared data blocks, possibly shared data blocks and private data blocks are used in the computations.
    Type: Grant
    Filed: October 24, 2008
    Date of Patent: December 16, 2014
    Assignee: Microsoft Corporation
    Inventors: Trishul A. Chilimbi, Chen Ding
  • Publication number: 20140359589
    Abstract: System and method for configuring a system of heterogeneous hardware components, including at least one: programmable hardware element (PHE), digital signal processor (DSP) core, and programmable communication element (PCE). A program, e.g., a graphical program (GP), which includes floating point math functionality and which is targeted for distributed deployment on the system is created. Respective portions of the program for deployment to respective ones of the hardware components are automatically determined. Program code implementing communication functionality between the at least one PHE and the at least one DSP core and targeted for deployment to the at least one PCE is automatically generated. At least one hardware configuration program (HCP) is generated from the program and the code, including compiling the respective portions of the program and the program code for deployment to respective hardware components. The HCP is deployable to the system for concurrent execution of the program.
    Type: Application
    Filed: October 25, 2013
    Publication date: December 4, 2014
    Applicant: NATIONAL INSTRUMENTS CORPORATION
    Inventors: Jeffrey L. Kodosky, Hugo A. Andrade, Brian Keith Odom, Cary Paul Butler, Brian C. MacCleery, James C. Nagle, J. Marcus Monroe, Alexandre M. Barp
  • Publication number: 20140359590
    Abstract: System and method for configuring a system of heterogeneous hardware components, including at least one: programmable hardware element (PHE), digital signal processor (DSP) core, and programmable communication element (PCE). A program, e.g., a graphical program (GP), which includes floating point math functionality and which is targeted for distributed deployment on the system is created. Respective portions of the program for deployment to respective ones of the hardware components are automatically determined. Program code implementing communication functionality between the at least one PHE and the at least one DSP core and targeted for deployment to the at least one PCE is automatically generated. At least one hardware configuration program (HCP) is generated from the program and the code, including compiling the respective portions of the program and the program code for deployment to respective hardware components. The HCP is deployable to the system for concurrent execution of the program.
    Type: Application
    Filed: October 25, 2013
    Publication date: December 4, 2014
    Applicant: NATIONAL INSTRUMENTS CORPORATION
    Inventors: Jeffrey L. Kodosky, Hugo A. Andrade, Brian Keith Odom, Cary Paul Butler, Brian C. MacCleery, James C. Nagle, J. Marcus Monroe, Alexandre M. Barp
  • Patent number: 8904369
    Abstract: A method for automated process distribution includes selecting a process definition; identifying a first process portion and at least one second process portion in the process definition; generating a first further process definition for the first process portion; generating a second further process definition for each the second process portion; generating a corresponding service definition for each the second further process definition. In the method, generating the first further process definition includes generating a process definition element configured to invoke at least one service of the service definitions, and generating the second further process definition includes generating a process definition element configured to offer a service of the service definition corresponding to that second further process definition.
    Type: Grant
    Filed: June 6, 2007
    Date of Patent: December 2, 2014
    Assignee: International Business Machines Corporation
    Inventor: Dieter Roller
  • Patent number: 8898648
    Abstract: A profiling tool identifies a code region with a false sharing potential. A static analysis tool classifies variables and arrays in the identified code region. A mapping detection library correlates memory access instructions in the identified code region with variables and arrays in the identified code region while a processor is running the identified code region. The mapping detection library identifies one or more instructions at risk, in the identified code region, which are subject to an analysis by a false sharing detection library. A false sharing detection library performs a run-time analysis of the one or more instructions at risk while the processor is re-running the identified code region. The false sharing detection library determines, based on the performed run-time analysis, whether two different portions of the cache memory line are accessed by the generated binary code.
    Type: Grant
    Filed: November 30, 2012
    Date of Patent: November 25, 2014
    Assignee: International Business Machines Corporation
    Inventors: I-Hsin Chung, Guojing Cong, Hiroki Murata, Yasushi Negishi, Hui-Fang Wen
  • Patent number: 8898652
    Abstract: Various technologies and techniques are disclosed for providing a hardware accelerated software transactional memory application. The software transactional memory application has access to metadata in a cache of a central processing unit that can be used to improve the operation of the STM system. For example, open read barrier filtering is provided that uses an opened-for-read bit that is contained in the metadata to avoid redundant open read processing. Similarly, redundant read log validation can be avoided using the metadata. For example, upon entering commit processing for a particular transaction, a get-evictions instruction in an instruction set architecture of the central processing unit is invoked. A retry operation can be optimized using the metadata. The particular transaction is aborted at a current point and put to sleep. The corresponding cache line metadata in the metadata are marked appropriately to efficiently detect a write by another CPU.
    Type: Grant
    Filed: June 8, 2007
    Date of Patent: November 25, 2014
    Assignee: Microsoft Corporation
    Inventors: Jan Gray, Timothy L. Harris, James Larus, Burton Smith
  • Patent number: 8892738
    Abstract: A technique for generating component usage statistics involves associating components with blocks of a stream-enabled application. When the streaming application is executed, block requests may be logged by Block ID in a log. The frequency of component use may be estimated by analyzing the block request log with the block associations.
    Type: Grant
    Filed: April 4, 2008
    Date of Patent: November 18, 2014
    Assignee: Numecent Holdings, Inc.
    Inventors: Jeffrey de Vries, Arthur Shingen Hitomi
  • Patent number: 8881124
    Abstract: According to the conventional loop parallelization method, when a loop in which a value of a loop-carried dependency variable can be calculated in all of the iterations without sequentially executing the loop from the start, it is determined that DOALL parallelization is not applicable due to the loop-carried dependency variable. Accordingly, the loop is sequentially executed or parallelized by using DOACROSS parallelization that executes a loop including a loop-carried dependency variable. That is, there is a problem that an expression including a loop-carried dependency cannot be parallelized and efficiently processed with use of a multi-processor. By generating initial value calculating codes, the loop-carried dependency in a source code prior to parallelization can be solved, and by dividing a loop included in the source code into subloops that can be executed in parallel, the multi-processor can efficiently process the source code.
    Type: Grant
    Filed: December 13, 2011
    Date of Patent: November 4, 2014
    Assignee: Panasonic Corporation
    Inventor: Daisuke Baba
  • Publication number: 20140325494
    Abstract: A device including a data analysis element including a plurality of memory cells. The memory cells analyze at least a portion of a data stream and output a result of the analysis. The device also includes a detection cell. The detection cell includes an AND gate. The AND gate receives result of the analysis as a first input. The detection cell also includes a D flip-flop including an output coupled to a second input of the AND gate.
    Type: Application
    Filed: July 11, 2014
    Publication date: October 30, 2014
    Inventors: David R. Brown, Harold B Noyes
  • Patent number: 8869126
    Abstract: A method and apparatus is disclosed for compilation of an original Cobol program with support for improved performance by increased parallelism during execution using multiple threads of processing. The approach includes a two stage compilation process, the first compilation/translation step by a first specialized compiler/translator that takes as input a Cobol source program that includes parallelization directives, and produces as output an intermediate computer program in a second computer programming language, the intermediate program including parallelization directives in the second computer programming language. The intermediate program is then compiled utilizing a selected second compiler that provides support for parallelism described in the second programming language. The approach optionally allows for use of pragmas serving as parallelization directives to the compiler in the original Cobol program or in the intermediate program.
    Type: Grant
    Filed: December 28, 2012
    Date of Patent: October 21, 2014
    Assignee: Bull HN Information Systems Inc.
    Inventors: Cynthia S. Guenthner, Russell W. Guenthner, John Edward Heath, Albert Henry John Wigchert, F. Michel Brown, Nicholas John Colasacco, Clinton B. Eckard
  • Patent number: 8869125
    Abstract: The invention relates to a system and method for demarcating information related to one or more blocks in an application source code. This invention provides a means to annotate block information in the source code. It parses the application source code to generate an abstract syntax tree and instruments the source code to capture information related to the one or more blocks generated at the time of dynamic analysis of the application. The information related to the one or more blocks are stored in Hash Map and based on this information the abstract syntax tree is modified to add the information related to the one or more blocks and inserting this information in the application source code.
    Type: Grant
    Filed: December 13, 2012
    Date of Patent: October 21, 2014
    Assignee: Infosys Limited
    Inventors: Murali Krishna Emani, Sudeep Mallick, Balkrishna Prasad
  • Patent number: 8869121
    Abstract: Data processing using multidimensional fields is described along with methods for advantageously using high-level language codes.
    Type: Grant
    Filed: July 7, 2011
    Date of Patent: October 21, 2014
    Assignee: Pact XPP Technologies AG
    Inventors: Martin Vorbach, Frank May, Armin Nückel
  • Patent number: 8869104
    Abstract: A system and method for managing several versions of a device with embedded object code by using an editor to scan the object code, find a signature, change one or more parameters within the object code, and replace the object code. The device may be shipped to a customer in a standard configuration and the object code may be changed by the customer using the editor.
    Type: Grant
    Filed: June 30, 2004
    Date of Patent: October 21, 2014
    Assignee: LSI Corporation
    Inventor: Roy Wade
  • Patent number: 8863105
    Abstract: An automatic control system capable of executing a control program in parallel is described. The system includes more than one unit controller, each executing in parallel at least a part of the program to be executed by the automatic control system; a compiler, connected to one of the unit controllers, for converting the program to be executed by the automatic control system into tasks executed in parallel by the unit controllers; an interconnection network, for connecting the unit controllers, such that information on one of the unit controllers is transferred to another one via the interconnection network.
    Type: Grant
    Filed: November 28, 2008
    Date of Patent: October 14, 2014
    Assignee: Siemens Aktiengesellschaft
    Inventors: Ming Jie, Fei Long, Li Pan, Detlef Pauly
  • Patent number: 8863104
    Abstract: Systems and methods for parallelizing applications that operate on irregular data structures. In an embodiment, the methods and systems enable programmers to use set iterators to express algorithms containing amorphous data parallelism. Parallelization can be achieved by speculatively executing multiple iterations of the iterator in parallel. Conflicts between speculatively executing iterations can be detected and handled using information in class libraries.
    Type: Grant
    Filed: June 10, 2009
    Date of Patent: October 14, 2014
    Assignee: Board of Regents, The University of Texas System
    Inventors: Keshav Kumar Pingali, Milind Vidyadhar Kulkarni
  • Patent number: 8850409
    Abstract: A method is provided for translating sets of constraint declarations to imperative code sequences based on defining an instantiatable object per set, inserting calls to a notification callback mechanism on state modification and defining calls in the constraint context as imperative code sequences that, in response to these callbacks, take actions to maintain these constraints.
    Type: Grant
    Filed: May 21, 2008
    Date of Patent: September 30, 2014
    Assignee: OptumSoft, Inc.
    Inventor: David R. Cheriton
  • Patent number: 8850410
    Abstract: A system and method for improving software maintainability, performance, and/or security by associating a unique marker to each software code-block; the system comprising of a plurality of processors, a plurality of code-blocks, and a marker associated with each code-block. The system may also include a special hardware register (code-block marker hardware register) in each processor for identifying the markers of the code-blocks executed by the processor, without changing any of the plurality of code-blocks.
    Type: Grant
    Filed: January 29, 2010
    Date of Patent: September 30, 2014
    Assignee: International Business Machines Corporation
    Inventors: Ramanjaneya S. Burugula, Joefon Jann, Pratap C. Pattnaik
  • Patent number: 8843910
    Abstract: A facility for identifying functionally distinct memory access reorderings for a multithreaded program is described. The facility monitors execution of the program to detect, for each of one or more memory locations, an order in which the memory location was accessed by the threads of the program, each access being at least one of a read access and a write access. Among a number of possible memory access reorderings of a read access by a reading thread to a location and a write access by a writing thread to the same location where the write access preceded the read access, the facility identifies as functionally distinct memory access reorderings those possible memory access reorderings where the reading thread could have become newly aware of changed state of the writing thread as a result of the indicated read access.
    Type: Grant
    Filed: March 14, 2011
    Date of Patent: September 23, 2014
    Assignee: F5 Networks, Inc.
    Inventors: Andrew M. Schwerin, Peter J. Godman, Kaya Bekiroglu
  • Patent number: 8843909
    Abstract: A method for transforming a procedural program having procedural language code into an object-oriented distributed software program is provided. A procedural program is transformed into intermediate client-server code. The intermediate client-server code is partitioned into an N-tier application program.
    Type: Grant
    Filed: July 8, 2004
    Date of Patent: September 23, 2014
    Assignee: CA, Inc.
    Inventors: David L. Tondreau, Jr., John P. Mahony
  • Patent number: 8838626
    Abstract: Embodiments of techniques and systems for parallel XML parsing are described. An event-level XML parser may include a lightweight events partitioning stage, parallel events parsing stages, and a post-processing stage. The events partition may pick out event boundaries using single-instruction, multiple-data instructions to find occurrences of the “<” character, marking event boundaries. Subsequent checking may be performed to help identify other event boundaries, as well as non-boundary instances of the “<” character. During events parsing, unresolved items, such as namespace resolution or matching of start and end elements, may be recorded in structure metadata. This structure metadata may be used during the subsequent post-processing to perform a check of the XML data. If the XML data is well-formed, individual sub-event streams formed by the events parsing processes may be assembled into a flat result event stream structure. Other embodiments may be described and claimed.
    Type: Grant
    Filed: December 17, 2009
    Date of Patent: September 16, 2014
    Assignee: Intel Corporation
    Inventors: Zhiqiang Yu, Yuejian Fang, Lei Zhai, Yun Wang, Zhonghai Wu, Mo Dai
  • Patent number: 8839213
    Abstract: A compiler is provided that determines when the use of software transactional memory (STM) primitives may be optimized with respect to a set of collectively dominating STM primitives. The compiler analysis coordinates the use of variables containing possible shadow copy pointers to allow the analysis to be performed for both direct write and buffered write STM systems. The coordination of the variables containing the possible shadow copy pointers ensures that the results of STM primitives are properly reused. The compiler analysis identifies memory accesses where STM primitives may be eliminated, combined, or substituted for lower overhead STM primitives.
    Type: Grant
    Filed: June 27, 2008
    Date of Patent: September 16, 2014
    Assignee: Microsoft Corporation
    Inventors: David L. Detlefs, Michael M. Magruder, Yosseff Levanoni, Vinod K. Grover
  • Patent number: 8839212
    Abstract: A code generator and multi-core framework are executable in a computer system to implement methods as disclosed herein, including a method for the code generator to automatically generate multi-threaded source code from functional specifications, and for the multi-core framework, which is a run time component, to generate multi-threaded task object code from the multi-threaded source code and to execute the multi-threaded task object code on respective processor cores. The methods provide transparency to the programmer, and during execution, provide automatic identification of processing parallelisms. The methods implement Consume-Simplify-Produce and Normalize-Transpose-Distribute operations to reduce complex expression sets in a functional specification to simplified expression sets operable in parallel processing environments through the generated multi-threaded task object code.
    Type: Grant
    Filed: September 30, 2013
    Date of Patent: September 16, 2014
    Assignee: Texas Tech University System
    Inventors: Daniel E. Cooke, J. Nelson Rushton, Brad Nemanich
  • Patent number: 8839214
    Abstract: A high level programming language provides an extensible set of transformations for use on indexable types in a data parallel processing environment. A compiler for the language implements each transformation as a map from indexable types to allow each transformation to be applied to other transformations. At compile time, the compiler identifies sequences of the transformations on each indexable type in data parallel source code and generates data parallel executable code to implement the sequences as a combined operation at runtime using the transformation maps. The compiler also incorporates optimizations that are based on the sequences of transformations into the data parallel executable code.
    Type: Grant
    Filed: June 30, 2010
    Date of Patent: September 16, 2014
    Assignee: Microsoft Corporation
    Inventors: Paul F. Ringseth, Weirong Zhu, Rick Molloy, Charles D. Callahan, II, Yosseff Levanoni, Lingli Zhang
  • Publication number: 20140258995
    Abstract: A compiler and language using the comma as a parallelism operator may ensure that variables on the left hand side of a line of code are only used once, and that the variables on the left hand side of the line of code are not being used as function arguments. Commas may be replaced with semi-colons.
    Type: Application
    Filed: March 5, 2014
    Publication date: September 11, 2014
    Inventor: Steven Mark CASSELMAN
  • Patent number: 8826234
    Abstract: A relational model may be used to encode primitives for each of a plurality of threads in a multi-core processor. The primitives may include tasks and parameters, such as buffers. The relationships may be linked to particular tasks. The tasks with the coding, which indicates the relationships, may then be used upon user selection to display a visualization of the functional relationships between tasks.
    Type: Grant
    Filed: December 23, 2009
    Date of Patent: September 2, 2014
    Assignee: Intel Corporation
    Inventors: Christopher J. Cormack, Nathaniel Duca, Jason Plumb
  • Patent number: 8826258
    Abstract: A method of generating a computer program, the method comprising: independently compiling a plurality of source code modules to generate a plurality of respective object modules comprising a plurality of respective threads explicitly designated by a user to be executed in parallel; in each of the object modules, inserting at least one symbol indicative of a property of the object module's thread potentially conflicting with a corresponding property of a thread of another of said object module as a result of parallel execution of those threads; executing a linker to perform a linking process on said object modules, the linking process comprising: assessing the symbols in conjunction with one another to determine whether a conflict exists between the threads of two or more of the respective object modules; and linking the object modules to generate a computer program in which said threads are executable in parallel, wherein the linking is performed in dependence on said assessment.
    Type: Grant
    Filed: May 11, 2009
    Date of Patent: September 2, 2014
    Assignee: Xmos Limited
    Inventors: Martin Young, Richard Osborne, Douglas Watt
  • Patent number: 8826053
    Abstract: An engine processor program, stored in a non-volatile storage region 37 of a storage section 35 connected to a host processor 31 of a host section 30, for execution in an engine processor 41 of an engine section 40, is transmitted from the host section 30 to the engine section 40. The engine processor program received by the engine section 40 is stored in a volatile storage section 42 connected to an engine processor 51. Then, the host section 30 notifies an execution instruction for a specified program, among the engine processor programs stored in the storage section 42, to the engine section 40 and causes execution on the engine processor 41. As a result, even in a structure provided, the engine section 40 does not need a large capacity non-volatile storage region, thereby configuring a compact mobile communication terminal.
    Type: Grant
    Filed: December 4, 2006
    Date of Patent: September 2, 2014
    Assignee: Vodafone Group PLC
    Inventors: Masahiko Kuwabara, Kazuo Aoki, Toshiro Matsumura
  • Patent number: 8813044
    Abstract: A method, system, and article of manufacture are disclosed for transforming a definition of a process for delivering a service. This service process definition is comprised of computer readable code. The method comprises the steps of expressing a given set of assumptions in a computer readable code; and transforming said process definition by using a processing unit to apply said assumptions to said process definition to change the configuration of the process definition. The process definition may be transformed by using factors relating to the specific context in or for which the process definition is executed. Also, the process definition may be transformed by identifying, in a flow diagram for the service process definition, flows to which the assumptions apply, and applying program rewriting techniques to those identified flows.
    Type: Grant
    Filed: September 6, 2012
    Date of Patent: August 19, 2014
    Assignee: International Business Machines Corporation
    Inventors: David F. Bantz, Steven J. Mastrianni, James R. Moulic, Dennis G. Shea
  • Patent number: 8813052
    Abstract: Various technologies and techniques are disclosed for providing a bounded transactional memory application that accesses cache metadata in a cache of a central processing unit. When performing a transactional read from the bounded transactional memory application, a cache line metadata transaction-read bit is set. When performing a transactional write from the bounded transactional memory application, a cache line metadata transaction-write bit is set and a conditional store is performed. At commit time, if any lines marked with the transaction-read bit or the transaction-write bit were evicted or invalidated, all speculatively written lines are discarded. The application can also interrogate a cache line metadata eviction summary to determine whether a transaction is doomed and then take an appropriate action.
    Type: Grant
    Filed: June 8, 2007
    Date of Patent: August 19, 2014
    Assignee: Microsoft Corporation
    Inventors: Jan Gray, Timothy L. Harris, James Larus, Burton Smith
  • Patent number: 8813053
    Abstract: Systems and methods for parallel incomplete LU (ILU) factorization in distributed sparse linear systems, which order nodes underlying the equations in the system(s) by dividing nodes into interior nodes and boundary nodes and assigning no more than three codes to distinguish the boundary nodes. Each code determines an ordering of the nodes, which in turn determines the order in which the equations will be factored and the solution performed.
    Type: Grant
    Filed: September 25, 2012
    Date of Patent: August 19, 2014
    Assignee: Landmark Graphics Corporation
    Inventors: Qinghua Wang, James William Watts, III
  • Publication number: 20140229926
    Abstract: Apparatus, systems, and methods for a compiler are disclosed. One such compiler parses a human readable expression into a syntax tree and converts the syntax tree into an automaton having in-transitions and out-transitions. Converting can include unrolling the quantification as a function of in-degree limitations wherein in-degree limitations includes a limit on the number of transitions into a state of the automaton. The compiler can also convert the automaton into an image for programming a parallel machine, and publishes the image. Additional apparatus, systems, and methods are disclosed.
    Type: Application
    Filed: April 14, 2014
    Publication date: August 14, 2014
    Applicant: Micron Technology, Inc.
    Inventors: Junjuan Xu, Paul Glendenning
  • Patent number: 8806466
    Abstract: A program generation apparatus references a source program including a loop for executing a block N times (N?2) and having such dependence that a variable defined in a statement in the block pertaining to ith execution (1?i<N) is referenced by a statement in the block pertaining to jth execution (i<j?N), calculates equivalent representations of variables in the block pertaining to the ith execution and the block pertaining to any other execution than the ith execution, specifies, with respect to each representation of a target variable causing the dependence, a representation of a variable not causing the dependence that is equivalent to the representation of the target variable, and generates a program being for executing the block M times (M?N) and including a statement including the specified representation in place of each representation of the target variable.
    Type: Grant
    Filed: July 4, 2011
    Date of Patent: August 12, 2014
    Assignee: Panasonic Corporation
    Inventors: Akira Tanaka, Hiroyuki Morishita, Akihiko Inoue
  • Patent number: 8806458
    Abstract: Intermediate representation (IR) code is received as compiled from a shader in the form of shader language source code. The input IR code is first analyzed during an analysis pass, during which operations, scopes, parts of scopes, and if-statement scopes are annotated for predication, mask usage, and branch protection and predication. This analysis outputs vectorization information that is then used by various sets of vectorization transformation rules to vectorize the input IR code, thus producing vectorized output IR code.
    Type: Grant
    Filed: February 16, 2012
    Date of Patent: August 12, 2014
    Assignee: Microsoft Corporation
    Inventors: Andy Glaister, Blaise Pascal Tine, Blake Pelton, Derek Sessions, Mikhail Lyapunov, Yuri Dotsenko
  • Publication number: 20140223419
    Abstract: According to one embodiment, a compiler applicable to a parallel computer including processors, wherein a source program is input to the compiler and a local code for each of the processors are generated, the compiler includes a generation module and an object code generation module. The generation module is configured to analyze the input source program, extract a data transfer point from a procedure described in the source program, and generate a call processing for data copy. The object code generation module is configured to generate an object code including the call processing.
    Type: Application
    Filed: August 30, 2013
    Publication date: August 7, 2014
    Applicant: Kabushiki Kaisha Toshiba
    Inventor: Ryuji SAKAI
  • Patent number: 8799880
    Abstract: A method of identifying and extracting functional parallelism from a PLC program has been developed that results in the ability of the extracted program fragments to be executed in parallel across a plurality of separate resources, and a compiler configured to perform the functional parallelism (i.e., identification and extraction processes) and perform the scheduling of the separate fragments within a given set of resources. The inventive functional parallelism creates a larger number of separable elements than was possible with prior dataflow analysis methodologies.
    Type: Grant
    Filed: March 15, 2012
    Date of Patent: August 5, 2014
    Assignee: Siemens Aktiengesellschaft
    Inventors: Arquimedes Martinez Canedo, Mohammad Abdullah Al Faruque, Mitchell Packer, Richard Freitag
  • Patent number: 8799881
    Abstract: According to one embodiment, a parallelizing unit divides a loop into first and second processes based on a program to be converted and division information. The first and second processes respectively have termination control information, loop control information, and change information. The parallelizing unit inserts into the first process a determination process determining whether the second process is terminated at execution of an (n?1)th iteration of the second process when the second process is subsequent to the first process or determining whether the second process is terminated at execution of an nth iteration of the second process when the second process precedes the first process. The parallelizing unit inserts into the second process a control process controlling execution of the second process based on the result of determination notified by the determination process.
    Type: Grant
    Filed: July 12, 2011
    Date of Patent: August 5, 2014
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Nobuaki Tojo, Hidenori Matsuzaki
  • Patent number: 8789031
    Abstract: In one embodiment, the present invention includes a software-controlled method of forming instruction strands. The software may include instructions to obtain code of a superblock including a plurality of basic blocks, build a dependency directed acyclic graph (DAG) for the code, sort nodes coupled by edges of the dependency DAG into a topological order, form strands from the nodes based on hardware constraints, rule constraints, and scheduling constraints, and generate executable code for the strands and store the executable code in a storage. Other embodiments are described and claimed.
    Type: Grant
    Filed: September 18, 2007
    Date of Patent: July 22, 2014
    Assignee: Intel Corporation
    Inventors: Wei Liu, Lixin Su, Youfeng Wu, Herbert Hum
  • Patent number: 8789063
    Abstract: Systems and methods establish communication and control between various heterogeneous processors in a computing system so that an operating system can run an application across multiple heterogeneous processors. With a single set of development tools, software developers can create applications that will flexibly run on one CPU or on combinations of central, auxiliary, and peripheral processors. In a computing system, application-only processors can be assigned a lean subordinate kernel to manage local resources. An application binary interface (ABI) shim is loaded with application binary images to direct kernel ABI calls to a local subordinate kernel or to the main OS kernel depending on which kernel manifestation is controlling requested resources.
    Type: Grant
    Filed: March 30, 2007
    Date of Patent: July 22, 2014
    Assignee: Microsoft Corporation
    Inventors: Orion Hodson, Haryadi Gunawi, Galen C. Hunt
  • Patent number: 8782624
    Abstract: A device including a data analysis element including a plurality of memory cells. The memory cells analyze at least a portion of a data stream and output a result of the analysis. The device also includes a detection cell. The detection cell includes an AND gate. The AND gate receives result of the analysis as a first input. The detection cell also includes a D-flip flop including an output coupled to a second input of the AND gate.
    Type: Grant
    Filed: December 15, 2011
    Date of Patent: July 15, 2014
    Assignee: Micron Technology, Inc.
    Inventors: David R. Brown, Harold B Noyes
  • Patent number: 8782625
    Abstract: Concepts and technologies are described herein for determining memory safety of floating-point computations. The concepts and technologies described herein analyze code to determine if any floating-point computations exist in the code, and if so, if the floating-point computations are memory safe. The analysis can include identifying floating-point instructions and conditional statements in the code. The code can be symbolically executed, and behavior of the floating-point instructions and the conditional statements can be monitored to determine if a floating point calculation is ever involved in computation of any memory address during the execution of the code.
    Type: Grant
    Filed: June 17, 2010
    Date of Patent: July 15, 2014
    Assignee: Microsoft Corporation
    Inventors: Patrice Godefroid, Johannes Kinder
  • Publication number: 20140196017
    Abstract: A method of enabling compiler assisted parallelization of one or more stream processing operators in a stream processing application, which consists of a data flow graph with operators as vertices connected by streams. The method includes specifying a parallelized version of one or more of the operators, with a parameterized degree of parallelism, in the stream application, evaluating whether or not to use the parallelized operator, deciding the degree of parallelism of the parallelized operator, if there is a need for a parallelized operator.
    Type: Application
    Filed: June 18, 2012
    Publication date: July 10, 2014
    Applicant: International Business Machines Corporation
    Inventors: Nagui Halim, Vibhore Kumar, Kung-Lung Wu, Sai Wu
  • Patent number: 8775510
    Abstract: The invention provides, in one aspect, an improved system for data access comprising a file server that is coupled to a client device or application executing thereon via one or more networks. The server comprises static storage that is organized in one or more directories, each containing, zero, one or more files. The server also comprises a file system operable, in cooperation with a file system on the client device, to provide authorized applications executing on the client device access to those directories and/or files. Fast file server (FFS) software or other functionality executing on or in connection with the server responds to requests received from the client by transferring requested data to the client device over multiple network pathways. That data can comprise, for example, directory trees, files (or portions thereof), and so forth.
    Type: Grant
    Filed: January 31, 2013
    Date of Patent: July 8, 2014
    Assignee: PME IP Australia Pty Ltd
    Inventors: Malte Westerhoff, Detlev Stalling
  • Patent number: 8776030
    Abstract: One embodiment of the present invention sets forth a technique for translating application programs written using a parallel programming model for execution on multi-core graphics processing unit (GPU) for execution by general purpose central processing unit (CPU). Portions of the application program that rely on specific features of the multi-core GPU are converted by a translator for execution by a general purpose CPU. The application program is partitioned into regions of synchronization independent instructions. The instructions are classified as convergent or divergent and divergent memory references that are shared between regions are replicated. Thread loops are inserted to ensure correct sharing of memory between various threads during execution by the general purpose CPU.
    Type: Grant
    Filed: March 31, 2009
    Date of Patent: July 8, 2014
    Assignee: NVIDIA Corporation
    Inventors: Vinod Grover, Bastiaan Joannes Matheus Aarts, Michael Murphy
  • Patent number: 8769485
    Abstract: A stream processing platform that provides fast execution of stream processing applications within a safe runtime environment. The platform includes a stream compiler that converts a representation of a stream processing application into executable program modules for a safe environment. The platform allows users to specify aspects of the program that contribute to generation of modules that execute as intended. A user may specify aspects to control a type of implementation for loops, order of execution for parallel paths, whether multiple instances of an operation can be performed in parallel or whether certain operations should be executed in separate threads. In addition, the stream compiler may generate executable modules in a way that cause a safe runtime environment to allocate memory or otherwise operate efficiently.
    Type: Grant
    Filed: December 22, 2006
    Date of Patent: July 1, 2014
    Assignee: TIBCO Software, Inc.
    Inventors: Jonathan Salz, Richard S. Tibbetts
  • Patent number: 8769510
    Abstract: A device receives program code, and receives size/type information associated with inputs to the program code. The device determines, prior to execution of the program code and based on the input size/type information, a portion of the program code that is executable by a graphical processing unit (GPU), and determines, prior to execution of the program code and based on the input size/type information, a portion of the program code that is executable by a central processing unit (CPU). The device compiles the GPU-executable portion of the program code to create a compiled GPU-executable portion of the program code, and compiles the CPU-executable portion of the program code to create a compiled CPU-executable portion of the program code. The device provides, to the GPU for execution, the compiled GPU-executable portion of the program code, and provides, to the CPU for execution, the compiled CPU-executable portion of the program code.
    Type: Grant
    Filed: September 30, 2010
    Date of Patent: July 1, 2014
    Assignee: The MathWorks, Inc.
    Inventors: Jocelyn Luke Martin, Joseph F. Hicklin
  • Patent number: 8768678
    Abstract: One or more embodiments provide a load balancing solution for improving the runtime performance of parallel HDL simulators. During compilation each process is analyzed to determine a simulation cost based on complexity of the HDL processes. During simulation, processes to be executed in the same simulation cycle are scheduled using the simulation costs computed at compile-time in order to reduce the delay incurred during simulation.
    Type: Grant
    Filed: September 26, 2011
    Date of Patent: July 1, 2014
    Assignee: Xilinx, Inc.
    Inventors: Valeria Mihalache, Christopher H. Kingsley, Jimmy Z. Wang, Kumar Deepak
  • Patent number: 8769507
    Abstract: A method, system, and article of manufacture are disclosed for transforming a definition of a process for delivering a service on a specified computing device. This service process definition is comprised of computer readable code. The method comprises the steps of expressing a given set of assumptions in a computer readable code; and transforming the definition by using a processing unit to apply the assumptions to the definition of the process to change the way in which the process operates. The definition of the process may be transformed by using factors relating to the specific context in or for which the definition is executed. Also, the definition may be transformed by identifying, in a flow diagram for the process, flows to which the assumptions apply, and applying program rewriting techniques to those identified flows.
    Type: Grant
    Filed: May 14, 2009
    Date of Patent: July 1, 2014
    Assignee: International Business Machines Corporation
    Inventors: David F. Bantz, Steven J. Mastrianni, James R. Moulic, Dennis G. Shea
  • Patent number: 8756590
    Abstract: A compile environment is provided in a computer system that allows programmers to program both CPUs and data parallel devices (e.g., GPUs) using a high level general purpose programming language that has data parallel (DP) extensions. A compilation process translates modular DP code written in the general purpose language into DP device source code in a high level DP device programming language using a set of binding descriptors for the DP device source code. A binder generates a single, self-contained DP device source code unit from the set of binding descriptors. A DP device compiler generates a DP device executable for execution on one or more data parallel devices from the DP device source code unit.
    Type: Grant
    Filed: June 22, 2010
    Date of Patent: June 17, 2014
    Assignee: Microsoft Corporation
    Inventors: Weirong Zhu, Lingli Zhang, Sukhdeep S. Sodhi, Yosseff Levanoni