Processing Control Patents (Class 712/220)
  • Publication number: 20130185543
    Abstract: The invention provides an embedded processor architecture comprising a plurality of virtual processing units that each execute processes or threads (collectively, “threads”). One or more execution units, which are shared by the processing units, execute instructions from the threads. An event delivery mechanism delivers events—such as, by way of non-limiting example, hardware interrupts, software-initiated signaling events (“software events”) and memory events—to respective threads without execution of instructions. Each event can, per aspects of the invention, be processed by the respective thread without execution of instructions outside that thread. The threads need not be constrained to execute on the same respective processing units during the lives of those threads—though, in some embodiments, they can be so constrained. The execution units execute instructions from the threads without needing to know what threads those instructions are from.
    Type: Application
    Filed: September 13, 2012
    Publication date: July 18, 2013
    Applicants: SHARP KABUSHIKI KAISHA (A/K/A SHARP CORPORATION)
    Inventors: Steven Frank, Shigeki Imai
  • Patent number: 8490085
    Abstract: A method for running, on a processor in non-privileged mode, different computer programs P while, in a nominal mode, using privileged instructions including running a hypervisor program in privileged mode of the processor, the hypervisor program providing the computer programs P with services substantially equivalent to those available for running in privileged mode, source codes of the computer programs P being modified beforehand for replacing the privileged instructions with calls for services supplied by the hypervisor program, and the hypervisor program creates at least two privileged submodes organized into a hierarchy within the non-privileged mode and the processor includes only two operating modes.
    Type: Grant
    Filed: September 2, 2005
    Date of Patent: July 16, 2013
    Assignee: VMware, Inc.
    Inventor: Fabrice Devaux
  • Patent number: 8489864
    Abstract: Performing non-transactional escape actions within a hardware based transactional memory system. A method includes at a hardware thread on a processor beginning a hardware based transaction for the thread. Without committing or aborting the transaction, the method further includes suspending the hardware based transaction and performing one or more operations for the thread, non-transactionally and not affected by: transaction monitoring and buffering for the transaction, an abort for the transaction, or a commit for the transaction. After performing one or more operations for the thread, non-transactionally, the method further includes resuming the transaction and performing additional operations transactionally. After performing the additional operations, the method further includes either committing or aborting the transaction.
    Type: Grant
    Filed: June 26, 2009
    Date of Patent: July 16, 2013
    Assignee: Microsoft Corporation
    Inventors: Gad Sheaffer, Jan Gray, Martin Taillefer, Ali-Reza Adl-Tabatabai, Bratin Saha, Vadim Bassin, Robert Y. Geva, David Callahan
  • Patent number: 8489863
    Abstract: An information handling system includes a processor with an instruction issue queue (IQ) that may perform age tracking operations. The issue queue IQ maintains or stores instructions that may issue out-of-order in an internal data store (IDS). The IDS organizes instructions in a queue position (QPOS) addressing arrangement. An age matrix of the IQ maintains a record of relative instruction aging for those instructions within the IDS. The age matrix updates latches or other memory cell data to reflect the changes in IDS instruction ages during a dispatch operation into the IQ. During dispatch of one or more instructions, the age matrix may update only those latches that require data change to reflect changing IDS instruction ages. The age matrix employs row and column data and clock controls to individually update those latches requiring update.
    Type: Grant
    Filed: April 19, 2012
    Date of Patent: July 16, 2013
    Assignee: International Business Machines Corporation
    Inventors: James Wilson Bishop, Mary Douglass Brown, Jeffrey Carl Brownscheidle, Robert Allen Cordes, Maureen Anne Delaney, Jafar Nahidi, Dung Quoc Nguyen, Joel Abraham Silberman
  • Patent number: 8489862
    Abstract: An object of the invention is to reduce the electric power consumption resulting from temporarily activating a processor requiring a large electric power consumption, out of a plurality of processors. A multiprocessor system (1) includes: a first processor (141) which executes a first instruction code; a second processor (151) which executes a second instruction code, a hypervisor (130) which converts the second instruction code into an instruction code executable by the first processor (141); and a power control circuit (170) which controls the operation of at least one of the first processor (141) and the second processor (151). When the operation of the second processor (151) is suppressed by the power control circuit (170), the hypervisor (130) converts the second instruction code into the instruction code executable by the first processor (141), and the first processor (141) executes the converted instruction code.
    Type: Grant
    Filed: June 5, 2008
    Date of Patent: July 16, 2013
    Assignee: Panasonic Corporation
    Inventors: Masahiko Saito, Masashige Mizuyama
  • Patent number: 8489865
    Abstract: A command chain system includes plurality of processing elements, a memory, and a chain engine. The chain engine is in communication with the memory and accesses instructions in the memory. The chain engine accesses a subroutine stored in the memory. The chain engine sends a command to a specialized hardware. The chain engine performs an action determined by one or more of the operation-code portion, the skip portion, and the loop-count portion of the instruction.
    Type: Grant
    Filed: April 15, 2010
    Date of Patent: July 16, 2013
    Assignee: Lockheed Martin Corporation
    Inventors: Joshua W. Rensch, Marlon O. Gunderson, James V. Hedin
  • Publication number: 20130173888
    Abstract: A programmable processor and method for improving the performance of processors by expanding at least two source operands, or a source and a result operand, to a width greater than the width of either the general purpose register or the data path width. The present invention provides operands which are substantially larger than the data path width of the processor by using the contents of a general purpose register to specify a memory address at which a plurality of data path widths of data can be read or written, as well as the size and shape of the operand. In addition, several instructions and apparatus for implementing these instructions are described which obtain performance advantages if the operands are not limited to the width and accessible number of general purpose registers.
    Type: Application
    Filed: August 22, 2012
    Publication date: July 4, 2013
    Applicant: MicroUnity Systems Engineering, Inc.
    Inventors: Craig Hansen, John Moussouris, Alexia Massalin
  • Publication number: 20130173889
    Abstract: A parallel processing system for computing particle interactions includes a plurality of computation nodes arranged according to a geometric partitioning of a simulation volume. Each computation node has storage for particle data. This particle data is associated with particles in a region of the geometrically partitioned simulation volume. The parallel processing system also includes a communication system having links interconnecting the computation nodes. Each of the computation nodes includes a processor subsystem. These processor subsystems cooperate to coordinate computation of the particle interactions in a distributed manner.
    Type: Application
    Filed: February 1, 2013
    Publication date: July 4, 2013
    Inventor: D.E. Shaw Research LLC
  • Publication number: 20130166887
    Abstract: According to one embodiment, a data processing apparatus includes a processor and a memory. The processor includes core blocks. The memory stores a command queue and task management structure data. The command queue stores a series of kernel functions. The task management structure data defines an order of execution of kernel functions by associating a return value of a previous kernel function with an argument of a subsequent kernel function. Core blocks of the processor are capable of executing different kernel functions.
    Type: Application
    Filed: August 16, 2012
    Publication date: June 27, 2013
    Inventor: Ryuji Sakai
  • Publication number: 20130166889
    Abstract: A method and apparatus are described for generating flags in response to processing data during an execution pipeline cycle of a processor. The processor may include a multiplexer configured generate valid bits for received data according to a designated data size, and a logic unit configured to control the generation of flags based on a shift or rotate operation command, the designated data size and information indicating how many bytes and bits to rotate or shift the data by. A carry flag may be used to extend the amount of bits supported by shift and rotate operations. A sign flag may be used to indicate whether a result is a positive or negative number. An overflow flag may be used to indicate that a data overflow exists, whereby there are not a sufficient number of bits to store the data.
    Type: Application
    Filed: December 22, 2011
    Publication date: June 27, 2013
    Applicant: ADVANCED MICRO DEVICES, INC.
    Inventors: Srikanth Arekapudi, Saurabh Gupta
  • Publication number: 20130166888
    Abstract: Techniques are described for predictively starting a processing element. Embodiments receive streaming data to be processed by a plurality of processing elements. An operator graph of the plurality of processing elements that defines at least one execution path is established. Embodiments determine a historical startup time for a first processing element in the operator graph, where, once started, the first processing element begins normal operations once the first processing element has received a requisite amount of data from one or more upstream processing elements. Additionally, embodiments determine an amount of time the first processing element takes to receive the requisite amount of data from the one or more upstream processing elements. The first processing element is then predictively started at a first startup time based on the determined historical startup time and the determined amount of time historically taken to receive the requisite amount of data.
    Type: Application
    Filed: December 10, 2012
    Publication date: June 27, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: International Business Machines Corporation
  • Patent number: 8473963
    Abstract: In one embodiment, the present invention includes a method of assigning a location within a shared variable for each of multiple threads and writing a value to a corresponding location to indicate that the corresponding thread has reached a barrier. In such manner, when all the threads have reached the barrier, synchronization is established. In some embodiments, the shared variable may be stored in a cache accessible by the multiple threads. Other embodiments are described and claimed.
    Type: Grant
    Filed: March 23, 2011
    Date of Patent: June 25, 2013
    Assignee: Intel Corporation
    Inventors: Sailesh Kottapalli, John H. Crawford
  • Patent number: 8473720
    Abstract: A method for providing generic formatted data to at least one digital data processor, configured to translate generic formatted data into specific formatted data. The generic formatted data includes data relative to logical blocks, at least one of the logical blocks corresponding to an object to be processed directly or indirectly according to specific formatted data by at least one processing platform with processor(s) and memory(ies), located upstream from the processor or integrated into the processor, the object being made up of elementary information of same type, all information being represented by at least one numerical value.
    Type: Grant
    Filed: December 19, 2006
    Date of Patent: June 25, 2013
    Assignee: DXO Labs
    Inventor: Bruno Liege
  • Publication number: 20130159677
    Abstract: Generating instructions, in particular for mailbox verification in a simulation environment. A sequence of instructions is received, as well as selection data representative of a plurality of commands including a special command. Repeatedly selecting one of the plurality of commands and outputting an instruction based on the selected command. The outputting of an instruction includes outputting a next instruction in the sequence of instructions if the selected command is the special command, and outputting an instruction associated with the command if the selected command is not the special command.
    Type: Application
    Filed: November 1, 2012
    Publication date: June 20, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: INTERNATIONAL BUSINESS MACHINES CORPORATION
  • Publication number: 20130159673
    Abstract: A method is provided that includes determining a number of outstanding out-of-order instructions in an instruction stream. The method includes determining a number of available hardware resources for executing out-of-order instructions and inserting fencing instructions into the instruction stream if the number of outstanding out-of-order instructions exceeds the determined number of available hardware resources. A second method is provided for compiling source code that includes determining a speculative region. The second method includes generating machine-level instructions and inserting fencing instructions into the machine-level instructions in response to determining the speculative region. A processing device is provided that includes cache memory and a processing unit to execute processing device instructions in an instruction stream.
    Type: Application
    Filed: December 15, 2011
    Publication date: June 20, 2013
    Inventors: Martin T. Pohlack, Michael Hohmuth, Stephan Diestelhorst, David Christie, Luke Yen
  • Publication number: 20130159679
    Abstract: In one embodiment, the present invention includes a method for receiving a data access instruction and obtaining an index into a data access hint register (DAHR) register file of a processor from the data access instruction, reading hint information from a register of the DAHR register file accessed using the index, and performing the data access instruction using the hint information. Other embodiments are described and claimed.
    Type: Application
    Filed: December 20, 2011
    Publication date: June 20, 2013
    Inventors: James E. McCormick, JR., Dale Morris
  • Publication number: 20130159678
    Abstract: A code section of a computer program to be executed by a computing device includes memory barrier instructions. Where the code section satisfies a threshold, the code section is modified, by enclosing the code section within a transaction that employs hardware transactional memory of the computing device, and removing the memory barrier instructions from the code section. Execution of the code section as has been enclosed within the transaction can be monitored to yield monitoring results. Where the monitoring results satisfy an abort threshold corresponding to excessive aborting of the execution of the code section as has been enclosed within the transaction, the code section is split into code sub-sections, and each code sub-section enclosed within a separate transaction that employs the hardware transactional memory. Splitting the code section sections and enclosing each code sub-section within a separate transaction can decrease occurrence of the code section aborting during execution.
    Type: Application
    Filed: December 15, 2011
    Publication date: June 20, 2013
    Inventors: Toshihiko Koju, Takuya Nakaike, Ali Ijaz Sheikh, Harold Wade Cain, III, Maged M. Michael
  • Publication number: 20130159683
    Abstract: A particular method includes receiving, at a processor, an instruction and an address of the instruction. The method also includes preventing execution of the instruction based at least in part on determining that the address is within a range of addresses.
    Type: Application
    Filed: December 16, 2011
    Publication date: June 20, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Mark J. Hickey, Adam J. Muff, Matthew R. Tubbs, Charles D. Wait
  • Patent number: 8468547
    Abstract: Systems and methods for synchronizing thread wavefronts and associated events are disclosed. According to an embodiment, a method for synchronizing one or more thread wavefronts and associated events includes inserting a first event associated with a first data output from a first thread wavefront into an event synchronizer. The event synchronizer is configured to release the first event before releasing events inserted subsequent to the first event. The method further includes releasing the first event from the event synchronizer after the first data is stored in the memory. Corresponding system and computer readable medium embodiments are also disclosed.
    Type: Grant
    Filed: November 23, 2010
    Date of Patent: June 18, 2013
    Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: Laurent LeFebvre, Michael Mantor, Deborah Lynne Szasz
  • Patent number: 8468323
    Abstract: A computer array (10) has a plurality of computers (12). The computers (12) communicate with each other asynchronously, and the computers (12) themselves operate in a generally asynchronous manner internally. When one computer (12) attempts to communicate with another it goes to sleep until the other computer (12) is ready to complete the transaction, thereby saving power and reducing heat production. The sleeping computer (12) can be awaiting data or instructions (12). In the case of instructions, the sleeping computer (12) can be waiting to store the instructions or to immediately execute the instructions. In the later case, the instructions are placed in an instruction register (30a) when they are received and executed therefrom, without first placing the instructions first into memory. The instructions can include a micro-loop (100) which is capable of performing a series of operations repeatedly.
    Type: Grant
    Filed: March 21, 2011
    Date of Patent: June 18, 2013
    Assignee: ARRAY Portfolio LLC
    Inventors: Charles H. Moore, Jeffrey Arthur Fox, John W. Rible
  • Patent number: 8464028
    Abstract: In one embodiment, a processor comprises a redirect unit configured to detect a match of an instruction pointer (IP) in an IP redirect table, the IP corresponding to a guest instruction that the processor has intercepted, wherein the guest is executed under control of a virtual machine monitor (VMM), and wherein the redirect unit is configured to redirect instruction fetching by the processor to a routine identified in the IP redirect table instead of exiting to the VMM in response to the intercept of the guest instruction.
    Type: Grant
    Filed: January 22, 2009
    Date of Patent: June 11, 2013
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Benjamin C. Serebrin, Anton Chernoff
  • Patent number: 8464030
    Abstract: A method, information processing system, and computer program product crack and/or shorten computer executable instructions. At least one instruction is received. The at least on instruction is analyzed. An instruction type associated with the at least one instruction is identified. At least one of a base field, an index field, one or more operands, and a mask field of the instruction are analyzed. At least one of the following is then performed: the at least one instruction is organized into a set of unit of operation; and the at least one instruction is shortened. The set of unit of operations is then executed.
    Type: Grant
    Filed: April 9, 2010
    Date of Patent: June 11, 2013
    Assignee: International Business Machines Corporation
    Inventors: Fadi Busaba, Brian Curran, Lee Eisen, Bruce Giamei, David Hutton
  • Publication number: 20130145135
    Abstract: A method utilizes information provided by performance monitoring hardware to dynamically adjust the number of levels of speculative branch predictions allowed (typically 3 or 4 per thread). for a processor core. The information includes cycles-per-instruction (CPI) for the processor core and number of memory accesses per unit time. If the CPI is below a CPI threshold; and the number of memory accesses (NMA) per unit time is above a prescribe threshold, the number of levels of speculative branch predictions is reduced per thread for the processor core. Likewise, the number of levels of speculative branch predictions could be increased, from a low level to maximum allowed, if the CPI threshold is exceeded or the number of memory accesses per unit time is below the prescribed threshold.
    Type: Application
    Filed: December 1, 2011
    Publication date: June 6, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Robert H. Bell, JR., Wen-Tzer T. Chen
  • Publication number: 20130145132
    Abstract: A system, for use with a compiler architecture framework, includes performing a statically speculative compilation process to extract and use speculative static information, encoding the speculative static information in an instruction set architecture of a processor, and executing a compiled computer program using the speculative static information, wherein executing supports static speculation driven mechanisms and controls.
    Type: Application
    Filed: November 6, 2012
    Publication date: June 6, 2013
    Applicant: BlueRISC Inc.
    Inventor: BlueRISC Inc.
  • Patent number: 8458443
    Abstract: A processor may include a plurality of processing units for processing instructions, where each processing unit is associated with a discrete instruction queue. Data is read from a data queue selected by each instruction, and a sequencer manages distribution of instructions to the plurality of discrete instruction queues.
    Type: Grant
    Filed: September 8, 2009
    Date of Patent: June 4, 2013
    Assignee: SMSC Holdings S.A.R.L.
    Inventors: Matthias Tramm, Manfred Stadler, Christian Hitz
  • Publication number: 20130138926
    Abstract: An indirect branch instruction takes an address register as an argument in order to provide indirect function call capability for single-instruction multiple-thread (SIMT) processor architectures. The indirect branch instruction is used to implement indirect function calls, virtual function calls, and switch statements to improve processing performance compared with using sequential chains of tests and branches.
    Type: Application
    Filed: November 12, 2012
    Publication date: May 30, 2013
    Applicant: NVIDIA CORPORATION
    Inventor: NVIDIA CORPORATION
  • Publication number: 20130138925
    Abstract: A method and circuit arrangement speculatively preprocess data stored in a register file during otherwise unused cycles in an execution unit, e.g., to prenormalize denormal floating point values stored in a floating point register file, to decompress compressed values stored in a register file, to decrypt encrypted values stored in a register file, or to otherwise preprocess data that is stored in an unprocessed form in a register file.
    Type: Application
    Filed: November 30, 2011
    Publication date: May 30, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Mark J. Hickey, Adam J. Muff, Matthew R. Tubbs, Charles D. Wait
  • Publication number: 20130138927
    Abstract: A data processing apparatus has an instruction memory system arranged to output an instruction word addressed by an instruction address. An instruction execution unit, processes a plurality of instructions from the instruction word in parallel. A detection unit, detects in which of a plurality of ranges the instruction address lies. The detection unit is coupled to the instruction execution unit and/or the instruction memory system, to control a way in which the instruction execution unit parallelizes processing of the instructions from the instruction word, dependent on a detected range. In an embodiment the instruction execution unit and/or the instruction memory system adjusts a width of the instruction word that determines a number of instructions from the instruction word that is processed in parallel, dependent on the detected range.
    Type: Application
    Filed: January 28, 2013
    Publication date: May 30, 2013
    Applicant: Nytell Software LLC
    Inventors: Ramanathan Sethuraman, Balakrishnan Srinivasan, Carlos Alba Pinto, Harm Peters, Rafael Peset LLopis
  • Patent number: 8452947
    Abstract: A hardware wake-and-go mechanism is provided for a data processing system. The wake-and-go mechanism looks ahead in the instruction stream of a thread for programming idioms that indicates that the thread is waiting for an event. The wake-and-go mechanism updates a wake-and-go array with a target address associated with the event for each recognized programming idiom. When the thread reaches a programming idiom, the thread goes to sleep until the event occurs. The wake-and-go array may be a content addressable memory (CAM). When a transaction appears on the symmetric multiprocessing (SMP) fabric that modifies the value at a target address in the CAM, the CAM returns a list of storage addresses at which the target address is stored. The wake-and-go mechanism associates these storage addresses with the threads waiting for an event at the target addresses, and may wake the one or more threads waiting for the event.
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: May 28, 2013
    Assignee: International Business Machines Corporation
    Inventors: Ravi K. Arimilli, Satya P. Sharma, Randal C. Swanberg
  • Patent number: 8452945
    Abstract: A data processor includes an instruction decoder, an execution unit, a general-purpose register file, and an index-register file. The instruction set for the data processor includes indirect-indexing instructions to facilitate table lookups. When executing such an instruction, the execution unit reads an index stored at an index-register location specified by the instruction. The index refers to a general-purpose register location, which is then read and copied to a general-purpose register location as specified by the instruction. The disclosed execution unit includes four functional units, each with two read ports and a write port so that eight table lookups can be performed in parallel.
    Type: Grant
    Filed: September 17, 2002
    Date of Patent: May 28, 2013
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventor: Dale Morris
  • Publication number: 20130132709
    Abstract: A method and system for processing instruction information. Each instruction information character string of a sequence of instruction information character strings are sequentially extracted and processed. Each instruction information character string pertains to an associated target object wrapped in a target object storage unit by an associated operation target model. It is independently ascertained for each instruction information character string whether to generate a code line for each instruction information character string, by: determining whether a requirement is satisfied and generating the code line and storing the code line in a code buffer if the requirement has been determined to be satisfied and not generating the code line if the requirement has been determined to not be satisfied. The requirement relates to whether the instruction information character string being processed comprises a naming instruction or a generation instruction.
    Type: Application
    Filed: October 11, 2012
    Publication date: May 23, 2013
    Applicant: International Business Machines Corporation
    Inventor: International Business Machines Corporation
  • Patent number: 8447955
    Abstract: A multiprocessor data processing system (MDPS) with a weakly-ordered architecture providing processing logic for substantially eliminating issuing sync instructions after every store instruction of a well-behaved application. Instructions of a well-behaved application are translated and executed by a weakly-ordered processor. The processing logic includes a lock address tracking utility (LATU), which provides an algorithm and a table of lock addresses, within which each lock address is stored when the lock is acquired by the weakly-ordered processor. When a store instruction is encountered in the instruction stream, the LATU compares the target address of the store instruction against the table of lock addresses. If the target address matches one of the lock addresses, indicating that the store instruction is the corresponding unlock instruction (or lock release instruction), a sync instruction is issued ahead of the store operation.
    Type: Grant
    Filed: October 28, 2008
    Date of Patent: May 21, 2013
    Assignee: International Business Machines Corporation
    Inventors: Andrew Dunshea, Satya Prakash Sharma, Mysore Sathyanarayana Srinivas
  • Publication number: 20130124826
    Abstract: A technique for optimizing program instruction execution throughput in a central processing unit core (CPU). The CPU implements a simultaneous multithreading (SMT) operational mode wherein program instructions associated with at least two software threads are executed in parallel as hardware threads while sharing one or more hardware resources used by the CPU, such as cache memory, translation lookaside buffers, functional execution units, etc. As part of the SMT mode, the CPU implements an autothread (AT) operational mode. During the AT operational mode, a determination is made whether there is a resource conflict between the hardware threads that undermines instruction execution throughput. If a resource conflict is detected, the CPU adjusts the relative instruction execution rates of the hardware threads based on relative priorities of the software threads.
    Type: Application
    Filed: November 11, 2011
    Publication date: May 16, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Amit Merchant, Dipankar Sarma, Vaidyanathan Srinivasan
  • Publication number: 20130117543
    Abstract: A method and apparatus for processing multi-cycle instructions include picking a multi-cycle instruction and directing the picked multi-cycle instruction to a pipeline. The pipeline includes a pipeline control configured to detect a latency and a repeat rate of the picked multi-cycle instruction and to count clock cycles based on the detected latency and the detected repeat rate. The method and apparatus further include detecting the repeat rate and the latency of the picked multi-cycle instruction, and counting clock cycles based on the detected repeat rate and the latency of the picked multi-cycle instruction.
    Type: Application
    Filed: November 4, 2011
    Publication date: May 9, 2013
    Applicant: ADVANCED MICRO DEVICES, INC.
    Inventors: Ganesh Venkataramanan, Michael G. Butler
  • Publication number: 20130117542
    Abstract: A method, information processing system, and computer program product record an execution of a program instruction. A determination is made that a thread has entered a program unit. Another determination is made that that the thread is associated with at least one attribute that matches a set of thread recording criteria. An instruction recording mechanism for the thread is dynamically activated in response to the at least one attribute of the thread matching the set of thread recording criteria.
    Type: Application
    Filed: November 4, 2011
    Publication date: May 9, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Christopher D. FILACHEK, Mei Hui WANG, Joshua B. WISNIEWSKI
  • Publication number: 20130117545
    Abstract: A computer employs a set of General Purpose Registers (GPRs). Each GPR comprises a plurality of portions. Programs such as an Operating System and Applications operating in a Large GPR mode, access the full GPR, however programs such as Applications operating in Small GPR mode, only have access to a portion at a time. Instruction Opcodes, in Small GPR mode, may determine which portion is accessed.
    Type: Application
    Filed: December 26, 2012
    Publication date: May 9, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: International Business Machines Corporation
  • Publication number: 20130117544
    Abstract: Disclosed are a method and system for optimized, dynamic data-dependent program execution. The disclosed system comprises a statistics computer which computes statistics of the incoming data at the current time instant, where the said statistics include the probability distribution of the incoming data, the probability distribution over program modules induced by the incoming data, the probability distribution induced over program outputs by the incoming data, and the time-complexity of each program module for the incoming data, wherein the said statistics are computed on as a function of current and past data, and previously computed statistics; a plurality of alternative execution path orders designed prior to run-time by the use of an appropriate source code; a source code selector which selects one of the execution path orders as a function of the statistics computed by the statistics computer; a complexity measurement which measures the time-complexity of the currently selected execution path-order.
    Type: Application
    Filed: December 21, 2012
    Publication date: May 9, 2013
    Applicant: International Business Machines Corporation
    Inventor: International Business Machines Corporation
  • Patent number: 8438568
    Abstract: In an embodiment, if a self thread has more than one conflict, a transaction of the self thread is aborted and restarted. If the self thread has only one conflict and an enemy thread of the self thread has more than one conflict, the transaction of the self thread is committed. If the self thread only conflicts with the enemy thread and the enemy thread only conflicts with the self thread and the self thread has a key that has a higher priority than a key of the enemy thread, the transaction of the self thread is committed. If the self thread only conflicts with the enemy thread, the enemy thread only conflicts with the self thread, and the self thread has a key that has a lower priority than the key of the enemy thread, the transaction of the self thread is aborted.
    Type: Grant
    Filed: February 24, 2010
    Date of Patent: May 7, 2013
    Assignee: International Business Machines Corporation
    Inventors: Mark E. Giampapa, Thomas M. Gooding, Raul E. Silvera, Kai-Ting Amy Wang, Peng Wu, Xiaotong Zhuang
  • Patent number: 8438369
    Abstract: A method and apparatus for providing fairness in a multi-processing element environment is herein described. Mask elements are utilized to associated portions of a reservation station with each processing element, while still allowing common access to another portion of reservation station entries. Additionally, bias logic biases selection of processing elements in a pipeline away from a processing element associated with a blocking stall to provide fair utilization of the pipeline.
    Type: Grant
    Filed: November 8, 2010
    Date of Patent: May 7, 2013
    Assignee: Intel Corporation
    Inventors: Morris Marden, Matthew Merten, Alexandre Farcy, Avinash Sodani, James Hadley, Ilhyun Kim
  • Patent number: 8438368
    Abstract: Optimizing processing of a document sequentially processed by a plurality of image processing apparatuses that refer to an instruction document indicating the processing to be performed by each of the plurality of image processing apparatuses and respective security measures performed by each of the plurality of image processing apparatuses.
    Type: Grant
    Filed: April 1, 2009
    Date of Patent: May 7, 2013
    Assignee: Fuji Xerox Co., Ltd.
    Inventor: Tetsuo Numata
  • Patent number: 8433884
    Abstract: A multiprocessor executes a plurality of threads without decreasing execution efficiency. The multiprocessor includes a first processor allocating a different register file to each of a predetermined number of threads to be executed from among plural threads, and executing the predetermined number of threads in parallel; and a second processor performing processing according to a processing request made by the first processor. The first processor has areas allocated to the plurality of threads in one-to-one correspondence, makes the processing request to the second processor according to an instruction included in one of the predetermined number of threads, upon receiving a request for writing a value resulting from the processing from the second processor, judges whether the one thread is being executed, and when judging negatively, performs control such that the obtained value is written into one of the areas allocated to the one thread.
    Type: Grant
    Filed: June 16, 2009
    Date of Patent: April 30, 2013
    Assignee: Panasonic Corporation
    Inventor: Hiroyuki Morishita
  • Publication number: 20130103931
    Abstract: Disclosed are machine processors and methods performed thereby. The processor has access to processing units for performing data processing and to libraries. Functions in the libraries are implementable to perform parallel processing and graphics processing. The processor may be configured to acquire (e.g., to download from a web server) a download script, possibly with extensions specifying bindings to library functions. Running the script may cause the processor to create, for each processing unit, contexts in which functions may be run, and to run, on the processing units and within a respective context, a portion of the download script. Running the script may also cause the processor to create, for a processing unit, a memory object, transfer data into that memory object, and transfer data back to the processor in such a way that a memory address of the data in the memory object is not returned to the processor.
    Type: Application
    Filed: October 10, 2012
    Publication date: April 25, 2013
    Applicant: MOTOROLA MOBILITY LLC
    Inventor: MOTOROLA MOBILITY LLC
  • Patent number: 8429656
    Abstract: Methods and apparatuses are presented for graphics operations with thread count throttling, involving operating a processor to carry out multiple threads of execution of, wherein the processor comprises at least one execution unit capable of supporting up to a maximum number of threads, obtaining a defined memory allocation size for allocating, in at least one memory device, a thread-specific memory space for the multiple threads, obtaining a per thread memory requirement corresponding to the thread-specific memory space, determining a thread count limit based on the defined memory allocation size and the per thread memory requirement, and sending a command to the processor to cause the processor to limit the number of threads carried out by the at least one execution unit to a reduced number of threads, the reduced number of threads being less than the maximum number of threads.
    Type: Grant
    Filed: November 2, 2006
    Date of Patent: April 23, 2013
    Assignee: NVIDIA Corporation
    Inventors: Jerome F. Duluk, Jr., Bryon S. Nordquist
  • Patent number: 8429386
    Abstract: Various techniques for dynamically allocating instruction tags and using those tags are disclosed. These techniques may apply to processors supporting out-of-order execution and to architectures that supports multiple threads. A group of instructions may be assigned a tag value from a pool of available tag values. A tag value may be usable to determine the program order of a group of instructions relative to other instructions in a thread. After the group of instructions has been (or is about to be) committed, the tag value may be freed so that it can be re-used on a second group of instructions. Tag values are dynamically allocated between threads; accordingly, a particular tag value or range of tag values is not dedicated to a particular thread.
    Type: Grant
    Filed: June 30, 2009
    Date of Patent: April 23, 2013
    Assignee: Oracle America, Inc.
    Inventors: Paul J. Jordan, Robert T. Golla, Jama I. Barreh
  • Publication number: 20130097410
    Abstract: Disclosed are machine processors and methods performed thereby. The processor has access to processing units for performing data processing and to libraries. Functions in the libraries are implementable to perform parallel processing and graphics processing. The processor may be configured to acquire (e.g., to download from a web server) a download script, possibly with extensions specifying bindings to library functions. Running the script may cause the processor to create, for each processing unit, contexts in which functions may be run, and to run, on the processing units and within a respective context, a portion of the download script. Running the script may also cause the processor to create, for a processing unit, a memory object, transfer data into that memory object, and transfer data back to the processor in such a way that a memory address of the data in the memory object is not returned to the processor.
    Type: Application
    Filed: October 10, 2012
    Publication date: April 18, 2013
    Applicant: MOTOROLA MOBILITY LLC
    Inventor: MOTOROLA MOBILITY LLC
  • Patent number: 8423750
    Abstract: Mechanisms are provided for offloading a workload from a main thread to an assist thread. The mechanisms receive, in a fetch unit of a processor of the data processing system, a branch-to-assist-thread instruction of a main thread. The branch-to-assist-thread instruction informs hardware of the processor to look for an already spawned idle thread to be used as an assist thread. Hardware implemented pervasive thread control logic determines if one or more already spawned idle threads are available for use as an assist thread. The hardware implemented pervasive thread control logic selects an idle thread from the one or more already spawned idle threads if it is determined that one or more already spawned idle threads are available for use as an assist thread, to thereby provide the assist thread. In addition, the hardware implemented pervasive thread control logic offloads a portion of a workload of the main thread to the assist thread.
    Type: Grant
    Filed: May 12, 2010
    Date of Patent: April 16, 2013
    Assignee: International Business Machines Corporation
    Inventors: Ronald P. Hall, Hung Q. Le, Raul E. Silvera, Balaram Sinharoy
  • Patent number: 8423605
    Abstract: Provided is a parallel distributed processing method executed by a computer system comprising a parallel-distributed-processing control server, a plurality of extraction processing servers and a plurality of aggregation processing servers. The managed data includes at least a first and a second data items, the plurality of data items each including a value. The method includes a step of extracting data from one of the plurality of chunks according to a value in the second data item, to thereby group the data, a step of merging groups having the same value in the second data item based on an order of a value in the first data item of data contained in a group among groups, and a step of processing data in a group obtained through the merging by focusing on the order of the value in the first data item.
    Type: Grant
    Filed: March 17, 2010
    Date of Patent: April 16, 2013
    Assignee: Hitachi, Ltd.
    Inventor: Ryo Kawai
  • Patent number: 8417920
    Abstract: Circuitry for receiving transaction requests from a plurality of masters and the masters themselves are disclosed. The circuitry comprises: an input port for receiving said transaction requests, at least one of said transaction requests received comprising an indicator indicating if said transaction is a speculative transaction; an output port for outputting a response to said master said transaction request was received from; and transaction control circuitry; wherein said transaction control circuitry is responsive to a speculative transaction request to determine a state of at least a portion of a data processing apparatus said circuitry is operating within and in response to said state being a predetermined state said transaction control circuitry generates a transaction cancel indicator and outputs said transaction cancel indicator as said response, said transaction cancel indicator indicating to said master that said speculative transaction will not be performed.
    Type: Grant
    Filed: December 21, 2007
    Date of Patent: April 9, 2013
    Assignee: ARM Limited
    Inventors: Ashley Miles Stevens, Daren Croxford
  • Patent number: 8417735
    Abstract: One embodiment of the present invention sets forth a technique for performing a parallel scan operation with high computational efficiency in a single-instruction multiple-data (SIMD) environment. Each participating thread initially writes an extended region of a data array to initialize the region with an identity value. For example, a value of zero is used as the identity value for addition. The initialized region of the data array includes an initialized entry for every possible out of bounds index that may be computed in the normal course of the parallel scan operation. During the parallel scan operation each thread computes data array indices according to any technically appropriate technique. When a participating thread computes an index that would conventionally be out of bounds, the thread is able to retrieve an identity value from the initialized region of the data array rather than perform a bounds check that returns the identity value.
    Type: Grant
    Filed: December 12, 2007
    Date of Patent: April 9, 2013
    Assignee: Nvidia Corporation
    Inventor: John A. Stratton
  • Patent number: 8416430
    Abstract: Disclosed is an information processing apparatus in which various kinds of information are processed in either the real time processing mode or the non-real time processing mode. The apparatus includes an operation display section to accept an inputted instruction, an image processing section to apply a processing to image information and a processor provided with a plurality of same cores. The real-time processing unnecessary process that is related to the operation display section, is fixed onto one of the plurality of same cores so that the one of the plurality of same cores is in charge of controlling the real-time processing unnecessary process, while, the real-time processing necessary process that is related to the image processing section, is fixed onto another one of the plurality of same cores so that the other one of the plurality of same cores is in charge of controlling the real-time processing necessary process.
    Type: Grant
    Filed: March 18, 2010
    Date of Patent: April 9, 2013
    Assignee: Konica Minolta Business Technologies, Inc.
    Inventor: Tadashi Suzue