Processing Control Patents (Class 712/220)

Arithmetic operation instruction processing (Class 712/221)

Floating point or vector (Class 712/222)

Logic operation instruction processing (Class 712/223)

Masking (Class 712/224)

Processing control for data transfer (Class 712/225)

Instruction modification based on condition (Class 712/226)

Specialized instruction processing in support of testing, debugging, emulation (Class 712/227)

Context preserving (e.g., context swapping, checkpointing, register windowing (Class 712/228)

Mode switch or change (Class 712/229)

Generating next microinstruction address (Class 712/230)

Detecting end or completion of microprogram (Class 712/231)

Hardwired controller (Class 712/232)

Branching (e.g., delayed branch, loop control, branch predict, interrupt) (Class 712/233)

Processing sequence control (i.e., microsequencing) (Class 712/245)

GENERAL PURPOSE EMBEDDED PROCESSOR

Publication number: 20130185543

Abstract: The invention provides an embedded processor architecture comprising a plurality of virtual processing units that each execute processes or threads (collectively, “threads”). One or more execution units, which are shared by the processing units, execute instructions from the threads. An event delivery mechanism delivers events—such as, by way of non-limiting example, hardware interrupts, software-initiated signaling events (“software events”) and memory events—to respective threads without execution of instructions. Each event can, per aspects of the invention, be processed by the respective thread without execution of instructions outside that thread. The threads need not be constrained to execute on the same respective processing units during the lives of those threads—though, in some embodiments, they can be so constrained. The execution units execute instructions from the threads without needing to know what threads those instructions are from.

Type: Application

Filed: September 13, 2012

Publication date: July 18, 2013

Applicants: SHARP KABUSHIKI KAISHA (A/K/A SHARP CORPORATION)

Inventors: Steven Frank, Shigeki Imai
Methods and systems for CPU virtualization by maintaining a plurality of virtual privilege leves in a non-privileged mode of a processor

Patent number: 8490085

Abstract: A method for running, on a processor in non-privileged mode, different computer programs P while, in a nominal mode, using privileged instructions including running a hypervisor program in privileged mode of the processor, the hypervisor program providing the computer programs P with services substantially equivalent to those available for running in privileged mode, source codes of the computer programs P being modified beforehand for replacing the privileged instructions with calls for services supplied by the hypervisor program, and the hypervisor program creates at least two privileged submodes organized into a hierarchy within the non-privileged mode and the processor includes only two operating modes.

Type: Grant

Filed: September 2, 2005

Date of Patent: July 16, 2013

Assignee: VMware, Inc.

Inventor: Fabrice Devaux
Performing escape actions in transactions

Patent number: 8489864

Abstract: Performing non-transactional escape actions within a hardware based transactional memory system. A method includes at a hardware thread on a processor beginning a hardware based transaction for the thread. Without committing or aborting the transaction, the method further includes suspending the hardware based transaction and performing one or more operations for the thread, non-transactionally and not affected by: transaction monitoring and buffering for the transaction, an abort for the transaction, or a commit for the transaction. After performing one or more operations for the thread, non-transactionally, the method further includes resuming the transaction and performing additional operations transactionally. After performing the additional operations, the method further includes either committing or aborting the transaction.

Type: Grant

Filed: June 26, 2009

Date of Patent: July 16, 2013

Assignee: Microsoft Corporation

Inventors: Gad Sheaffer, Jan Gray, Martin Taillefer, Ali-Reza Adl-Tabatabai, Bratin Saha, Vadim Bassin, Robert Y. Geva, David Callahan
Processor including age tracking of issue queue instructions

Patent number: 8489863

Abstract: An information handling system includes a processor with an instruction issue queue (IQ) that may perform age tracking operations. The issue queue IQ maintains or stores instructions that may issue out-of-order in an internal data store (IDS). The IDS organizes instructions in a queue position (QPOS) addressing arrangement. An age matrix of the IQ maintains a record of relative instruction aging for those instructions within the IDS. The age matrix updates latches or other memory cell data to reflect the changes in IDS instruction ages during a dispatch operation into the IQ. During dispatch of one or more instructions, the age matrix may update only those latches that require data change to reflect changing IDS instruction ages. The age matrix employs row and column data and clock controls to individually update those latches requiring update.

Type: Grant

Filed: April 19, 2012

Date of Patent: July 16, 2013

Assignee: International Business Machines Corporation

Inventors: James Wilson Bishop, Mary Douglass Brown, Jeffrey Carl Brownscheidle, Robert Allen Cordes, Maureen Anne Delaney, Jafar Nahidi, Dung Quoc Nguyen, Joel Abraham Silberman
Multiprocessor control apparatus for controlling a plurality of processors sharing a memory and an internal bus and multiprocessor control method and multiprocessor control circuit for performing the same

Patent number: 8489862

Abstract: An object of the invention is to reduce the electric power consumption resulting from temporarily activating a processor requiring a large electric power consumption, out of a plurality of processors. A multiprocessor system (1) includes: a first processor (141) which executes a first instruction code; a second processor (151) which executes a second instruction code, a hypervisor (130) which converts the second instruction code into an instruction code executable by the first processor (141); and a power control circuit (170) which controls the operation of at least one of the first processor (141) and the second processor (151). When the operation of the second processor (151) is suppressed by the power control circuit (170), the hypervisor (130) converts the second instruction code into the instruction code executable by the first processor (141), and the first processor (141) executes the converted instruction code.

Type: Grant

Filed: June 5, 2008

Date of Patent: July 16, 2013

Assignee: Panasonic Corporation

Inventors: Masahiko Saito, Masashige Mizuyama
Device, system, and method for single thread command chaining instructions from multiple processor elements

Patent number: 8489865

Abstract: A command chain system includes plurality of processing elements, a memory, and a chain engine. The chain engine is in communication with the memory and accesses instructions in the memory. The chain engine accesses a subroutine stored in the memory. The chain engine sends a command to a specialized hardware. The chain engine performs an action determined by one or more of the operation-code portion, the skip portion, and the loop-count portion of the instruction.

Type: Grant

Filed: April 15, 2010

Date of Patent: July 16, 2013

Assignee: Lockheed Martin Corporation

Inventors: Joshua W. Rensch, Marlon O. Gunderson, James V. Hedin
Processor for Executing Wide Operand Operations Using a Control Register and a Results Register

Publication number: 20130173888

Abstract: A programmable processor and method for improving the performance of processors by expanding at least two source operands, or a source and a result operand, to a width greater than the width of either the general purpose register or the data path width. The present invention provides operands which are substantially larger than the data path width of the processor by using the contents of a general purpose register to specify a memory address at which a plurality of data path widths of data can be read or written, as well as the size and shape of the operand. In addition, several instructions and apparatus for implementing these instructions are described which obtain performance advantages if the operands are not limited to the width and accessible number of general purpose registers.

Type: Application

Filed: August 22, 2012

Publication date: July 4, 2013

Applicant: MicroUnity Systems Engineering, Inc.

Inventors: Craig Hansen, John Moussouris, Alexia Massalin
PARALLEL PROCESSING SYSTEM FOR COMPUTING PARTICLE INTERACTIONS

Publication number: 20130173889

Abstract: A parallel processing system for computing particle interactions includes a plurality of computation nodes arranged according to a geometric partitioning of a simulation volume. Each computation node has storage for particle data. This particle data is associated with particles in a region of the geometrically partitioned simulation volume. The parallel processing system also includes a communication system having links interconnecting the computation nodes. Each of the computation nodes includes a processor subsystem. These processor subsystems cooperate to coordinate computation of the particle interactions in a distributed manner.

Type: Application

Filed: February 1, 2013

Publication date: July 4, 2013

Inventor: D.E. Shaw Research LLC
DATA PROCESSING APPARATUS AND DATA PROCESSING METHOD

Publication number: 20130166887

Abstract: According to one embodiment, a data processing apparatus includes a processor and a memory. The processor includes core blocks. The memory stores a command queue and task management structure data. The command queue stores a series of kernel functions. The task management structure data defines an order of execution of kernel functions by associating a return value of a previous kernel function with an argument of a subsequent kernel function. Core blocks of the processor are capable of executing different kernel functions.

Type: Application

Filed: August 16, 2012

Publication date: June 27, 2013

Inventor: Ryuji Sakai
METHOD AND APPARATUS FOR GENERATING FLAGS FOR A PROCESSOR

Publication number: 20130166889

Abstract: A method and apparatus are described for generating flags in response to processing data during an execution pipeline cycle of a processor. The processor may include a multiplexer configured generate valid bits for received data according to a designated data size, and a logic unit configured to control the generation of flags based on a shift or rotate operation command, the designated data size and information indicating how many bytes and bits to rotate or shift the data by. A carry flag may be used to extend the amount of bits supported by shift and rotate operations. A sign flag may be used to indicate whether a result is a positive or negative number. An overflow flag may be used to indicate that a data overflow exists, whereby there are not a sufficient number of bits to store the data.

Type: Application

Filed: December 22, 2011

Publication date: June 27, 2013

Applicant: ADVANCED MICRO DEVICES, INC.

Inventors: Srikanth Arekapudi, Saurabh Gupta
PREDICTIVE OPERATOR GRAPH ELEMENT PROCESSING

Publication number: 20130166888

Abstract: Techniques are described for predictively starting a processing element. Embodiments receive streaming data to be processed by a plurality of processing elements. An operator graph of the plurality of processing elements that defines at least one execution path is established. Embodiments determine a historical startup time for a first processing element in the operator graph, where, once started, the first processing element begins normal operations once the first processing element has received a requisite amount of data from one or more upstream processing elements. Additionally, embodiments determine an amount of time the first processing element takes to receive the requisite amount of data from the one or more upstream processing elements. The first processing element is then predictively started at a first startup time based on the determined historical startup time and the determined amount of time historically taken to receive the requisite amount of data.

Type: Application

Filed: December 10, 2012

Publication date: June 27, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: International Business Machines Corporation
Synchronizing multiple threads efficiently

Patent number: 8473963

Abstract: In one embodiment, the present invention includes a method of assigning a location within a shared variable for each of multiple threads and writing a value to a corresponding location to indicate that the corresponding thread has reached a barrier. In such manner, when all the threads have reached the barrier, synchronization is established. In some embodiments, the shared variable may be stored in a cache accessible by the multiple threads. Other embodiments are described and claimed.

Type: Grant

Filed: March 23, 2011

Date of Patent: June 25, 2013

Assignee: Intel Corporation

Inventors: Sailesh Kottapalli, John H. Crawford
Method for providing data to a digital processing means

Patent number: 8473720

Abstract: A method for providing generic formatted data to at least one digital data processor, configured to translate generic formatted data into specific formatted data. The generic formatted data includes data relative to logical blocks, at least one of the logical blocks corresponding to an object to be processed directly or indirectly according to specific formatted data by at least one processing platform with processor(s) and memory(ies), located upstream from the processor or integrated into the processor, the object being made up of elementary information of same type, all information being represented by at least one numerical value.

Type: Grant

Filed: December 19, 2006

Date of Patent: June 25, 2013

Assignee: DXO Labs

Inventor: Bruno Liege
INSTRUCTION GENERATION

Publication number: 20130159677

Abstract: Generating instructions, in particular for mailbox verification in a simulation environment. A sequence of instructions is received, as well as selection data representative of a plurality of commands including a special command. Repeatedly selecting one of the plurality of commands and outputting an instruction based on the selected command. The outputting of an instruction includes outputting a next instruction in the sequence of instructions if the selected command is the special command, and outputting an instruction associated with the command if the selected command is not the special command.

Type: Application

Filed: November 1, 2012

Publication date: June 20, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: INTERNATIONAL BUSINESS MACHINES CORPORATION
PROVIDING CAPACITY GUARANTEES FOR HARDWARE TRANSACTIONAL MEMORY SYSTEMS USING FENCES

Publication number: 20130159673

Abstract: A method is provided that includes determining a number of outstanding out-of-order instructions in an instruction stream. The method includes determining a number of available hardware resources for executing out-of-order instructions and inserting fencing instructions into the instruction stream if the number of outstanding out-of-order instructions exceeds the determined number of available hardware resources. A second method is provided for compiling source code that includes determining a speculative region. The second method includes generating machine-level instructions and inserting fencing instructions into the machine-level instructions in response to determining the speculative region. A processing device is provided that includes cache memory and a processing unit to execute processing device instructions in an instruction stream.

Type: Application

Filed: December 15, 2011

Publication date: June 20, 2013

Inventors: Martin T. Pohlack, Michael Hohmuth, Stephan Diestelhorst, David Christie, Luke Yen
Providing Hint Register Storage For A Processor

Publication number: 20130159679

Abstract: In one embodiment, the present invention includes a method for receiving a data access instruction and obtaining an index into a data access hint register (DAHR) register file of a processor from the data access instruction, reading hint information from a register of the DAHR register file accessed using the index, and performing the data access instruction using the hint information. Other embodiments are described and claimed.

Type: Application

Filed: December 20, 2011

Publication date: June 20, 2013

Inventors: James E. McCormick, JR., Dale Morris
Code optimization by memory barrier removal and enclosure within transaction

Publication number: 20130159678

Abstract: A code section of a computer program to be executed by a computing device includes memory barrier instructions. Where the code section satisfies a threshold, the code section is modified, by enclosing the code section within a transaction that employs hardware transactional memory of the computing device, and removing the memory barrier instructions from the code section. Execution of the code section as has been enclosed within the transaction can be monitored to yield monitoring results. Where the monitoring results satisfy an abort threshold corresponding to excessive aborting of the execution of the code section as has been enclosed within the transaction, the code section is split into code sub-sections, and each code sub-section enclosed within a separate transaction that employs the hardware transactional memory. Splitting the code section sections and enclosing each code sub-section within a separate transaction can decrease occurrence of the code section aborting during execution.

Type: Application

Filed: December 15, 2011

Publication date: June 20, 2013

Inventors: Toshihiko Koju, Takuya Nakaike, Ali Ijaz Sheikh, Harold Wade Cain, III, Maged M. Michael
INSTRUCTION PREDICATION USING INSTRUCTION ADDRESS PATTERN MATCHING

Publication number: 20130159683

Abstract: A particular method includes receiving, at a processor, an instruction and an address of the instruction. The method also includes preventing execution of the instruction based at least in part on determining that the address is within a range of addresses.

Type: Application

Filed: December 16, 2011

Publication date: June 20, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Mark J. Hickey, Adam J. Muff, Matthew R. Tubbs, Charles D. Wait
Method and system for synchronizing thread wavefront data and events

Patent number: 8468547

Abstract: Systems and methods for synchronizing thread wavefronts and associated events are disclosed. According to an embodiment, a method for synchronizing one or more thread wavefronts and associated events includes inserting a first event associated with a first data output from a first thread wavefront into an event synchronizer. The event synchronizer is configured to release the first event before releasing events inserted subsequent to the first event. The method further includes releasing the first event from the event synchronizer after the first data is stored in the memory. Corresponding system and computer readable medium embodiments are also disclosed.

Type: Grant

Filed: November 23, 2010

Date of Patent: June 18, 2013

Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Laurent LeFebvre, Michael Mantor, Deborah Lynne Szasz
Clockless computer using a pulse generator that is triggered by an event other than a read or write instruction in place of a clock

Patent number: 8468323

Abstract: A computer array (10) has a plurality of computers (12). The computers (12) communicate with each other asynchronously, and the computers (12) themselves operate in a generally asynchronous manner internally. When one computer (12) attempts to communicate with another it goes to sleep until the other computer (12) is ready to complete the transaction, thereby saving power and reducing heat production. The sleeping computer (12) can be awaiting data or instructions (12). In the case of instructions, the sleeping computer (12) can be waiting to store the instructions or to immediately execute the instructions. In the later case, the instructions are placed in an instruction register (30a) when they are received and executed therefrom, without first placing the instructions first into memory. The instructions can include a micro-loop (100) which is capable of performing a series of operations repeatedly.

Type: Grant

Filed: March 21, 2011

Date of Patent: June 18, 2013

Assignee: ARRAY Portfolio LLC

Inventors: Charles H. Moore, Jeffrey Arthur Fox, John W. Rible
Redirection table and predictor for fetching instruction routines in a virtual machine guest

Patent number: 8464028

Abstract: In one embodiment, a processor comprises a redirect unit configured to detect a match of an instruction pointer (IP) in an IP redirect table, the IP corresponding to a guest instruction that the processor has intercepted, wherein the guest is executed under control of a virtual machine monitor (VMM), and wherein the redirect unit is configured to redirect instruction fetching by the processor to a routine identified in the IP redirect table instead of exiting to the VMM in response to the intercept of the guest instruction.

Type: Grant

Filed: January 22, 2009

Date of Patent: June 11, 2013

Assignee: Advanced Micro Devices, Inc.

Inventors: Benjamin C. Serebrin, Anton Chernoff
Instruction cracking and issue shortening based on instruction base fields, index fields, operand fields, and various other instruction text bits

Patent number: 8464030

Abstract: A method, information processing system, and computer program product crack and/or shorten computer executable instructions. At least one instruction is received. The at least on instruction is analyzed. An instruction type associated with the at least one instruction is identified. At least one of a base field, an index field, one or more operands, and a mask field of the instruction are analyzed. At least one of the following is then performed: the at least one instruction is organized into a set of unit of operation; and the at least one instruction is shortened. The set of unit of operations is then executed.

Type: Grant

Filed: April 9, 2010

Date of Patent: June 11, 2013

Assignee: International Business Machines Corporation

Inventors: Fadi Busaba, Brian Curran, Lee Eisen, Bruce Giamei, David Hutton
PERFORMANCE OF PROCESSORS IS IMPROVED BY LIMITING NUMBER OF BRANCH PREDICTION LEVELS

Publication number: 20130145135

Abstract: A method utilizes information provided by performance monitoring hardware to dynamically adjust the number of levels of speculative branch predictions allowed (typically 3 or 4 per thread). for a processor core. The information includes cycles-per-instruction (CPI) for the processor core and number of memory accesses per unit time. If the CPI is below a CPI threshold; and the number of memory accesses (NMA) per unit time is above a prescribe threshold, the number of levels of speculative branch predictions is reduced per thread for the processor core. Likewise, the number of levels of speculative branch predictions could be increased, from a low level to maximum allowed, if the CPI threshold is exceeded or the number of memory accesses per unit time is below the prescribed threshold.

Type: Application

Filed: December 1, 2011

Publication date: June 6, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Robert H. Bell, JR., Wen-Tzer T. Chen
STATICALLY SPECULATIVE COMPILATION AND EXECUTION

Publication number: 20130145132

Abstract: A system, for use with a compiler architecture framework, includes performing a statically speculative compilation process to extract and use speculative static information, encoding the speculative static information in an instruction set architecture of a processor, and executing a compiled computer program using the speculative static information, wherein executing supports static speculation driven mechanisms and controls.

Type: Application

Filed: November 6, 2012

Publication date: June 6, 2013

Applicant: BlueRISC Inc.

Inventor: BlueRISC Inc.
VLIW processor with execution units executing instructions from instruction queues and accessing data queues to read and write operands

Patent number: 8458443

Abstract: A processor may include a plurality of processing units for processing instructions, where each processing unit is associated with a discrete instruction queue. Data is read from a data queue selected by each instruction, and a sequencer manages distribution of instructions to the plurality of discrete instruction queues.

Type: Grant

Filed: September 8, 2009

Date of Patent: June 4, 2013

Assignee: SMSC Holdings S.A.R.L.

Inventors: Matthias Tramm, Manfred Stadler, Christian Hitz
INDIRECT FUNCTION CALL INSTRUCTIONS IN A SYNCHRONOUS PARALLEL THREAD PROCESSOR

Publication number: 20130138926

Abstract: An indirect branch instruction takes an address register as an argument in order to provide indirect function call capability for single-instruction multiple-thread (SIMT) processor architectures. The indirect branch instruction is used to implement indirect function calls, virtual function calls, and switch statements to improve processing performance compared with using sequential chains of tests and branches.

Type: Application

Filed: November 12, 2012

Publication date: May 30, 2013

Applicant: NVIDIA CORPORATION

Inventor: NVIDIA CORPORATION
PROCESSING CORE WITH SPECULATIVE REGISTER PREPROCESSING

Publication number: 20130138925

Abstract: A method and circuit arrangement speculatively preprocess data stored in a register file during otherwise unused cycles in an execution unit, e.g., to prenormalize denormal floating point values stored in a floating point register file, to decompress compressed values stored in a register file, to decrypt encrypted values stored in a register file, or to otherwise preprocess data that is stored in an unprocessed form in a register file.

Type: Application

Filed: November 30, 2011

Publication date: May 30, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Mark J. Hickey, Adam J. Muff, Matthew R. Tubbs, Charles D. Wait
DATA PROCESSING APPARATUS ADDRESS RANGE DEPENDENT PARALLELIZATION OF INSTRUCTIONS

Publication number: 20130138927

Abstract: A data processing apparatus has an instruction memory system arranged to output an instruction word addressed by an instruction address. An instruction execution unit, processes a plurality of instructions from the instruction word in parallel. A detection unit, detects in which of a plurality of ranges the instruction address lies. The detection unit is coupled to the instruction execution unit and/or the instruction memory system, to control a way in which the instruction execution unit parallelizes processing of the instructions from the instruction word, dependent on a detected range. In an embodiment the instruction execution unit and/or the instruction memory system adjusts a width of the instruction word that determines a number of instructions from the instruction word that is processed in parallel, dependent on the detected range.

Type: Application

Filed: January 28, 2013

Publication date: May 30, 2013

Applicant: Nytell Software LLC

Inventors: Ramanathan Sethuraman, Balakrishnan Srinivasan, Carlos Alba Pinto, Harm Peters, Rafael Peset LLopis
Hardware wake-and-go mechanism and content addressable memory with instruction pre-fetch look-ahead to detect programming idioms

Patent number: 8452947

Abstract: A hardware wake-and-go mechanism is provided for a data processing system. The wake-and-go mechanism looks ahead in the instruction stream of a thread for programming idioms that indicates that the thread is waiting for an event. The wake-and-go mechanism updates a wake-and-go array with a target address associated with the event for each recognized programming idiom. When the thread reaches a programming idiom, the thread goes to sleep until the event occurs. The wake-and-go array may be a content addressable memory (CAM). When a transaction appears on the symmetric multiprocessing (SMP) fabric that modifies the value at a target address in the CAM, the CAM returns a list of storage addresses at which the target address is stored. The wake-and-go mechanism associates these storage addresses with the threads waiting for an event at the target addresses, and may wake the one or more threads waiting for the event.

Type: Grant

Filed: February 1, 2008

Date of Patent: May 28, 2013

Assignee: International Business Machines Corporation

Inventors: Ravi K. Arimilli, Satya P. Sharma, Randal C. Swanberg
Indirect indexing instructions

Patent number: 8452945

Abstract: A data processor includes an instruction decoder, an execution unit, a general-purpose register file, and an index-register file. The instruction set for the data processor includes indirect-indexing instructions to facilitate table lookups. When executing such an instruction, the execution unit reads an index stored at an index-register location specified by the instruction. The index refers to a general-purpose register location, which is then read and copied to a general-purpose register location as specified by the instruction. The disclosed execution unit includes four functional units, each with two read ports and a write port so that eight table lookups can be performed in parallel.

Type: Grant

Filed: September 17, 2002

Date of Patent: May 28, 2013

Assignee: Hewlett-Packard Development Company, L.P.

Inventor: Dale Morris
METHOD AND SYSTEM FOR PROCESSING INSTRUCTION INFORMATION

Publication number: 20130132709

Abstract: A method and system for processing instruction information. Each instruction information character string of a sequence of instruction information character strings are sequentially extracted and processed. Each instruction information character string pertains to an associated target object wrapped in a target object storage unit by an associated operation target model. It is independently ascertained for each instruction information character string whether to generate a code line for each instruction information character string, by: determining whether a requirement is satisfied and generating the code line and storing the code line in a code buffer if the requirement has been determined to be satisfied and not generating the code line if the requirement has been determined to not be satisfied. The requirement relates to whether the instruction information character string being processed comprises a naming instruction or a generation instruction.

Type: Application

Filed: October 11, 2012

Publication date: May 23, 2013

Applicant: International Business Machines Corporation

Inventor: International Business Machines Corporation
Efficient memory update process for well behaved applications executing on a weakly-ordered processor

Patent number: 8447955

Abstract: A multiprocessor data processing system (MDPS) with a weakly-ordered architecture providing processing logic for substantially eliminating issuing sync instructions after every store instruction of a well-behaved application. Instructions of a well-behaved application are translated and executed by a weakly-ordered processor. The processing logic includes a lock address tracking utility (LATU), which provides an algorithm and a table of lock addresses, within which each lock address is stored when the lock is acquired by the weakly-ordered processor. When a store instruction is encountered in the instruction stream, the LATU compares the target address of the store instruction against the table of lock addresses. If the target address matches one of the lock addresses, indicating that the store instruction is the corresponding unlock instruction (or lock release instruction), a sync instruction is issued ahead of the store operation.

Type: Grant

Filed: October 28, 2008

Date of Patent: May 21, 2013

Assignee: International Business Machines Corporation

Inventors: Andrew Dunshea, Satya Prakash Sharma, Mysore Sathyanarayana Srinivas
Optimizing System Throughput By Automatically Altering Thread Co-Execution Based On Operating System Directives

Publication number: 20130124826

Abstract: A technique for optimizing program instruction execution throughput in a central processing unit core (CPU). The CPU implements a simultaneous multithreading (SMT) operational mode wherein program instructions associated with at least two software threads are executed in parallel as hardware threads while sharing one or more hardware resources used by the CPU, such as cache memory, translation lookaside buffers, functional execution units, etc. As part of the SMT mode, the CPU implements an autothread (AT) operational mode. During the AT operational mode, a determination is made whether there is a resource conflict between the hardware threads that undermines instruction execution throughput. If a resource conflict is detected, the CPU adjusts the relative instruction execution rates of the hardware threads based on relative priorities of the software threads.

Type: Application

Filed: November 11, 2011

Publication date: May 16, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Amit Merchant, Dipankar Sarma, Vaidyanathan Srinivasan
LOW OVERHEAD OPERATION LATENCY AWARE SCHEDULER

Publication number: 20130117543

Abstract: A method and apparatus for processing multi-cycle instructions include picking a multi-cycle instruction and directing the picked multi-cycle instruction to a pipeline. The pipeline includes a pipeline control configured to detect a latency and a repeat rate of the picked multi-cycle instruction and to count clock cycles based on the detected latency and the detected repeat rate. The method and apparatus further include detecting the repeat rate and the latency of the picked multi-cycle instruction, and counting clock cycles based on the detected repeat rate and the latency of the picked multi-cycle instruction.

Type: Application

Filed: November 4, 2011

Publication date: May 9, 2013

Applicant: ADVANCED MICRO DEVICES, INC.

Inventors: Ganesh Venkataramanan, Michael G. Butler
CODE COVERAGE FRAMEWORK

Publication number: 20130117542

Abstract: A method, information processing system, and computer program product record an execution of a program instruction. A determination is made that a thread has entered a program unit. Another determination is made that that the thread is associated with at least one attribute that matches a set of thread recording criteria. An instruction recording mechanism for the thread is dynamically activated in response to the at least one attribute of the thread matching the set of thread recording criteria.

Type: Application

Filed: November 4, 2011

Publication date: May 9, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Christopher D. FILACHEK, Mei Hui WANG, Joshua B. WISNIEWSKI
High-Word Facility for Extending the Number of General Purpose Registers Available to Instructions

Publication number: 20130117545

Abstract: A computer employs a set of General Purpose Registers (GPRs). Each GPR comprises a plurality of portions. Programs such as an Operating System and Applications operating in a Large GPR mode, access the full GPR, however programs such as Applications operating in Small GPR mode, only have access to a portion at a time. Instruction Opcodes, in Small GPR mode, may determine which portion is accessed.

Type: Application

Filed: December 26, 2012

Publication date: May 9, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: International Business Machines Corporation
METHOD AND APPARATUS FOR RUN-TIME STATISTICS DEPENDENT PROGRAM EXECUTION USING SOURCE-CODING PRINCIPLES

Publication number: 20130117544

Abstract: Disclosed are a method and system for optimized, dynamic data-dependent program execution. The disclosed system comprises a statistics computer which computes statistics of the incoming data at the current time instant, where the said statistics include the probability distribution of the incoming data, the probability distribution over program modules induced by the incoming data, the probability distribution induced over program outputs by the incoming data, and the time-complexity of each program module for the incoming data, wherein the said statistics are computed on as a function of current and past data, and previously computed statistics; a plurality of alternative execution path orders designed prior to run-time by the use of an appropriate source code; a source code selector which selects one of the execution path orders as a function of the statistics computed by the statistics computer; a complexity measurement which measures the time-complexity of the currently selected execution path-order.

Type: Application

Filed: December 21, 2012

Publication date: May 9, 2013

Applicant: International Business Machines Corporation

Inventor: International Business Machines Corporation
Speculative thread execution with hardware transactional memory

Patent number: 8438568

Abstract: In an embodiment, if a self thread has more than one conflict, a transaction of the self thread is aborted and restarted. If the self thread has only one conflict and an enemy thread of the self thread has more than one conflict, the transaction of the self thread is committed. If the self thread only conflicts with the enemy thread and the enemy thread only conflicts with the self thread and the self thread has a key that has a higher priority than a key of the enemy thread, the transaction of the self thread is committed. If the self thread only conflicts with the enemy thread, the enemy thread only conflicts with the self thread, and the self thread has a key that has a lower priority than the key of the enemy thread, the transaction of the self thread is aborted.

Type: Grant

Filed: February 24, 2010

Date of Patent: May 7, 2013

Assignee: International Business Machines Corporation

Inventors: Mark E. Giampapa, Thomas M. Gooding, Raul E. Silvera, Kai-Ting Amy Wang, Peng Wu, Xiaotong Zhuang
Providing thread fairness by biasing selection away from a stalling thread using a stall-cycle counter in a hyper-threaded microprocessor

Patent number: 8438369

Abstract: A method and apparatus for providing fairness in a multi-processing element environment is herein described. Mask elements are utilized to associated portions of a reservation station with each processing element, while still allowing common access to another portion of reservation station entries. Additionally, bias logic biases selection of processing elements in a pipeline away from a processing element associated with a blocking stall to provide fair utilization of the pipeline.

Type: Grant

Filed: November 8, 2010

Date of Patent: May 7, 2013

Assignee: Intel Corporation

Inventors: Morris Marden, Matthew Merten, Alexandre Farcy, Avinash Sodani, James Hadley, Ilhyun Kim
Processing apparatus, processing system, and computer readable medium

Patent number: 8438368

Abstract: Optimizing processing of a document sequentially processed by a plurality of image processing apparatuses that refer to an instruction document indicating the processing to be performed by each of the plurality of image processing apparatuses and respective security measures performed by each of the plurality of image processing apparatuses.

Type: Grant

Filed: April 1, 2009

Date of Patent: May 7, 2013

Assignee: Fuji Xerox Co., Ltd.

Inventor: Tetsuo Numata
Multiprocessor

Patent number: 8433884

Abstract: A multiprocessor executes a plurality of threads without decreasing execution efficiency. The multiprocessor includes a first processor allocating a different register file to each of a predetermined number of threads to be executed from among plural threads, and executing the predetermined number of threads in parallel; and a second processor performing processing according to a processing request made by the first processor. The first processor has areas allocated to the plurality of threads in one-to-one correspondence, makes the processing request to the second processor according to an instruction included in one of the predetermined number of threads, upon receiving a request for writing a value resulting from the processing from the second processor, judges whether the one thread is being executed, and when judging negatively, performs control such that the obtained value is written into one of the areas allocated to the one thread.

Type: Grant

Filed: June 16, 2009

Date of Patent: April 30, 2013

Assignee: Panasonic Corporation

Inventor: Hiroyuki Morishita
MACHINE PROCESSOR

Publication number: 20130103931

Abstract: Disclosed are machine processors and methods performed thereby. The processor has access to processing units for performing data processing and to libraries. Functions in the libraries are implementable to perform parallel processing and graphics processing. The processor may be configured to acquire (e.g., to download from a web server) a download script, possibly with extensions specifying bindings to library functions. Running the script may cause the processor to create, for each processing unit, contexts in which functions may be run, and to run, on the processing units and within a respective context, a portion of the download script. Running the script may also cause the processor to create, for a processing unit, a memory object, transfer data into that memory object, and transfer data back to the processor in such a way that a memory address of the data in the memory object is not returned to the processor.

Type: Application

Filed: October 10, 2012

Publication date: April 25, 2013

Applicant: MOTOROLA MOBILITY LLC

Inventor: MOTOROLA MOBILITY LLC
Thread count throttling for efficient resource utilization

Patent number: 8429656

Abstract: Methods and apparatuses are presented for graphics operations with thread count throttling, involving operating a processor to carry out multiple threads of execution of, wherein the processor comprises at least one execution unit capable of supporting up to a maximum number of threads, obtaining a defined memory allocation size for allocating, in at least one memory device, a thread-specific memory space for the multiple threads, obtaining a per thread memory requirement corresponding to the thread-specific memory space, determining a thread count limit based on the defined memory allocation size and the per thread memory requirement, and sending a command to the processor to cause the processor to limit the number of threads carried out by the at least one execution unit to a reduced number of threads, the reduced number of threads being less than the maximum number of threads.

Type: Grant

Filed: November 2, 2006

Date of Patent: April 23, 2013

Assignee: NVIDIA Corporation

Inventors: Jerome F. Duluk, Jr., Bryon S. Nordquist
Dynamic tag allocation in a multithreaded out-of-order processor

Patent number: 8429386

Abstract: Various techniques for dynamically allocating instruction tags and using those tags are disclosed. These techniques may apply to processors supporting out-of-order execution and to architectures that supports multiple threads. A group of instructions may be assigned a tag value from a pool of available tag values. A tag value may be usable to determine the program order of a group of instructions relative to other instructions in a thread. After the group of instructions has been (or is about to be) committed, the tag value may be freed so that it can be re-used on a second group of instructions. Tag values are dynamically allocated between threads; accordingly, a particular tag value or range of tag values is not dedicated to a particular thread.

Type: Grant

Filed: June 30, 2009

Date of Patent: April 23, 2013

Assignee: Oracle America, Inc.

Inventors: Paul J. Jordan, Robert T. Golla, Jama I. Barreh
MACHINE PROCESSOR

Publication number: 20130097410

Abstract: Disclosed are machine processors and methods performed thereby. The processor has access to processing units for performing data processing and to libraries. Functions in the libraries are implementable to perform parallel processing and graphics processing. The processor may be configured to acquire (e.g., to download from a web server) a download script, possibly with extensions specifying bindings to library functions. Running the script may cause the processor to create, for each processing unit, contexts in which functions may be run, and to run, on the processing units and within a respective context, a portion of the download script. Running the script may also cause the processor to create, for a processing unit, a memory object, transfer data into that memory object, and transfer data back to the processor in such a way that a memory address of the data in the memory object is not returned to the processor.

Type: Application

Filed: October 10, 2012

Publication date: April 18, 2013

Applicant: MOTOROLA MOBILITY LLC

Inventor: MOTOROLA MOBILITY LLC
Hardware assist thread for increasing code parallelism

Patent number: 8423750

Abstract: Mechanisms are provided for offloading a workload from a main thread to an assist thread. The mechanisms receive, in a fetch unit of a processor of the data processing system, a branch-to-assist-thread instruction of a main thread. The branch-to-assist-thread instruction informs hardware of the processor to look for an already spawned idle thread to be used as an assist thread. Hardware implemented pervasive thread control logic determines if one or more already spawned idle threads are available for use as an assist thread. The hardware implemented pervasive thread control logic selects an idle thread from the one or more already spawned idle threads if it is determined that one or more already spawned idle threads are available for use as an assist thread, to thereby provide the assist thread. In addition, the hardware implemented pervasive thread control logic offloads a portion of a workload of the main thread to the assist thread.

Type: Grant

Filed: May 12, 2010

Date of Patent: April 16, 2013

Assignee: International Business Machines Corporation

Inventors: Ronald P. Hall, Hung Q. Le, Raul E. Silvera, Balaram Sinharoy
Parallel distributed processing method and computer system

Patent number: 8423605

Abstract: Provided is a parallel distributed processing method executed by a computer system comprising a parallel-distributed-processing control server, a plurality of extraction processing servers and a plurality of aggregation processing servers. The managed data includes at least a first and a second data items, the plurality of data items each including a value. The method includes a step of extracting data from one of the plurality of chunks according to a value in the second data item, to thereby group the data, a step of merging groups having the same value in the second data item based on an order of a value in the first data item of data contained in a group among groups, and a step of processing data in a group obtained through the merging by focusing on the order of the value in the first data item.

Type: Grant

Filed: March 17, 2010

Date of Patent: April 16, 2013

Assignee: Hitachi, Ltd.

Inventor: Ryo Kawai
Management of speculative transactions

Patent number: 8417920

Abstract: Circuitry for receiving transaction requests from a plurality of masters and the masters themselves are disclosed. The circuitry comprises: an input port for receiving said transaction requests, at least one of said transaction requests received comprising an indicator indicating if said transaction is a speculative transaction; an output port for outputting a response to said master said transaction request was received from; and transaction control circuitry; wherein said transaction control circuitry is responsive to a speculative transaction request to determine a state of at least a portion of a data processing apparatus said circuitry is operating within and in response to said state being a predetermined state said transaction control circuitry generates a transaction cancel indicator and outputs said transaction cancel indicator as said response, said transaction cancel indicator indicating to said master that said speculative transaction will not be performed.

Type: Grant

Filed: December 21, 2007

Date of Patent: April 9, 2013

Assignee: ARM Limited

Inventors: Ashley Miles Stevens, Daren Croxford
Instruction-efficient algorithm for parallel scan using initialized memory regions to replace conditional statements

Patent number: 8417735

Abstract: One embodiment of the present invention sets forth a technique for performing a parallel scan operation with high computational efficiency in a single-instruction multiple-data (SIMD) environment. Each participating thread initially writes an extended region of a data array to initialize the region with an identity value. For example, a value of zero is used as the identity value for addition. The initialized region of the data array includes an initialized entry for every possible out of bounds index that may be computed in the normal course of the parallel scan operation. During the parallel scan operation each thread computes data array indices according to any technically appropriate technique. When a participating thread computes an index that would conventionally be out of bounds, the thread is able to retrieve an identity value from the initialized region of the data array rather than perform a bounds check that returns the identity value.

Type: Grant

Filed: December 12, 2007

Date of Patent: April 9, 2013

Assignee: Nvidia Corporation

Inventor: John A. Stratton
Information processing apparatus

Patent number: 8416430

Abstract: Disclosed is an information processing apparatus in which various kinds of information are processed in either the real time processing mode or the non-real time processing mode. The apparatus includes an operation display section to accept an inputted instruction, an image processing section to apply a processing to image information and a processor provided with a plurality of same cores. The real-time processing unnecessary process that is related to the operation display section, is fixed onto one of the plurality of same cores so that the one of the plurality of same cores is in charge of controlling the real-time processing unnecessary process, while, the real-time processing necessary process that is related to the image processing section, is fixed onto another one of the plurality of same cores so that the other one of the plurality of same cores is in charge of controlling the real-time processing necessary process.

Type: Grant

Filed: March 18, 2010

Date of Patent: April 9, 2013

Assignee: Konica Minolta Business Technologies, Inc.

Inventor: Tadashi Suzue

prev … 6 7 8 9 10 11 12 13 14 … next