Processing Control Patents (Class 712/220)

Arithmetic operation instruction processing (Class 712/221)

Floating point or vector (Class 712/222)

Logic operation instruction processing (Class 712/223)

Masking (Class 712/224)

Processing control for data transfer (Class 712/225)

Instruction modification based on condition (Class 712/226)

Specialized instruction processing in support of testing, debugging, emulation (Class 712/227)

Context preserving (e.g., context swapping, checkpointing, register windowing (Class 712/228)

Mode switch or change (Class 712/229)

Generating next microinstruction address (Class 712/230)

Detecting end or completion of microprogram (Class 712/231)

Hardwired controller (Class 712/232)

Branching (e.g., delayed branch, loop control, branch predict, interrupt) (Class 712/233)

Processing sequence control (i.e., microsequencing) (Class 712/245)

Computer Instructions for Activating and Deactivating Operands

Publication number: 20130086363

Abstract: An instruction set architecture (ISA) includes instructions for selectively indicating last-use architected operands having values that will not be accessed again, wherein architected operands are made active or inactive after an instruction specified last-use by an instruction, wherein the architected operands are made active by performing a write operation to an inactive operand, wherein the activation/deactivation may be performed by the instruction having the last-use of the operand or another (prefix) instruction.

Type: Application

Filed: October 3, 2011

Publication date: April 4, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Michael K. Gschwind, Valentina Salapura
Tracking operand liveliness information in a computer system and performance function based on the liveliness information

Publication number: 20130086367

Abstract: Operand liveness state information is maintained during context switches for current architected operands of executing programs the current operand state information indicating whether corresponding current operands are any one of enabled or disabled for use by a first program module, the first program module comprising machine instructions of an instruction set architecture (ISA) for disabling current architected operands, wherein a current operand is accessed by a machine instruction of said first program module, the accessing comprising using the current operand state information to determine whether a previously stored current operand value is accessible by the first program module.

Type: Application

Filed: October 3, 2011

Publication date: April 4, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Michael K. Gschwind, Valentina Salapura
Managing a Register Cache Based on an Architected Computer Instruction Set Having Operand Last-User Information

Publication number: 20130086364

Abstract: A multi-level register hierarchy is disclosed comprising a first level pool of registers for caching registers of a second level pool of registers in a system wherein programs can dynamically release and re-enable architected registers such that released architected registers need not be maintained by the processor, the processor accessing operands from the first level pool of registers, wherein a last-use instruction is identified as having a last use of an architected register before being released, the last-use architected register being released causes the multi-level register hierarchy to discard any correspondence of an entry to said last use architected register.

Type: Application

Filed: October 3, 2011

Publication date: April 4, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Michael K. Gschwind, Valentina Salapura
Exploiting an Architected List-Use Operand Indication in a Computer System Operand Resource Pool

Publication number: 20130086365

Abstract: A pool of available physical registers are provided for architected registers, wherein operations are performed that activate and deactivate selected architected registers, such that the deactivated selected architected registers need not retain values, and physical registers can be deallocated to the pool, wherein deallocation of physical registers is performed after a last-use by a designated last-use instruction, wherein the last-use information is provided either by the last-use instruction or a prefix instruction, wherein reads to deallocated architecture registers return an architected default value.

Type: Application

Filed: October 3, 2011

Publication date: April 4, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Michael K. Gschwind, Valentina Salapura
FINE-GRAINED INSTRUCTION ENABLEMENT AT SUB-FUNCTION GRANULARITY

Publication number: 20130080745

Abstract: Fine-grained enablement at sub-function granularity. An instruction encapsulates different sub-functions of a function, in which the sub-functions use different sets of registers of a composite register file, and therefore, different sets of functional units. At least one operand of the instruction specifies which set of registers, and therefore, which set of functional units, is to be used in performing the sub-function. The instruction can perform various functions (e.g., move, load, etc.) and a sub-function of the function specifies the type of function (e.g., move-floating point; move-vector; etc.).

Type: Application

Filed: November 20, 2012

Publication date: March 28, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: INTERNATIONAL BUSINESS MACHINES CORP
ABSTRACTING COMPUTATIONAL INSTRUCTIONS TO IMPROVE PERFORMANCE

Publication number: 20130080744

Abstract: Methods and systems for executing a code stream of non-native binary code on a computing system are disclosed. One method includes parsing the code stream to detect a plurality of elements including one or more branch destinations, and traversing the code stream to detect a plurality of non-native operators. The method also includes executing a pattern matching algorithm against the plurality of non-native operators to find combinations of two or more non-native operators that do not span across a detected branch destination and that correspond to one or more target operators executable by the computing system. The method further includes generating a second code stream executable on the computing system including the one or more target operators.

Type: Application

Filed: September 27, 2011

Publication date: March 28, 2013

Inventor: Andrew Ward Beale
Providing A Dedicated Communication Path Separate From A Second Path To Enable Communication Between Complaint Sequencers Of A Processor Using An Assertion Signal

Publication number: 20130080746

Abstract: In one embodiment, the present invention includes a method for communicating an assertion signal from a first instruction sequencer to a plurality of accelerators coupled to the first instruction sequencer, detecting the assertion signal in the accelerators and communicating a request for a lock, and registering an accelerator that achieves the lock by communication of a registration message for the accelerator to the first instruction sequencer. Other embodiments are described and claimed.

Type: Application

Filed: November 20, 2012

Publication date: March 28, 2013

Inventors: Perry Wang, Jamison Collins, Hong Wang
Coexistence of advanced hardware synchronization and global locks

Patent number: 8407455

Abstract: A computer-implemented method and article of manufacture is disclosed for enabling computer programs utilizing hardware transactional memory to safely interact with code utilizing traditional locks. A thread executing on a processor of a plurality of processors in a shared-memory system may initiate transactional execution of a section of code, which includes a plurality of access operations to the shared-memory, including one or more to locations protected by a lock. Before executing any operations accessing the location associated with the lock, the thread reads the value of the lock as part of the transaction, and only proceeds if the lock is not held. If the lock is acquired by another thread during transactional execution, the processor detects this acquisition, aborts the transaction, and attempts to re-execute it.

Type: Grant

Filed: July 28, 2009

Date of Patent: March 26, 2013

Assignee: Advanced Micro Devices, Inc.

Inventors: David S. Christie, Michael P. Hohmuth, Stephan Diestelhorst
Method and apparatus for enabling resource allocation identification at the instruction level in a processor system

Patent number: 8407451

Abstract: An information handling system includes a processor with multiple hardware units that generate program application load, store, and I/O interface requests to system busses within the information handling system. The processor includes a resource allocation identifier (RAID) that links the processor hardware unit initiating a system bus request with a specific resource allocation group. The resource allocation group assigns a specific bandwidth allocation rate to the initiating processor. When a load, store, or I/O interface bus request reaches the I/O bus for execution, the resource allocation manager restricts the amount of bandwidth associated with each I/O request by assigning discrete amounts of bandwidth to each successive I/O requester. Successive stages of the instruction pipeline in the hardware unit contain the resource allocation identifiers (RAID) linked to the specific load, store, or I/O instruction.

Type: Grant

Filed: February 6, 2007

Date of Patent: March 26, 2013

Assignee: International Business Machines Corporation

Inventors: Gavin Balfour Meil, Steven Leonard Roberts, Christopher John Spandikow
FINE-GRAINED INSTRUCTION ENABLEMENT AT SUB-FUNCTION GRANULARITY

Publication number: 20130073836

Abstract: Fine-grained enablement at sub-function granularity. An instruction encapsulates different sub-functions of a function, in which the sub-functions use different sets of registers of a composite register file, and therefore, different sets of functional units. At least one operand of the instruction specifies which set of registers, and therefore, which set of functional units, is to be used in performing the sub-function. The instruction can perform various functions (e.g., move, load, etc.) and a sub-function of the function specifies the type of function (e.g., move-floating point; move-vector; etc.).

Type: Application

Filed: September 16, 2011

Publication date: March 21, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Michael K. Gschwind, Brett Olsson, Valentina Salapura
Managing multiple threads in a single pipeline

Patent number: 8402253

Abstract: In one embodiment, the present invention includes a method for determining if an instruction of a first thread dispatched from a first queue associated with the first thread is stalled in a pipestage of a pipeline, and if so, dispatching an instruction of a second thread from a second queue associated with the second thread to the pipeline if the second thread is not stalled. Other embodiments are described and claimed.

Type: Grant

Filed: September 29, 2006

Date of Patent: March 19, 2013

Assignee: Intel Corporation

Inventors: Matthew Merten, Avinash Sodani, James Hadley, Alexandre Farcy, Iredamola Olopade
CONDITIONAL NON-BRANCH INSTRUCTION PREDICTION

Publication number: 20130067202

Abstract: A microprocessor processes conditional non-branch instructions that specify a condition and instruct the microprocessor to perform an operation if the condition is satisfied and otherwise to not perform the operation. A predictor provides a prediction about a conditional non-branch instruction. An instruction translator translates the conditional non-branch instruction into a no-operation microinstruction when the prediction predicts the condition will not be satisfied, and into a set of one or more microinstructions to unconditionally perform the operation when the prediction predicts the condition will be satisfied. An execution pipeline executes the no-operation microinstruction or the set of microinstructions. The predictor translates into a second set of one or more microinstructions to conditionally perform the operation when the prediction does not make a prediction.

Type: Application

Filed: March 6, 2012

Publication date: March 14, 2013

Applicant: VIA TECHNOLOGIES, INC.

Inventors: G. Glenn Henry, Terry Parks, Rodney E. Hooker
System and Method of Increasing Data Processing on a Diagnostic Tool

Publication number: 20130067131

Abstract: A method of processing J1850 requests using a scan tool having multiple processor systems is provided. The scan tool includes a first processor that processes data according to scan tool functions to assist with diagnosing and repairing a vehicle. A second processor receives data transmitted to the first processor and stores the data in a buffer. The second processor determines whether the data is complete to enable the first processor to make a determination regarding the data.

Type: Application

Filed: August 21, 2012

Publication date: March 14, 2013

Applicant: Service Solutions U.S. LLC

Inventor: David Vossen
TECHNIQUES FOR HANDLING DIVERGENT THREADS IN A MULTI-THREADED PROCESSING SYSTEM

Publication number: 20130061027

Abstract: This disclosure describes techniques for handling divergent thread conditions in a multi-threaded processing system. In some examples, a control flow unit may obtain a control flow instruction identified by a program counter value stored in a program counter register. The control flow instruction may include a target value indicative of a target program counter value for the control flow instruction. The control flow unit may select one of the target program counter value and a minimum resume counter value as a value to load into the program counter register. The minimum resume counter value may be indicative of a smallest resume counter value from a set of one or more resume counter values associated with one or more inactive threads. Each of the one or more resume counter values may be indicative of a program counter value at which a respective inactive thread should be activated.

Type: Application

Filed: September 7, 2011

Publication date: March 7, 2013

Applicant: QUALCOMM Incorporated

Inventors: Lin Chen, David Rigel Garcia Garcia, Andrew E. Gruber, Guofang Jiao
CONFIGURABLE MASS DATA PORTIONING FOR PARALLEL PROCESSING

Publication number: 20130061026

Abstract: A configurable mass data portioning for parallel processing is described herein. One or more operation attributes are selected to participate in parallelization criteria. The values of the selected operation attributes for a number of operations are submitted to a specified algorithm using to provide parallelization values corresponding to the operations. The parallelization values are applied to group the operations in comparable portions for parallel execution without conflicts.

Type: Application

Filed: September 5, 2011

Publication date: March 7, 2013

Inventors: ARTUR KAUFMANN, GEORG LANG
Register mapping in emulation of a target system on a host system

Patent number: 8392171

Abstract: Methods and systems for register mapping in emulation of a target system on a host system are disclosed. Statistics for use of a set of registers of a target system processor are determined. Based on the statistics a first subset of the target system registers, including one or more most commonly used registers is determined. The registers in the first subset are directly mapped to a first group of registers of a host system processor. A second subset of the set of target system registers is dynamically mapped to a second group of registers of the host system processor.

Type: Grant

Filed: August 12, 2010

Date of Patent: March 5, 2013

Assignee: Sony Computer Entertainment Inc.

Inventors: Stewart Sargaison, Victor Suba
Information processing device for causing a processor to context switch between threads including storing contexts based on next thread start position

Patent number: 8392932

Abstract: An information processing device for causing a processor to execute a plurality of threads by switching between them. Each thread performs a process in correspondence with an obtainment of an event. The information processing device, when causing a second thread to transit from a non-execution state to an execution state to replace a first thread, detects whether or not, in the first thread having transited to the non-execution state, a next start position of a process belongs to an already processed part, detects whether or not a start position of a process in the second thread in the execution state belongs to the processed part; and determines whether or not to set a context for execution of the second thread into the processor in accordance with detection results of the first and second detection units, and performs processing in accordance with the determination.

Type: Grant

Filed: May 13, 2009

Date of Patent: March 5, 2013

Assignee: Panasonic Corporation

Inventor: Takuji Kawamoto
Fast REP STOS using grabline operations

Patent number: 8392693

Abstract: A microprocessor includes a cache memory and a grabline instruction. The grabline instruction specifies a memory address that implicates a cache line of the memory. The grabline instruction instructs the microprocessor to initiate a zero-beat read-invalidate transaction on the bus to obtain ownership of the cache line. The microprocessor foregoes initiating the transaction on the bus when executing the grabline instruction if the microprocessor determines that a store to the cache line would cause an exception.

Type: Grant

Filed: May 17, 2010

Date of Patent: March 5, 2013

Assignee: VIA Technologies, Inc.

Inventors: G. Glenn Henry, Colin Eddy, Rodney E. Hooker
TRACKING A PROGRAMS CALLING CONTEXT USING A HYBRID CODE SIGNATURE

Publication number: 20130054942

Abstract: A method for a hybrid code signature including executing, via a processor, an application, the executing comprising executing a root instruction of the application; profiling, via the processor, the executing of the application, the profiling comprising storing a reference signature; determining, via the processor, a working signature of instructions executed subsequent to the executing of the root instruction, the determining comprising implementing a hashing function of the instructions in response to storing the reference signature; tracking the updating of the working signature by storing a value in a counter; and updating continuously, via the processor, the working signature with the hashing function while at least the working signature does not match the reference signature.

Type: Application

Filed: August 22, 2011

Publication date: February 28, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: Mauricio J. Serrano
Method and system for enhancing computer processing performance

Patent number: 8387053

Abstract: A method of performing operations in a computer system, computer system, and related method of compilation, are disclosed. In one embodiment, the method of performing includes providing compiled code having at least one thread, where each of the at least one thread includes a respective plurality of blocks and each respective block includes a respective pre-fetch component and a respective execute component. The method also includes performing a first pre-fetch component from a first block of a first thread of the at least one thread, performing a first additional component after the first pre-fetch component has been performed, and performing a first execute component from the first block of the first thread. The first execute component is performed after the first additional component has been performed, and the first additional component is from either a second thread or another block of the first thread that is not the first block.

Type: Grant

Filed: January 25, 2007

Date of Patent: February 26, 2013

Assignee: Hewlett-Packard Development Company, L.P.

Inventors: Blaine D. Gaither, Verna Knapp, Jerome Huck, Benjamin D. Osecky
MULTI-THREADED DFA ARCHITECTURE

Publication number: 20130046954

Abstract: Disclosed is an architecture, system and method for performing multi-thread DFA descents on a single input stream. An executer performs DFA transitions from a plurality of threads each starting at a different point in an input stream. A plurality of executers may operate in parallel to each other and a plurality of thread contexts operate concurrently within each executer to maintain the context of each thread which is state transitioning. A scheduler in each executer arbitrates instructions for the thread into an at least one pipeline where the instructions are executed. Tokens may be output from each of the plurality of executers to a token processor which sorts and filters the tokens into dispatch order.

Type: Application

Filed: January 18, 2012

Publication date: February 21, 2013

Inventors: Michael Ruehle, Umesh Ramkrishnarao Kasture, Vinay Janardan Naik, Nayan Amrutlal Suthar, Robert J. McMillen
Processor including age tracking of issue queue instructions

Patent number: 8380964

Abstract: An information handling system includes a processor with an instruction issue queue (IQ) that may perform age tracking operations. The issue queue IQ maintains or stores instructions that may issue out-of-order in an internal data store IDS. The IDS organizes instructions in a queue position (QPOS) addressing arrangement. An age matrix of the IQ maintains a record of relative instruction aging for those instructions within the IDS. The age matrix updates latches or other memory cell data to reflect the changes in IDS instruction ages during a dispatch operation into the IQ. During dispatch of one or more instructions, the age matrix may update only those latches that require data change to reflect changing IDS instruction ages. The age matrix employs row and column data and clock controls to individually update those latches requiring update.

Type: Grant

Filed: April 3, 2009

Date of Patent: February 19, 2013

Assignee: International Business Machines Corporation

Inventors: James Wilson Bishop, Mary Douglass Brown, Jeffrey Carl Brownscheidle, Robert Allen Cordes, Maureen Anne Delaney, Jafar Nahidi, Dung Quoc Nguyen, Joel Abraham Silberman
Channel-based runtime engine for stream processing

Patent number: 8380965

Abstract: An apparatus to facilitate design of a stream processing flow that satisfies an objective, wherein the flow includes at least three processing groups, wherein a first processing group includes a data source and an operator, a second processing group includes a data source and an operator and a third processing group includes a join operator at its input and another operator, wherein data inside each group is organized by channels and each channel is a sequence of data, wherein an operator producing a data channel does not generate new data for the channel until old data of the channel is received by all other operators in the same group, and wherein data that flows from the first and second groups to the third group is done asynchronously and is stored in a queue if not ready for processing by an operator of the third group.

Type: Grant

Filed: June 16, 2009

Date of Patent: February 19, 2013

Assignee: International Business Machines Corporation

Inventors: Eric Bouillet, Hanhua Feng, Zhen Liu, Anton V. Riabov
BIT Splitting Instruction

Publication number: 20130042091

Abstract: An instruction specifies a source value and an offset value. Upon execution of the instruction, a first result of the instruction and a second result of the instruction are generated. The first result is a first portion of the source value and the second result is a second portion of the source value.

Type: Application

Filed: August 12, 2011

Publication date: February 14, 2013

Applicant: QUALCOMM INCORPORATED

Inventors: Mao Zeng, Lucian Codrescu, Erich James Plondke
GPU ASSISTED GARBAGE COLLECTION

Publication number: 20130036295

Abstract: A system and method for efficient garbage collection. A general-purpose central processing unit (CPU) sends a garbage collection request and a first log to a special processing unit (SPU). The first log includes an address and a data size of each allocated data object stored in a heap in memory corresponding to the CPU. The SPU has a single instruction multiple data (SIMD) parallel architecture and may be a graphics processing unit (GPU). The SPU efficiently performs operations of a garbage collection algorithm due to its architecture on a local representation of the data objects stored in the memory. The SPU records a list of changes it performs to remove dead data objects and compact live data objects. This list is subsequently sent to the CPU, which performs the included operations.

Type: Application

Filed: September 24, 2012

Publication date: February 7, 2013

Applicant: Advanced Micro Devices, Inc.

Inventor: Advanced Micro Devices, Inc.
Switching data pointers based on context

Patent number: 8370606

Abstract: Apparatus and methods for quickly switching active context between data pointer registers are disclosed. The apparatus can include a first register operable for storing a first data pointer and a second register operable for storing a second data pointer. A configuration register can provide a first signal specifying either the first or the second data pointer as an active data pointer. An instruction decoder can receive a data pointer instruction and output a second signal. The first and second signals can be independent from one another. Decoding logic coupled to the logic devices can output one of the first or second data pointers as the active data pointer in response to the first and second signals.

Type: Grant

Filed: March 16, 2007

Date of Patent: February 5, 2013

Assignee: Atmel Corporation

Inventors: Benjamin Francis Froemming, Emil Lambrache
Copy-propagate, propagate-post, and propagate-prior instructions for processing vectors

Patent number: 8370608

Abstract: The described embodiments provide a processor for generating a result vector with copied or propagated values from an input vector. During operation, the processor receives at least one input vector and a control vector. Using these vectors, the processor generates the result vector, which can contain copied propagated values from the input vector(s), depending on the value of the control vector. In addition, a predicate vector can be used to control the values that are written to the result vector.

Type: Grant

Filed: June 30, 2009

Date of Patent: February 5, 2013

Assignee: Apple Inc.

Inventors: Jeffry E. Gonion, Keith E. Diefendorff
Cache rollback acceleration via a bank based versioning cache ciruit

Patent number: 8370576

Abstract: An embodiment of the present invention includes a circuit for tracking memory operations with trace-based execution. Each trace includes a sequence of operations that includes zero or more of the memory operations. At least some of the active memory operations access the memory in an execution order that is different from the program order. The circuit includes a first memory that caches data accessed by the memory operations. This memory is partitioned into N banks. Checkpoint entries, which are stored in a second memory also partitioned into N banks, are associated with each trace. Each entry refers to a checkpoint location in the first memory. A sub-circuit receives rollback requests and responds by overwriting checkpoint locations. Each of the N memory units consisting of a bank in the first memory and the corresponding bank in the second memory may be rolled back independently and concurrently with other memory units.

Type: Grant

Filed: February 13, 2008

Date of Patent: February 5, 2013

Assignee: Oracle America, Inc.

Inventors: John Gregory Favor, Paul G. Chan, Graham Ricketson Murphy, Joseph Byron Rowlands
Power efficient system for recovering an architecture register mapping table

Patent number: 8370607

Abstract: A system for recovering an architecture register mapping table (ARMT). The system includes a first number of collection circuits and decode circuits, a second number of selection circuits, and an enable circuit. Information related to the mapping between each physical register and an appropriate architecture register is obtained from a physical register mapping table (PRMT) by one and only one collection circuit during only one of a fourth number of instruction cycles. Each decode circuit has its input coupled to the output of one different collection circuit and is capable of converting its input into a third number bit wide binary string selection code at its output. Each selection circuit is configured to receive from each selection code a bit from a bit position associated with that selection circuit. The enable circuit is configured to appropriately enable mapping of information from the PRMT to the ARMT.

Type: Grant

Filed: December 23, 2009

Date of Patent: February 5, 2013

Assignee: STMicroelectronics (Beijing) R&D Co. Ltd.

Inventors: Hong-Xia Sun, Kai-Feng Wang, Peng-Fei Zhu, Yong-Qiang Chris Wu
Dynamically monitoring and rebalancing resource allocation of monitored processes based on execution rates of measuring processes at multiple priority levels

Patent number: 8365177

Abstract: Measuring processes are started at a plurality of priority levels. A different one of the measuring processes is started for each of the priority levels. Subsequently, for each of the measuring processes, it is determined whether each measuring process is scheduled for executing at a respective target rate. In response to determining that a particular measuring process of the measuring processes is not scheduled for executing at a particular target rate, resource allocation to at least one monitored process running at a particular level of the priority levels is adjusted. The at least one monitored process is not any of the measuring processes.

Type: Grant

Filed: January 20, 2009

Date of Patent: January 29, 2013

Assignee: Oracle International Corporation

Inventors: Wilson Chan, Angelo Pruscino, Ahmed S. Abbas, Tak Fung Wang
RELAXATION OF SYNCHRONIZATION FOR ITERATIVE CONVERGENT COMPUTATIONS

Publication number: 20130024662

Abstract: Systems and methods are disclosed that allow atomic updates to global data to be at least partially eliminated to reduce synchronization overhead in parallel computing. A compiler analyzes the data to be processed to selectively permit unsynchronized data transfer for at least one type of data. A programmer may provide a hint to expressly identify the type of data that are candidates for unsynchronized data transfer. In one embodiment, the synchronization overhead is reducible by generating an application program that selectively substitutes codes for unsynchronized data transfer for a subset of codes for synchronized data transfer. In another embodiment, the synchronization overhead is reducible by employing a combination of software and hardware by using relaxation data registers and decoders that collectively convert a subset of commands for synchronized data transfer into commands for unsynchronized data transfer.

Type: Application

Filed: July 18, 2011

Publication date: January 24, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Lakshminarayanan Renganarayana, Vijayalakshmi Srinivasan
Semiconductor device and data processing method performed by semiconductor device to perform a repeat operation within a reconfigurable pipeline

Patent number: 8359457

Abstract: The semiconductor device includes a controller and a plurality of dynamically reconfigurable circuits connected to one another in series below the controller to perform operations in the manner of a pipeline. The controller inputs data and reconfiguration information to the first one of the dynamically reconfigurable circuits. Each of the dynamically reconfigurable circuits includes a processing unit that performs a data computation, an updating unit that updates the reconfiguration information, and a repetition controlling unit that determines whether to repeat the computation and controls the data and the reconfiguration information.

Type: Grant

Filed: February 17, 2009

Date of Patent: January 22, 2013

Assignee: Kabushiki Kaisha Toshiba

Inventors: Takashi Yoshikawa, Shigehiro Asano
Using hardware support to reduce synchronization costs in multithreaded applications

Patent number: 8359459

Abstract: A processor configured to synchronize threads in multithreaded applications. The processor includes first and second registers. The processor stores a first bitmask in the first register and a second bitmask in the second register. For each bitmask, each bit corresponds with one of multiple threads. A given bit in the first bitmask indicates the corresponding thread has been assigned to execute a portion of a unit of work. A corresponding bit in the second bitmask indicates the corresponding thread has completed execution of its assigned portion of the unit of work. The processor receives updates to the second bitmask in the second register and provides an indication that the unit of work has been completed in response to detecting that for each bit in the first bitmask that corresponds to a thread that is assigned work, a corresponding bit in the second bitmask indicates its corresponding thread has completed its assigned work.

Type: Grant

Filed: May 27, 2008

Date of Patent: January 22, 2013

Assignee: Oracle America, Inc.

Inventor: Darryl J. Gove
Minimizing code duplication in an unbounded transactional memory system by using mode agnostic transactional read and write barriers

Patent number: 8356166

Abstract: Minimizing code duplication in an unbounded transactional memory system. A computing apparatus including one or more processors in which it is possible to use a set of common mode-agnostic TM barrier sequences that runs on legacy ISA and extended ISA processors, and that employs hardware filter indicators (when available) to filter redundant applications of TM barriers, and that enables a compiled binary representation of the subject code to run correctly in any of the currently implemented set of transactional memory execution modes, including running the code outside of a transaction, and that enables the same compiled binary to continue to work with future TM implementations which may introduce as yet unknown future TM execution modes.

Type: Grant

Filed: June 26, 2009

Date of Patent: January 15, 2013

Assignee: Microsoft Corporation

Inventors: Ali-Reza Adl-Tabatabai, Bratin Saha, Gad Sheaffer, Vadim Bassin, Robert Y. Geva, Martin Taillefer, Darek Mihocka, Burton Jordan Smith, Jan Gray
Execution unit with data dependent conditional write instructions

Patent number: 8356162

Abstract: An execution unit supports data dependent conditional write instructions that write data to a target only when a particular condition is met. In one implementation, a data dependent conditional write instruction identifies a condition as well as data to be tested against that condition. The data is tested against that condition, and the result of the test is used to selectively enable or disable a write to a target associated with the data dependent conditional write instruction. Then, a write is attempted while the write to the target is enabled or disabled such that the write will update the contents of the target only when the write is selectively enabled as a result of the test. By doing so, dependencies are typically avoided, as is use of an architected condition register that might otherwise introduce branch prediction mispredict penalties, enabling improved performance with z-buffer test and similar types of algorithms.

Type: Grant

Filed: March 18, 2008

Date of Patent: January 15, 2013

Assignee: International Business Machines Corporation

Inventors: Adam James Muff, Matthew Ray Tubbs
SIMD microprocessor and method for controlling variable sized image data processing

Patent number: 8356163

Abstract: A disclosed SIMD microprocessor includes plural processor elements each having n arithmetic circuits and n registers configured to temporarily store data pieces to be input to the arithmetic circuits, n being a natural number equal to or greater than 2, and; a control circuit configured to determine an arrangement order of the processor elements and an arrangement order of the arithmetic circuits in the processor elements and determine whether to use the n arithmetic circuits as a single arithmetic circuit or as n arithmetic circuits. Each processor element further includes n shifter pairs each including a PE shifter and a bit shifter; and n shift data selection circuits configured to select arbitrary data pieces from the data pieces in the shifter pairs, perform bit extension on the data pieces, and transfer the data pieces to the arithmetic circuits.

Type: Grant

Filed: June 24, 2008

Date of Patent: January 15, 2013

Assignee: Ricoh Company, Ltd.

Inventor: Toshiki Yamanaka
Shift-in-right instructions for processing vectors

Patent number: 8356164

Abstract: The described embodiments provide a processor for generating a result vector with shifted values from an input vector. During operation, the processor receives an input vector and a control vector. Using these vectors, the processor generates the result vector, which can contain shifted values or propagated values from the input vector, depending on the value of the control vector. In addition, a predicate vector can be used to control the values that are written to the result vector.

Type: Grant

Filed: June 30, 2009

Date of Patent: January 15, 2013

Assignee: Apple Inc.

Inventors: Jeffry E. Gonion, Keith E. Diefendorff
MULTI-THREAD PROCESSOR AND ITS HARDWARE THREAD SCHEDULING METHOD

Publication number: 20130013900

Abstract: A multi-thread processor includes a plurality of hardware threads each of which generates an independent instruction flow, a first thread scheduler that outputs a first thread selection signal, the first thread selection signal designating a hardware thread to be executed in a next execution cycle among the plurality of hardware threads according to a priority rank, the priority rank being established in advance for each of the plurality of hardware threads, a first selector that selects one of the plurality of hardware threads according to the first thread selection signal and outputs an instruction generated by the selected hardware thread, and an execution pipeline that executes an instruction output from the first selector. Whenever the hardware thread is executed in the execution pipeline, the first scheduler updates the priority rank for the executed hardware thread and outputs the first thread selection signal in accordance with the updated priority rank.

Type: Application

Filed: September 14, 2012

Publication date: January 10, 2013

Applicant: Renesas Electronics Corporation

Inventors: Koji ADACHI, Teppei Oomoto
Using Hardware Transaction Primitives for Implementing Non-Transactional Escape Actions Inside Transactions

Publication number: 20130013899

Abstract: Mechanisms are provided for performing escape actions within transactions. These mechanisms execute a transaction comprising a transactional section and an escape action. The transactional section is comprised of one or more instructions that are to be executed in an atomic manner as part of the transaction. The escape action is comprised of one or more instructions to be executed in a non-transactional manner. These mechanisms further populate at least one actions list data structure, associated with a thread of the data processing system that is executing the transaction, with one or more actions associated with the escape action. Moreover, these mechanisms execute one or more actions in the actions list data structure based upon whether the transaction commits successfully or is aborted.

Type: Application

Filed: July 6, 2011

Publication date: January 10, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Christopher M. Barton, Harold W. Cain, III, Bradly G. Frey, Hung Q. Le, Maged M. Michael, Raul E. Silvera, Derek E. Williams, Michael Wong, Peng Wu
Managing Multiple Threads In A Single Pipeline

Publication number: 20130013898

Abstract: In one embodiment, the present invention includes a method for determining if an instruction of a first thread dispatched from a first queue associated with the first thread is stalled in a pipestage of a pipeline, and if so, dispatching an instruction of a second thread from a second queue associated with the second thread to the pipeline if the second thread is not stalled. Other embodiments are described and claimed.

Type: Application

Filed: September 13, 2012

Publication date: January 10, 2013

Inventors: Matthew Merten, Avinash Sodani, James Hadley, Alexander Farcy, Iredamola Olopade
MULTI-CORE IMAGE PROCESSOR FOR PORTABLE DEVICE

Publication number: 20130013839

Abstract: A portable handheld device including a CPU for processing a script; a multi-core processor for processing an image; an input buffer for receiving data for processing by the multi-core processor, the input buffer being provided under the control of the multi-core processor to send data thereto; and an output buffer for receiving data processed by the multi-core processor, the output buffer being provided under the control of the multi-core processor to receive data therefrom. The multi-core processor comprises a plurality of micro-coded processing units. The CPU is configured with authority to clear and query the input and output buffers.

Type: Application

Filed: September 15, 2012

Publication date: January 10, 2013

Inventor: Kia Silverbrook
Coordinating chores in a multiprocessing environment using a compiler generated exception table

Patent number: 8352711

Abstract: The coordination and execution of chores in a multiprocessing environment. The coordination of chores is accomplished utilizing a compiler generated correlation that relates blocks of code that execute chores and blocks of code in which the chore can be realized. By tracking the execution of the program and using the compiler-generated correlation, chores can be identified for the currently executing code.

Type: Grant

Filed: January 22, 2008

Date of Patent: January 8, 2013

Assignee: Microsoft Corporation

Inventors: Eric Dean Tribble, Mark Ronald Plesko, Christopher Wellington Brumme
CHECKING THE INTEGRITY OF A PROGRAM EXECUTED BY AN ELECTRONIC CIRCUIT

Publication number: 20130007420

Abstract: A method for checking the integrity of a program executed by an electronic circuit and including at least one conditional jump, wherein: a first value is updated for any instruction which does not correspond to a jump instruction; a second value is updated with the first value for each conditional jump instruction; and the second value is compared with a third value, calculated according to the performed conditional jumps.

Type: Application

Filed: June 15, 2012

Publication date: January 3, 2013

Applicant: Proton World International N.V.

Inventors: Gilles Van Assche, Ronny Vankeer
COMPUTER IMPLEMENTED METHOD OF ELECTING K EXTREME ENTRIES FROM A LIST USING SEPARATE SECTION COMPARISONS

Publication number: 20130007419

Abstract: A computer implemented method selects K extreme elements of a list of N elements by partitioning each of the N elements into a plurality of sections. For each section from a most significant section to a least significant section the method selects a threshold selection determining at least K extreme entries from the list. This iteratively compares a corresponding section to a section threshold, counts a number of sections which are more extreme than the section threshold, increasing (or decreasing) the section threshold if the count is greater than K and decreasing the section threshold if the count is less than K. After finding the section threshold for the corresponding section the method forms a combined threshold by concatenation of said section thresholds in order from a most significant section to a least significant section, compares each of the N elements to the combined threshold, and selects at least K elements from the set of N elements more extreme than the combined threshold.

Type: Application

Filed: April 12, 2012

Publication date: January 3, 2013

Applicant: Texas Instruments Incorporated

Inventors: Constantin Bajenaru, Michael Livshitz, Mingjian Yan, Jing Jiang
SYSTEM AND METHOD FOR POWER OPTIMIZATION

Publication number: 20120331319

Abstract: A technique for reducing the power consumption required to execute processing operations. A processing complex, such as a CPU or a GPU, includes a first set of cores comprising one or more fast cores and second set of cores comprising one or more slow cores. A processing mode of the processing complex can switch between a first mode of operation and a second mode of operation based on one or more of the workload characteristics, performance characteristics of the first and second sets of cores, power characteristics of the first and second sets of cores, and operating conditions of the processing complex. A controller causes the processing operations to be executed by either the first set of cores or the second set of cores to achieve the lowest total power consumption.

Type: Application

Filed: September 5, 2012

Publication date: December 27, 2012

Inventors: John George Mathieson, Phil Carmack, Brian Smith
SYSTEM AND METHOD FOR POWER OPTIMIZATION

Publication number: 20120331275

Abstract: A technique for reducing the power consumption required to execute processing operations. A processing complex, such as a CPU or a GPU, includes a first set of cores comprising one or more fast cores and second set of cores comprising one or more slow cores. A processing mode of the processing complex can switch between a first mode of operation and a second mode of operation based on one or more of the workload characteristics, performance characteristics of the first and second sets of cores, power characteristics of the first and second sets of cores, and operating conditions of the processing complex. A controller causes the processing operations to be executed by either the first set of cores or the second set of cores to achieve the lowest total power consumption.

Type: Application

Filed: September 5, 2012

Publication date: December 27, 2012

Inventors: John George Mathieson, Phil Carmack, Brian Smith
Dynamic allocation of resources in a threaded, heterogeneous processor

Patent number: 8335911

Abstract: Systems and methods for efficient dynamic utilization of shared resources in a processor. A processor comprises a front end pipeline, an execution pipeline, and a commit pipeline, wherein each pipeline comprises a shared resource with entries configured to be allocated for use in each clock cycle by each of a plurality of threads supported by the processor. To avoid starvation of any active thread, the processor further comprises circuitry configured to ensure each active thread is able to allocate at least a predetermined quota of entries of each shared resource. Each pipe stage of a total pipeline for the processor may include at least one dynamically allocated shared resource configured not to starve any active thread. Dynamic allocation of shared resources between a plurality of threads may yield higher performance over static allocation. In addition, dynamic allocation may require relatively little overhead for activation/deactivation of threads.

Type: Grant

Filed: September 30, 2009

Date of Patent: December 18, 2012

Assignee: Oracle America, Inc.

Inventors: Robert T. Golla, Gregory F. Grohoski
Primitives to enhance thread-level speculation

Patent number: 8332619

Abstract: A processor may include an address monitor table and an atomic update table to support speculative threading. The processor may also include one or more registers to maintain state associated with execution of speculative threads. The processor may support one or more of the following primitives: an instruction to write to a register of the state, an instruction to trigger the committing of buffered memory updates, an instruction to read the a status register of the state, and/or an instruction to clear one of the state bits associated with trap/exception/interrupt handling. Other embodiments are also described and claimed.

Type: Grant

Filed: December 8, 2011

Date of Patent: December 11, 2012

Assignee: Intel Corporation

Inventors: Quinn A. Jacobson, Hong Wang, John P. Shen, Gautham N. Chinya, Per Hammarlund, Xiang Zou, Bryant Bigbee, Shivnandan D. Kaushik
Out-of-order X86 microprocessor with fast shift-by-zero handling

Patent number: 8332618

Abstract: An out-of-order execution microprocessor includes a register alias table configured to generate a first indicator that indicates whether an instruction is dependent upon a condition code result of a shift instruction. The microprocessor also includes a first execution unit configured to execute the shift instruction and to generate a second indicator that indicates whether a shift amount of the shift instruction is zero. The microprocessor also includes a second execution unit configured to receive the first and second indicators and to generate a replay signal to cause the instruction to be replayed if the first indicator indicates the instruction is dependent upon the condition code result of the shift instruction and a second indicator indicates the shift amount of the shift instruction is zero.

Type: Grant

Filed: December 9, 2009

Date of Patent: December 11, 2012

Assignee: VIA Technologies, Inc.

Inventors: Gerard M. Col, Matthew Daniel Day, Terry Parks, Bryan Wayne Pogor
Method for switching a selected task to be executed according with an output from task selecting circuit

Patent number: 8327379

Abstract: A task processor includes a CPU, a save circuit, and a task control circuit. The CPU is provided with a processing register and an execution control circuit operative to load data from a memory into a processing register and execute a task in accordance with the data in the processing register. The task control circuit is provided with a task selecting circuit and state storage units respectively associated with tasks. In executing a predetermined system call, the execution control circuit notifies the task control circuit as such. Upon being notified of the execution of the system call instruction, the task control circuit switches a task to be executed next in accordance with an output from the task selecting circuit. The task selecting circuit selects a task in accordance with an output from the state registers.

Type: Grant

Filed: August 24, 2006

Date of Patent: December 4, 2012

Assignee: Kernelon Silicon Inc.

Inventor: Naotaka Maruyama

prev … 7 8 9 10 11 12 13 14 15 … next