Processing Control Patents (Class 712/220)

Arithmetic operation instruction processing (Class 712/221)

Floating point or vector (Class 712/222)

Logic operation instruction processing (Class 712/223)

Masking (Class 712/224)

Processing control for data transfer (Class 712/225)

Instruction modification based on condition (Class 712/226)

Specialized instruction processing in support of testing, debugging, emulation (Class 712/227)

Context preserving (e.g., context swapping, checkpointing, register windowing (Class 712/228)

Mode switch or change (Class 712/229)

Generating next microinstruction address (Class 712/230)

Detecting end or completion of microprogram (Class 712/231)

Hardwired controller (Class 712/232)

Branching (e.g., delayed branch, loop control, branch predict, interrupt) (Class 712/233)

Processing sequence control (i.e., microsequencing) (Class 712/245)

GENERATION-BASED MEMORY SYNCHRONIZATION IN A MULTIPROCESSOR SYSTEM WITH WEAKLY CONSISTENT MEMORY ACCESSES

Publication number: 20110119470

Abstract: In a multiprocessor system, a central memory synchronization module coordinates memory synchronization requests responsive to memory access requests in flight, a generation counter, and a reclaim pointer. The central module communicates via point-to-point communication. The module includes a global OR reduce tree for each memory access requesting device, for detecting memory access requests in flight. An interface unit is implemented associated with each processor requesting synchronization. The interface unit includes multiple generation completion detectors. The generation count and reclaim pointer do not pass one another.

Type: Application

Filed: June 8, 2010

Publication date: May 19, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: Martin Ohmacht
Efficient operating system interposition mechanism

Patent number: 7945915

Abstract: Methods and systems for efficiently interpreting operating system service requests on the same register or vector of a processor or CPU where the operating system service requests are initiated from native and non-native applications are provided. More particularly, a switching layer can enable processing of the operating system service requests by routing control of a particular request to an appropriate kernel subsystem or module based on the type of operating system service being requested and the type of application initiating the request. Additionally, the performance impact of the switching layer for native applications is overcome by dynamically reprogramming the processor or CPU on every change of active process so that only foreign applications are subject to the processing requirements of the switching layer.

Type: Grant

Filed: December 12, 2006

Date of Patent: May 17, 2011

Assignee: Oracle America, Inc.

Inventor: Nils A. Nieuwejaar
Computer for executing two instruction sets and adds a macroinstruction end marker for performing iterations after loop termination

Patent number: 7941647

Abstract: A computer. A processor pipeline alternately executes instructions coded for first and second different computer architectures or coded to implement first and second different processing conventions. A memory stores instructions for execution by the processor pipeline, the memory being divided into pages for management by a virtual memory manager, a single address space of the memory having first and second pages. A memory unit fetches instructions from the memory for execution by the pipeline, and fetches stored indicator elements associated with respective memory pages of the single address space from which the instructions are to be fetched. Each indicator element is designed to store an indication of which of two different computer architectures and/or execution conventions under which instruction data of the associated page are to be executed by the processor pipeline.

Type: Grant

Filed: October 31, 2007

Date of Patent: May 10, 2011

Assignee: ATI Technologies ULC

Inventors: John S. Yates, Jr., David L. Reese, Korbin S. Van Dyke, T. R. Ramesh, Paul H. Hohensee
SIMD processor executing min/max instructions

Patent number: 7941649

Abstract: A SIMD processor responds to a single min/max instruction to find the minimum or maximum valued data unit in an array of data units. The determined minimum/maximum value and an associated index value thereto may be output. Alternatively, the value of a data unit in another array may be output at a corresponding location. A further single instruction executable by the SIMD processor, may be applied to results obtained using such a single min/max instruction, to allow such instructions to operate on two dimensional arrays.

Type: Grant

Filed: September 5, 2008

Date of Patent: May 10, 2011

Assignee: Broadcom Corporation

Inventors: Richard J. Selvaggi, Larry A. Pearlstein
Isochronous pipelined processor with deterministic control

Patent number: 7941645

Abstract: An isochronous processor includes a state register, a functional unit, a control module, and an activation unit. The state register includes an arm buffer and an active buffer. The functional unit performs a transformation operation on the data stream in response to an active value of the control parameter obtained from the active buffer. The control module updates the arm value of the control parameter in the arm buffer in response to control instructions. The activation unit detects a load event propagating with the data stream and transfers the parameter value from the arm buffer to the active buffer in response to the load event. During this transfer, the control module is inhibited from updating the arm buffer.

Type: Grant

Filed: July 28, 2004

Date of Patent: May 10, 2011

Assignee: NVIDIA Corporation

Inventors: Duncan A. Riach, Leslie E. Neft, Michael A. Ogrinc, Wayne Douglas Young
Methods and apparatus for dynamic instruction controlled reconfigurable register file

Patent number: 7941648

Abstract: A scalable reconfigurable register file (SRRF) containing multiple register files, read and write multiplexer complexes, and a control unit operating in response to instructions is described. Multiple address configurations of the register files are supported by each instruction and different configurations are operable simultaneously during a single instruction execution. For example, with separate files of the size 32×32 supported configurations of 128×32 bit s, 64×64 bit s and 32×128 bit s can be in operation each cycle. Single width, double width, quad width operands are optimally supported without increasing the register file size and without increasing the number of register file read or write ports.

Type: Grant

Filed: June 3, 2008

Date of Patent: May 10, 2011

Assignee: Altera Corporation

Inventors: Gerald George Pechanek, Edward A. Wolff
Completion continue on thread switch based on instruction progress metric mechanism for a microprocessor

Patent number: 7941646

Abstract: A thread switch mechanism and technique for a microprocessor is disclosed wherein a processing of a first thread is completed, and a continuation of a second thread is initiated during completion of the first thread. In one form, the technique includes processing a first thread at a pipeline of a processing device, and initiating processing of a second thread at a front end of the pipeline in response to an occurrence of a context switch event. The technique can also include initiating a instruction progress metric in response the context switch event. The technique can further include enabling completion of processing of instructions of the first thread that are at a back end of the pipeline at the occurrence of the context switch event until an expiry of the instruction progress metric.

Type: Grant

Filed: December 31, 2007

Date of Patent: May 10, 2011

Assignee: Freescale Semicondoctor, Inc.

Inventors: David C. Holloway, Michael D. Snyder, Suresh Venkumahanti
SINGLE-CHIP MULTIPROCESSOR WITH CLOCK CYCLE-PRECISE PROGRAM SCHEDULING OF PARALLEL EXECUTION

Publication number: 20110107067

Abstract: A single-chip multiprocessor system and operation method of this system based on a static macro-scheduling of parallel streams for multiprocessor parallel execution. The single-chip multiprocessor system has buses for direct exchange between the processor register files and access to their store addresses and data. Each explicit parallelism architecture processor of this system has an interprocessor interface providing the synchronization signals exchange, data exchange at the register file level and access to store addresses and data of other processors. The single-chip multiprocessor system uses ILP to increase the performance. Synchronization of the streams parallel execution is ensured using special operations setting a sequence of streams and stream fragments execution prescribed by the program algorithm.

Type: Application

Filed: January 10, 2011

Publication date: May 5, 2011

Applicant: Elbrus International

Inventors: Boris A. Babaian, Yuli Kh. Sakhin, Vladimir Yu. Volkonskiy, Sergey A. Rozhkov, Vladimir V. Tikhorsky, Feodor A. Gruzdov, Leonid N. Nazarov, Mikhail L. Chudakov
Performance of first and second macros while data is moving through hardware pipeline

Publication number: 20110107061

Abstract: A hardware pipeline has a number of rows including a first row, a last row, and an intermediate row between the first row and the last row. Each row stores a number of bytes of data as the data moves through the pipeline on a row-by-row basis from the first row towards the last row. A mechanism performs a first macro on the data beginning at the first row. The mechanism performs a second macro different than the first macro on the data beginning at the intermediate row where the first macro has been completely performed when the data has reached the intermediate row. The first and second macros each include a number of modifications of the data as the data moves through the pipeline to effect a complete transformation of the data. The complete transformation of the first macro is different than the complete transformation of the second data.

Type: Application

Filed: October 30, 2009

Publication date: May 5, 2011

Inventor: David A. Warren
SYSTEM AND A METHOD FOR PROVIDING NONDETERMINISTIC DATA

Publication number: 20110106870

Abstract: A system and method for providing non-deterministic data for processes executed by non-synchronized processor elements of a fault resilient system is discussed. The steps of the method comprise receiving a request for getting non-deterministic data from a requesting processor element; assigning non-deterministic data generated by an entropy source to the request; and supplying the non-deterministic data assigned to the request, to the requesting processor element.

Type: Application

Filed: October 28, 2010

Publication date: May 5, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Silvio Dragone, Tamas Visegrady, Vincenzo Condorelli
CASCADED ACCELERATOR FUNCTIONS

Publication number: 20110107066

Abstract: Accelerator functions are cascaded, such that a result of one accelerator function is directly forwarded to another accelerator function, bypassing the processor requesting the functions to be performed. The cascading may be provided during compilation of a program specifying the functions to be performed, but can be dynamically reversed during runtime of the program.

Type: Application

Filed: October 30, 2009

Publication date: May 5, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Rajaram B. Krishnamurthy, Thomas A. Gregg
Extended Cache Capacity

Publication number: 20110107031

Abstract: A method, programmed medium and system are provided for enabling a core's cache capacity to be increased by using the caches of the disabled or non-enabled cores on the same chip. Caches of disabled or non-enabled cores on a chip are made accessible to store cachelines for those chip cores that have been enabled, thereby extending cache capacity of enabled cores.

Type: Application

Filed: November 5, 2009

Publication date: May 5, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Vaijayanthimala K. Anand, Diane Garza Flemming, William A. Maron, Mysore Sathyanarayana Srinivas
INTERCONNECT CONTROLLER FOR A DATA PROCESSING DEVICE AND METHOD THEREFOR

Publication number: 20110107065

Abstract: A data processing device includes an interconnect controller operable to manage the communication of information between modules of the data processing device via an interconnect. In response to a transaction request the interconnect controller selects a tag value from a set of available tag values, assigns the tag to the transaction and reserves the tag value so that it is unavailable for assignment to other transactions. If an expected response to the transaction request is not received within a designated amount of time, the transaction enters a timed-out state and the interconnect controller locks the tag value, so that it remains unavailable for assignment to other transactions until an unlock event, such as a request from software.

Type: Application

Filed: October 29, 2009

Publication date: May 5, 2011

Applicant: FREESCALE SEMICONDUCTOR, INC.

Inventors: Gus P. Ikonomopoulos, Thang Q. Nguyen, Jose M. Nunez, Kun Xu
Methods for scalably exploiting parallelism in a parallel processing system

Patent number: 7937567

Abstract: Parallelism in a parallel processing subsystem is exploited in a scalable manner. A problem to be solved can be hierarchically decomposed into at least two levels of sub-problems. Individual threads of program execution are defined to solve the lowest-level sub-problems. The threads are grouped into one or more thread arrays, each of which solves a higher-level sub-problem. The thread arrays are executable by processing cores, each of which can execute at least one thread array at a time. Thread arrays can be grouped into grids of independent thread arrays, which solve still higher-level sub-problems or an entire problem. Thread arrays within a grid, or entire grids, can be distributed across all of the available processing cores as available in a particular system implementation.

Type: Grant

Filed: November 1, 2006

Date of Patent: May 3, 2011

Assignee: Nvidia Corporation

Inventors: John R. Nickolls, Stephen D. Lew
Method and apparatus for monitoring inputs to an asyncrhonous, homogenous, reconfigurable computer array

Patent number: 7934075

Abstract: A computer array (10) has a plurality of computers (12). The computers (12) communicate with each other asynchronously and operate in a generally asynchronous manner internally. When one computer (12) attempts to communicate with another it goes to sleep until the other computer (12) is ready to complete the transaction, thereby saving power and reducing heat production. The instructions executed by the computers (12) can include a micro-loop (100) which is capable of performing a series of operations repeatedly. In one application, the sleeping computer (12) is awakened by an input such that it commences an action that would otherwise required an interrupt of an otherwise active computer. For example, one computer (12f) can be used to monitor an input/output port of the computer array (10).

Type: Grant

Filed: May 26, 2006

Date of Patent: April 26, 2011

Assignee: VNS Portfolio LLC

Inventors: Charles H. Moore, Jeffrey Arthur Fox, John W. Rible
Processor for executing switch and translate instructions requiring wide operands

Patent number: 7932911

Abstract: A programmable processor and method for improving the performance of processors by expanding at least two source operands, or a source and a result operand, to a width greater than the width of either the general purpose register or the data path width. The present invention provides operands which are substantially larger than the data path width of the processor by using the contents of a general purpose register to specify a memory address at which a plurality of data path widths of data can be read or written, as well as the size and shape of the operand. In addition, several instructions and apparatus for implementing these instructions are described which obtain performance advantages if the operands are not limited to the width and accessible number of general purpose registers.

Type: Grant

Filed: October 31, 2007

Date of Patent: April 26, 2011

Assignee: MicroUnity Systems Engineering, Inc.

Inventors: Craig Hansen, John Moussouris, Alexia Massalin
Reshuffled communications processes in pipelined asynchronous circuits

Patent number: 7934031

Abstract: An asynchronous logic family of circuits which communicate on delay-insensitive flow-controlled channels with 4-phase handshakes and 1 of N encoding, compute output data directly from input data using domino logic, and use the state-holding ability of the domino logic to implement pipelining without additional latches.

Type: Grant

Filed: May 11, 2006

Date of Patent: April 26, 2011

Assignee: California Institute of Technology

Inventors: Andrew M. Lines, Alain J. Martin, Uri Cummings
Processor and its instruction issue method

Patent number: 7934079

Abstract: An instruction issue method for use in a pipelined processor, comprising the steps of: decoding an instruction to be processed to get a type of the instruction; computing the number of cycles to be occupied at execution stage for the instruction, according to the type of the instruction; marking a target operand of the instruction as acquirable in a predefined cycle before the instruction enters write-back stage, according to the number of cycles, so that subsequent instructions taking the target operand as their source operands perform subsequent operations according to the case that the target operand is acquirable.

Type: Grant

Filed: January 10, 2006

Date of Patent: April 26, 2011

Assignee: NXP B.V.

Inventor: Xia Zhu
Aggressive store merging in a processor that supports checkpointing

Patent number: 7934080

Abstract: Embodiments of the present invention provide a processor that merges stores in an N-entry first-in-first-out (FIFO) store queue. In these embodiments, the processor starts by executing instructions before a checkpoint is generated. When executing instructions before the checkpoint is generated, the processor is configured to perform limited or no merging of stores into existing entries in the store queue. Then, upon detecting a predetermined condition, the processor is configured to generate a checkpoint. After generating the checkpoint, the processor is configured to continue to execute instructions. When executing instructions after the checkpoint is generated, the processor is configured to freely merge subsequent stores into post-checkpoint entries in the store queue.

Type: Grant

Filed: May 28, 2008

Date of Patent: April 26, 2011

Assignee: Oracle America, Inc.

Inventors: Paul Caprioli, Martin Karlsson, Gideon N. Levinsky, Khondakar A. Mujtaba, Shailender Chaudhry, Murali K. Inaganti
System and software for performing matrix multiply extract operations

Patent number: 7932910

Abstract: A programmable processor and method for improving the performance of processors by expanding at least two source operands, or a source and a result operand, to a width greater than the width of either the general purpose register or the data path width. The present invention provides operands which are substantially larger than the data path width of the processor by using the contents of a general purpose register to specify a memory address at which a plurality of data path widths of data can be read or written, as well as the size and shape of the operand. In addition, several instructions and apparatus for implementing these instructions are described which obtain performance advantages if the operands are not limited to the width and accessible number of general purpose registers.

Type: Grant

Filed: August 20, 2007

Date of Patent: April 26, 2011

Assignee: MicroUnity Systems Engineering, Inc.

Inventors: Craig Hansen, John Moussouris, Alexia Massalin
Program flow control

Publication number: 20110093683

Abstract: A data processing apparatus includes a data engine 6 having an instruction decoder 18 for generating one or more control signals 24 for controlling processing circuitry 20 to perform data processing operations specified by the program instructions decoded. The instruction decoder 18 responsive to a marker instruction to read a programmable flow control value from a flow control register 38. The programmable flow control value specifies the action to be taken upon completion of execution of a current sequence of program instructions. The action taken may be jumping to a target program instruction at the start of a target sequence of program instructions or entry into an idle state awaiting a new processing task to be initiated.

Type: Application

Filed: September 7, 2010

Publication date: April 21, 2011

Applicant: ARM LIMITED

Inventors: Merlijn Aurich, Jef Verdonck
DATA PROCESSING CIRCUIT

Publication number: 20110093685

Abstract: A data processing circuit is disclosed in the present invention. The data processing circuit includes a decoder and a number of N-stage circuits. The circuits receive input data from at least a memory and separate the input data into N stages. The circuit process and store the N input data simultaneously to decrease the time of data processing in the data processing circuit.

Type: Application

Filed: September 8, 2010

Publication date: April 21, 2011

Inventors: Chien-Chou CHEN, Ming-Sung Huang, Wen Min Lu
Reducing multiplexer circuitry associated with a processor

Patent number: 7930521

Abstract: Methods and apparatus are provided for reducing the amount of resources allocated for handling multiplexing in a processor. Characteristics associated with processing blocks are analyzed. Operand restrictions and register groups can be configured to allow the use of more resource efficient multiplexing circuitry in a processor.

Type: Grant

Filed: June 11, 2008

Date of Patent: April 19, 2011

Assignee: Altera Corporation

Inventor: Paul Metzgen
Method for speculative execution of instructions and a device having speculative execution capabilities

Patent number: 7930522

Abstract: A method for speculative execution of instructions, the method includes: decoding a compare instruction; speculatively executing, in a continuous manner, conditional instructions that are conditioned by a condition that is related to a resolution of the compare instruction and are decoded during a speculation window that starts at the decoding of the compare instruction and ends when the compare instruction is resolved; and stalling an execution of a non-conditional instruction that is dependent upon an outcome of at least one of the conditional instructions, until the speculation window ends.

Type: Grant

Filed: August 19, 2008

Date of Patent: April 19, 2011

Assignee: Freescale Semiconductor, Inc.

Inventors: Guy Shumeli, Itzhak Barak, Uri Dayan, Amir Paran, Idan Rozenberg, Doron Schupper
PROVIDING PIPELINE STATE THROUGH CONSTANT BUFFERS

Publication number: 20110087864

Abstract: One embodiment of the present invention sets forth a technique for providing state information to one or more shader engines within a processing pipeline. State information received from an application accessing the processing pipeline is stored in constant buffer memory accessible to each of the shader engines. The shader engines can then retrieve the state information during execution.

Type: Application

Filed: October 6, 2010

Publication date: April 14, 2011

Inventors: Jerome F. DULUK, JR., Jesse David Hall
DATA PROCESSING APPARATUS HAVING A PARALLEL PROCESSING CIRCUIT INCLUDING A PLURALITY OF PROCESSING MODULES, AND METHOD FOR CONTROLLING THE SAME

Publication number: 20110087863

Abstract: In an apparatus which includes a plurality of processing modules connected via a ring-shape bus, if a plurality pieces of pipeline processing to be processed in a different order is allocated to a plurality of processing modules, the transfer efficiency may decrease when an amount of data transferred from one of the processing modules to a post-stage module exceeds a processing capacity of the post-stage module. Accordingly, a module positioned on the preceding side in the pipeline processing controls a transmission interval of processed data so that the post-stage module can receive the data processed by the preceding module.

Type: Application

Filed: October 4, 2010

Publication date: April 14, 2011

Applicant: CANON KABUSHIKI KAISHA

Inventors: Hiroyasu Watanabe, Hirowo Inoue, Hisashi Ishikawa
Compiling method, apparatus, and program

Patent number: 7925471

Abstract: Brings the response time of a Web server and the like closer to a targeted value. A controller controlling the average response time elapsed between reception by information processing apparatus of a processing request and response of information processing apparatus to the processing request. The controller including: a section for obtaining a response time goal which is a target value of the average response time; a section for calculating a predicted response time which is a predicted value of the average response time at the time point when a predetermined reference period has elapsed from setting an operation mode in the information processing apparatus, the operation mode being any of a plurality of operation modes which provide different throughputs; and a section for setting the operation mode in the information processing apparatus if predicted response time calculated by the predicted response time calculating section is less than goal.

Type: Grant

Filed: August 12, 2008

Date of Patent: April 12, 2011

Assignee: International Business Machines Corporation

Inventors: Takuya Nakaike, Hideaki Komatsu
Instruction-level multithreading according to a predetermined fixed schedule in an embedded processor using zero-time context switching

Patent number: 7925869

Abstract: A system and method for enabling multithreading in a embedded processor, invoking zero-time context switching in a multithreading environment, scheduling multiple threads to permit numerous hard-real time and non-real time priority levels, fetching data and instructions from multiple memory blocks in a multithreading environment, and enabling a particular thread to modify the multiple states of the multiple threads in the processor core.

Type: Grant

Filed: December 21, 2000

Date of Patent: April 12, 2011

Assignee: Ubicom, Inc.

Inventors: Nicholas J Kelsey, Christopher J Waters, Tibet Mimaroglu, David A Fotland
Operand and result forwarding between differently sized operands in a superscalar processor

Patent number: 7921279

Abstract: Result and operand forwarding is provided between differently sized operands in a superscalar processor by grouping a first set of instructions for operand forwarding, and grouping a second set of instructions for result forwarding, the first set of instructions comprising a first source instruction having a first operand and a first dependent instruction having a second operand, the first dependent instruction depending from the first source instruction; the second set of instructions comprising a second source instruction having a third operand and a second dependent instruction having a fourth operand, the second dependent instruction depending from the second source instruction, performing operand forwarding by forwarding the first operand, either whole or in part, as it is being read to the first dependent instruction prior to execution; performing result forwarding by forwarding a result of the second source instruction, either whole or in part, to the second dependent instruction, after execution; wher

Type: Grant

Filed: March 19, 2008

Date of Patent: April 5, 2011

Assignee: International Business Machines Corporation

Inventors: David S. Hutton, Fadi Y. Busaba, Bruce C. Giamei, Christopher A. Krygowski, Edward T. Malley, Jeffrey S. Plate, John G. Rell, Jr., Chung-Lung Kevin Shum, Timothy J. Slegel
DYNAMIC SELECTION OF EXECUTION STAGE

Publication number: 20110078486

Abstract: Methods and apparatus relating to dynamic selection of execution stage are described. In some embodiments, logic may determine whether to execute an instruction at one of a plurality of stages in a processor. In some embodiments, the plurality of stages are to at least correspond to an address generation stage or an execution stage of the instruction. Other embodiments are also described and claimed.

Type: Application

Filed: September 30, 2009

Publication date: March 31, 2011

Inventors: Deepak Limaye, Kulin N. Kothari, James D. Allen, James E. Phillips
Support for Non-Local Returns in Parallel Thread SIMD Engine

Publication number: 20110078418

Abstract: One embodiment of the present invention sets forth a method for executing a non-local return instruction in a parallel thread processor. The method comprises the steps of receiving, within the thread group, a first long jump instruction and, in response, popping a first token from the execution stack. The method also comprises determining whether the first token is a first long jump token that was pushed onto the execution stack when a first push instruction associated with the first long jump instruction was executed, and when the first token is the first long jump token, jumping to the second instruction based on the address specified by the first long jump token, or, when the first token is not the first long jump token, disabling the active thread until the first long jump token is popped from the execution stack.

Type: Application

Filed: September 13, 2010

Publication date: March 31, 2011

Inventors: Guillermo Juan Rozas, Brett W. Coon
SET PROGRAM PARAMETER INSTRUCTION

Publication number: 20110078419

Abstract: A measurement sampling facility takes snapshots of the central processing unit (CPU) on which it is executing at specified sampling intervals to collect data relating to tasks executing on the CPU. The collected data is stored in a buffer, and at selected times, an interrupt is provided to remove data from the buffer to enable reuse thereof. The interrupt is not taken after each sample, but in sufficient time to remove the data and minimize data loss.

Type: Application

Filed: December 7, 2010

Publication date: March 31, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Jane H. Bartik, Lisa Cranton Heller, Damian L. Osisek, Donald W. Schmidt, Patrick M. West, JR., Phil C. Yeh
Techniques to manage critical region interrupts

Patent number: 7917910

Abstract: Briefly, techniques to manage interrupts and swaps of threads operating in critical region. In an embodiment, a thread is to be interrupted during a first critical region with an interrupt routine. The thread may be set to restart at a beginning of the first critical region in response to an indication that the thread is working in a critical region. Other embodiments are also claimed and disclosed.

Type: Grant

Filed: March 26, 2004

Date of Patent: March 29, 2011

Assignee: Intel Corporation

Inventor: Joseph S. Cavallo
Variable length command pull with contiguous sequential layout

Patent number: 7917659

Abstract: The invention relates to a method for computer signal processing data and command transfer over an interface and more particularly to a communication between peripheral firmware and a host processor or Basic Input/Output System (BIOS) on a Peripheral Component Interconnect (PCI) bus. In one embodiment, a device and method for reducing the load on the PCI Bus is described. In yet another embodiment, a device and method is described for constructing a variable length command block comprising message frames and aligning all message frames for a particular command block that are contiguous in memory.

Type: Grant

Filed: March 1, 2005

Date of Patent: March 29, 2011

Assignee: LSI Corporation

Inventors: Parag Maharana, Basavaraj Hallyal, Senthil Murugan Thangaraj, Gurpreet Singh Anand
FAST APPLICATION PROGRAMMABLE TIMERS

Publication number: 20110072247

Abstract: Methods, systems, and computer program products for implementing fast application programmable timers are provided. A computer program product includes a tangible storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method. The method includes receiving a request to set a user accessible timer, the request received from an application thread. The user accessible timer is set in response to receiving the request, the setting including initializing a counter. The counter is decremented until an interrupt threshold has been reached. An interrupt signal is transmitted to the application thread in response to detecting that the interrupt threshold has been reached.

Type: Application

Filed: September 21, 2009

Publication date: March 24, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Hubertus Franke, James Xenidis, Terry L. Nelms, II, Hollis R. Blanchard
NODE CONTROL DEVICE INTERPOSED BETWEEN PROCESSOR NODE AND IO NODE IN INFORMATION PROCESSING SYSTEM

Publication number: 20110072246

Abstract: A node control device is interposed between processor nodes and IO nodes in an information processing system, wherein each IO node subordinates at least one IO device. The node control device includes a register storing a base address of a mapping destination of an IO space, a table describing a plurality of entries retaining a plurality of IO space numbers and address ranges, and an IO space access detection circuit. The table stores an identification flag as to whether or not IO spaces are each mapped onto a memory space. The IO space access detection circuit decodes a command code and an address of an FRTT signal output from a processor node, thus detecting a target IO space and detecting whether the processor node is accessing an IO space mapped onto the memory space or another IO space.

Type: Application

Filed: September 20, 2010

Publication date: March 24, 2011

Inventor: YOSHIHISA YAMADA
HARDWARE FOR PARALLEL COMMAND LIST GENERATION

Publication number: 20110072245

Abstract: A method for providing state inheritance across command lists in a multi-threaded processing environment. The method includes receiving an application program that includes a plurality of parallel threads; generating a command list for each thread of the plurality of parallel threads; causing a first command list associated with a first thread of the plurality of parallel threads to be executed by a processing unit; and causing a second command list associated with a second thread of the plurality of parallel threads to be executed by the processing unit, where the second command list inherits from the first command list state associated with the processing unit.

Type: Application

Filed: August 25, 2010

Publication date: March 24, 2011

Inventors: Jerome F. DULUK, JR., Jesse David Hall, Henry Packard Moreton, Patrick R. Brown
Hardware For Parallel Command List Generation

Publication number: 20110072211

Abstract: A method for providing state inheritance across command lists in a multi-threaded processing environment. The method includes receiving an application program that includes a plurality of parallel threads; generating a command list for each thread of the plurality of parallel threads; causing a first command list associated with a first thread of the plurality of parallel threads to be executed by a processing unit; and causing a second command list associated with a second thread of the plurality of parallel threads to be executed by the processing unit, where the second command list inherits from the first command list state associated with the processing unit.

Type: Application

Filed: August 9, 2010

Publication date: March 24, 2011

Inventors: Jerome F. DULUK, JR., Jesse David Hall, Henry Packard Moreton, Patrick R. Brown
Method and system for overlapping execution of instructions through non-uniform execution pipelines in an in-order processor

Patent number: 7913067

Abstract: A system and method for overlapping execution (OE) of instructions through non-uniform execution pipelines in an in-order processor are provided. The system includes a first execution unit to perform instruction execution in a first execution pipeline. The system also includes a second execution unit to perform instruction execution in a second execution pipeline, where the second execution pipeline includes a greater number of stages than the first execution pipeline. The system further includes an instruction dispatch unit (IDU), the IDU including OE registers and logic for dispatching an OE-capable instruction to the first execution unit such that the instruction completes execution prior to completing execution of a previously dispatched instruction to the second execution unit. The system additionally includes a latch to hold a result of the execution of the OE-capable instruction until after the second execution unit completes the execution of the previously dispatched instruction.

Type: Grant

Filed: February 20, 2008

Date of Patent: March 22, 2011

Assignee: International Business Machines Corporation

Inventors: David S. Hutton, Khary J. Alexander, Fadi Y. Busaba, Bruce C. Giamei, John G. Rell, Jr., Eric M. Schwarz, Chung-Lung Kevin Shum
MAPPING OF COMPUTER THREADS ONTO HETEROGENEOUS RESOURCES

Publication number: 20110066828

Abstract: Techniques are generally described for mapping a thread onto heterogeneous processor cores. Example techniques may include associating the thread with one or more predefined execution characteristic(s), assigning the thread to one or more heterogeneous processor core(s) based on the one or more predefined execution characteristic(s), and/or executing the thread by the respective assigned heterogeneous processor core(s).

Type: Application

Filed: September 11, 2009

Publication date: March 17, 2011

Inventors: Andrew Wolfe, Thomas M. Conte
Functional-level instruction-set computer architecture for processing application-layer content-service requests such as file-access requests

Patent number: 7908464

Abstract: A functional-level instruction-set computing (FLIC) architecture executes higher-level functional instructions such as lookups and bit-compares of variable-length operands. Each FLIC processing-engine slice has specialized processing units including a lookup unit that searches for a matching entry in a lookup cache. Variable-length operands are stored in execution buffers. The operand length and location in the execution buffer are stored in fixed-length general-purpose registers (GPRs) that also store fixed-length operands. A copy/move unit moves data between input and output buffers and one or more FLIC processing-engine slices. Multiple contexts can each have a set of GPRs and execution buffers. An expansion buffer in a FLIC slice can be allocated to a context to expand that context's execution buffer for storing longer operands.

Type: Grant

Filed: July 31, 2007

Date of Patent: March 15, 2011

Assignee: Alacritech, Inc.

Inventors: Millind Mittal, Mehul Kharidia, Tarun Kumar Tripathy, J. Sukarno Mertoguno
PARALLEL PIPELINED VECTOR REDUCTION IN A DATA PROCESSING SYSTEM

Publication number: 20110060891

Abstract: A parallel processing data processing system builds at least one data structure indicating a communication schedule for a plurality of processes each having a respective one of a plurality of equal length vectors formed of multiple equal size chunks. The data processing system, based upon the at least one data structure, communicates chunks of the plurality of vectors among the plurality of processes and performs partial reduction operations on chunks in accordance with the communication schedule. The data processing system then stores a result vector representing reduction of the plurality of vectors.

Type: Application

Filed: September 4, 2009

Publication date: March 10, 2011

Applicant: International Business Machines Corporation

Inventor: BIN JIA
OPTIMIZING MEMORY COPY ROUTINE SELECTION FOR MESSAGE PASSING IN A MULTICORE ARCHITECTURE

Publication number: 20110055487

Abstract: In one embodiment, the present invention includes a method to obtain topology information regarding a system including at least one multicore processor, provide the topology information to a plurality of parallel processes, generate a topological map based on the topology information, access the topological map to determine a topological relationship between a sender process and a receiver process, and select a given memory copy routine to pass a message from the sender process to the receiver process based at least in part on the topological relationship. Other embodiments are described and claimed.

Type: Application

Filed: March 31, 2008

Publication date: March 3, 2011

Inventors: Sergey I. Sapronov, Alexey V. Bayduraev, Alexander V. Supalov, Vladimir D. Truschin, Igor Ermolaev, Dmitry Mishura
Handling data cache misses out-of-order for asynchronous pipelines

Patent number: 7900024

Abstract: Mechanisms for handling data cache misses out-of-order for asynchronous pipelines are provided. The mechanisms associate load tag (LTAG) identifiers with the load instructions and uses them to track the load instruction across multiple pipelines as an index into a load table data structure of a load target buffer. The load table is used to manage cache “hits” and “misses” and to aid in the recycling of data from the L2 cache. With cache misses, the LTAG indexed load table permits load data to recycle from the L2 cache in any order. When the load instruction issues and sees its corresponding entry in the load table marked as a “miss,” the effects of issuance of the load instruction are canceled and the load instruction is stored in the load table for future reissuing to the instruction pipeline when the required data is recycled.

Type: Grant

Filed: October 17, 2008

Date of Patent: March 1, 2011

Assignee: International Business Machines Corporation

Inventors: Christopher M. Abernathy, Jeffrey P. Bradford, Ronald P. Hall, Timothy H. Heil, David Shippy
Methods and Apparatus to Predict Non-Execution of Conditional Non-branching Instructions

Publication number: 20110047357

Abstract: Efficient techniques are described for not executing an issued conditional non-branch instruction. A conditional non-branch instruction is identified as being eligible for a prediction, the prediction indicating that the eligible conditional non-branch (ECNB) instruction would not execute. The ECNB instruction executes as a no operation (NOP) instruction in response to the prediction that the ECNB instruction would not execute. A source operand required for the ECNB instruction to execute is not fetched in response to the prediction to not execute.

Type: Application

Filed: August 19, 2009

Publication date: February 24, 2011

Applicant: QUALCOMM INCORPORATED

Inventors: Brian M. Stempel, James N. Dieffenderfer, Thomas A. Sartorius, David J. Mandzak, Rodney W. Smith
Reconfigurable integrated circuit

Patent number: 7895416

Abstract: A reconfigurable integrated circuit is provided wherein the available hardware resources can be optimised for a particular application. Dynamically reconfiguring (in both real-time and non real-time) the available resources and sharing a plurality of processing elements with a plurality of controller elements achieve this. In a preferred embodiment the integrated circuit includes a plurality of processing blocks, which interface to a reconfigurable interconnection means. A processing block has two forms, namely a shared resource block and a dedicated resource block. Each processing block consists of one or a plurality of controller elements and a plurality of processing elements. The controller element and processing element generally comprise diverse rigid coarse and fine grained circuits and are interconnected through dedicated and reconfigurable interconnect. The processing blocks can be configured as a hierarchy of blocks and or fractal architecture.

Type: Grant

Filed: June 24, 2009

Date of Patent: February 22, 2011

Assignee: Akya (Holdings) Limited

Inventors: Graeme Roy Smith, Dyson Wilkes
Select-and-insert instruction within data processing systems

Patent number: 7895417

Abstract: A data processing system 2 is provided including an instruction decoder 34 responsive to program instructions within an instruction register 32 to generate control signals for controlling data processing circuitry 36. The instructions supported include an address calculation instruction which splits an input address value at a position dependent upon a size value into a first portion and second portion, adds a non-zero offset value to the first portion, sets the second portion to a value and then concatenates the result of the processing on the first portion and the second portion to form the output address value. Another type of instruction supported is a select-and-insert instruction. This instruction takes a first input value and shifts it by N bit positions to form a shifted value, selects N bits from within a second input value in dependence upon the first input value and then concatenates the shifted value with the N bits to form an output value.

Type: Grant

Filed: April 30, 2010

Date of Patent: February 22, 2011

Assignee: ARM Limited

Inventors: Dominic Hugo Symes, Daniel Kershaw, Mladen Wilder
Method for extracting fields from packets having fields spread over more than one register

Patent number: 7895423

Abstract: Systems and methods that allow for extracting a field from data stored in a pair of registers using two instructions. A first instruction extracts any part of the field from a first register designated as a first source register, and executes a second instruction extracting any part of the field from a second general register designated as a second source register. The second instruction inserts any extracted field parts in a result register.

Type: Grant

Filed: August 19, 2009

Date of Patent: February 22, 2011

Assignee: MIPS Technologies, Inc.

Inventors: Sol Katzman, Robert Gelinas, W. Patrick Hays
Single-chip multiprocessor with clock cycle-precise program scheduling of parallel execution

Patent number: 7895587

Abstract: A single-chip multiprocessor system and operation method of this system based on a static macro-scheduling of parallel streams for multiprocessor parallel execution. The single-chip multiprocessor system has buses for direct exchange between the processor register files and access to their store addresses and data. Each explicit parallelism architecture processor of this system has an interprocessor interface providing the synchronization signals exchange, data exchange at the register file level and access to store addresses and data of other processors. The single-chip multiprocessor system uses ILP to increase the performance. Synchronization of the streams parallel execution is ensured using special operations setting a sequence of streams and stream fragments execution prescribed by the program algorithm.

Type: Grant

Filed: September 8, 2006

Date of Patent: February 22, 2011

Assignee: Elbrus International

Inventors: Boris A. Babaian, Yuli Kh. Sakhin, Vladimir Yu. Volkonskiy, Sergey A. Rozhkov, Vladimir V. Tikhorsky, Feodor A. Gruzdov, Leonid N. Nazarov, Mikhail L. Chudakov
Cache sharing based thread control

Patent number: 7895415

Abstract: Apparatus and computing systems associated with cache sharing based thread control are described. One embodiment includes a memory to store a thread control instruction and a processor to execute the thread control instruction. The processor is coupled to the memory. The processor includes a first unit to dynamically determine a cache sharing behavior between threads in a multi-threaded computing system and a second unit to dynamically control the composition of a set of threads in the multi-threaded computing system. The composition of the set of threads is based, at least in part, on thread affinity as exhibited by cache-sharing behavior. The thread control instruction controls the operation of the first unit and the second unit.

Type: Grant

Filed: February 14, 2007

Date of Patent: February 22, 2011

Assignee: Intel Corporation

Inventors: Antonio Gonzalez, Josep M. Codina, Pedro Lopez, Fernando Latorre, Jose-Alejandro Pineiro, Enric Gibert, Jaume Abella, Jaideep Moses, Donald Newell, Ravishankar Iyer, Ramesh G. Illikkal, Srihari Makineni

prev … 13 14 15 16 17 18 19 20 21 … next