Processing Control Patents (Class 712/220)

Arithmetic operation instruction processing (Class 712/221)

Floating point or vector (Class 712/222)

Logic operation instruction processing (Class 712/223)

Masking (Class 712/224)

Processing control for data transfer (Class 712/225)

Instruction modification based on condition (Class 712/226)

Specialized instruction processing in support of testing, debugging, emulation (Class 712/227)

Context preserving (e.g., context swapping, checkpointing, register windowing (Class 712/228)

Mode switch or change (Class 712/229)

Generating next microinstruction address (Class 712/230)

Detecting end or completion of microprogram (Class 712/231)

Hardwired controller (Class 712/232)

Branching (e.g., delayed branch, loop control, branch predict, interrupt) (Class 712/233)

Processing sequence control (i.e., microsequencing) (Class 712/245)

Apparatus and method for executing fast bit scan forward/reverse (BSR/BSF) instructions

Patent number: 8327119

Abstract: An apparatus executes a bit scan instruction that specifies an N-byte input operand. A first encoder forward bit scan encodes each input byte to generate N first bit vectors. A zero detector zero-detects each input byte to generate a second bit vector. A second encoder forward bit scan encodes the second bit vector to generate a third bit vector. An N:1 multiplexor, controlled by the third bit vector, selects one of the N first bit vectors to output a fourth bit vector. The apparatus concatenates the third and fourth bit vectors into a fifth bit vector that indicates the bit index of the least significant set bit of the input operand. A third encoder forward bit scan encodes a bit-reversed version of each input by to generate N sixth bit vectors. A fourth encoder forward bit scan encodes a bit-reversed version of the second bit vector to generate a seventh bit vector. A second N:1 multiplexor, controlled by the seventh bit vector, selects one of the N sixth bit vectors to output an eighth bit vector.

Type: Grant

Filed: October 21, 2009

Date of Patent: December 4, 2012

Assignee: VIA Technologies, Inc.

Inventor: Bryan Wayne Pogor
System and method for analyzing streams and counting stream items on multi-core processors

Patent number: 8321579

Abstract: Systems and methods for parallel stream item counting are disclosed. A data stream is partitioned into portions and the portions are assigned to a plurality of processing cores. A sequential kernel is executed at each processing core to compute a local count for items in an assigned portion of the data stream for that processing core. The counts are aggregated for all the processing cores to determine a final count for the items in the data stream. A frequency-aware counting method (FCM) for data streams includes dynamically capturing relative frequency phases of items from a data stream and placing the items in a sketch structure using a plurality of hash functions where a number of hash functions is based on the frequency phase of the item. A zero-frequency table is provided to reduce errors due to absent items.

Type: Grant

Filed: July 26, 2007

Date of Patent: November 27, 2012

Assignee: International Business Machines Corporation

Inventors: Charu Chandra Aggarwal, Rajesh Bordawekar, Dina Thomas, Philip Shilung Yu
METHODS FOR GENERATING CODE FOR AN ARCHITECTURE ENCODING AN EXTENDED REGISTER SPECIFICATION

Publication number: 20120297171

Abstract: There are provided methods and computer program products for generating code for an architecture encoding an extended register specification. A method for generating code for a fixed-width instruction set includes identifying a non-contiguous register specifier. The method further includes generating a fixed-width instruction word that includes the non-contiguous register specifier.

Type: Application

Filed: July 26, 2012

Publication date: November 22, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Michael Karl Gschwind, Robert Kevin Montoye, Brett Olsson, John-David Wellman
DECENTRALIZED ALLOCATION OF RESOURCES AND INTERCONNNECT STRUCTURES TO SUPPORT THE EXECUTION OF INSTRUCTION SEQUENCES BY A PLURALITY OF ENGINES

Publication number: 20120297170

Abstract: A method for decentralized resource allocation in an integrated circuit. The method includes receiving a plurality of requests from a plurality of resource consumers of a plurality of partitionable engines to access a plurality resources, wherein the resources are spread across the plurality of engines and are accessed via a global interconnect structure. At each resource, a number of requests for access to said each resource are added. At said each resource, the number of requests are compared against a threshold limiter. At said each resource, a subsequent request that is received that exceeds the threshold limiter is canceled. Subsequently, requests that are not canceled within a current clock cycle are implemented.

Type: Application

Filed: May 18, 2012

Publication date: November 22, 2012

Applicant: SOFT MACHINES, INC.

Inventor: Mohammad Abdallah
AUTOMATIC KERNEL MIGRATION FOR HETEROGENEOUS CORES

Publication number: 20120297163

Abstract: A system and method for automatically migrating the execution of work units between multiple heterogeneous cores. A computing system includes a first processor core with a single instruction multiple data micro-architecture and a second processor core with a general-purpose micro-architecture. A compiler predicts execution of a function call in a program migrates at a given location to a different processor core. The compiler creates a data structure to support moving live values associated with the execution of the function call at the given location. An operating system (OS) scheduler schedules at least code before the given location in program order to the first processor core. In response to receiving an indication that a condition for migration is satisfied, the OS scheduler moves the live values to a location indicated by the data structure for access by the second processor core and schedules code after the given location to the second processor core.

Type: Application

Filed: May 16, 2011

Publication date: November 22, 2012

Inventors: Mauricio Breternitz, Patryk Kaminski, Keith Lowery, Anton Chernoff, Dz-Ching Ju
Look-ahead wake-and-go engine with speculative execution

Patent number: 8316218

Abstract: A wake-and-go mechanism is provided for a microprocessor. The wake-and-go mechanism looks ahead in the instruction stream of a thread for programming idioms that indicate that the thread is waiting for an event. if a look-ahead polling operation succeeds, the look-ahead wake-and-go engine may record an instruction address for the corresponding idiom so that the wake-and-go mechanism may have the thread perform speculative execution at a time when the thread is waiting for an event. During execution, when the wake-and-go mechanism recognizes a programming idiom, the wake-and-go mechanism may store the thread state in the thread state storage. Instead of putting the thread to sleep, the wake-and-go mechanism may perform speculative execution.

Type: Grant

Filed: February 1, 2008

Date of Patent: November 20, 2012

Assignee: International Business Machines Corporation

Inventors: Ravi K. Arimilli, Satya P. Sharma, Randal C. Swanberg
Synchronizing commands and dependencies in an asynchronous command queue

Patent number: 8316219

Abstract: Provided are techniques for the managing of command queue dependencies and command queue synchronization. Incoming commands are actively tracked through their dependency relationships. Command dependencies may be tracked across multiple lists, including a submission list and a completion list. Each command on the submission list is prepared for processing and ultimately submitted to command processing logic. Command completion processing is performed on each command on the completion list, including by not limited to removing dependencies from pending commands and possibly queuing pending commands for submission to the command processing logic. Also provided as features of a command queue are a standby barrier, an active barrier and a marker. Standby and active barriers are employed to synchronize and track commands through the command queue. Markers are employed to track commands through the command queue.

Type: Grant

Filed: August 31, 2009

Date of Patent: November 20, 2012

Assignee: International Business Machines Corporation

Inventors: Gregory H. Bellows, Joaquin Madruga, Ross A. Mikosh, Brian D. Watt
Optimizing execution of single-threaded programs on a multiprocessor managed by compilation

Patent number: 8312455

Abstract: A method for optimizing execution of a single threaded program on a multi-core processor. The method includes dividing the single threaded program into a plurality of discretely executable components while compiling the single threaded program; identifying at least some of the plurality of discretely executable components for execution by an idle core within the multi-core processor; and enabling execution of the at least one of the plurality of discretely executable components on the idle core.

Type: Grant

Filed: December 19, 2007

Date of Patent: November 13, 2012

Assignee: International Business Machines Corporation

Inventors: Robert H. Bell, Jr., Louis Bennie Capps, Jr., Michael A. Paolini, Michael Jay Shapiro
Content receiving apparatus and method, storage medium, and server

Patent number: 8312252

Abstract: A content receiver is compatible with a plurality of rights management and protection methods (RMP) devised for each content distribution system. Only the format which specifies the specification of the RMP formed of information such as content billing, security, and copyright protection, is standardized. Each content provider inputs encrypted content and rights processing information to content in a form conforming to the standardized specification. For content users, by merely being provided with functions corresponding to each RMP method in advance, even if the content is based on any RMP method, the content can be decrypted and used in the same content receiver.

Type: Grant

Filed: April 26, 2005

Date of Patent: November 13, 2012

Assignee: Sony Corporation

Inventor: Tadashi Ezaki
Scalable bus-based on-chip interconnection networks

Patent number: 8307116

Abstract: The present disclosure generally relates to systems for routing data across a multinodal network. Example systems include a multinodal array having a plurality of nodes and a plurality of physical communication channels connecting the nodes. At least one of the physical communication channels may be configured to route data from a first node to two or more other destination nodes of the plurality of nodes. The present disclosure also generally relates to methods for routing data across a multinodal network and computer accessible mediums having stored thereon computer executable instructions for performing techniques for routing data across a multinodal network.

Type: Grant

Filed: June 19, 2009

Date of Patent: November 6, 2012

Assignee: Board of Regents of the University of Texas System

Inventors: Stephen W. Keckler, Boris Grot
System to profile and optimize user software in a managed run-time environment

Patent number: 8301868

Abstract: Method, apparatus, and system for monitoring performance within a processing resource, which may be used to modify user-level software. Some embodiments of the invention pertain to an architecture to allow a user to improve software running on a processing resources on a per-thread basis in real-time and without incurring significant processing overhead.

Type: Grant

Filed: September 23, 2005

Date of Patent: October 30, 2012

Assignee: Intel Corporation

Inventors: Chris J. Newburn, Robert Knight, Robert Geva, Dion Rodgers, Xiang Zou, Hong Wang, Bryant E. Bigbee, Ittai Anati
Hierarchical register file with operand capture ports

Patent number: 8296550

Abstract: A hierarchical register file included in a hierarchical microprocessor that includes a plurality of execution clusters. An embodiment of the a hierarchical register file includes a first-level register file including a plurality of mappable registers. where the first level register filed is configured to allocate the mappable registers to store execution results of instructions executed by the execution clusters and provide secondary register storage for each of the execution clusters. The hierarchical register file also includes a plurality of second-level register files operatively coupled with the first-level register file, where the plurality of second-level register files are configured to store instruction operands and provide the instruction operands to respective execution units of the execution clusters for use in executing associated instructions.

Type: Grant

Filed: October 31, 2007

Date of Patent: October 23, 2012

Assignee: The Invention Science Fund I, LLC

Inventor: Andrew Forsyth Glew
ALLOCATION OF COUNTERS FROM A POOL OF COUNTERS TO TRACK MAPPINGS OF LOGICAL REGISTERS TO PHYSICAL REGISTERS FOR MAPPER BASED INSTRUCTION EXECUTIONS

Publication number: 20120265969

Abstract: A computer system assigns a particular counter from among a plurality of counters currently in a counter free pool to count a number of mappings of logical registers from among a plurality of logical registers to a particular physical register from among a plurality of physical registers, responsive to an execution of an instruction by a mapper unit mapping at least one logical register from among the plurality of logical registers to the particular physical register, wherein the number of the plurality of counters is less than a number of the plurality of physical registers. The computer system, responsive to the counted number of mappings of logical registers to the particular physical register decremented to less than a minimum value, returns the particular counter to the counter free pool.

Type: Application

Filed: April 18, 2012

Publication date: October 18, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: GREGORY W. ALEXANDER, BRIAN D. BARRICK, JOHN W. WARD, III
SYSTEM AND METHOD OF INDIRECT REGISTER ACCESS

Publication number: 20120265970

Abstract: Systems and methods are provided for managing access to registers. A system may include a set of direct registers and a set of indirect registers. The indirect registers may be accessed through the direct registers, and the direct registers may provide various features to provide faster access to the indirect registers. One of the direct registers may indicate access modes for accessing the indirect registers. The access modes may include auto-increment, auto-decrement, auto-reset, and no change modes. Based on the access mode, the currently accessed address may be automatically modified after accessing the indirect register at the address.

Type: Application

Filed: June 22, 2012

Publication date: October 18, 2012

Applicant: Micron Technology, Inc.

Inventors: Harold B Noyes, Mark Jurenka, Gavin Huggins
Apparatus and method for regulating bursty data in a signal processing pipeline

Patent number: 8291198

Abstract: Apparatus and method for regulating data in a signal processing pipeline are disclosed. For example, an apparatus is disclosed that includes a first element operable to determine a time interval between a first plurality of data samples input to the signal processing pipeline, and calculate a sample spacing count value associated with the time interval, a second element coupled to the first element, the second element operable to hold the sample spacing count value until the time interval between the first plurality of data samples is changed, a third element coupled to the second element and the signal processing pipeline, the third element operable to output a control signal to the signal processing pipeline, and responsive to the control signal, the signal processing pipeline operable to output a second plurality of data samples.

Type: Grant

Filed: September 11, 2006

Date of Patent: October 16, 2012

Assignee: Samsung Electronics Co., Ltd.

Inventors: Jordan Charles Mott, William Milton Hurley
Forward-pass dead instruction identification and removal at run-time

Patent number: 8291196

Abstract: Apparatuses and methods for dead instruction identification are disclosed. In one embodiment, an apparatus includes an instruction buffer and a dead instruction identifier. The instruction buffer is to store an instruction stream having a single entry point and a single exit point. The dead instruction identifier is to identify dead instructions based on a forward pass through the instruction stream.

Type: Grant

Filed: December 29, 2005

Date of Patent: October 16, 2012

Assignee: Intel Corporation

Inventors: Stephan J. Jourdan, Matthew C. Merten, Alexandre J. Farcy
Block driven computation with an address generation accelerator

Patent number: 8285971

Abstract: A processor includes at least one execution unit that executes instructions, at least one register file, coupled to the at least one execution unit, that buffers operands for access by the at least one execution unit, an instruction sequencing unit that fetches instructions for execution by the at least one execution unit, and an address generation accelerator. The address generation accelerator, responsive to an initiation signal received from the instruction sequencing unit, computes and outputs first and second effective addresses of operands of an operation.

Type: Grant

Filed: December 16, 2008

Date of Patent: October 9, 2012

Assignee: International Business Machines Corporation

Inventors: Ravi K. Arimilli, Balaram Sinharoy
Specifying an addressing relationship in an operand data structure

Patent number: 8281106

Abstract: A processor includes at least one execution unit that executes instructions, at least one register file, coupled to the at least one execution unit, that buffers operands for access by the at least one execution unit, and an instruction sequencing unit that fetches instructions for execution by the execution unit. The processor further includes an operand data structure and an address generation accelerator. The operand data structure specifies a first relationship between addresses of sequential accesses within a first address region and a second relationship between addresses of sequential accesses within a second address region. The address generation accelerator computes a first address of a first memory access in the first address region by reference to the first relationship and a second address of a second memory access in the second address region by reference to the second relationship.

Type: Grant

Filed: December 16, 2008

Date of Patent: October 2, 2012

Assignee: International Business Machines Corporation

Inventors: Ravi K. Arimilli, Balaram Sinharoy
PROCESSING LONG-LATENCY INSTRUCTIONS IN A PIPELINED PROCESSOR

Publication number: 20120246451

Abstract: There is provided a method and processor for processing a thread. The thread comprises a plurality of sequential instructions, the plurality of sequential instructions comprising some short-latency instructions and some long-latency instructions and at least one hazard instruction, the hazard instruction requiring one or more preceding instructions to be processed before the hazard instruction is processed. The method comprises the steps of: a) before processing each long-latency instruction, incrementing by one, a counter associated with the thread; b) after each long-latency instruction has been processed, decrementing by one, the counter associated with the thread; c) before processing each hazard instruction, checking the value of the counter associated with the thread, and i) if the counter value is zero, processing the hazard instruction, or ii) if the counter value is non-zero, pausing processing of the hazard instruction until a later time.

Type: Application

Filed: June 3, 2012

Publication date: September 27, 2012

Applicant: Imagination Technologies, Ltd.

Inventors: Morrie Berglas, Yoong Chert Foo
Hierarchical instruction scheduler facilitating instruction replay

Patent number: 8275976

Abstract: A hierarchical instruction scheduler included in a hierarchical microprocessor comprising a plurality of execution clusters. In one embodiment, a hierarchical instruction scheduler comprises a first-level instruction scheduler configured to receive instructions for execution; store first operand status information for respective operands of the instructions; and dispatch the instructions to respective execution clusters based on the instructions' respective first operand status information.

Type: Grant

Filed: October 31, 2007

Date of Patent: September 25, 2012

Assignee: The Invention Science Fund I, LLC

Inventor: Andrew Forsyth Glew
SEAMLESS INTERFACE FOR MULTI-THREADED CORE ACCELERATORS

Publication number: 20120239904

Abstract: A method, system and computer program product are disclosed for interfacing between a multi-threaded processing core and an accelerator. In one embodiment, the method comprises copying from the processing core to the hardware accelerator memory address translations for each of multiple threads operating on the processing core, and simultaneously storing on the hardware accelerator one or more of the memory address translations for each of the threads. Whenever any one of the multiple threads operating on the processing core instructs the hardware accelerator to perform a specified operation, the hardware accelerator has stored thereon one or more of the memory address translations for the any one of the threads. This facilitates starting that specified operation without memory translation faults. In an embodiment, the copying includes, each time one of the memory address translations is updated on the processing core, copying the updated one of the memory address translations to the hardware accelerator.

Type: Application

Filed: March 15, 2011

Publication date: September 20, 2012

Applicant: International Business Machines Corporation

Inventors: Kattamuri Ekanadham, Hung Q. Le, Jose E. Moreira, Pratap C. Pattnaik
SYSTEMS AND METHODS FOR VOTING AMONG PARALLEL THREADS

Publication number: 20120239909

Abstract: One embodiment of the present invention sets forth a technique for efficiently performing voting operations within a multi-threaded parallel-processing system. A group of related parallel program threads executes within a processor core together in parallel. A new instruction, called a “vote” instruction, is introduced that enables a parallel program thread to post an individual vote within the context of the group of related threads and to receive the result of the vote. In this fashion, the vote instruction advantageously reduces overhead associated with inter-thread communication, thereby improving overall system performance.

Type: Application

Filed: May 31, 2012

Publication date: September 20, 2012

Inventors: John R. Nickolls, Lars Nyland, Peter C. Mills, Jeremy Sugerman, Timothy Foley, Brian Fahs, Michael Garland, David P. Luebke
DUAL THREAD PROCESSOR

Publication number: 20120239908

Abstract: Pipeline processor architectures, processors, and methods are provided. A described processor includes thread allocation counters for corresponding processor threads. For example, a first counter is configured to store a first processor time allocation that controls first periods of processor time for a first processor thread, the first processor thread retaining control of the processor during each of the first periods of processor time. The processor causes data associated with the first processor thread to pass through the processor's pipeline during the first periods of processor time. A second counter is similarly configured. The processor can be configured to receive an input defining processor time to be allocated to one or more processor threads and to use the input to change one or more of the counters such that subsequent periods of processor times for the one or more processor threads are affected.

Type: Application

Filed: May 31, 2012

Publication date: September 20, 2012

Inventors: Hong-Yi Chen, Sehat Sutardja
Identifying Initial Don't Care Memory Elements for Simulation

Publication number: 20120239368

Abstract: In an embodiment, the design of a digital circuit may be analyzed to identify which uninitialized memory elements, such as flops, have initial don't care values. The analysis may include determining that that each possible initial value (e.g. zero and one) of the flops does not impact the outputs of circuitry to which the uninitialized flops are connected. For example, a model may be generated that includes two instances of the uninitialized flops and corresponding logic circuitry. The inputs of the two instances may be connected together, and the uninitialized flops may be initialized to zero in one instance and one in the other instance. If the outputs of the two instances are equal for any input stimulus, the initial value of the uninitialized flops may be don't cares. The flops may be safely initialized to a known value for simulation.

Type: Application

Filed: March 17, 2011

Publication date: September 20, 2012

Inventor: Nimrod Agmon
Processor architecture for executing wide transform slice instructions

Patent number: 8269784

Abstract: A programmable processor and method for improving the performance of processors by expanding at least two source operands, or a source and a result operand, to a width greater than the width of either the general purpose register or the data path width. The present invention provides operands which are substantially larger than the data path width of the processor by using the contents of a general purpose register to specify a memory address at which a plurality of data path widths of data can be read or written, as well as the size and shape of the operand. In addition, several instructions and apparatus for implementing these instructions are described which obtain performance advantages if the operands are not limited to the width and accessible number of general purpose registers.

Type: Grant

Filed: January 19, 2012

Date of Patent: September 18, 2012

Assignee: MicroUnity Systems Engineering, Inc.

Inventors: Craig Hansen, John Moussouris, Alexia Massalin
Multi-Thread Processors and Methods for Instruction Execution and Synchronization Therein and Computer Program Products Thereof

Publication number: 20120233445

Abstract: Methods for instruction execution and synchronization in a multi-thread processor are provided, wherein in the multi-thread processor, multiple threads are running and each of the threads can simultaneously execute a same instruction sequence. A source code or an object code is received and then compiled to generate the instruction sequence. Instructions for all of function calls within the instruction sequence are sorted according to a calling order. Each thread is provided a counter value pointing to one of the instructions in the instruction sequence. A main counter value is determined according to the counter values of the threads such that all of the threads simultaneously execute an instruction of the instruction sequence that the main counter value points to.

Type: Application

Filed: March 8, 2011

Publication date: September 13, 2012

Applicant: VIA TECHNOLOGIES, INC.

Inventor: Yangang Zhang
Hierarchical store buffer having segmented partitions

Patent number: 8266412

Abstract: A hierarchical store buffer included in a hierarchical microprocessor includes a plurality of execution clusters. An embodiment of a hierarchical store buffer includes a first-level store buffer configured to receive data values to be written to a memory subsystem from the plurality of execution clusters and store the received data values prior to writing the data values to the memory subsystem and a plurality of second-level store buffers each operatively coupled with the first-level store buffer, each second-level store buffer being included in a respective execution cluster.

Type: Grant

Filed: October 31, 2007

Date of Patent: September 11, 2012

Assignee: The Invention Science Fund I, LLC

Inventor: Andrew Forsyth Glew
Hardware controller to choose selected hardware entity and to execute instructions in relation to selected hardware entity

Publication number: 20120226893

Abstract: A hardware controller includes a first hardware interface, a second hardware interface, first hardware logic, and second hardware logic. The first hardware interface is to couple the hardware controller to hardware entities of a hardware device in which the hardware controller is to be included. The second hardware interface is to couple the hardware controller to a memory to receive instructions. The first hardware logic is to choose a selected hardware entity from the hardware entities. The second hardware logic is to execute the instructions in relation to the selected hardware entity.

Type: Application

Filed: March 3, 2011

Publication date: September 6, 2012

Inventors: Mary T. Prenn, Bradley R. Larson, Russell Fredrickson
Software pipelining on a network on chip

Patent number: 8261025

Abstract: Memory sharing in a software pipeline on a network on chip (‘NOC’), the NOC including integrated processor (‘IP’) blocks, routers, memory communications controllers, and network interface controllers, with each IP block adapted to a router through a memory communications controller and a network interface controller, where each memory communications controller controlling communications between an IP block and memory, and each network interface controller controlling inter-IP block communications through routers, including segmenting a computer software application into stages of a software pipeline, the software pipeline comprising one or more paths of execution; allocating memory to be shared among at least two stages including creating a smart pointer, the smart pointer including data elements for determining when the shared memory can be deallocated; determining, in dependence upon the data elements for determining when the shared memory can be deallocated, that the shared memory can be deallocated; and d

Type: Grant

Filed: November 12, 2007

Date of Patent: September 4, 2012

Assignee: International Business Machines Corporation

Inventors: Eric O. Mejdrich, Paul E. Schardt, Robert A. Shearer
Optional function multi-function instruction in an emulated computing environment

Patent number: 8261048

Abstract: A method, system and program product for executing a multi-function instruction in an emulated computer system by specifying, via the multi-function instruction, either a capability query or execution of a selected function of one or more optional functions, wherein the selected function is an installed optional function, wherein the capability query determines which optional functions of the one or more optional functions are installed on the computer system.

Type: Grant

Filed: December 13, 2011

Date of Patent: September 4, 2012

Assignee: Intenational Business Machines Corporation

Inventors: Shawn D. Lundvall, Ronald M. Smith, Sr., Phil Chi-Chung Yeh
Single-chip multiprocessor with clock cycle-precise program scheduling of parallel execution

Patent number: 8261250

Abstract: A single-chip multiprocessor system and operation method of this system based on a static macro-scheduling of parallel streams for multiprocessor parallel execution. The single-chip multiprocessor system has buses for direct exchange between the processor register files and access to their store addresses and data. Each explicit parallelism architecture processor of this system has an interprocessor interface providing the synchronization signals exchange, data exchange at the register file level and access to store addresses and data of other processors. The single-chip multiprocessor system uses ILP to increase the performance. Synchronization of the streams parallel execution is ensured using special operations setting a sequence of streams and stream fragments execution prescribed by the program algorithm.

Type: Grant

Filed: January 10, 2011

Date of Patent: September 4, 2012

Assignee: Elbrus International

Inventors: Boris A. Babaian, Yuli Kh. Sakhin, Vladimir Yu. Volkonskiy, Sergey A. Rozhkov, Vladimir V. Tikhorsky, Feodor A. Gruzdov, Leonid N. Nazarov, Mikhail L. Chudakov
Methods, apparatus and systems to improve security in computer systems

Patent number: 8261085

Abstract: According to some implementations methods, apparatus and systems are provided involving the use of processors having at least one core with a security component, the security component adapted to read and verify data within data blocks stored in a L1 instruction cache memory and to allow the execution of data block instructions in the core only upon the instructions being verified by the use of a cryptographic algorithm.

Type: Grant

Filed: September 26, 2011

Date of Patent: September 4, 2012

Assignee: Media Patents, S.L.

Inventor: Álvaro Fernández Gutiérrez
DIAGNOSE INSTRUCTION FOR SERIALIZING PROCESSING

Publication number: 20120216195

Abstract: A system serialization capability is provided to facilitate processing in those environments that allow multiple processors to update the same resources. The system serialization capability is used to facilitate processing in a multi-processing environment in which guests and hosts use locks to provide serialization. The system serialization capability includes a diagnose instruction which is issued after the host acquires a lock, eliminating the need for the guest to acquire the lock.

Type: Application

Filed: April 28, 2012

Publication date: August 23, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: Lisa C. Heller
ASYNCHRONOUS ASSIST THREAD INITIATION

Publication number: 20120204011

Abstract: A method of data processing includes a processor of a data processing system executing a controlling thread of a program and detecting occurrence of a particular asynchronous event during execution of the controlling thread of the program. In response to occurrence of the particular asynchronous event during execution of the controlling thread of the program, the processor initiates execution of an assist thread of the program such that the processor simultaneously executes the assist thread and controlling thread of the program.

Type: Application

Filed: April 16, 2012

Publication date: August 9, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: GILES R. FRAZIER, VENKAT R. INDUKURU
Microprocessor for Executing Byte Compiled Java Code

Publication number: 20120204017

Abstract: A microprocessor architecture for executing byte compiled Java programs directly in hardware. The microprocessor targets the lower end of the embedded systems domain and features two orthogonal programming models, a Java model and a RISC model. The entities share a common data path and operate independently, although not in parallel. The microprocessor includes a combined register file in which the Java module sees the elements in the register file as a circular operand stack and the RISC module sees the elements as a conventional register file. The integrated microprocessor architecture facilitates access to hardware-near instructions and provides powerful interrupt and instruction trapping capabilities.

Type: Application

Filed: April 23, 2012

Publication date: August 9, 2012

Applicant: ATMEL CORPORATION

Inventor: Oyvind Strom
CONFIGURABLE PIPELINE BASED ON ERROR DETECTION MODE IN A DATA PROCESSING SYSTEM

Publication number: 20120204012

Abstract: A method includes providing a data processor having an instruction pipeline, where the instruction pipeline has a plurality of instruction pipeline stages, and where the plurality of instruction pipeline stages includes a first instruction pipeline stage and a second instruction pipeline stage. The method further includes providing a data processor instruction that causes the data processor to perform a first set of computational operations during execution of the data processor instruction, performing the first set of computational operations in the first instruction pipeline stage if the data processor instruction is being executed and a first mode has been selected, and performing the first set of computational operations in the second instruction pipeline stage if the data processor instruction is being executed and a second mode has been selected.

Type: Application

Filed: April 13, 2012

Publication date: August 9, 2012

Applicant: Rambus Inc.

Inventors: William C. Moyer, Jeffrey W. Scott
Reduction of memory latencies using fine grained parallelism and FIFO data structures

Patent number: 8239866

Abstract: Software rendering and fine grained parallelism are utilized to reduce/avoid memory latency in a multi-processor (MP) system. According to one embodiment, the management of the transfer of data from one processor to another in the MP environment is moved into a low overhead hardware system. The low overhead hardware system may be a FIFO (“First In First Out”) hardware control. Each FIFO may be real or virtual.

Type: Grant

Filed: April 24, 2009

Date of Patent: August 7, 2012

Assignee: Microsoft Corporation

Inventor: Susan Carrie
System and method for double-issue instructions using a dependency matrix

Patent number: 8239661

Abstract: A method for double-issue complex instructions receives a complex instruction comprising a first portion and a second portion. The method sets a single issue queue slot and allocates an execution unit for the complex instruction, and identifies dependencies in the first and second portions. The method sets a dependency matrix slot and a consumers table slot for the first and section portion. In the event the first portion dependencies have been satisfied, the method issues the first portion and then issues the second portion from the single issue queue slot. In the event the second portion dependencies have not been satisfied, the method cancels the second portion issue.

Type: Grant

Filed: August 28, 2008

Date of Patent: August 7, 2012

Assignee: International Business Machines Corporation

Inventors: Christopher M. Abernathy, Mary D. Brown, Todd A. Venton
Scalable packet processing systems and methods

Patent number: 8234653

Abstract: A data processing architecture includes multiple processors connected in series between a load balancer and reorder logic. The load balancer is configured to receive data and distribute the data across the processors. Appropriate ones of the processors are configured to process the data. The reorder logic is configured to receive the data processed by the processors, reorder the data, and output the reordered data.

Type: Grant

Filed: May 30, 2008

Date of Patent: July 31, 2012

Assignee: Juniper Networks, Inc.

Inventors: John C Carney, Michael E Lipman
PROCESSOR HAVING INCREASED PERFORMANCE AND ENERGY SAVING VIA INSTRUCTION PRE-COMPLETION

Publication number: 20120191954

Abstract: Methods and apparatuses are provided for achieving increased performance and energy saving via instruction pre-completion without having to schedule instruction execution in processor execution units. The apparatus comprises an operational unit for determining whether an instruction can be completed without scheduling use of an execution unit of the processor and units within the operational unit capable of employing alternate or equivalent processes or techniques to complete the instruction. In this way, the instruction is completed without scheduling use of the execution unit of the processor. The method comprises determining that an instruction can be completed without scheduling use of an execution unit of a processor and then pre-completing the instruction without use of one or more the execution units.

Type: Application

Filed: January 20, 2011

Publication date: July 26, 2012

Applicant: ADVANCED MICRO DEVICES, INC.

Inventors: Jay FLEISCHMAN, Debjit DAS SARMA
Multithreaded processor architecture with operational latency hiding

Patent number: 8230423

Abstract: A method and processor architecture for achieving a high level of concurrency and latency hiding in an “infinite-thread processor architecture” with a limited number of hardware threads is disclosed. A preferred embodiment defines “fork” and “join” instructions for spawning new context-switched threads. Context switching is used to hide the latency of both memory-access operations (i.e., loads and stores) and arithmetic/logical operations. When an operation executing in a thread incurs a latency having the potential to delay the instruction pipeline, the latency is hidden by performing a context switch to a different thread. When the result of the operation becomes available, a context switch back to that thread is performed to allow the thread to continue.

Type: Grant

Filed: April 7, 2005

Date of Patent: July 24, 2012

Assignee: International Business Machines Corporation

Inventors: Matteo Frigo, Ahmed Gheith, Volker Strumpen
Migrating sleeping and waking threads between wake-and-go mechanisms in a multiple processor data processing system

Patent number: 8230201

Abstract: A wake-and-go mechanism is provided for a data processing system. The wake-and-go mechanism detects a thread running on a first processing unit within a plurality of processing units that is waiting for an event that modifies a data value associated with a target address. The wake-and-go mechanism creates a wake-and-go instance for the thread by populating a wake-and-go storage array with the target address. The operating system places the thread in a sleep state. Responsive to detecting the event that modifies the data value associated with the target address, the wake-and-go mechanism assigns the wake-and-go instance to a second processing unit within the plurality of processing units. The operating system on the second processing unit places the thread in a non-sleep state.

Type: Grant

Filed: April 16, 2009

Date of Patent: July 24, 2012

Assignee: International Business Machines Corporation

Inventors: Ravi K. Arimilli, Satya P. Sharma, Randal C. Swanberg
METHODS AND SYSTEMS FOR STORAGE OF BINARY INFORMATION THAT IS USABLE IN A MIXED COMPUTING ENVIRONMENT

Publication number: 20120185677

Abstract: A method of managing binary data across a mixed computing environment is provided. The method includes performing on one or more processors: receiving binary data; receiving binary coded data indicating a type of the binary data; formatting the binary data and the binary coded data according to a first format; and generating at least one of a message and a file based on the formatted data.

Type: Application

Filed: January 14, 2011

Publication date: July 19, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Harry J. Beatty, III, Peter C. Elmendorf, Charles Gates, Luo Chen
HARDWARE THREAD DISABLE WITH STATUS INDICATING SAFE SHARED RESOURCE CONDITION

Publication number: 20120185678

Abstract: A technique for indicating a safe shared resource condition with respect to a disabled thread provides a mechanism for providing a fast indication to other hardware threads that a temporarily disabled thread can no longer impact shared resources, such as shared special-purpose registers and translation look-aside buffers within the processor core. Signals from pipelines within the core indicates whether any of the instructions pending in the pipeline impact the shared resources and if not, then the thread disable status is presented to the other threads via a state change in a thread status register. Upon receiving an indication that a particular hardware thread is to be disabled, control logic halts the dispatch of instructions for the particular hardware thread, and then waits until any indication that a shared resource is impacted by an instruction has cleared. Then the control logic updates the thread status to indicate the thread is disabled.

Type: Application

Filed: March 30, 2012

Publication date: July 19, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Becky Bruce, Giles R. Frazier, Bradly G. Frey, Kumar K. Gala, Cathy May, Michael D. Snyder, Gary Whisenhunt, James Xenidis
Methods and systems for managing computations on a hybrid computing platform including a parallel accelerator

Patent number: 8225074

Abstract: In accordance with exemplary implementations, application computation operations and communications between operations on a host processing platform may be adapted to conform to the memory capacity of a parallel accelerator. Computation operations may be split and scheduled such that the computation operations fit within the memory capacity of the accelerator. Further, the operations may be automatically adapted without any modification to the code of an application. In addition, data transfers between a host processing platform and the parallel accelerator may be minimized in accordance with exemplary aspects of the present principles to improve processing performance.

Type: Grant

Filed: March 6, 2009

Date of Patent: July 17, 2012

Assignee: NEC Laboratories America, Inc.

Inventors: Srimat T. Chakradhar, Anand Raghunathan, Narayanan Sundaram
Processing data using continuous processing task and binary routine

Patent number: 8225320

Abstract: A computing method and system is presented that modifies a standard operating system utilizing two or more processing units to execute continuous processing tasks; such as processing or generating continuous audio, video or other types of data. One of the processors is tasked with running the operating system while each processing unit is dedicated towards running a single continuous processing task. Communication is provided between both processors enabling the continuous processing task to utilize the operating system without being affected by any operating system scheduling requirements.

Type: Grant

Filed: August 23, 2007

Date of Patent: July 17, 2012

Assignee: Advanced Simulation Technology, Inc.

Inventors: Manushantha (Manu) Sporny, Robert Kenneth Butterfield, Norton Kenneth James, Patrick Quinn Gaffney
Adapter allowing unaligned access to memory

Patent number: 8219785

Abstract: Methods and apparatus are provided for allowing a master component such as a processor on a programmable chip to access memory using unaligned addresses. An adapter connected to a master component determines if a master component memory access request is aligned. If the access request is aligned, the request is forwarded to memory and a response is provided to the master component. If the access request is unaligned, the adapter sends multiple access requests to memory and processes the responses in order to provide a correct response to the master component.

Type: Grant

Filed: September 25, 2006

Date of Patent: July 10, 2012

Assignee: Altera Corporation

Inventors: Timothy P. Allen, Jeffrey Orion Pritchard, Richard Noble Hill
Processing apparatus and method for performing computation

Publication number: 20120173853

Abstract: A processing apparatus includes an execution unit which performs computation on two operand inputs each being selectable between read data from a register and an immediate value. The processing apparatus also includes another execution unit which performs computation on two operand inputs, one of which is selectable between read data from a register and an immediate value, and the other of which is an immediate value. A control unit determines, based on a received instruction specifying a computation on two operands, whether each of the two operands specifies read data from a register or an immediate value. Depending on the determination result, the control unit causes one of the execution units to execute the computation specified by the received instruction.

Type: Application

Filed: November 14, 2011

Publication date: July 5, 2012

Applicant: FUJITSU LIMITED

Inventor: Masaki Ukai
PROCESSOR HAVING INCREASED EFFECTIVE PHYSICAL FILE SIZE VIA REGISTER MAPPING

Publication number: 20120173854

Abstract: Methods and apparatuses are provided for an efficient technique for processing registers having a known value while improving processor performance. The apparatus comprises a processor having a plurality of physical registers available for use in computations and a decoder for determining that a logical register contains a known value. A renaming unit maps the logical register containing the known value to an address outside an address range for the plurality of physical registers once the known value is determined. Thereafter, scheduling and execution units perform computations using the known value without storing the known value in one of the plurality of physical registers. The method comprises determining that a logical register of a processor has a known value and then mapping that logical register to a physical register address outside an expected range of physical register addresses; which indicates that the logical register represents the known value.

Type: Application

Filed: December 29, 2010

Publication date: July 5, 2012

Applicant: ADVANCED MICRO DEVICES, INC.

Inventors: Jay FLEISCHMAN, Debjit Das Sarma, Michael SEDMAK
Systems and methods for voting among parallel threads

Patent number: 8214625

Abstract: One embodiment of the present invention sets forth a technique for efficiently performing voting operations within a multi-threaded parallel-processing system. A group of related parallel program threads executes within a processor core together in parallel. A new instruction, called a “vote” instruction, is introduced that enables a parallel program thread to post an individual vote within the context of the group of related threads and to receive the result of the vote. In this fashion, the vote instruction advantageously reduces overhead associated with inter-thread communication, thereby improving overall system performance.

Type: Grant

Filed: November 26, 2008

Date of Patent: July 3, 2012

Assignee: NVIDIA Corporation

Inventors: John R. Nickolls, Lars Nyland, Peter C. Mills, Jeremy Sugerman, Timothy Foley, Brian Fahs, Michael Garland, David P. Luebke

prev … 8 9 10 11 12 13 14 15 16 … next