Processing Control Patents (Class 712/220)
  • Patent number: 8327119
    Abstract: An apparatus executes a bit scan instruction that specifies an N-byte input operand. A first encoder forward bit scan encodes each input byte to generate N first bit vectors. A zero detector zero-detects each input byte to generate a second bit vector. A second encoder forward bit scan encodes the second bit vector to generate a third bit vector. An N:1 multiplexor, controlled by the third bit vector, selects one of the N first bit vectors to output a fourth bit vector. The apparatus concatenates the third and fourth bit vectors into a fifth bit vector that indicates the bit index of the least significant set bit of the input operand. A third encoder forward bit scan encodes a bit-reversed version of each input by to generate N sixth bit vectors. A fourth encoder forward bit scan encodes a bit-reversed version of the second bit vector to generate a seventh bit vector. A second N:1 multiplexor, controlled by the seventh bit vector, selects one of the N sixth bit vectors to output an eighth bit vector.
    Type: Grant
    Filed: October 21, 2009
    Date of Patent: December 4, 2012
    Assignee: VIA Technologies, Inc.
    Inventor: Bryan Wayne Pogor
  • Patent number: 8321579
    Abstract: Systems and methods for parallel stream item counting are disclosed. A data stream is partitioned into portions and the portions are assigned to a plurality of processing cores. A sequential kernel is executed at each processing core to compute a local count for items in an assigned portion of the data stream for that processing core. The counts are aggregated for all the processing cores to determine a final count for the items in the data stream. A frequency-aware counting method (FCM) for data streams includes dynamically capturing relative frequency phases of items from a data stream and placing the items in a sketch structure using a plurality of hash functions where a number of hash functions is based on the frequency phase of the item. A zero-frequency table is provided to reduce errors due to absent items.
    Type: Grant
    Filed: July 26, 2007
    Date of Patent: November 27, 2012
    Assignee: International Business Machines Corporation
    Inventors: Charu Chandra Aggarwal, Rajesh Bordawekar, Dina Thomas, Philip Shilung Yu
  • Publication number: 20120297171
    Abstract: There are provided methods and computer program products for generating code for an architecture encoding an extended register specification. A method for generating code for a fixed-width instruction set includes identifying a non-contiguous register specifier. The method further includes generating a fixed-width instruction word that includes the non-contiguous register specifier.
    Type: Application
    Filed: July 26, 2012
    Publication date: November 22, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michael Karl Gschwind, Robert Kevin Montoye, Brett Olsson, John-David Wellman
  • Publication number: 20120297170
    Abstract: A method for decentralized resource allocation in an integrated circuit. The method includes receiving a plurality of requests from a plurality of resource consumers of a plurality of partitionable engines to access a plurality resources, wherein the resources are spread across the plurality of engines and are accessed via a global interconnect structure. At each resource, a number of requests for access to said each resource are added. At said each resource, the number of requests are compared against a threshold limiter. At said each resource, a subsequent request that is received that exceeds the threshold limiter is canceled. Subsequently, requests that are not canceled within a current clock cycle are implemented.
    Type: Application
    Filed: May 18, 2012
    Publication date: November 22, 2012
    Applicant: SOFT MACHINES, INC.
    Inventor: Mohammad Abdallah
  • Publication number: 20120297163
    Abstract: A system and method for automatically migrating the execution of work units between multiple heterogeneous cores. A computing system includes a first processor core with a single instruction multiple data micro-architecture and a second processor core with a general-purpose micro-architecture. A compiler predicts execution of a function call in a program migrates at a given location to a different processor core. The compiler creates a data structure to support moving live values associated with the execution of the function call at the given location. An operating system (OS) scheduler schedules at least code before the given location in program order to the first processor core. In response to receiving an indication that a condition for migration is satisfied, the OS scheduler moves the live values to a location indicated by the data structure for access by the second processor core and schedules code after the given location to the second processor core.
    Type: Application
    Filed: May 16, 2011
    Publication date: November 22, 2012
    Inventors: Mauricio Breternitz, Patryk Kaminski, Keith Lowery, Anton Chernoff, Dz-Ching Ju
  • Patent number: 8316218
    Abstract: A wake-and-go mechanism is provided for a microprocessor. The wake-and-go mechanism looks ahead in the instruction stream of a thread for programming idioms that indicate that the thread is waiting for an event. if a look-ahead polling operation succeeds, the look-ahead wake-and-go engine may record an instruction address for the corresponding idiom so that the wake-and-go mechanism may have the thread perform speculative execution at a time when the thread is waiting for an event. During execution, when the wake-and-go mechanism recognizes a programming idiom, the wake-and-go mechanism may store the thread state in the thread state storage. Instead of putting the thread to sleep, the wake-and-go mechanism may perform speculative execution.
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: November 20, 2012
    Assignee: International Business Machines Corporation
    Inventors: Ravi K. Arimilli, Satya P. Sharma, Randal C. Swanberg
  • Patent number: 8316219
    Abstract: Provided are techniques for the managing of command queue dependencies and command queue synchronization. Incoming commands are actively tracked through their dependency relationships. Command dependencies may be tracked across multiple lists, including a submission list and a completion list. Each command on the submission list is prepared for processing and ultimately submitted to command processing logic. Command completion processing is performed on each command on the completion list, including by not limited to removing dependencies from pending commands and possibly queuing pending commands for submission to the command processing logic. Also provided as features of a command queue are a standby barrier, an active barrier and a marker. Standby and active barriers are employed to synchronize and track commands through the command queue. Markers are employed to track commands through the command queue.
    Type: Grant
    Filed: August 31, 2009
    Date of Patent: November 20, 2012
    Assignee: International Business Machines Corporation
    Inventors: Gregory H. Bellows, Joaquin Madruga, Ross A. Mikosh, Brian D. Watt
  • Patent number: 8312455
    Abstract: A method for optimizing execution of a single threaded program on a multi-core processor. The method includes dividing the single threaded program into a plurality of discretely executable components while compiling the single threaded program; identifying at least some of the plurality of discretely executable components for execution by an idle core within the multi-core processor; and enabling execution of the at least one of the plurality of discretely executable components on the idle core.
    Type: Grant
    Filed: December 19, 2007
    Date of Patent: November 13, 2012
    Assignee: International Business Machines Corporation
    Inventors: Robert H. Bell, Jr., Louis Bennie Capps, Jr., Michael A. Paolini, Michael Jay Shapiro
  • Patent number: 8312252
    Abstract: A content receiver is compatible with a plurality of rights management and protection methods (RMP) devised for each content distribution system. Only the format which specifies the specification of the RMP formed of information such as content billing, security, and copyright protection, is standardized. Each content provider inputs encrypted content and rights processing information to content in a form conforming to the standardized specification. For content users, by merely being provided with functions corresponding to each RMP method in advance, even if the content is based on any RMP method, the content can be decrypted and used in the same content receiver.
    Type: Grant
    Filed: April 26, 2005
    Date of Patent: November 13, 2012
    Assignee: Sony Corporation
    Inventor: Tadashi Ezaki
  • Patent number: 8307116
    Abstract: The present disclosure generally relates to systems for routing data across a multinodal network. Example systems include a multinodal array having a plurality of nodes and a plurality of physical communication channels connecting the nodes. At least one of the physical communication channels may be configured to route data from a first node to two or more other destination nodes of the plurality of nodes. The present disclosure also generally relates to methods for routing data across a multinodal network and computer accessible mediums having stored thereon computer executable instructions for performing techniques for routing data across a multinodal network.
    Type: Grant
    Filed: June 19, 2009
    Date of Patent: November 6, 2012
    Assignee: Board of Regents of the University of Texas System
    Inventors: Stephen W. Keckler, Boris Grot
  • Patent number: 8301868
    Abstract: Method, apparatus, and system for monitoring performance within a processing resource, which may be used to modify user-level software. Some embodiments of the invention pertain to an architecture to allow a user to improve software running on a processing resources on a per-thread basis in real-time and without incurring significant processing overhead.
    Type: Grant
    Filed: September 23, 2005
    Date of Patent: October 30, 2012
    Assignee: Intel Corporation
    Inventors: Chris J. Newburn, Robert Knight, Robert Geva, Dion Rodgers, Xiang Zou, Hong Wang, Bryant E. Bigbee, Ittai Anati
  • Patent number: 8296550
    Abstract: A hierarchical register file included in a hierarchical microprocessor that includes a plurality of execution clusters. An embodiment of the a hierarchical register file includes a first-level register file including a plurality of mappable registers. where the first level register filed is configured to allocate the mappable registers to store execution results of instructions executed by the execution clusters and provide secondary register storage for each of the execution clusters. The hierarchical register file also includes a plurality of second-level register files operatively coupled with the first-level register file, where the plurality of second-level register files are configured to store instruction operands and provide the instruction operands to respective execution units of the execution clusters for use in executing associated instructions.
    Type: Grant
    Filed: October 31, 2007
    Date of Patent: October 23, 2012
    Assignee: The Invention Science Fund I, LLC
    Inventor: Andrew Forsyth Glew
  • Publication number: 20120265969
    Abstract: A computer system assigns a particular counter from among a plurality of counters currently in a counter free pool to count a number of mappings of logical registers from among a plurality of logical registers to a particular physical register from among a plurality of physical registers, responsive to an execution of an instruction by a mapper unit mapping at least one logical register from among the plurality of logical registers to the particular physical register, wherein the number of the plurality of counters is less than a number of the plurality of physical registers. The computer system, responsive to the counted number of mappings of logical registers to the particular physical register decremented to less than a minimum value, returns the particular counter to the counter free pool.
    Type: Application
    Filed: April 18, 2012
    Publication date: October 18, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: GREGORY W. ALEXANDER, BRIAN D. BARRICK, JOHN W. WARD, III
  • Publication number: 20120265970
    Abstract: Systems and methods are provided for managing access to registers. A system may include a set of direct registers and a set of indirect registers. The indirect registers may be accessed through the direct registers, and the direct registers may provide various features to provide faster access to the indirect registers. One of the direct registers may indicate access modes for accessing the indirect registers. The access modes may include auto-increment, auto-decrement, auto-reset, and no change modes. Based on the access mode, the currently accessed address may be automatically modified after accessing the indirect register at the address.
    Type: Application
    Filed: June 22, 2012
    Publication date: October 18, 2012
    Applicant: Micron Technology, Inc.
    Inventors: Harold B Noyes, Mark Jurenka, Gavin Huggins
  • Patent number: 8291198
    Abstract: Apparatus and method for regulating data in a signal processing pipeline are disclosed. For example, an apparatus is disclosed that includes a first element operable to determine a time interval between a first plurality of data samples input to the signal processing pipeline, and calculate a sample spacing count value associated with the time interval, a second element coupled to the first element, the second element operable to hold the sample spacing count value until the time interval between the first plurality of data samples is changed, a third element coupled to the second element and the signal processing pipeline, the third element operable to output a control signal to the signal processing pipeline, and responsive to the control signal, the signal processing pipeline operable to output a second plurality of data samples.
    Type: Grant
    Filed: September 11, 2006
    Date of Patent: October 16, 2012
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Jordan Charles Mott, William Milton Hurley
  • Patent number: 8291196
    Abstract: Apparatuses and methods for dead instruction identification are disclosed. In one embodiment, an apparatus includes an instruction buffer and a dead instruction identifier. The instruction buffer is to store an instruction stream having a single entry point and a single exit point. The dead instruction identifier is to identify dead instructions based on a forward pass through the instruction stream.
    Type: Grant
    Filed: December 29, 2005
    Date of Patent: October 16, 2012
    Assignee: Intel Corporation
    Inventors: Stephan J. Jourdan, Matthew C. Merten, Alexandre J. Farcy
  • Patent number: 8285971
    Abstract: A processor includes at least one execution unit that executes instructions, at least one register file, coupled to the at least one execution unit, that buffers operands for access by the at least one execution unit, an instruction sequencing unit that fetches instructions for execution by the at least one execution unit, and an address generation accelerator. The address generation accelerator, responsive to an initiation signal received from the instruction sequencing unit, computes and outputs first and second effective addresses of operands of an operation.
    Type: Grant
    Filed: December 16, 2008
    Date of Patent: October 9, 2012
    Assignee: International Business Machines Corporation
    Inventors: Ravi K. Arimilli, Balaram Sinharoy
  • Patent number: 8281106
    Abstract: A processor includes at least one execution unit that executes instructions, at least one register file, coupled to the at least one execution unit, that buffers operands for access by the at least one execution unit, and an instruction sequencing unit that fetches instructions for execution by the execution unit. The processor further includes an operand data structure and an address generation accelerator. The operand data structure specifies a first relationship between addresses of sequential accesses within a first address region and a second relationship between addresses of sequential accesses within a second address region. The address generation accelerator computes a first address of a first memory access in the first address region by reference to the first relationship and a second address of a second memory access in the second address region by reference to the second relationship.
    Type: Grant
    Filed: December 16, 2008
    Date of Patent: October 2, 2012
    Assignee: International Business Machines Corporation
    Inventors: Ravi K. Arimilli, Balaram Sinharoy
  • Publication number: 20120246451
    Abstract: There is provided a method and processor for processing a thread. The thread comprises a plurality of sequential instructions, the plurality of sequential instructions comprising some short-latency instructions and some long-latency instructions and at least one hazard instruction, the hazard instruction requiring one or more preceding instructions to be processed before the hazard instruction is processed. The method comprises the steps of: a) before processing each long-latency instruction, incrementing by one, a counter associated with the thread; b) after each long-latency instruction has been processed, decrementing by one, the counter associated with the thread; c) before processing each hazard instruction, checking the value of the counter associated with the thread, and i) if the counter value is zero, processing the hazard instruction, or ii) if the counter value is non-zero, pausing processing of the hazard instruction until a later time.
    Type: Application
    Filed: June 3, 2012
    Publication date: September 27, 2012
    Applicant: Imagination Technologies, Ltd.
    Inventors: Morrie Berglas, Yoong Chert Foo
  • Patent number: 8275976
    Abstract: A hierarchical instruction scheduler included in a hierarchical microprocessor comprising a plurality of execution clusters. In one embodiment, a hierarchical instruction scheduler comprises a first-level instruction scheduler configured to receive instructions for execution; store first operand status information for respective operands of the instructions; and dispatch the instructions to respective execution clusters based on the instructions' respective first operand status information.
    Type: Grant
    Filed: October 31, 2007
    Date of Patent: September 25, 2012
    Assignee: The Invention Science Fund I, LLC
    Inventor: Andrew Forsyth Glew
  • Publication number: 20120239904
    Abstract: A method, system and computer program product are disclosed for interfacing between a multi-threaded processing core and an accelerator. In one embodiment, the method comprises copying from the processing core to the hardware accelerator memory address translations for each of multiple threads operating on the processing core, and simultaneously storing on the hardware accelerator one or more of the memory address translations for each of the threads. Whenever any one of the multiple threads operating on the processing core instructs the hardware accelerator to perform a specified operation, the hardware accelerator has stored thereon one or more of the memory address translations for the any one of the threads. This facilitates starting that specified operation without memory translation faults. In an embodiment, the copying includes, each time one of the memory address translations is updated on the processing core, copying the updated one of the memory address translations to the hardware accelerator.
    Type: Application
    Filed: March 15, 2011
    Publication date: September 20, 2012
    Applicant: International Business Machines Corporation
    Inventors: Kattamuri Ekanadham, Hung Q. Le, Jose E. Moreira, Pratap C. Pattnaik
  • Publication number: 20120239909
    Abstract: One embodiment of the present invention sets forth a technique for efficiently performing voting operations within a multi-threaded parallel-processing system. A group of related parallel program threads executes within a processor core together in parallel. A new instruction, called a “vote” instruction, is introduced that enables a parallel program thread to post an individual vote within the context of the group of related threads and to receive the result of the vote. In this fashion, the vote instruction advantageously reduces overhead associated with inter-thread communication, thereby improving overall system performance.
    Type: Application
    Filed: May 31, 2012
    Publication date: September 20, 2012
    Inventors: John R. Nickolls, Lars Nyland, Peter C. Mills, Jeremy Sugerman, Timothy Foley, Brian Fahs, Michael Garland, David P. Luebke
  • Publication number: 20120239908
    Abstract: Pipeline processor architectures, processors, and methods are provided. A described processor includes thread allocation counters for corresponding processor threads. For example, a first counter is configured to store a first processor time allocation that controls first periods of processor time for a first processor thread, the first processor thread retaining control of the processor during each of the first periods of processor time. The processor causes data associated with the first processor thread to pass through the processor's pipeline during the first periods of processor time. A second counter is similarly configured. The processor can be configured to receive an input defining processor time to be allocated to one or more processor threads and to use the input to change one or more of the counters such that subsequent periods of processor times for the one or more processor threads are affected.
    Type: Application
    Filed: May 31, 2012
    Publication date: September 20, 2012
    Inventors: Hong-Yi Chen, Sehat Sutardja
  • Publication number: 20120239368
    Abstract: In an embodiment, the design of a digital circuit may be analyzed to identify which uninitialized memory elements, such as flops, have initial don't care values. The analysis may include determining that that each possible initial value (e.g. zero and one) of the flops does not impact the outputs of circuitry to which the uninitialized flops are connected. For example, a model may be generated that includes two instances of the uninitialized flops and corresponding logic circuitry. The inputs of the two instances may be connected together, and the uninitialized flops may be initialized to zero in one instance and one in the other instance. If the outputs of the two instances are equal for any input stimulus, the initial value of the uninitialized flops may be don't cares. The flops may be safely initialized to a known value for simulation.
    Type: Application
    Filed: March 17, 2011
    Publication date: September 20, 2012
    Inventor: Nimrod Agmon
  • Patent number: 8269784
    Abstract: A programmable processor and method for improving the performance of processors by expanding at least two source operands, or a source and a result operand, to a width greater than the width of either the general purpose register or the data path width. The present invention provides operands which are substantially larger than the data path width of the processor by using the contents of a general purpose register to specify a memory address at which a plurality of data path widths of data can be read or written, as well as the size and shape of the operand. In addition, several instructions and apparatus for implementing these instructions are described which obtain performance advantages if the operands are not limited to the width and accessible number of general purpose registers.
    Type: Grant
    Filed: January 19, 2012
    Date of Patent: September 18, 2012
    Assignee: MicroUnity Systems Engineering, Inc.
    Inventors: Craig Hansen, John Moussouris, Alexia Massalin
  • Publication number: 20120233445
    Abstract: Methods for instruction execution and synchronization in a multi-thread processor are provided, wherein in the multi-thread processor, multiple threads are running and each of the threads can simultaneously execute a same instruction sequence. A source code or an object code is received and then compiled to generate the instruction sequence. Instructions for all of function calls within the instruction sequence are sorted according to a calling order. Each thread is provided a counter value pointing to one of the instructions in the instruction sequence. A main counter value is determined according to the counter values of the threads such that all of the threads simultaneously execute an instruction of the instruction sequence that the main counter value points to.
    Type: Application
    Filed: March 8, 2011
    Publication date: September 13, 2012
    Applicant: VIA TECHNOLOGIES, INC.
    Inventor: Yangang Zhang
  • Patent number: 8266412
    Abstract: A hierarchical store buffer included in a hierarchical microprocessor includes a plurality of execution clusters. An embodiment of a hierarchical store buffer includes a first-level store buffer configured to receive data values to be written to a memory subsystem from the plurality of execution clusters and store the received data values prior to writing the data values to the memory subsystem and a plurality of second-level store buffers each operatively coupled with the first-level store buffer, each second-level store buffer being included in a respective execution cluster.
    Type: Grant
    Filed: October 31, 2007
    Date of Patent: September 11, 2012
    Assignee: The Invention Science Fund I, LLC
    Inventor: Andrew Forsyth Glew
  • Publication number: 20120226893
    Abstract: A hardware controller includes a first hardware interface, a second hardware interface, first hardware logic, and second hardware logic. The first hardware interface is to couple the hardware controller to hardware entities of a hardware device in which the hardware controller is to be included. The second hardware interface is to couple the hardware controller to a memory to receive instructions. The first hardware logic is to choose a selected hardware entity from the hardware entities. The second hardware logic is to execute the instructions in relation to the selected hardware entity.
    Type: Application
    Filed: March 3, 2011
    Publication date: September 6, 2012
    Inventors: Mary T. Prenn, Bradley R. Larson, Russell Fredrickson
  • Patent number: 8261025
    Abstract: Memory sharing in a software pipeline on a network on chip (‘NOC’), the NOC including integrated processor (‘IP’) blocks, routers, memory communications controllers, and network interface controllers, with each IP block adapted to a router through a memory communications controller and a network interface controller, where each memory communications controller controlling communications between an IP block and memory, and each network interface controller controlling inter-IP block communications through routers, including segmenting a computer software application into stages of a software pipeline, the software pipeline comprising one or more paths of execution; allocating memory to be shared among at least two stages including creating a smart pointer, the smart pointer including data elements for determining when the shared memory can be deallocated; determining, in dependence upon the data elements for determining when the shared memory can be deallocated, that the shared memory can be deallocated; and d
    Type: Grant
    Filed: November 12, 2007
    Date of Patent: September 4, 2012
    Assignee: International Business Machines Corporation
    Inventors: Eric O. Mejdrich, Paul E. Schardt, Robert A. Shearer
  • Patent number: 8261048
    Abstract: A method, system and program product for executing a multi-function instruction in an emulated computer system by specifying, via the multi-function instruction, either a capability query or execution of a selected function of one or more optional functions, wherein the selected function is an installed optional function, wherein the capability query determines which optional functions of the one or more optional functions are installed on the computer system.
    Type: Grant
    Filed: December 13, 2011
    Date of Patent: September 4, 2012
    Assignee: Intenational Business Machines Corporation
    Inventors: Shawn D. Lundvall, Ronald M. Smith, Sr., Phil Chi-Chung Yeh
  • Patent number: 8261250
    Abstract: A single-chip multiprocessor system and operation method of this system based on a static macro-scheduling of parallel streams for multiprocessor parallel execution. The single-chip multiprocessor system has buses for direct exchange between the processor register files and access to their store addresses and data. Each explicit parallelism architecture processor of this system has an interprocessor interface providing the synchronization signals exchange, data exchange at the register file level and access to store addresses and data of other processors. The single-chip multiprocessor system uses ILP to increase the performance. Synchronization of the streams parallel execution is ensured using special operations setting a sequence of streams and stream fragments execution prescribed by the program algorithm.
    Type: Grant
    Filed: January 10, 2011
    Date of Patent: September 4, 2012
    Assignee: Elbrus International
    Inventors: Boris A. Babaian, Yuli Kh. Sakhin, Vladimir Yu. Volkonskiy, Sergey A. Rozhkov, Vladimir V. Tikhorsky, Feodor A. Gruzdov, Leonid N. Nazarov, Mikhail L. Chudakov
  • Patent number: 8261085
    Abstract: According to some implementations methods, apparatus and systems are provided involving the use of processors having at least one core with a security component, the security component adapted to read and verify data within data blocks stored in a L1 instruction cache memory and to allow the execution of data block instructions in the core only upon the instructions being verified by the use of a cryptographic algorithm.
    Type: Grant
    Filed: September 26, 2011
    Date of Patent: September 4, 2012
    Assignee: Media Patents, S.L.
    Inventor: Álvaro Fernández Gutiérrez
  • Publication number: 20120216195
    Abstract: A system serialization capability is provided to facilitate processing in those environments that allow multiple processors to update the same resources. The system serialization capability is used to facilitate processing in a multi-processing environment in which guests and hosts use locks to provide serialization. The system serialization capability includes a diagnose instruction which is issued after the host acquires a lock, eliminating the need for the guest to acquire the lock.
    Type: Application
    Filed: April 28, 2012
    Publication date: August 23, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Lisa C. Heller
  • Publication number: 20120204011
    Abstract: A method of data processing includes a processor of a data processing system executing a controlling thread of a program and detecting occurrence of a particular asynchronous event during execution of the controlling thread of the program. In response to occurrence of the particular asynchronous event during execution of the controlling thread of the program, the processor initiates execution of an assist thread of the program such that the processor simultaneously executes the assist thread and controlling thread of the program.
    Type: Application
    Filed: April 16, 2012
    Publication date: August 9, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: GILES R. FRAZIER, VENKAT R. INDUKURU
  • Publication number: 20120204017
    Abstract: A microprocessor architecture for executing byte compiled Java programs directly in hardware. The microprocessor targets the lower end of the embedded systems domain and features two orthogonal programming models, a Java model and a RISC model. The entities share a common data path and operate independently, although not in parallel. The microprocessor includes a combined register file in which the Java module sees the elements in the register file as a circular operand stack and the RISC module sees the elements as a conventional register file. The integrated microprocessor architecture facilitates access to hardware-near instructions and provides powerful interrupt and instruction trapping capabilities.
    Type: Application
    Filed: April 23, 2012
    Publication date: August 9, 2012
    Applicant: ATMEL CORPORATION
    Inventor: Oyvind Strom
  • Publication number: 20120204012
    Abstract: A method includes providing a data processor having an instruction pipeline, where the instruction pipeline has a plurality of instruction pipeline stages, and where the plurality of instruction pipeline stages includes a first instruction pipeline stage and a second instruction pipeline stage. The method further includes providing a data processor instruction that causes the data processor to perform a first set of computational operations during execution of the data processor instruction, performing the first set of computational operations in the first instruction pipeline stage if the data processor instruction is being executed and a first mode has been selected, and performing the first set of computational operations in the second instruction pipeline stage if the data processor instruction is being executed and a second mode has been selected.
    Type: Application
    Filed: April 13, 2012
    Publication date: August 9, 2012
    Applicant: Rambus Inc.
    Inventors: William C. Moyer, Jeffrey W. Scott
  • Patent number: 8239866
    Abstract: Software rendering and fine grained parallelism are utilized to reduce/avoid memory latency in a multi-processor (MP) system. According to one embodiment, the management of the transfer of data from one processor to another in the MP environment is moved into a low overhead hardware system. The low overhead hardware system may be a FIFO (“First In First Out”) hardware control. Each FIFO may be real or virtual.
    Type: Grant
    Filed: April 24, 2009
    Date of Patent: August 7, 2012
    Assignee: Microsoft Corporation
    Inventor: Susan Carrie
  • Patent number: 8239661
    Abstract: A method for double-issue complex instructions receives a complex instruction comprising a first portion and a second portion. The method sets a single issue queue slot and allocates an execution unit for the complex instruction, and identifies dependencies in the first and second portions. The method sets a dependency matrix slot and a consumers table slot for the first and section portion. In the event the first portion dependencies have been satisfied, the method issues the first portion and then issues the second portion from the single issue queue slot. In the event the second portion dependencies have not been satisfied, the method cancels the second portion issue.
    Type: Grant
    Filed: August 28, 2008
    Date of Patent: August 7, 2012
    Assignee: International Business Machines Corporation
    Inventors: Christopher M. Abernathy, Mary D. Brown, Todd A. Venton
  • Patent number: 8234653
    Abstract: A data processing architecture includes multiple processors connected in series between a load balancer and reorder logic. The load balancer is configured to receive data and distribute the data across the processors. Appropriate ones of the processors are configured to process the data. The reorder logic is configured to receive the data processed by the processors, reorder the data, and output the reordered data.
    Type: Grant
    Filed: May 30, 2008
    Date of Patent: July 31, 2012
    Assignee: Juniper Networks, Inc.
    Inventors: John C Carney, Michael E Lipman
  • Publication number: 20120191954
    Abstract: Methods and apparatuses are provided for achieving increased performance and energy saving via instruction pre-completion without having to schedule instruction execution in processor execution units. The apparatus comprises an operational unit for determining whether an instruction can be completed without scheduling use of an execution unit of the processor and units within the operational unit capable of employing alternate or equivalent processes or techniques to complete the instruction. In this way, the instruction is completed without scheduling use of the execution unit of the processor. The method comprises determining that an instruction can be completed without scheduling use of an execution unit of a processor and then pre-completing the instruction without use of one or more the execution units.
    Type: Application
    Filed: January 20, 2011
    Publication date: July 26, 2012
    Applicant: ADVANCED MICRO DEVICES, INC.
    Inventors: Jay FLEISCHMAN, Debjit DAS SARMA
  • Patent number: 8230423
    Abstract: A method and processor architecture for achieving a high level of concurrency and latency hiding in an “infinite-thread processor architecture” with a limited number of hardware threads is disclosed. A preferred embodiment defines “fork” and “join” instructions for spawning new context-switched threads. Context switching is used to hide the latency of both memory-access operations (i.e., loads and stores) and arithmetic/logical operations. When an operation executing in a thread incurs a latency having the potential to delay the instruction pipeline, the latency is hidden by performing a context switch to a different thread. When the result of the operation becomes available, a context switch back to that thread is performed to allow the thread to continue.
    Type: Grant
    Filed: April 7, 2005
    Date of Patent: July 24, 2012
    Assignee: International Business Machines Corporation
    Inventors: Matteo Frigo, Ahmed Gheith, Volker Strumpen
  • Patent number: 8230201
    Abstract: A wake-and-go mechanism is provided for a data processing system. The wake-and-go mechanism detects a thread running on a first processing unit within a plurality of processing units that is waiting for an event that modifies a data value associated with a target address. The wake-and-go mechanism creates a wake-and-go instance for the thread by populating a wake-and-go storage array with the target address. The operating system places the thread in a sleep state. Responsive to detecting the event that modifies the data value associated with the target address, the wake-and-go mechanism assigns the wake-and-go instance to a second processing unit within the plurality of processing units. The operating system on the second processing unit places the thread in a non-sleep state.
    Type: Grant
    Filed: April 16, 2009
    Date of Patent: July 24, 2012
    Assignee: International Business Machines Corporation
    Inventors: Ravi K. Arimilli, Satya P. Sharma, Randal C. Swanberg
  • Publication number: 20120185677
    Abstract: A method of managing binary data across a mixed computing environment is provided. The method includes performing on one or more processors: receiving binary data; receiving binary coded data indicating a type of the binary data; formatting the binary data and the binary coded data according to a first format; and generating at least one of a message and a file based on the formatted data.
    Type: Application
    Filed: January 14, 2011
    Publication date: July 19, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Harry J. Beatty, III, Peter C. Elmendorf, Charles Gates, Luo Chen
  • Publication number: 20120185678
    Abstract: A technique for indicating a safe shared resource condition with respect to a disabled thread provides a mechanism for providing a fast indication to other hardware threads that a temporarily disabled thread can no longer impact shared resources, such as shared special-purpose registers and translation look-aside buffers within the processor core. Signals from pipelines within the core indicates whether any of the instructions pending in the pipeline impact the shared resources and if not, then the thread disable status is presented to the other threads via a state change in a thread status register. Upon receiving an indication that a particular hardware thread is to be disabled, control logic halts the dispatch of instructions for the particular hardware thread, and then waits until any indication that a shared resource is impacted by an instruction has cleared. Then the control logic updates the thread status to indicate the thread is disabled.
    Type: Application
    Filed: March 30, 2012
    Publication date: July 19, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Becky Bruce, Giles R. Frazier, Bradly G. Frey, Kumar K. Gala, Cathy May, Michael D. Snyder, Gary Whisenhunt, James Xenidis
  • Patent number: 8225074
    Abstract: In accordance with exemplary implementations, application computation operations and communications between operations on a host processing platform may be adapted to conform to the memory capacity of a parallel accelerator. Computation operations may be split and scheduled such that the computation operations fit within the memory capacity of the accelerator. Further, the operations may be automatically adapted without any modification to the code of an application. In addition, data transfers between a host processing platform and the parallel accelerator may be minimized in accordance with exemplary aspects of the present principles to improve processing performance.
    Type: Grant
    Filed: March 6, 2009
    Date of Patent: July 17, 2012
    Assignee: NEC Laboratories America, Inc.
    Inventors: Srimat T. Chakradhar, Anand Raghunathan, Narayanan Sundaram
  • Patent number: 8225320
    Abstract: A computing method and system is presented that modifies a standard operating system utilizing two or more processing units to execute continuous processing tasks; such as processing or generating continuous audio, video or other types of data. One of the processors is tasked with running the operating system while each processing unit is dedicated towards running a single continuous processing task. Communication is provided between both processors enabling the continuous processing task to utilize the operating system without being affected by any operating system scheduling requirements.
    Type: Grant
    Filed: August 23, 2007
    Date of Patent: July 17, 2012
    Assignee: Advanced Simulation Technology, Inc.
    Inventors: Manushantha (Manu) Sporny, Robert Kenneth Butterfield, Norton Kenneth James, Patrick Quinn Gaffney
  • Patent number: 8219785
    Abstract: Methods and apparatus are provided for allowing a master component such as a processor on a programmable chip to access memory using unaligned addresses. An adapter connected to a master component determines if a master component memory access request is aligned. If the access request is aligned, the request is forwarded to memory and a response is provided to the master component. If the access request is unaligned, the adapter sends multiple access requests to memory and processes the responses in order to provide a correct response to the master component.
    Type: Grant
    Filed: September 25, 2006
    Date of Patent: July 10, 2012
    Assignee: Altera Corporation
    Inventors: Timothy P. Allen, Jeffrey Orion Pritchard, Richard Noble Hill
  • Publication number: 20120173853
    Abstract: A processing apparatus includes an execution unit which performs computation on two operand inputs each being selectable between read data from a register and an immediate value. The processing apparatus also includes another execution unit which performs computation on two operand inputs, one of which is selectable between read data from a register and an immediate value, and the other of which is an immediate value. A control unit determines, based on a received instruction specifying a computation on two operands, whether each of the two operands specifies read data from a register or an immediate value. Depending on the determination result, the control unit causes one of the execution units to execute the computation specified by the received instruction.
    Type: Application
    Filed: November 14, 2011
    Publication date: July 5, 2012
    Applicant: FUJITSU LIMITED
    Inventor: Masaki Ukai
  • Publication number: 20120173854
    Abstract: Methods and apparatuses are provided for an efficient technique for processing registers having a known value while improving processor performance. The apparatus comprises a processor having a plurality of physical registers available for use in computations and a decoder for determining that a logical register contains a known value. A renaming unit maps the logical register containing the known value to an address outside an address range for the plurality of physical registers once the known value is determined. Thereafter, scheduling and execution units perform computations using the known value without storing the known value in one of the plurality of physical registers. The method comprises determining that a logical register of a processor has a known value and then mapping that logical register to a physical register address outside an expected range of physical register addresses; which indicates that the logical register represents the known value.
    Type: Application
    Filed: December 29, 2010
    Publication date: July 5, 2012
    Applicant: ADVANCED MICRO DEVICES, INC.
    Inventors: Jay FLEISCHMAN, Debjit Das Sarma, Michael SEDMAK
  • Patent number: 8214625
    Abstract: One embodiment of the present invention sets forth a technique for efficiently performing voting operations within a multi-threaded parallel-processing system. A group of related parallel program threads executes within a processor core together in parallel. A new instruction, called a “vote” instruction, is introduced that enables a parallel program thread to post an individual vote within the context of the group of related threads and to receive the result of the vote. In this fashion, the vote instruction advantageously reduces overhead associated with inter-thread communication, thereby improving overall system performance.
    Type: Grant
    Filed: November 26, 2008
    Date of Patent: July 3, 2012
    Assignee: NVIDIA Corporation
    Inventors: John R. Nickolls, Lars Nyland, Peter C. Mills, Jeremy Sugerman, Timothy Foley, Brian Fahs, Michael Garland, David P. Luebke