Superscalar Patents (Class 712/23)
  • Patent number: 9942272
    Abstract: Processing streaming data in accordance with policies that group data by source, enforce a maximum permissible late arrival value for streaming data, a maximum permissible early arrival for data and/or a maximum degree to which data can be out of order and still be compliant with the out of order policy is described. The correct starting point for reading a data stream so as to produce correct output from a given output start time can be enabled using the early arrival policy. Using combinations of policies, output can be generated promptly (with low latency). When input from a given source is not disrupted, output can be generated with low latency. Output can be generated even when the input stops by applying a late arrival policy.
    Type: Grant
    Filed: June 5, 2015
    Date of Patent: April 10, 2018
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC.
    Inventors: Zhong Chen, Lev Novik, Boris Shulman, Clemens A. Szyperski
  • Patent number: 9864398
    Abstract: A method for transmitting a plurality of data bits and a clock signal on a return to zero (RZ) signal includes: transmitting a first voltage that is greater than a first threshold, the first voltage being decodable to first order of data bits; transmitting a second voltage that is between a second threshold and the first threshold, the second voltage being decodable to a second order of data bits; transmitting a third voltage that is between a third threshold and a fourth threshold, the third voltage being decodable to a third order of data bits; transmitting a fourth voltage that is greater in magnitude than the fourth threshold, the fourth voltage being decodable to a fourth order of data bits; and transitioning the clock signal in response to the RZ signal being between the second threshold and the third threshold.
    Type: Grant
    Filed: December 30, 2015
    Date of Patent: January 9, 2018
    Assignee: TEXAS INSTRUMENTS INCORPORATED
    Inventor: Robert Floyd Payne
  • Patent number: 9851977
    Abstract: A method for executing instructions on a single-program, multiple-data processor system having a fixed number of execution lanes, including: scheduling a primary instruction for execution with a first wave of multiple data; assigning the first wave to a corresponding primary subset of the execution lanes; scheduling a secondary instruction having a second wave of multiple data, such that the second wave fits in lanes that are unused by the primary subset of lanes; assigning the second wave to a corresponding secondary subset of the lanes; fetching the primary and secondary instructions; configuring the execution lanes such that the primary subset is responsive to the primary instruction and the secondary subset is simultaneously responsive to the secondary instruction; and simultaneously executing the primary and secondary instructions in the execution lanes.
    Type: Grant
    Filed: December 6, 2012
    Date of Patent: December 26, 2017
    Assignee: KALRAY
    Inventors: Nicolas Brunie, Sylvain Collange
  • Patent number: 9798590
    Abstract: A method and apparatus for post-retire transaction access tracking is herein described. Load and store buffers are capable of storing senior entries. In the load buffer a first access is scheduled based on a load buffer entry. Tracking information associated with the load is stored in a filter field in the load buffer entry. Upon retirement, the load buffer entry is marked as a senior load entry. A scheduler schedules a post-retire access to update transaction tracking information, if the filter field does not represent that the tracking information has already been updated during a pendency of the transaction. Before evicting a line in a cache, the load buffer is snooped to ensure no load accessed the line to be evicted.
    Type: Grant
    Filed: September 7, 2006
    Date of Patent: October 24, 2017
    Assignee: Intel Corporation
    Inventors: Haitham Akkary, Ravi Rajwar, Srikanth T. Srinivasan
  • Patent number: 9766894
    Abstract: A chaining bit decoder of a computer processor receives an instruction stream. The chaining bit decoder selects a group of instructions from the instruction stream. The chaining bit decoder extracts a designated bit from each instruction of the instruction stream to produce a sequence of chaining bits. The chaining bit decoder decodes the sequence of chaining bits. The chaining bit decoder identifies zero or more instruction stream dependencies among the selected group of instructions in view of the decoded sequence of chaining bits. The chaining bit decoder outputs control signals to cause one or more pipelines stages of the processor to execute the selected group of instructions in view of the identified zero or more instruction stream dependencies among the group sequence of instructions.
    Type: Grant
    Filed: November 12, 2014
    Date of Patent: September 19, 2017
    Assignee: Optimum Semiconductor Technologies, Inc.
    Inventors: C. John Glossner, Gary J. Nacer, Murugappan Senthilvelan, Vitaly Kalashnikov, Arthur J. Hoane, Paul D'Arcy, Sabin D. Iancu, Shenghong Wang
  • Patent number: 9766895
    Abstract: A computing device determines that a current software thread of a plurality of software threads having an issuing sequence does not have a first instruction waiting to be issued to a hardware thread during a clock cycle. The computing device identifies one or more alternative software threads in the issuing sequence having instructions waiting to be issued. The computing device selects, during the clock cycle by the computing device, a second instruction from a second software thread among the one or more alternative software threads in view of determining that the second instruction has no dependencies with any other instructions among the instructions waiting to be issued. Dependencies are identified by the computing device in view of the values of a chaining bit extracted from each of the instructions waiting to be issued. The computing device issues the second instruction to the hardware thread.
    Type: Grant
    Filed: November 12, 2014
    Date of Patent: September 19, 2017
    Assignee: Optimum Semiconductor Technologies, Inc.
    Inventors: Shenghong Wang, C. John Glossner, Gary J. Nacer
  • Patent number: 9747224
    Abstract: Provided is a method of managing a register port, the method including performing scheduling on register ports that are used during a plurality of cycles to enable performing of a calculation; encoding data of the register ports according to results of the scheduling, the encoding of the data including, with respect to data of one of the register ports that does not have a schedule during one of the plurality of cycles, equally encoding the data of the one register port during the one cycle with data of an adjacent cycle of the one register port, the adjacent cycle being adjacent to the one cycle; and transmitting results of the encoding to a device that includes the register ports.
    Type: Grant
    Filed: March 11, 2015
    Date of Patent: August 29, 2017
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Tai-Song Jin, Jae-Un Park, Do-hyung Kim, Seung-won Lee
  • Patent number: 9740494
    Abstract: Instruction issue circuits are disclosed that are configured to issue multiple instructions within a superscalar pipeline of a microprocessor. The instruction issue circuit includes an instruction queue that stores instructions. A ready generation circuit is operably associated with the instruction queue and generates ready signals that indicate which instructions in the instruction queue are ready for execution. To simplify the instruction issue circuit, the instruction issue circuit has group blocks. Each group block receives a different group of the ready signals corresponding to a different group of the instructions. Each group block generates a group output indicating a group set within the corresponding group of the instructions that has a highest instruction execution priority and are ready for execution. By splitting the ready signals into groups, the groups of ready signals can be processed in parallel thereby reducing both the resulting delay and complexity of the instruction issue circuit.
    Type: Grant
    Filed: April 30, 2012
    Date of Patent: August 22, 2017
    Assignee: Arizona Board of Regents for and on behalf of Arizona State University
    Inventors: Lawrence T. Clark, Siddhesh Mhambrey, Satendra Kumar Maurya
  • Patent number: 9720675
    Abstract: Illustrated is a system and method to receive an instruction to access a set of identified data structures, where each identified data structure is associated with a version data structure that includes annotations of particular dependencies amongst at least two members of the set of identical data structures. The system and method further comprising determining, based upon the dependencies, that a version mismatch exists between the at least two members of the set of identified data structures, the dependencies used to identify a most recent version and a locally cached version of the at least two members. The system and method further comprising delaying execution of the instruction until the version mismatch between the at least two members of the set of identified data structures is resolved through an upgrade of a version of one of the at least two members of the set of identified data structures.
    Type: Grant
    Filed: October 27, 2010
    Date of Patent: August 1, 2017
    Assignee: Hewlett Packard Enterprise Development LP
    Inventor: Antonio Lain
  • Patent number: 9652371
    Abstract: A circular queue implementing a scheme for prioritized reads is disclosed. In one embodiment, a circular queue (or buffer) includes a number of storage locations each configured to store a data value. A multiplexer tree is coupled between the storage locations and a read port. A priority circuit is configured to generate and provide selection signals to each multiplexer of the multiplexer tree, based on a priority scheme. Based on the states of the selection signals, one of the storage locations is coupled to the read port via the multiplexers of the multiplexer tree.
    Type: Grant
    Filed: February 18, 2015
    Date of Patent: May 16, 2017
    Assignee: Apple Inc.
    Inventors: Rajat Goel, Hari S. Kannan, Khurram Z. Malik
  • Patent number: 9645802
    Abstract: A device compiler and linker is configured to group instructions into different strands for execution by different threads based on the dependence of those instructions on other, long-latency instructions. A thread may execute a strand that includes long-latency instructions, and then hardware resources previously allocated for the execution of that thread may be de-allocated from the thread and re-allocated to another thread. The other thread may then execute another strand while the long-latency instructions are in flight. With this approach, the other thread is not required to wait for the long-latency instructions to complete before acquiring hardware resources and initiating execution of the other strand, thereby eliminating at least a portion of the time that the other thread would otherwise spend waiting.
    Type: Grant
    Filed: August 7, 2013
    Date of Patent: May 9, 2017
    Assignee: NVIDIA Corporation
    Inventors: Mojtaba Mehrara, Michael Garland, Gregory Diamos
  • Patent number: 9632977
    Abstract: A data processor includes a packet selector. The packet selector creates an ordered list of packets, each packet corresponding to a respective communication flow, determines whether each packet in the ordered list of packets is eligible for transfer to a prefetch unit based on whether a preceding packet in the same communication flow has been transferred to the prefetch unit, and sets a selection priority for each packet based on start time constraints for the respective communication flow, and based on a processing status of a preceding packet in the communication flow.
    Type: Grant
    Filed: March 13, 2013
    Date of Patent: April 25, 2017
    Assignee: NXP USA, Inc.
    Inventors: Timothy G. Boland, Anne C. Harris, Steven D. Millman
  • Patent number: 9626190
    Abstract: The present invention provides a method and apparatus for floating-point register caching. One embodiment of the method includes mapping a first set of architected registers defined by a first instruction set to a memory outside of a plurality of physical registers. The plurality of physical registers are configured to map to the first set, a second set of architected registers defined by a second construction set, and a set of rename registers. This embodiment of the method also includes adding the physical registers corresponding to the first set of architected registers to the set of rename registers.
    Type: Grant
    Filed: October 7, 2010
    Date of Patent: April 18, 2017
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Jeff Rupley
  • Patent number: 9612835
    Abstract: A system and method for fencing memory accesses. Memory loads can be fenced, or all memory access can be fenced. The system receives a fencing instruction that separates memory access instructions into older accesses and newer accesses. A buffer within the memory ordering unit is allocated to the instruction. The access instructions newer than the fencing instruction are stalled. The older access instructions are gradually retired. When all older memory accesses are retired, the fencing instruction is dispatched from the buffer.
    Type: Grant
    Filed: September 14, 2012
    Date of Patent: April 4, 2017
    Assignee: Intel Corporation
    Inventors: Salvador Palanca, Stephen A. Fischer, Subramaniam Maiyuran, Shekoufeh Qawami
  • Patent number: 9582283
    Abstract: A microcontroller includes a program memory, data memory, central processing unit, at least one register module, a memory management unit, and a transport network. Instructions are executed in one clock cycle via an instruction word. The instruction word indicates the source module from which data is to be retrieved and the destination module to which data is to be stored. The address/data capability of an instruction word may be extended via a prefix module. If an operation is performed on the data, the source module or the destination module may perform the operation during the same clock cycle in which the data is transferred.
    Type: Grant
    Filed: May 18, 2009
    Date of Patent: February 28, 2017
    Assignee: Maxim Integrated Products, Inc.
    Inventors: Jeffrey D. Owens, Edward Tangkwai Ma, Donald W. Loomis, III, Thomas Augustus Chenot
  • Patent number: 9535701
    Abstract: A pipelined processor selects an instruction fetch mode from a number of fetch modes including an executed branch fetch mode, a predicted fetch mode, and a sequential fetch mode. Each branch instruction is associated with branch delay slots, the size of which can be greater than or equal to zero, and can vary from one branch instance to another. Branch prediction is used to fetch instructions, with the source of information for predictions deriving from a last instruction in the branch delay slots. When a prediction error occurs, the executed branch fetch mode uses an address from branch instruction evaluation to fetch a next instruction.
    Type: Grant
    Filed: January 29, 2014
    Date of Patent: January 3, 2017
    Assignee: TELEFONAKTIEBOLAGET LM ERICSSON (publ)
    Inventors: Erik Rijshouwer, Ricky Nas
  • Patent number: 9529596
    Abstract: In accordance with embodiments disclosed herein, there are provided methods, systems, and apparatuses for scheduling instructions in a multi-strand out-of-order processor. For example, an apparatus for scheduling instructions in a multi-strand out-of-order processor includes an out-of-order instruction fetch unit to retrieve a plurality of interdependent instructions for execution from a multi-strand representation of a sequential program listing; an instruction scheduling unit to schedule the execution of the plurality of interdependent instructions based at least in part on operand synchronization bits encoded within each of the plurality of interdependent instructions; and a plurality of execution units to execute at least a subset of the plurality of interdependent instructions in parallel.
    Type: Grant
    Filed: July 1, 2011
    Date of Patent: December 27, 2016
    Assignee: Intel Corporation
    Inventors: Boris A. Babayan, Vladimir M. Pentkovski, Alexander V. Butuzov, Sergey Y. Shishlov, Alexey Y. Sivtsov, Nikolay E. Kosarev
  • Patent number: 9513904
    Abstract: A computer processing system with a hierarchical memory system that associates a number of valid bits for each cache line of the hierarchical memory system. The valid bits are provided for each cache line stored in a respective cache and make explicit which bytes are semantically defined and which are not for the associated given cache line. Memory requests to the cache(s) of the hierarchical memory system can include an address specifying a requested cache line as well as a mask that includes a number of bits each corresponding to a different byte of the requested cache line. The values of the bits of the byte mask indicate which bytes of the requested cache line are to be returned from the hierarchical memory system. The memory request is processed by the top level cache of the hierarchical memory system, looking for one or more valid bytes of the requested cache line corresponding to the target address of the memory request.
    Type: Grant
    Filed: October 15, 2014
    Date of Patent: December 6, 2016
    Assignee: MILL COMPUTING, INC.
    Inventors: Roger Rawson Godard, Arthur David Kahlich
  • Patent number: 9489204
    Abstract: An example method of storing a partial target address in an instruction cache includes receiving a branch instruction. The method also includes predicting a direction of the branch instruction as being not taken. The method further includes calculating a destination address based on executing the branch instruction. The method also includes determining a partial target address using the destination address. The method further includes in response to the predicted direction of the branch instruction changing from not taken to taken, replacing an offset in an instruction cache with the partial target address.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: November 8, 2016
    Assignee: QUALCOMM Incorporated
    Inventors: Jiajin Tu, Suresh K. Venkumahanti, Brian R. Mestan
  • Patent number: 9471480
    Abstract: A data processing apparatus has a memory rename table for storing memory rename entries each identifying a mapping between a memory address of a location in memory and a mapped register of a plurality of registers. The mapped register is identified by a register number. In response to a store instruction, the store target memory address of the store instruction is mapped to a store destination register and so the data value is stored to the store destination register instead of memory. A memory rename entry is provided in the table to identify the mapping between the store target memory address and store destination target register. In response to a load instruction, if there is a hit in the memory rename table for the load target memory address then the loaded value can be read from the mapped register instead of memory.
    Type: Grant
    Filed: February 21, 2014
    Date of Patent: October 18, 2016
    Assignee: The Regents of the University of Michigan
    Inventors: Joseph Michael Pusdesris, Yiping Kang, Andrea Pellegrini, Benjamin Allen Vandersloot, Trevor Nigel Mudge
  • Patent number: 9459871
    Abstract: A method, system, and computer program product for identifying loop information corresponding to a plurality of loop instructions. The loop instructions are stored into a queue. The loop instructions are replayed from the queue for execution. Loop iteration is counted based on the identified loop information. A determination is made of whether the last iteration of the loop is done. If the last iteration is not done, then embodiments continue replaying the loop instructions, until the last iteration is done.
    Type: Grant
    Filed: December 31, 2012
    Date of Patent: October 4, 2016
    Assignee: Intel Corporation
    Inventors: Masha Lipshits, Lihu Rappaport, Shantanu Gupta, Franck Sala, Naveen Kumar, Allan D. Knies
  • Patent number: 9436464
    Abstract: In a multithread processor capable of executing a plurality of threads, in order to select a thread and instruction for increasing a throughput of the multithread processor, an instruction-issuance controlling device included in the multithread processor includes a resource management unit configured to manage stall information indicating whether or not each of threads in execution is in a stalled state; a thread selection unit configured to select a thread which is not in the stalled state among the threads in execution; and an instruction-issuance controlling unit configured to perform controlling so that simultaneously issuable instructions are issued from among the selected thread.
    Type: Grant
    Filed: December 3, 2012
    Date of Patent: September 6, 2016
    Assignee: SOCIONECT INC.
    Inventor: Tomohiro Yamana
  • Patent number: 9430240
    Abstract: Embodiments relate to pre-computation slice (p-slice) merging for prefetching in a computer processor. An aspect includes determining a plurality of p-slices corresponding to a delinquent instruction. Another aspect includes selecting a first p-slice and a second p-slice of the plurality of p-slices. Another aspect includes traversing the first p-slice and the second p-slice to determine that divergent instructions exist between the first p-slice and the second p-slice. Another aspect includes, based on determining that divergent instructions exist between the first p-slice and the second p-slice, determining whether the first p-slice and the second p-slice converge after the divergent instructions. Another aspect includes, based on determining that the first p-slice and the second p-slice converge after the divergent instructions, merging the first p-slice and the second p-slice into a single merged p-slice.
    Type: Grant
    Filed: December 10, 2015
    Date of Patent: August 30, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Islam Atta, Ioana M. Baldini Soares, Kailash Gopalakrishnan, Vijayalakshmi Srinivasan
  • Patent number: 9430243
    Abstract: A system and method for efficiently reducing the latency of initializing registers. A register rename unit within a processor determines whether prior to an execution pipeline stage it is known a decoded given instruction writes a particular numerical value in a destination operand. An example is a move immediate instruction that writes a value of 0 in its destination operand. Other examples may also qualify. If the determination is made, a given physical register identifier is assigned to the destination operand, wherein the given physical register identifier is associated with the particular numerical value, but it is not associated with an actual physical register in a physical register file. The given instruction is marked to prevent it from proceeding to an execution pipeline stage. When the given physical register identifier is used to read the physical register file, no actual physical register is accessed.
    Type: Grant
    Filed: April 30, 2012
    Date of Patent: August 30, 2016
    Assignee: Apple Inc.
    Inventors: James B. Keller, John H. Mylius, Conrado Blasco-Allue, Gerard R. Williams, III
  • Patent number: 9430235
    Abstract: A method and information processing system manage load and store operations that can be executed out-of-order. At least one of a load instruction and a store instruction is executed. A determination is made that an operand store compare hazard has been encountered. An entry within an operand store compare hazard prediction table is created based on the determination. The entry includes at least an instruction address of the instruction that has been executed and a hazard indicating flag associated with the instruction. The hazard indicating flag indicates that the instruction has encountered the operand store compare hazard. When a load instruction is associated with the hazard indicating flag, the load instruction becomes dependent upon all store instructions associated with a substantially similar hazard indicating flag.
    Type: Grant
    Filed: July 29, 2013
    Date of Patent: August 30, 2016
    Assignee: International Business Machines Corporation
    Inventors: Gregory W. Alexander, Khary J. Alexander, Brian Curran, Jonathan T. Hsieh, Christian Jacobi, James R. Mitchell, Brian R. Prasky, Brian W. Thompto
  • Patent number: 9424365
    Abstract: Techniques and approaches are provided for creating indexes and column constraints on structured XML data that is stored in a relational database. Data Definition Language (DDL) Create Index and Create Constraint commands have extended syntax that allows the specification of a path-based expression instead of requiring a column and table name. A mapping created by the system when an XML Schema is registered stores the correspondence of XML data elements to automatically-created database tables and columns that are given names only useful for the internal system. When a user provides a path-based expression in a DDL when creating an index or constraint, the path-based expression is translated to the underlying database constructs using the mapping. Issues are addressed for handling path-based expressions that evaluate to more than one element. Additional index optimization is described using data type information available in the XML schema to select the optimal index type.
    Type: Grant
    Filed: October 30, 2009
    Date of Patent: August 23, 2016
    Assignee: Oracle International Corporation
    Inventors: Beda Christoph Hammerschmidt, Zhen Hua Liu, Thomas Baby
  • Patent number: 9354882
    Abstract: Example methods and apparatus to manage partial commit-checkpoints are disclosed. A disclosed example method includes identifying a commit instruction associated with a region of instructions executed by a processor, identifying candidate instructions from the region of instructions, and generating a processor partial commit-checkpoint to save a current state of the processor, the checkpoint based on calculated register values associated with live instructions, and including instruction reference addresses to link the candidate instructions.
    Type: Grant
    Filed: September 30, 2013
    Date of Patent: May 31, 2016
    Assignee: Intel Corporation
    Inventors: Edson Borin, Youfeng Wu
  • Patent number: 9354885
    Abstract: Processing of an instruction fetch from an instruction cache is provided, which includes: determining whether the next instruction fetch is in a same cache line of the instruction cache as a last instruction fetch; and based, at least in part, on determining that the next instruction fetch is in the same cache line, suppressing for the next instruction fetch one or more instruction cache-related directory accesses, and forcing for the next instruction an address match signal for the same cache line. The suppressing may include generating a known-to-hit signal where the next fetch is in the same cache line, and the last fetch is not a branch instruction, and issuing an instruction cache hit where a cache line segment of the same cache line having the next instruction has a valid validity bit, the valid validity bit having been retrieved and maintained based on a most-recent, instruction cache-directory-accessed fetch.
    Type: Grant
    Filed: January 8, 2016
    Date of Patent: May 31, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michael K. Gschwind, Valentina Salapura
  • Patent number: 9342309
    Abstract: A method and circuit arrangement tightly couple together decode logic associated with multiple types of execution units and having varying priorities to enable instructions that are decoded as valid instructions for multiple types of execution units to be forwarded to a highest priority type of execution unit among the multiple types of execution units. Among other benefits, when an auxiliary execution unit is coupled to a general purpose processing core with the decode logic for the auxiliary execution unit tightly coupled with the decode logic for the general purpose processing core, the auxiliary execution unit may be used to effectively overlay new functionality for an existing instruction that is normally executed by the general purpose processing core, e.g., to patch a design flaw in the general purpose processing core or to provide improved performance for specialized applications.
    Type: Grant
    Filed: March 11, 2013
    Date of Patent: May 17, 2016
    Assignee: International Business Machines Corporation
    Inventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
  • Patent number: 9317294
    Abstract: A method and circuit arrangement utilize inactive non-pipelined operation resources in one processing core of a multi-core processing unit to execute non-pipelined instructions on behalf of another processing core in the same processing unit. Adjacent processing cores in a processing unit may be coupled together such that, for example, when one processing core's non-pipelined execution sequencer is busy, that processing core may issue into another processing core's non-pipelined execution sequencer if that other processing core's non-pipelined execution sequencer is idle, thereby providing intermittent concurrent execution of multiple non-pipelined instructions within each individual processing core.
    Type: Grant
    Filed: December 6, 2012
    Date of Patent: April 19, 2016
    Assignee: International Business Machines Corporation
    Inventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
  • Patent number: 9311087
    Abstract: A data processing apparatus 2 supports speculative execution and the use of sticky bits. A different version of a sticky bit is associated with each segment of the speculative program flow. The segments of the program flow are separated by speculation nodes corresponding to program instructions which may be followed by a plurality of different alternative program instruction serving as the next program instruction. When a speculation node is resolved, then the segments separated by that speculation node are merged and the sticky bit values for those two segments are merged.
    Type: Grant
    Filed: December 21, 2012
    Date of Patent: April 12, 2016
    Assignee: ARM Limited
    Inventors: Luca Scalabrino, Cédric Denis Robert Airaud, Guillaume Schon, Frederic Jean Denis Arsanto
  • Patent number: 9304776
    Abstract: A computer system may recognize a busy-wait loop in program instructions at compile time and/or may recognize busy-wait looping behavior during execution of program instructions. The system may recognize that an exit condition for a busy-wait loop is specified by a conditional branch type instruction in the program instructions. In response to identifying the loop and the conditional branch type instruction that specifies its exit condition, the system may influence or override a prediction made by a dynamic branch predictor, resulting in a prediction that the exit condition will be met and that the loop will be exited regardless of any observed branch behavior for the conditional branch type instruction. The looping instructions may implement waiting for an inter-thread communication event to occur or for a lock to become available. When the exit condition is met, the loop may be exited without incurring a misprediction delay.
    Type: Grant
    Filed: January 31, 2012
    Date of Patent: April 5, 2016
    Assignee: Oracle International Corporation
    Inventors: David Dice, Mark S. Moir
  • Patent number: 9304932
    Abstract: In a particular embodiment, an apparatus includes control logic configured to selectively set bits of a multi-bit way prediction mask based on a prediction mask value. The control logic is associated with an instruction cache including a data array. A subset of line drivers of the data array is enabled responsive to the multi-bit way prediction mask. The subset of line drivers includes multiple line drivers.
    Type: Grant
    Filed: December 20, 2012
    Date of Patent: April 5, 2016
    Assignee: QUALCOMM Incorporated
    Inventors: Peter G. Sassone, Suresh K. Venkumahanti, Lucian Codrescu
  • Patent number: 9292450
    Abstract: Methods and migration units for use in out-of-order processors for migrating data to register file caches associated with functional units of the processor to satisfy register read operations. The migration unit receives register read operations to be executed for a particular functional unit. The migration unit reviews entries in a register renaming table to determine if the particular functional unit has recently accessed the source register and thus is likely to comprise an entry for the source register in its register file cache. In particular, the register renaming table comprises entries for physical registers that indicate what functional units have accessed the physical register. If the particular functional unit has not accessed the particular physical register the migration unit migrates data to the register file cache associated with the particular functional unit.
    Type: Grant
    Filed: February 25, 2014
    Date of Patent: March 22, 2016
    Assignee: Imagination Technologies Limited
    Inventors: Hugh Jackson, Anand Khot
  • Patent number: 9286097
    Abstract: Apparatuses, methods and storage media associated with switching operating systems are disclosed herewith. In embodiments, an apparatus for computing may include one or more processors; and a virtual machine manager to be operated by the one or more processors to instantiate a first virtual machine with a first operating system in a background, and a second virtual machine with a second operating system in a foreground; wherein the virtual machine manager is further to place the first virtual machine, on instantiation, in background into a standby state. Other embodiments may be disclosed or claimed.
    Type: Grant
    Filed: November 7, 2013
    Date of Patent: March 15, 2016
    Assignee: Intel Corporation
    Inventors: Michael A. Rothman, Vincent J. Zimmer, Ping Wu, Zijan You
  • Patent number: 9280492
    Abstract: Embodiments of an invention for a load instruction for code conversion are disclosed. In one embodiment, a processor includes an instruction unit and an execution unit. The instruction unit is to receive an instruction having a source operand to indicate a source location and a destination operand to indicate a destination location. The execution unit is to execute the instruction. Execution of the instruction includes checking the access permissions of the source location and loading content from the source location into the destination location if the access permissions of the source location indicate that the content is executable.
    Type: Grant
    Filed: December 28, 2013
    Date of Patent: March 8, 2016
    Assignee: Intel Corporation
    Inventors: Paul Caprioli, Alexandre Farcy
  • Patent number: 9274793
    Abstract: A system for executing instructions using a plurality of memory fragments for a processor. The system includes a global front end scheduler for receiving an incoming instruction sequence, wherein the global front end scheduler partitions the incoming instruction sequence into a plurality of code blocks of instructions and generates a plurality of inheritance vectors describing interdependencies between instructions of the code blocks. The system further includes a plurality of virtual cores of the processor coupled to receive code blocks allocated by the global front end scheduler, wherein each virtual core comprises a respective subset of resources of a plurality of partitionable engines, wherein the code blocks are executed by using the partitionable engines in accordance with a virtual core mode and in accordance with the respective inheritance vectors. A plurality memory fragments are coupled to the partitionable engines for providing data storage.
    Type: Grant
    Filed: March 23, 2012
    Date of Patent: March 1, 2016
    Assignee: SOFT MACHINES, INC.
    Inventor: Mohammad Abdallah
  • Patent number: 9250917
    Abstract: An approach is provided in which a distributed runtime environment executes a software application that includes isolated runtime constructs corresponding to an isolated runtime environment. During the execution, the distributed runtime environment identifies isolated runtime constructs included in the software application and selects distributed runtime constructs corresponding to the isolated runtime constructs. In turn, the distributed runtime environment executes the distributed runtime constructs in lieu of executing the isolated runtime constructs.
    Type: Grant
    Filed: September 22, 2014
    Date of Patent: February 2, 2016
    Assignee: International Business Machines Corporation
    Inventor: Douglas Davis
  • Patent number: 9135005
    Abstract: Store multiple instructions are managed based on previous execution history and their alignment. At least one store multiple instruction is detected. A flag is determined to be associated with the at least one store multiple instruction. The flag indicates that the at least one store multiple instruction has previously encountered an operand store compare hazard. The at least one store multiple instruction is organized into a set of unit of operations. The set of unit of operations is executed. The executing avoids the operand store compare hazard previously encountered by the at least one store multiple instruction.
    Type: Grant
    Filed: January 28, 2010
    Date of Patent: September 15, 2015
    Assignee: International Business Machines Corporation
    Inventors: Khary J. Alexander, Fadi Busaba, Brian Curran, Bruce Giamei, Christian Jacobi, James R. Mitchell
  • Patent number: 9128700
    Abstract: A technique for restoring a register renaming map is described. In one example, a restore table having a number of storage locations saves a copy of the register renaming map whenever a flow-risk instruction is passed to a re-order buffer. When all storage locations are full, further instructions still pass to the re-order buffer, but a copy of the map is not saved. A storage location subsequently becomes available when its associated flow-risk instruction is executed. A register renaming map state for an unrecorded flow-risk instruction passed to the re-order buffer while the storage locations were full is generated and stored in the available location. This is generated using the restore table entry for a previous flow-risk instruction and re-order buffer values for intervening instructions between the previous and unrecorded flow-risk instructions. The restore table can be used to restore the map if an unexpected change in instruction flow occurs.
    Type: Grant
    Filed: July 31, 2012
    Date of Patent: September 8, 2015
    Assignee: Imagination Technologies Limited
    Inventor: Hugh Jackson
  • Patent number: 9124233
    Abstract: An audio equalizer includes an equalization processor that operates in conjunction with a transformed-based audio decoder that generates a decoded audio signal from an encoded audio signal. The equalization processor receives an equalization input signal, generates a plurality of response coefficients in response to the equalization input and applies the response coefficients to partially decoded data of the transformed-based audio decoder.
    Type: Grant
    Filed: August 25, 2010
    Date of Patent: September 1, 2015
    Assignee: VIXS Systems, INC
    Inventor: Hong Zeng
  • Patent number: 9122424
    Abstract: Systems and methods are disclosed for managing data entry buffers in a data storage device. A memory of the data storage device includes one or more data input ports. The device further includes a controller configured to receive a data entry over one of the data input ports and store the data entry in a first data structure (e.g., a FIFO data structure). The data entry is stored in the first data structure among other data entries received over various data input ports. The controller stores a data entry corresponding to the data entry stored in the first data structure in a second data structure. Entries in the second data structure include a valid bit field and one or more condition fields. The controller indicates, using a valid bit field of the second data structure data entry, that the corresponding data entry stored in the first data structure is valid.
    Type: Grant
    Filed: September 5, 2013
    Date of Patent: September 1, 2015
    Assignee: Western Digital Technologies, Inc.
    Inventor: Jianxun Gao
  • Patent number: 9116719
    Abstract: Described herein are technologies for optimizing computer code. A code generator can optimize a portion of original code to create optimized code. The code generator can create a partial commit point to indicate that execution of the optimized code produces an invalid architectural state. The code generator can create recovery information recover a valid architectural state at a recovery point. The code generator can associate the partial commit point and recovery information with the optimized code.
    Type: Grant
    Filed: June 27, 2013
    Date of Patent: August 25, 2015
    Assignee: Intel Corporation
    Inventors: Raul Martinez, Enric Gibert Codina, Marc Lupon, Kyriakos A. Stavrou
  • Patent number: 9110802
    Abstract: A method of implementing a mask load or mask store instruction by a processor is provided. The method may include receiving the mask load or mask store instruction, a location of a memory operand and a location of corresponding mask bits associated with the memory operand, breaking the received memory operand into a plurality of sub-operands and executing the mask load or mask store instruction on each of the plurality of sub-operands using a fastpath operation or using microcode, wherein the respective mask load or mask store instruction loads or stores each of the plurality of sub-operands based upon the corresponding mask bits.
    Type: Grant
    Filed: November 5, 2010
    Date of Patent: August 18, 2015
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Kelvin Goveas, Edward McLellan, Steven Beigelmacher, David Kroesche, Michael Clark
  • Patent number: 9104532
    Abstract: Embodiments relate to sequential location accesses in an active memory device that includes memory and a processing element. An aspect includes a method for sequential location accesses that includes receiving from the memory a first group of data values associated with a queue entry at the processing element. A tag value associated with the queue entry and specifying a position from which to extract a first subset of the data values is read. The queue entry is populated with the first subset of the data values starting at the position specified by the tag value. The processing element determines whether a second subset of the data values in the first group of data values is associated with a subsequent queue entry, and populates a portion of the subsequent queue entry with the second subset of the data values.
    Type: Grant
    Filed: December 14, 2012
    Date of Patent: August 11, 2015
    Assignee: International Business Machines Corporation
    Inventors: Bruce M. Fleischer, Thomas W. Fox, Hans M. Jacobson, Ravi Nair
  • Patent number: 9081563
    Abstract: Apparatus and a method for causing scheduler software to produce code which executes more rapidly by ignoring some of the normal constraints placed on its scheduling operations and simply scheduling certain instructions to run as fast as possible, raising an exception if the scheduling violates a scheduling constraint, and determining steps to be taken for correctly executing each set of instructions about which an exception is raised.
    Type: Grant
    Filed: June 4, 2012
    Date of Patent: July 14, 2015
    Inventors: Guillermo J. Rozas, Godfrey P. D'Souza, Charles R. Price, Paul S. Serris
  • Patent number: 9063824
    Abstract: Embodiments of the present invention provide a method, system and computer program product for melding mediation and adaptation modules of a service component architecture (SCA) system. A method for melding mediation and adaptation modules of an SCA system can include selecting each of a mediation module and an adaptation module in an integrated development tool executing in memory by a processor of a computer and loading respectively different descriptor files for each of the mediation module and the adaptation module. The method further can include combining descriptors from the different descriptor files into a single descriptor file for a melded module. Finally, the method can include modifying names and wiring descriptors in the single descriptor file for the melded module to account for a combination of the mediation component and the adaptation component in the melded component.
    Type: Grant
    Filed: January 4, 2014
    Date of Patent: June 23, 2015
    Assignee: International Business Machines Corporation
    Inventors: Gregory A. Flurry, Christopher H. Gerken, Paul Verschueren
  • Patent number: 9009506
    Abstract: Embodiments of a processing architecture are described. The architecture includes a fetch unit for fetching instructions from a data bus. A scheduler receives data from the fetch unit and creates a schedule allocates the data and schedule to a plurality of computational units. The scheduler also modifies voltage and frequency settings of the processing architecture to optimize power consumption and throughput of the system. The computational units include control units and execute units. The control units receive and decode the instructions and send the decoded instructions to execute units. The execute units then execute the instructions according to relevant software.
    Type: Grant
    Filed: March 1, 2012
    Date of Patent: April 14, 2015
    Assignee: NXP B.V.
    Inventors: Hamed Fatemi, Ajay Kapoor, Jose Pineda de Gyvez
  • Publication number: 20150100759
    Abstract: A system and method for controlling operation of a pipeline. In one embodiment, a pipelined datapath includes a plurality of processing stages and a pipeline controller. Each of the processing stages is configured to further processing provided by a previous one of the processing stages. The pipeline controller is configured to control operation of the processing stages. The pipeline controller includes a pipelined finite state machine. The pipelined finite state machine includes a plurality of control stages. Each of the control stages is configured to control operation of a single one of the processing stages, and to receive a state value that defines a state of the control stage for controlling the single one of the processing stages from a previous control stage.
    Type: Application
    Filed: October 7, 2013
    Publication date: April 9, 2015
    Applicant: TEXAS INSTRUMENTS DEUTSCHLAND GMBH
    Inventors: Christian Wiencke, Marko Krüger, Markus Kösler
  • Patent number: 8924661
    Abstract: A data storage system includes a plurality of non-volatile memory devices arranged in one or more sets, a main controller and one or more processors. The main controller is configured to accept commands from a host and to convert the commands into recipes. Each recipe includes a list of multiple memory operations to be performed sequentially in the non-volatile memory devices belonging to one of the sets. Each of the processors is associated with a respective set of the non-volatile memory devices, and is configured to receive one or more of the recipes from the main controller and to execute the memory operations specified in the received recipes in the non-volatile memory devices belonging to the respective set.
    Type: Grant
    Filed: January 17, 2010
    Date of Patent: December 30, 2014
    Assignee: Apple Inc.
    Inventors: Michael Shachar, Barak Rotbard, Oren Golov, Uri Perlmutter, Dotan Sokolov, Julian Vlaiko, Yair Schwartz