Patents Examined by Michael Metzger
  • Patent number: 9792125
    Abstract: A TRANSACTION BEGIN instruction begins execution of a transaction and includes a general register save mask having bits, that when set, indicate registers to be saved in the event the transaction is aborted. At the beginning of the transaction, contents of the registers are saved in memory not accessible to the program, and if the transaction is aborted, the saved contents are copied to the registers.
    Type: Grant
    Filed: May 20, 2016
    Date of Patent: October 17, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Dan F. Greiner, Christian Jacobi, Timothy J. Slegel
  • Patent number: 9785403
    Abstract: An engine architecture for processing finite automata includes a hyper non-deterministic automata (HNA) processor specialized for non-deterministic finite automata (NFA) processing. The HNA processor includes a plurality of super-clusters and an HNA scheduler. Each super-cluster includes a plurality of clusters. Each cluster of the plurality of clusters includes a plurality of HNA processing units (HPUs). A corresponding plurality of HPUs of a corresponding plurality of clusters of at least one selected super-cluster is available as a resource pool of HPUs to the HNA scheduler for assignment of at least one HNA instruction to enable acceleration of a match of at least one regular expression pattern in an input stream received from a network.
    Type: Grant
    Filed: July 8, 2014
    Date of Patent: October 10, 2017
    Assignee: Cavium, Inc.
    Inventors: Rajan Goyal, Satyanarayana Lakshmipathi Billa, Yossef Shanava, Gregg A. Bouchard, Timothy Toshio Nakada
  • Patent number: 9785410
    Abstract: A method for operating a control unit, the control unit including a software-controlled main processing unit, a strictly hardware-based model calculation unit for calculating an algorithm, for carrying out a Bayesian regression method, based on configuration data, and a memory unit, a model memory area being defined in the memory unit to which a configuration register block for providing the configuration data in the model calculation unit is assigned, a calculation start-configuration register being assigned the highest address in the configuration register block into which configuration data are written, the writing into of which starts the calculation in the model calculation unit, the configuration data being written in a memory area of the memory unit from the model memory area into the configuration register block with an incremental copying process, the addresses being copied in the incremental copying process in ascending order.
    Type: Grant
    Filed: July 1, 2014
    Date of Patent: October 10, 2017
    Assignee: ROBERT BOSCH GMBH
    Inventors: Heiner Markert, Wolfgang Fischer, Nico Bannow, Andre Guntoro, Michael Hanselmann
  • Patent number: 9747104
    Abstract: In one example, a method includes responsive to receiving, by a processing unit, one or more instructions requesting that a first value be moved from a first general purpose register (GPR) to a third GPR and that a second value be moved from a second GPR to a fourth GPR, copying, by an initial logic unit and during a first clock cycle, the first value to an initial pipeline register, copying, by the initial logic and during a second clock cycle, the second value to the initial pipeline register, copying, by a final logic unit and during a third clock cycle, the first value from a final pipeline register to the third GPR, and copying, by the final logic unit and during a fourth clock cycle, the second value from the final pipeline register to the fourth GPR.
    Type: Grant
    Filed: May 12, 2014
    Date of Patent: August 29, 2017
    Assignee: QUALCOMM Incorporated
    Inventors: Lin Chen, Yun Du, Sumesh Udayakumaran, Chihong Zhang, Andrew Evan Gruber
  • Patent number: 9740498
    Abstract: Disclosed are an opportunistic multi-thread method and processor, the method comprising the following steps: if a zeroth thread, a first thread, a second thread and a third thread all have instructions ready to be executed, then a zeroth clock period, a first clock period, a second clock period and a third clock period are respectively allocated to the zeroth thread, the first thread, the second thread and the third thread; if one of the threads cannot issue an instruction within a specified clock period because the instruction is not ready, and the previous thread still has an instruction ready to be executed after issuing certain instructions in the previous specified clock period, then the previous thread will take the specified clock period.
    Type: Grant
    Filed: November 15, 2012
    Date of Patent: August 22, 2017
    Assignee: Wuxi DSP Technologies Inc.
    Inventor: Shenghong Wang
  • Patent number: 9734126
    Abstract: A system and method for controlling post-silicon configurable instruction behavior are provided. For example, the method includes receiving data related to a compute circuit. The method also includes detecting a data pattern in the data. The method further includes determining that the data pattern is a special case that the compute circuit may handle improperly. The method also includes selecting a value from a post-silicon configurable data set based on the detected data. Further, the method includes changing a behavior of the compute circuit to produce a different output result based on the value selected from the post-silicon configurable data set.
    Type: Grant
    Filed: October 10, 2016
    Date of Patent: August 15, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: James R. Cuffney, Nicol Hofmann, Michael Klein, Petra Leber, Cédric Lichtenau, Silvia M. Mueller, Timothy J. Slegel
  • Patent number: 9703559
    Abstract: When the branch condition of a branch command for a loop process is satisfied and enters the loop mode, the relative branch address is saved in a branch relative address save circuit that points to the branch command for loop processing, and the loop state flag is set in a loop state save circuit. When the loop state flag is set, if the absolute value of the value outputted by a command code counter circuit matches the absolute value of the relative branch address outputted by the branch relative address save circuit, a program counter sum value switching circuit outputs the relative branch address to an program counter adder. If the absolute values do not match, the program counter sum value switching circuit outputs the value ‘1’ to the program counter adder. With this, the branch penalty during loop processing is eliminated even with little hardware.
    Type: Grant
    Filed: November 2, 2012
    Date of Patent: July 11, 2017
    Assignee: NEC CORPORATION
    Inventor: Hiroyuki Igura
  • Patent number: 9697005
    Abstract: In an example, there is disclosed a digital signal processor having a register containing a modular integer configured for use as a thread offset counter. In a multi-stage, pipelined loop, which may be implemented in microcode, the main body of the loop has only one repeating stage. On each stage, the operation executed by each thread of the single repeating stage is identified by the sum of a fixed integer and the thread offset counter. After each pass through the loop, the thread offset counter is incremented, thus maintaining pipelined operation of the single repeating stage.
    Type: Grant
    Filed: December 4, 2013
    Date of Patent: July 4, 2017
    Assignee: ANALOG DEVICES, INC.
    Inventor: Boris Lerner
  • Patent number: 9665377
    Abstract: A processing apparatus, comprising at least a first processing unit and a second processing unit, is proposed. The first processing unit comprises a set of first stateful elements, the second processing unit comprises a set of second stateful elements. A set of synchronization data lines may connect the first stateful elements to the second stateful elements in a pairwise manner. A control unit may control the first processing unit, the second processing unit and the synchronization data lines so as to copy the states of the first stateful elements in parallel via the synchronization data lines to the second stateful elements in response to a synchronization request. A method of synchronizing the processing units is also proposed.
    Type: Grant
    Filed: July 20, 2011
    Date of Patent: May 30, 2017
    Assignee: NXP USA, Inc.
    Inventors: Vladimir Litovtchenko, Harald Luepken, Markus Regner
  • Patent number: 9652242
    Abstract: An apparatus and method for calculating flag bits is disclosed. The flag bits may be used in a processor utilizing branch predication. More particularly, the apparatus and method may be used to calculate a predicate that can be used by a branch unit to evaluate whether a branch is to be taken. In one embodiment, the apparatus is coupled to receive a condition code associated with an instruction, and flag bits generated responsive to execution of the instruction. The condition code is indicative of a condition to be checked resulting from execution of the instruction. The apparatus may then provide an indication of whether the condition is true.
    Type: Grant
    Filed: May 2, 2012
    Date of Patent: May 16, 2017
    Assignee: Apple Inc.
    Inventors: Rajat Goel, Sandeep Gupta, Yamini Modukuru
  • Patent number: 9645819
    Abstract: A computer system, a computer processor and a method executable on a computer processor involve placing each sequence of a plurality of sequences of computer instructions being scheduled for execution in the processor into a separate queue. The head instruction from each queue is stored into a first storage unit prior to determining whether the head instruction is ready for scheduling. For each instruction in the first storage unit that is determined to be ready, the instruction is moved from the first storage unit to a second storage unit. During a first processor cycle, each instruction in the first storage unit that is determined to be not ready is retained in the first storage unit, and the determining of whether the instruction is ready is repeated during the next processor cycle. Scheduling logic performs scheduling of instructions contained in the second storage unit.
    Type: Grant
    Filed: June 15, 2012
    Date of Patent: May 9, 2017
    Assignee: Intel Corporation
    Inventors: Jayesh Iyer, Nikolay Kosarev, Sergey Shishlov, Alexey Sivtsov, Alexander Butuzov, Boris A. Babayan, Vladimir Penkovski
  • Patent number: 9639371
    Abstract: A system and method for efficiently processing instructions in hardware parallel execution lanes within a processor. In response to a given divergent point within an identified loop, a compiler generates code wherein when executed determines a size of a next very large instruction world (VLIW) to process and determine multiple pointer values to store in multiple corresponding PC registers in a target processor. The updated PC registers point to instructions intermingled from different basic blocks between the given divergence point and a corresponding convergence point. The target processor includes a single instruction multiple data (SIMD) micro-architecture. The assignment for a given lane is based on branch direction found at runtime for the given lane at the given divergent point. The processor includes a vector register for mapping PC registers to execution lanes.
    Type: Grant
    Filed: January 29, 2013
    Date of Patent: May 2, 2017
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Reza Yazdani
  • Patent number: 9639451
    Abstract: Debugger system, method and computer program product for debugging instructions.
    Type: Grant
    Filed: January 25, 2010
    Date of Patent: May 2, 2017
    Assignee: NXP USA, INC.
    Inventors: Constantin Tudor, Sorin Babeanu
  • Patent number: 9632787
    Abstract: Some methods, computer program products, and data processing nodes identify a data unit in a data memory that is to be operated upon by a processor circuit, and uses a characteristic of the data unit to identify what instruction(s) within an instruction memory is be executed by the processor circuit to perform an operation upon the data unit. The data memory may be local to the processor circuit, and the instruction memory may be remotely accessible to the processor circuit through a data network.
    Type: Grant
    Filed: October 23, 2012
    Date of Patent: April 25, 2017
    Assignee: CA, Inc.
    Inventor: Gregory Lewis Bodine
  • Patent number: 9632781
    Abstract: Techniques are provided for executing a vector alignment instruction. A scalar register file in a first processor is configured to share one or more register values with a second processor, the one or more register values accessed from the scalar register file according to an Rt address specified in a vector alignment instruction, wherein a start location is determined from one of the shared register values. An alignment circuit in the second processor is configured to align data identified between the start location within a beginning Vu register of a vector register file (VRF) and an end location of a last Vu register of the VRF according to the vector alignment instruction. A store circuit is configured to select the aligned data from the alignment circuit and store the aligned data in the vector register file according to an alignment store address specified by the vector alignment instruction.
    Type: Grant
    Filed: February 26, 2013
    Date of Patent: April 25, 2017
    Assignee: QUALCOMM Incorporated
    Inventors: Ajay A. Ingle, Marc M. Hoffman, Jose Fridman, Lucian Codrescu
  • Patent number: 9632980
    Abstract: An apparatus is described having instruction execution logic circuitry. The instruction execution logic circuitry has input vector element routing circuitry to perform the following for each of three different instructions: for each of a plurality of output vector element locations, route into an output vector element location an input vector element from one of a plurality of input vector element locations that are available to source the output vector element. The output vector element and each of the input vector element locations are one of three available bit widths for the three different instructions. The apparatus further includes masking layer circuitry coupled to the input vector element routing circuitry to mask a data structure created by the input vector routing element circuitry. The masking layer circuitry is designed to mask at three different levels of granularity that correspond to the three available bit widths.
    Type: Grant
    Filed: December 23, 2011
    Date of Patent: April 25, 2017
    Assignee: Intel Corporation
    Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Suleyman Sair
  • Patent number: 9632782
    Abstract: A processor includes an instruction decoder to receive a first instruction to process a secure hash algorithm 2 (SHA-2) hash algorithm, the first instruction having a first operand associated with a first storage location to store a SHA-2 state and a second operand associated with a second storage location to store a plurality of messages and round constants. The processor further includes an execution unit coupled to the instruction decoder to perform one or more iterations of the SHA-2 hash algorithm on the SHA-2 state specified by the first operand and the plurality of messages and round constants specified by the second operand, in response to the first instruction.
    Type: Grant
    Filed: March 30, 2012
    Date of Patent: April 25, 2017
    Assignee: Intel Corporation
    Inventors: Kirk S. Yap, Gilbert M. Wolrich, James D. Guilford, Vinodh Gopal, Erdinc Ozturk, Sean M. Gulley, Wajdi K. Feghali, Martin G. Dixon
  • Patent number: 9626189
    Abstract: Embodiments relate to reducing operand store compare penalties by detecting potential unit of operation (UOP) dependencies. An aspect includes a computer system for reducing operation store compare penalties. The system includes memory and a processor. The system performs a method including cracking an instruction into units of operation, where each UOP includes instruction text and address determination fields. The method includes identifying a load UOP among the plurality of UOPs and comparing values of the address determination fields of the load UOP with values of address determination fields of one or more previously-decoded store UOPs. The method also includes forcing, prior to issuance of the instruction to an execution unit, a dependency between the load UOP and the one or more previously-decoded store UOPs based on the comparing.
    Type: Grant
    Filed: June 15, 2012
    Date of Patent: April 18, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Fadi Busaba, David Hutton, John G. Rell, Jr., Chung-Lung K. Shum
  • Patent number: 9626256
    Abstract: A method for diagnosing an aborted transaction from a plurality of transactions is executed by a processor core with a transactional memory, that stores information corresponding to a plurality of transactions executed by the processor core, and a transaction diagnostic register. The processor core retrieves context summary information from at least one register of the processor core. The processor core stores the context summary information of aborted transactions into the transactional memory or the transaction diagnostic register. The context summary information can be used for diagnosing the aborted transactions.
    Type: Grant
    Filed: February 28, 2013
    Date of Patent: April 18, 2017
    Assignee: International Business Machines Corporation
    Inventors: Harold W Cain, Bradly G Frey, Hung Q Le, Cathy May
  • Patent number: 9619345
    Abstract: A processor core includes a transactional memory that stores information corresponding to a plurality of transactions executed by the processor core, and a transaction diagnostic register. The processor core retrieves context summary information from at least one register of the processor core. The processor core stores the context summary information of aborted transactions into the transactional memory or the transaction diagnostic register. The context summary information can be used for diagnosing the aborted transactions.
    Type: Grant
    Filed: September 13, 2012
    Date of Patent: April 11, 2017
    Assignee: International Business Machines Corporation
    Inventors: Harold W Cain, Bradly G Frey, Hung Q Le, Cathy May