Patents Examined by Benjamin Geib
  • Patent number: 9690587
    Abstract: Embodiments relate to variable branch prediction. An aspect includes determining a branch selection of an execution unit of a processor and determining whether a present prediction state of the state machine correctly predicted the branch selection by the execution unit. The aspect includes determining whether a predetermined condition is met for performing an alternative state transition and, based on determining that the predetermined condition is met, changing the present prediction state of the branch prediction state machine from the one state to another state according to an alternative state transition process based on the branch selection of the execution unit and the determination whether the present prediction state of the state machine correctly predicted the branch selection by the execution unit.
    Type: Grant
    Filed: April 8, 2014
    Date of Patent: June 27, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Narasimha R. Adiga, James J. Bonanno, Ashutosh Misra, Anthony Saporito
  • Patent number: 9652235
    Abstract: A system for synchronizing parallel processing of a plurality of functional processing units (FPU), a first FPU and a first program counter to control timing of a first stream of program instructions issued to the first FPU by advancement of the first program counter; a second FPU and a second program counter to control timing of a second stream of program instructions issued to the second FPU by advancement of the second program counter, the first FPU is in communication with a second FPU to synchronize the issuance of a first stream of program instructions to the second stream of program instructions and the second FPU is in communication with the first FPU to synchronize the issuance of the second stream program instructions to the first stream of program instructions.
    Type: Grant
    Filed: November 24, 2015
    Date of Patent: May 16, 2017
    Assignee: International Business Machines Corporation
    Inventor: Changhoan Kim
  • Patent number: 9652244
    Abstract: A processing bypass directory system and method are disclosed. In one embodiment, a bypass directory tracking process includes setting bits in a bypass directory when a corresponding architectural register is written. The bits are selectively cleared in the bypass directory each cycle. The configuration of the bits is utilized to determine which stage of a bypass path processing information is at.
    Type: Grant
    Filed: June 25, 2012
    Date of Patent: May 16, 2017
    Assignee: Intellectual Ventures Holding 81 LLC
    Inventors: Alexander Klaiber, Guillermo Rozas
  • Patent number: 9632790
    Abstract: A processing device comprises select logic to schedule a plurality of instructions for execution. The select logic calculates a reconstructed program order (RPO) value for each of a plurality of instructions that are ready to be scheduled for execution. The select logic creates an ordered list of instructions based on the delayed RPO values, the delayed RPO values comprising the calculated RPO values from a previous execution cycle, and dispatches instructions for scheduling based on the ordered list.
    Type: Grant
    Filed: December 26, 2012
    Date of Patent: April 25, 2017
    Assignee: Intel Corporation
    Inventors: Jayesh Iyer, Nikolay Kosarev, Sergey Y. Shishlov, Alexey Y. Sivtsov, Yuriy V Baida, Alexander V Butuzov, Bob Babayan, Vladimir Pentkovski
  • Patent number: 9619231
    Abstract: A central processing unit (CPU) having an interrupt unit for interrupting execution of instructions, a plurality context defining register sets, wherein each set of registers having the same number of CPU registers, a switching unit for coupling a selected register set within the CPU, wherein the switching unit switches to a predetermined register set of the plurality of context defining register sets upon occurrence of an exception, and a control register configured to control selection of a register set of the plurality of context defining register initiated by an instruction and further operable to indicate a currently used context.
    Type: Grant
    Filed: March 7, 2014
    Date of Patent: April 11, 2017
    Assignee: MICROCHIP TECHNOLOGY INCORPORATED
    Inventors: Michael I. Catherwood, Bryan Kris, David Mickey, Joseph Kanellopoulos
  • Patent number: 9606807
    Abstract: Devices, systems, and methods of communicating information directly to a sequencer or a buffer in a memory device are provided. In some embodiments, instructions are sent directly from an external processor to a sequencer in the memory device, and the sequencer configures the instructions for an internal processor, such as one or more arithmetic logic units (ALUs) embedded on the memory device. Further, data to be operated on by the internal processor can be sent directly from the external processor to a buffer, and the sequencer can copy the data from the buffer to the internal processor. As power can be consumed each time a memory array is written to or read from, the direct communication of instructions and/or data can reduce the power consumed in writing to or reading from the memory array.
    Type: Grant
    Filed: June 4, 2009
    Date of Patent: March 28, 2017
    Assignee: Micron Technology, Inc.
    Inventor: Robert Walker
  • Patent number: 9606829
    Abstract: A technique for suspending transactional memory transactions without stack corruption. A first function that begins a transactional memory transaction is allocated a stack frame on a default program stack, then returns. Prior to suspending the transaction, or after suspending the transaction but prior to allocating any suspended mode stack frames, either of the following operations is performed: (1) switch from the default program stack to an alternative program stack, or (2) switch from a default region of the default program stack where the first function's stack frame was allocated to an alternative region of the default program stack. Prior to resuming the transaction, or after resuming the transaction but prior to allocating any transaction mode stack frames, either of the following operations is performed: (1) switch from the alternative program stack to the default program stack, or (2) switch from the alternative stack region to the default stack region.
    Type: Grant
    Filed: October 11, 2014
    Date of Patent: March 28, 2017
    Assignee: International Business Machines Corporation
    Inventor: Paul E. McKenney
  • Patent number: 9600287
    Abstract: An instruction stream includes a transactional code region. The transactional code region includes a latent modification instruction (LMI), a next sequential instruction (NSI) following the LMI, and a set of target instructions following the NSI in program order. Each target instruction has an associated function, and the LMI at least partially specifies a substitute function for the associated function. A processor executes the LMI, the NSI, and at least one of the target instructions, employing the substitute function at least partially specified by the LMI. The LMI, the NSI, and the target instructions may be executed by the processor in sequential program order or out of order.
    Type: Grant
    Filed: August 19, 2015
    Date of Patent: March 21, 2017
    Assignee: International Business Machines Corporation
    Inventors: Michael Karl Gschwind, Valentina Salapura, Chung-Lung K. Shum, Timothy J. Slegel
  • Patent number: 9600286
    Abstract: An instruction stream includes a transactional code region. The transactional code region includes a latent modification instruction (LMI), a next sequential instruction (NSI) following the LMI, and a set of target instructions following the NSI in program order. Each target instruction has an associated function, and the LMI at least partially specifies a substitute function for the associated function. A processor executes the LMI, the NSI, and at least one of the target instructions, employing the substitute function at least partially specified by the LMI. The LMI, the NSI, and the target instructions may be executed by the processor in sequential program order or out of order.
    Type: Grant
    Filed: June 30, 2014
    Date of Patent: March 21, 2017
    Assignee: International Business Machines Corporation
    Inventors: Michael Karl Gschwind, Valentina Salapura, Chung-Lung K. Shum, Timothy J. Slegel
  • Patent number: 9594724
    Abstract: An aspect includes accessing a vector register in a vector register file. The vector register file includes a plurality of vector registers and each vector register includes a plurality of elements. A read command is received at a read port of the vector register file. The read command specifies a vector register address. The vector register address is decoded by an address decoder to determine a selected vector register of the vector register file. An element address is determined for one of the plurality of elements associated with the selected vector register based on a read element counter of the selected vector register. A word is selected in a memory array of the selected vector register as read data based on the element address. The read data is output from the selected vector register based on the decoding of the vector register address by the address decoder.
    Type: Grant
    Filed: August 9, 2012
    Date of Patent: March 14, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Bruce M. Fleischer, Thomas W. Fox, Hans M. Jacobson, Ravi Nair
  • Patent number: 9594564
    Abstract: An arithmetic processing device includes: first prediction units which output branch prediction information of a fetched conditional branch instruction based on past branch history information of conditional branch instructions; a second prediction unit which stores a branch taken consecutive number of times and a branch not-taken consecutive number of times to a pattern information storage unit, and outputs branch prediction information of a fetched conditional branch instruction based on the past branch taken consecutive number of times or branch not-taken consecutive number of times stored; selecting units which selectively output the branch prediction information output from the first prediction units or the second prediction unit; and a selector which outputs a next instruction address of the conditional branch instruction or a branch target address of the conditional branch instruction to an instruction fetch unit in accordance with the branch prediction information output by the selecting units.
    Type: Grant
    Filed: June 30, 2014
    Date of Patent: March 14, 2017
    Assignee: FUJITSU LIMITED
    Inventors: Kotaro Kuwahara, Takashi Suzuki
  • Patent number: 9588769
    Abstract: A processor performs out-of-order execution of a first instruction and a second instruction after the first instruction in program order, the first instruction includes source and destination indicators, the source indicator specifies a source of data, the destination indicator specifies a destination of the data, the first instruction instructs the processor to move the data from the source to the destination, the second instruction specifies a source indicator that specifies a source of data. A rename unit updates the second instruction source indicator with the first instruction source indicator if there are no intervening instructions that write to the source or to the destination of the first instruction and the second instruction source indicator matches the first instruction destination indicator.
    Type: Grant
    Filed: June 25, 2014
    Date of Patent: March 7, 2017
    Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD.
    Inventors: Gerard M. Col, Matthew Daniel Day
  • Patent number: 9582466
    Abstract: An aspect includes accessing a vector register in a vector register file. The vector register file includes a plurality of vector registers and each vector register includes a plurality of elements. A read command is received at a read port of the vector register file. The read command specifies a vector register address. The vector register address is decoded by an address decoder to determine a selected vector register of the vector register file. An element address is determined for one of the plurality of elements associated with the selected vector register based on a read element counter of the selected vector register. A word is selected in a memory array of the selected vector register as read data based on the element address. The read data is output from the selected vector register based on the decoding of the vector register address by the address decoder.
    Type: Grant
    Filed: August 13, 2012
    Date of Patent: February 28, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Bruce M. Fleischer, Thomas W. Fox, Hans M. Jacobson, Ravi Nair
  • Patent number: 9575755
    Abstract: Embodiments relate to vector processing in an active memory device. An aspect includes a method for vector processing in an active memory device that includes memory and a processing element. The method includes decoding, in the processing element, an instruction including a plurality of sub-instructions to execute in parallel. An iteration count to repeat execution of the sub-instructions in parallel is determined. Based on the iteration count, execution of the sub-instructions in parallel is repeated for multiple iterations by the processing element. Multiple locations in the memory are accessed in parallel based on the execution of the sub-instructions.
    Type: Grant
    Filed: August 3, 2012
    Date of Patent: February 21, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Bruce M. Fleischer, Thomas W. Fox, Hans M. Jacobson, Ravi Nair, Daniel A. Prener
  • Patent number: 9569215
    Abstract: A system for synchronizing parallel processing of a plurality of functional processing units (FPU), a first FPU and a first program counter to control timing of a first stream of program instructions issued to the first FPU by advancement of the first program counter; a second FPU and a second program counter to control timing of a second stream of program instructions issued to the second FPU by advancement of the second program counter, the first FPU is in communication with a second FPU to synchronize the issuance of a first stream of program instructions to the second stream of program instructions and the second FPU is in communication with the first FPU to synchronize the issuance of the second stream program instructions to the first stream of program instructions.
    Type: Grant
    Filed: August 15, 2016
    Date of Patent: February 14, 2017
    Assignee: International Business Machines Corporation
    Inventor: Changhoan Kim
  • Patent number: 9558151
    Abstract: Disclosed is a data processing device capable of efficiently performing an arithmetic process on variable-length data and an arithmetic process on fixed-length data. The data processing device includes first PEs of SIMD type, SRAMs provided respectively for the first PEs, and second PEs. The first PEs each perform an arithmetic operation on data stored in a corresponding one of the SRAMs. The second PEs each perform an arithmetic operation on data stored in corresponding ones of the SRAMs. Therefore, the SRAMs can be shared so as to efficiently perform the arithmetic process on variable-length data and the arithmetic process on fixed-length data.
    Type: Grant
    Filed: February 3, 2012
    Date of Patent: January 31, 2017
    Assignee: RENESAS ELECTRONICS CORPORATION
    Inventors: Kan Murata, Hideyuki Noda, Masaru Haraguchi
  • Patent number: 9519485
    Abstract: Embodiments relate to confidence threshold-based opposing path execution for branch prediction. An aspect includes determining a branch prediction for a first branch instruction that is encountered during execution of a first thread, wherein the branch prediction indicates a primary path and an opposing path for the first branch instruction. Another aspect includes executing the primary path by the first thread. Another aspect includes determining a confidence of the branch prediction and comparing the confidence of the branch prediction to a confidence threshold. Yet another aspect includes, based on the confidence of the branch prediction being less than the confidence threshold, starting a second thread that executes the opposing path of the first branch instruction, wherein the second thread is executed in parallel with the first thread.
    Type: Grant
    Filed: September 30, 2014
    Date of Patent: December 13, 2016
    Assignee: International Business Machines Corporation
    Inventors: Fadi Y. Busaba, Steven R. Carlough, Christopher A. Krygowski, Brian R. Prasky, Chung-Lung K. Shum
  • Patent number: 9519488
    Abstract: An external intrinsic interface. A processor may include a core including a plurality of functional units, an intrinsic module located outside the core, and an interface module to perform relaying between the intrinsic module and a functional unit, among the plurality of functional units.
    Type: Grant
    Filed: February 16, 2012
    Date of Patent: December 13, 2016
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Kwon Taek Kwon, Seok Yoon Jung
  • Patent number: 9513916
    Abstract: A computer-implemented method includes determining that two or more instructions of an instruction stream are eligible for optimization, where the two or more instructions include a memory load instruction and a data processing instruction to process data based on the memory load instruction. The method includes merging, by a processor, the two or more instructions into a single optimized internal instruction and executing the single optimized internal instruction to perform a memory load function and a data processing function corresponding to the memory load instruction and the data processing instruction.
    Type: Grant
    Filed: March 8, 2013
    Date of Patent: December 6, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michael K. Gschwind, Valentina Salapura
  • Patent number: 9513915
    Abstract: A computer system for optimizing instructions includes a processor including an instruction execution unit configured to execute instructions and an instruction optimization unit configured to optimize instructions and memory to store machine instructions to be executed by the instruction execution unit. The computer system is configured to perform a method including analyzing machine instructions from among a stream of instructions to be executed by the instruction execution unit, the machine instructions including a memory load instruction and a data processing instruction to perform a data processing function based on the memory load instruction, identifying the machine instructions as being eligible for optimization, merging the machine instructions into a single optimized internal instruction, and executing the single optimized internal instruction to perform a memory load function and a data processing function corresponding to the memory load instruction and the data processing instruction.
    Type: Grant
    Filed: March 28, 2012
    Date of Patent: December 6, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michael K. Gschwind, Valentina Salapura