Patents Examined by Kenneth Kim
  • Patent number: 9411587
    Abstract: A prefetch optimizer tool for an information handling system (IHS) may improve effective memory access time by controlling both hardware prefetch operations and software prefetch operations. The prefetch optimizer tool selectively disables prefetch instructions in an instruction sequence of interest within an application. The tool measures execution times of the instruction sequence of interest when different prefetch instructions are disabled. The tool may hold hardware prefetch depth constant while cycling through disabling different prefetch instructions and taking corresponding execution time measurements. Alternatively, for each disabled prefetch instruction in the instruction sequence of interest, the tool may cycle through different hardware prefetch depths and take corresponding execution time measurements at each hardware prefetch depth.
    Type: Grant
    Filed: December 11, 2013
    Date of Patent: August 9, 2016
    Assignee: International Business Machines Corporation
    Inventor: Randall Ray Heisch
  • Patent number: 9412718
    Abstract: Methods are provided to operate a processor device in one of multiple power operating modes. The processor device comprises first and second processor chips connected in a stacked configuration, and which respectively include first and second processors that operate as a single logical processor. A control system generates control signals and different sets of configuration parameters. A first control signal is generated to input a first set of configuration parameters to the single logical processor, which is utilized to operate the single logical processor in a first power operating mode wherein the first processor is turned on and the second processor is turned off. A second control signal is generated to input a second set of configuration parameters to the single logical processor, which is utilized to operate the single logical processor in a second power operating mode wherein both the first processor and the second processor are turned on.
    Type: Grant
    Filed: August 31, 2012
    Date of Patent: August 9, 2016
    Assignee: International Business Machines Corporation
    Inventor: Philip G. Emma
  • Patent number: 9389858
    Abstract: An apparatus includes an instruction decoder, first and second source registers and a circuit coupled to the decoder to receive packed data from the source registers and to unpack the packed data responsive to an unpack instruction received by the decoder. A first packed data element and a third packed data element are received from the first source register. A second packed data element and a fourth packed data element are received from the second source register. The circuit copies the packed data elements into a destination register resulting with the second packed data element adjacent to the first packed data element, the third packed data element adjacent to the second packed data element, and the fourth packed data element adjacent to the third packed data element.
    Type: Grant
    Filed: December 29, 2012
    Date of Patent: July 12, 2016
    Assignee: Intel Corporation
    Inventors: Alexander D. Peleg, Yaakov Yaari, Millind Mittal, Larry M. Mennemeier, Benny Eitan
  • Patent number: 9391047
    Abstract: Processor devices are provided which operate in one of multiple power operating modes. A processor device comprises first and second processor chips connected in a stacked configuration, and which respectively include first and second processors that operate as a single logical processor. A mode control circuit generates control signals and different sets of configuration parameters. A first control signal is generated to input a first set of configuration parameters to the single logical processor, which is utilized to operate the single logical processor in a first power operating mode wherein the first processor is turned on and the second processor is turned off. A second control signal is generated to input a second set of configuration parameters to the single logical processor, which is utilized to operate the single logical processor in a second power operating mode wherein both the first processor and the second processor are turned on.
    Type: Grant
    Filed: April 20, 2012
    Date of Patent: July 12, 2016
    Assignee: International Business Machines Corporation
    Inventor: Philip G. Emma
  • Patent number: 9374414
    Abstract: Embodiments of the invention provide a method, system and computer program product for embedding a global barrier and global interrupt network in a parallel computer system organized as a torus network. The computer system includes a multitude of nodes. In one embodiment, the method comprises taking inputs from a set of receivers of the nodes, dividing the inputs from the receivers into a plurality of classes, combining the inputs of each of the classes to obtain a result, and sending said result to a set of senders of the nodes. Embodiments of the invention provide a method, system and computer program product for embedding a collective network in a parallel computer system organized as a torus network. In one embodiment, the method comprises adding to a torus network a central collective logic to route messages among at least a group of nodes in a tree structure.
    Type: Grant
    Filed: August 26, 2013
    Date of Patent: June 21, 2016
    Assignee: International Business Machines Corporation
    Inventors: Dong Chen, Paul W. Coteus, Noel A. Eisley, Alan Gara, Philip Heidelberger, Robert M. Senger, Valentina Salapura, Burkhard Steinmacher-Burow, Yutaka Sugawara, Todd E. Takken
  • Patent number: 9367497
    Abstract: A method for dynamically reconfiguring one or more cores of a multi-core microprocessor comprising a plurality of cores and sideband communication wires, extrinsic to a system bus connected to a chipset, which facilitate non-system-bus inter-core communications. At least some of the cores are operable to be reconfigurably designated with or without master credentials for purposes of structuring sideband-based inter-core communications. The method includes determining an initial configuration of cores of the microprocessor, which configuration designates at least one core, but not all of the cores, as a master core, and reconfiguring the cores according to a modified configuration, which modified configuration removes a master designation from a core initially so designated, and assigns a master designation to a core not initially so designated. Each core is configured to conditionally drive a sideband communication wire to which it is connected based upon its designation, or lack thereof, as a master core.
    Type: Grant
    Filed: October 24, 2014
    Date of Patent: June 14, 2016
    Assignee: VIA TECHNOLOGIES, INC.
    Inventors: G. Glenn Henry, Stephan Gaskins
  • Patent number: 9361100
    Abstract: A processor includes a first register with first, second, third, and fourth data elements. A second register to hold fifth, sixth, seventh, and eighth data elements, and a third register. A decoder to decode a packed instruction to identify the first and second registers as source registers and the third register as a destination register. And to decode a pack instruction to identify a fourth and a fifth register each having 16-bit data elements. At least one functional unit, responsive to the packed instruction, to store a result in the third register including only half of all data elements of each of the first and second registers, including only corresponding data elements from corresponding positions in the first and second registers, and responsive to the pack instruction to store a result that is to include an 8-bit data element for each 16-bit data element in the fourth and fifth registers.
    Type: Grant
    Filed: December 29, 2012
    Date of Patent: June 7, 2016
    Assignee: Intel Corporation
    Inventors: Alexander D. Peleg, Yaakov Yaari, Millind Mittal, Larry M. Mennemeier, Benny Eitan
  • Patent number: 9354878
    Abstract: In one embodiment, a processor includes an execution unit and at least one last branch record (LBR) register to store address information of a branch taken during program execution. This register may further store a transaction indicator to indicate whether the branch was taken during a transactional memory (TM) transaction. This register may further store an abort indicator to indicate whether the branch was caused by a transaction abort. Other embodiments are described and claimed.
    Type: Grant
    Filed: June 14, 2013
    Date of Patent: May 31, 2016
    Assignee: Intel Corporation
    Inventors: Ravi Rajwar, Peter Lachner, Laura A. Knauth, Konrad K. Lai
  • Patent number: 9348791
    Abstract: Compute clusters to implement a high performance computer are provided. For example, a compute cluster includes an optical redistribution box, and a processor module comprising M processors. The optical redistribution box includes N optical input connectors, N optical global connectors, and internal optical connections configured to connect each optical input connector to every optical global connector such that each duplex pair of a given optical input bundle connected connected to the optical global connectors. A first group of N processors (wherein N=M/2) of the processor module is optically connected to one of the optical input connectors via one of the optical input bundles, and second group of N processors of the processor module is optically connected to one of the optical global connectors via one of the optical global bundles.
    Type: Grant
    Filed: October 9, 2014
    Date of Patent: May 24, 2016
    Assignee: International Business Machines Corporation
    Inventors: Evan G. Colgan, Monty M. Denneau, Daniel M. Kuchta
  • Patent number: 9323551
    Abstract: A technique of modifying a code sequence for a processor includes identifying a set of one or more target instructions in the code sequence. A replacement instruction is selected that includes a set of replacement instruction parts. A length of each of the replacement instruction parts corresponds to a minimum instruction length for an instruction set of the processor. The replacement instruction parts include a first instruction type and one or more second instruction types that are each configured as exception instructions if processed in isolation from the first instruction type. The replacement instruction is then substituted for the set of one or more target instructions in the code sequence for processing by the processor.
    Type: Grant
    Filed: January 6, 2012
    Date of Patent: April 26, 2016
    Assignee: International Business Machines Corporation
    Inventor: Neil A. Campbell
  • Patent number: 9317301
    Abstract: A microprocessor includes a plurality of registers that holds an architectural state of the microprocessor and an indicator that indicates a boot instruction set architecture (ISA) of the microprocessor as either the x86 ISA or the Advanced RISC Machines (ARM) ISA. The microprocessor also includes a hardware instruction translator that translates x86 ISA instructions and ARM ISA instructions into microinstructions. The hardware instruction translator translates, as instructions of the boot ISA, the initial ISA instructions that the microprocessor fetches from architectural memory space after receiving a reset signal. The microprocessor also includes an execution pipeline, coupled to the hardware instruction translator. The execution pipeline executes the microinstructions to generate results defined by the x86 ISA and ARM ISA instructions.
    Type: Grant
    Filed: October 28, 2014
    Date of Patent: April 19, 2016
    Assignee: VIA TECHNOLOGIES, INC.
    Inventors: G. Glenn Henry, Terry Parks, Rodney E. Hooker
  • Patent number: 9317297
    Abstract: Embodiments may provide a method for performing a replay of a previous execution of a program. The method includes generating an order of recorded chunks of instructions across a plurality of recorded threads based, at least in part, on log files generated from the previous execution of the program. The method includes initiating execution of the program, the executing program having a plurality of threads, each thread having a number of chunks of instructions. The method includes intercepting, by a virtual machine unit executing on a processor, an instruction of a chunk before the instruction is executed. The method includes determining, by a replay module executing on the processor, that the chunk is an active chunk if the chunk is currently in line for execution according to the order of recorded chunks, and responsive to a determination that the chunk is the active chunk, executing the instruction.
    Type: Grant
    Filed: September 27, 2012
    Date of Patent: April 19, 2016
    Assignee: Intel Corporation
    Inventors: Justin E. Gottschlich, Klaus Danne, Cristiano L. Pereira, Gilles A. Pokam, Rolf Kassa, Shiliang Hu, Tim Kranich
  • Patent number: 9317285
    Abstract: A system and method for efficiently reducing the power consumption of register file accesses. A processor is operable to execute instructions with two or more data types, each with an associated size and alignment. Data operands for a first data type use operand sizes equal to an entire width of a physical register within a physical register file. Data operands for a second data type use operand sizes less than an entire width of a physical register. Accesses of the physical register file for operands associated with a non-full-width data type do not access a full width of the physical registers. A given numerical value may be bypassed for the portion of the physical register that is not accessed.
    Type: Grant
    Filed: April 30, 2012
    Date of Patent: April 19, 2016
    Assignee: Apple Inc.
    Inventors: Sandeep Gupta, Conrado Blasco-Allue, John H. Mylius, Gerard R. Williams, III, James B. Keller
  • Patent number: 9304963
    Abstract: Systems and methods self-organize a multifunctional power and energy control and management system by integrating multiple backplane based modules through module descriptions, the module descriptions including control logic and parameters associated with the modules. Dynamic data table structures may be configured based on information provides with the module descriptions and provide for improved data accessing, storing, and updating.
    Type: Grant
    Filed: January 14, 2011
    Date of Patent: April 5, 2016
    Assignee: Rockwell Automation Technologies, Inc.
    Inventors: Xuyyan Xiao, Mark E. Delker, Benfeng Tang, Chao Chen, Steven A. Lombardi, David Berman
  • Patent number: 9304772
    Abstract: A system and method is provided for improving efficiency, power, and bandwidth consumption in parallel processing. Rather than requiring memory polling to ensure ordered execution of processes or threads in wavefronts, the techniques disclosed herein provide a system and method to allow any process or thread in a wavefront to run out of order as long as needed, but ensure ordered execution of multiple ordered instructions when needed. These operations are handled efficiently in hardware, but are flexible enough to be implemented in all manner of programming models.
    Type: Grant
    Filed: March 29, 2012
    Date of Patent: April 5, 2016
    Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: Laurent Lefebvre, Michael Mantor
  • Patent number: 9298672
    Abstract: Three-dimensional (3-D) processor structures are provided which are constructed by connecting processors in a stacked configuration. For example, a processor system includes a first processor chip comprising a first processor, and a second processor chip comprising a second processor. The first and second processor chips are connected in a stacked configuration with the first and second processors connected through vertical connections between the first and second processor chips. The processor system further includes a mode control circuit to selectively configure the first and second processors of the first and second processor chips to operate in one of a plurality of operating modes, wherein the processors can be selectively configured to operate independently, to aggregate resources, to share resources, and/or be combined to form a single processor image.
    Type: Grant
    Filed: September 4, 2012
    Date of Patent: March 29, 2016
    Assignee: International Business Machines Corporation
    Inventors: Alper Buyuktosunoglu, Philip G. Emma, Allan M. Hartstein, Michael B. Healy, Krishnan Kunjunny Kailas
  • Patent number: 9292294
    Abstract: Method and apparatus to efficiently detect violations of data dependency relationships. A memory address associated with a computer instruction may be obtained. A current state of the memory address may be identified. The current state may include whether the memory address is associated with a read or a store instruction, and whether the memory address is associated with a set or a check. A previously accumulated state associated with the memory address may be retrieved from a data structure. The previously accumulated state may include whether the memory address was previously associated with a read or a store instruction, and whether the memory address was previously associated with a set or a check. If a transition from the previously accumulated state to the current state is invalid, a failure condition may be signaled.
    Type: Grant
    Filed: September 27, 2012
    Date of Patent: March 22, 2016
    Assignee: Intel Corporation
    Inventors: Muawya M. Al-Otoom, Paul Caprioli, Ryan Carlson, Ho-Seop Kim, Omar Shaikh
  • Patent number: 9292296
    Abstract: Processing instruction grouping information is provided that includes: reading addresses of machine instructions grouped by a processor at runtime from a buffer to form an address file; analyzing the address file to obtain grouping information of the machine instructions; converting the machine instructions in the address file into readable instructions; and obtaining grouping information of the readable instructions based on the grouping information of the machine instructions and the readable instructions resulted from conversion. Status of grouping and processing performed on instructions by a processor at runtime can be acquired dynamically, such that processing capability of the processor can be better utilized.
    Type: Grant
    Filed: April 26, 2012
    Date of Patent: March 22, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Qin Yue Chen, Qi Liang, Hong Chang Lin, Feng Liu
  • Patent number: 9286145
    Abstract: Processing data communications events in a parallel active messaging interface (‘PAMI’) of a parallel computer that includes compute nodes that execute a parallel application, with the PAMI including data communications endpoints, and the endpoints are coupled for data communications through the PAMI and through other data communications resources, including determining by an advance function that there are no actionable data communications events pending for its context, placing by the advance function its thread of execution into a wait state, waiting for a subsequent data communications event for the context; responsive to occurrence of a subsequent data communications event for the context, awakening by the thread from the wait state; and processing by the advance function the subsequent data communications event now pending for the context.
    Type: Grant
    Filed: November 8, 2012
    Date of Patent: March 15, 2016
    Assignee: International Business Machines Corporation
    Inventors: Charles J. Archer, Michael A. Blocksome, Joseph D. Ratterman, Brian E. Smith
  • Patent number: 9280344
    Abstract: A processor includes a plurality of execution units. At least one of the execution units is configured to repeatedly execute a first instruction based on a first field of the first instruction indicating that the first instruction is to be iteratively executed.
    Type: Grant
    Filed: September 27, 2012
    Date of Patent: March 8, 2016
    Assignee: TEXAS INSTRUMENTS INCORPORATED
    Inventors: Horst Diewald, Johann Zipperer