Patents Examined by Corey S Faherty
  • Patent number: 10175986
    Abstract: A processor includes a logic for stateless capture of data linear addresses (DLA) during precise event based sampling (PEBS) for an out-of-order execution engine. The engine may include a PEBS unit with logic to increment a counter each time an instance of a designated micro-op is retired a reorder buffer, capture output DLA referenced by an instance of the micro-op that executes after the counter overflows, set a captured bit associated with a reorder buffer identifier for the instance of the micro-op, and store a PEBS record in a debug storage when the instance of the micro-op is retired from the reorder buffer. The designated micro-op references a DLA of a memory accessible to the processor.
    Type: Grant
    Filed: May 8, 2017
    Date of Patent: January 8, 2019
    Assignee: Intel Corporation
    Inventors: Roger Gramunt, Ramon Matas, Benjamin C. Chaffin, Neal S. Moyer, Rammohan Padmanabhan, Alexey P. Suprun, Matthew G. Smith
  • Patent number: 10157062
    Abstract: A method is described for operating a microprocessor, in which a conversion software executed in the microprocessor carries out a binary translation, in the course of which a source instruction that is encoded according to a first instruction-set architecture is translated into a target instruction in a binary manner, which is encoded according to a second instruction-set architecture, and the target instruction translated by the translation software into the second instruction-set architecture being replicated, and in this replicated target instruction a memory area which is to be accessed in the course of the execution of the target instruction is replaced by a second memory area, and the target instruction and the copied target instruction is executed by the microprocessor.
    Type: Grant
    Filed: February 28, 2017
    Date of Patent: December 18, 2018
    Assignee: ROBERT BOSCH GMBH
    Inventor: Jaroslaw Topp
  • Patent number: 10146542
    Abstract: Methods and apparatuses relating to converting encoding formats are described. In one embodiment, a hardware processor includes a decode circuit to decode an instruction comprising a state operand, a source vector operand, a destination vector operand, and a control operand, and an execution circuit to execute the instruction to convert elements from the source vector operand in a first encoding format to a second encoding format, store the elements in the second encoding format in the destination vector operand, store a total length of the elements in the second encoding format in the state operand, and set a stream completion indication in the control operand when the elements from the source vector operand are a last elements in a data stream.
    Type: Grant
    Filed: December 29, 2015
    Date of Patent: December 4, 2018
    Assignee: Intel Corporation
    Inventors: Yevgeny Y. Rouban, Daniil Y. Sokolov
  • Patent number: 10146736
    Abstract: A data processing system comprising: a processor comprising a plurality of cores, each core comprising a first processing pipeline and a second processing pipeline, the second processing pipeline having a different architecture to the first processing pipeline; a framework configured to manage the processing resources of the data processing system including the processor; and an interface configured to present to the framework each of the processing pipelines as a core.
    Type: Grant
    Filed: September 18, 2015
    Date of Patent: December 4, 2018
    Assignee: Imagination Technologies Limited
    Inventor: Jung-Wook Park
  • Patent number: 10146541
    Abstract: Method, apparatus, and program means for performing bitstream buffer manipulation with a SIMD merge instruction. The method of one embodiment comprises determining whether any unprocessed data bits for a partial variable length symbol exist in a first data block is made. A shift merge operation is performed to merge the unprocessed data bits from the first data block with a second data block. A merged data block is formed. A merged variable length symbol comprised of the unprocessed data bits and a plurality of data bits from the second data block is extracted from the merged data block.
    Type: Grant
    Filed: November 12, 2015
    Date of Patent: December 4, 2018
    Assignee: Intel Corporation
    Inventors: Julien Sebot, William W. Macy, Jr., Eric L. Debes, Huy V. Nguyen
  • Patent number: 10146535
    Abstract: Embodiments of systems, apparatuses, and methods for chained fused multiply add. In some embodiments, an apparatus includes a decoder to decode a single instruction having an opcode, a destination field representing a destination operand, a first source field representing a plurality of packed data source operands of a first type that have packed data elements of a first size, a second source field representing a plurality of packed data source operands that have packed data elements of a second size, and a field for a memory location that stores a scalar value. A register file having a plurality of packed data registers includes registers for the plurality of packed data source operands that have packed data elements of a first size, the source operands that have packed data elements of a second size, and the destination operand.
    Type: Grant
    Filed: October 20, 2016
    Date of Patent: December 4, 2018
    Assignee: Intel Corporatoin
    Inventors: Jesus Corbal, Robert Valentine, Roman S. Dubtsov, Nikita A. Shustrov, Mark J. Charney, Dennis R. Bradford, Milind B. Girkar, Edward T. Grochowski, Thomas D. Fletcher, Warren E. Ferguson
  • Patent number: 10140128
    Abstract: A parallelized multiple dispatch ordered queue including an ordered queue, qualify logic, ordered select logic, and dispatch logic. The ordered queue stores candidates in order from oldest to youngest into multiple entries. The ordered queue is divided into N groups in which an i'th group includes every i'th entry of every N entries of the ordered queue, wherein i is an integer less than or equal to N. The qualify logic determines whether any candidate is ready to be dispatched. The ordered select logic respectively determines the oldest candidate in each group that is ready to be dispatched. The dispatch logic dispatches the oldest ready candidates in parallel. The shift logic shifts the stored candidates in the ordered queue to fill any vacant entries between remaining ones of the stored candidates without changing an order of the remaining ones of the stored candidates in the ordered queue. The ordered queue may have any size or depth and N is any suitable integer determining the number of candidates (e.g.
    Type: Grant
    Filed: March 10, 2015
    Date of Patent: November 27, 2018
    Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD.
    Inventors: Qianli Di, Jianbin Wang, Weili Li, Xiaoyuan Yu, Xin Yu Gao
  • Patent number: 10127043
    Abstract: A method and system for implementing very long instruction words (VLIW), the system operable to: receive a first very long instruction word (VLIW) including a set of slot instructions corresponding to a set of functional units, where: each slot instruction includes an opcode identifying an operation to be performed by the set of functional units and value fields related to the operation, where a dedicated subset of the value fields include dedicated bits dedicated to the slot instruction and an allocable subset of the value fields include allocable bits allocable to other slot instructions; identify the opcodes of each slot instruction; determine, based on the opcodes, which allocable bits are allocated to which slot instructions; and instruct each functional unit to perform an operation identified by a corresponding slot instruction using the corresponding dedicated bits and any allocable bits determined to be allocated to the slot instruction.
    Type: Grant
    Filed: October 19, 2016
    Date of Patent: November 13, 2018
    Assignee: Rex Computing, Inc.
    Inventors: Paul Michael Sebexen, Thomas Rex Sohmers
  • Patent number: 10127059
    Abstract: A multi-tenant virtual machine infrastructure (MTVMI) allows multiple tenants to independently access and use a plurality of virtual computing resources via the Internet. Within the MTVMI, different tenants may define unique configurations of virtual computing resources and unique rules to govern the use of the virtual computing resources. The MTVMI may be configured to provide valuable services for tenants and users associated with the tenants.
    Type: Grant
    Filed: May 2, 2009
    Date of Patent: November 13, 2018
    Assignee: Skytap
    Inventors: Nicholas Luis Astete, Aaron Benjamin Brethorst, Joseph Michael Goldberg, Matthew Hanlon, Anthony A. Hutchinson, Jr., Gopalakrishnan Janakiraman, Alexander Kotelnikov, Petr Novodvorskiy, David William Richardson, Roxanne Camille Skelly, Nikolai Slioussar, Jonathan Weeks
  • Patent number: 10120685
    Abstract: An apparatus and method for supporting simultaneous multiple iterations (SMI) in a course grained reconfigurable architecture (CGRA). In support of SMI, the apparatus includes: Hardware structures that connect all of multiple processing engines (PEs) to a load-store unit (LSU) configured to keep track of which compiled program code iterations have completed, which ones are in flight and which are yet to begin, and a control unit including hardware structures that are used to maintain synchronization and initiate and terminate loops within the PEs. SMI permits execution of the next instruction within any iteration (in flight). If instructions from multiple iterations are ready for execution (and are pre-decoded), then the hardware selects the lowest iteration number ready for execution. If in a particular clock cycle, a loop iteration with a lower iteration number is stalled (i.e.
    Type: Grant
    Filed: November 4, 2015
    Date of Patent: November 6, 2018
    Assignee: International Business Machines Corporation
    Inventors: Chia-yu Chen, Kailash Gopalakrishnan, Jinwook Oh, Sunil K. Shukla, Vijayalakshmi Srinivasan
  • Patent number: 10114639
    Abstract: An arithmetic device which controls a parallel arithmetic operation includes a global memory, a plurality of compute units, each of the compute units including a local memory and a plurality of processing elements, and each of the processing elements including a private memory and processing data blocks stored in the private memory, an attribute group holding unit which includes a specific attribute which includes a parameter indicative of a size of the data block, an arithmetic attribute which includes a parameter indicating whether the data block is a data relevant to processing, and indicating a transfer order when the data block is data relevant to processing, and a policy attribute which includes a parameter indicative of how to execute a transfer of the data block and how to execute processing of the data block.
    Type: Grant
    Filed: April 28, 2017
    Date of Patent: October 30, 2018
    Assignee: RENESAS ELECTRONICS CORPORATION
    Inventor: Shorin Kyo
  • Patent number: 10108417
    Abstract: Storing narrow produced values for instruction operands directly in a register map in an out-of-order processor (OoP) is provided. An OoP is provided that includes an instruction processing system. The instruction processing system includes a number of instruction processing stages configured to pipeline the processing and execution of instructions according to a dataflow execution. The instruction processing system also includes a register map table (RMT) configured to store address pointers mapping logical registers to physical registers in a physical register file (PRF) for storing produced data for use by consumer instructions without overwriting logical registers for later executed, out-of-order instructions. In certain aspects, the instruction processing system is configured to write back (i.e., store) narrow values produced by executed instructions directly into the RMT, as opposed to writing the narrow produced values into the PRF in a write back stage.
    Type: Grant
    Filed: September 21, 2015
    Date of Patent: October 23, 2018
    Assignee: QUALCOMM Incorporated
    Inventors: Anil Krishna, Rodney Wayne Smith, Sandeep Suresh Navada, Shivam Priyadarshi, Raguram Damodaran
  • Patent number: 10102017
    Abstract: A computing system in which a software component executing on a platform can reliably and efficiently obtain state information about a component supported by the platform through the use of a shared memory page. State information may be supplied by the platform, but any state translation information needed to map the state information as supplied to a format as used may be provided through the shared page. In a virtualized environment, the state translation information can be used to map the value of a virtual timer counter or other component from a value provided by a virtual processor to a normalized reference time that will yield the same result, regardless of whether the software component is migrated to or from another virtual processor. Use of a shared page avoids the inefficiency of an intercept into a virtualized environment or a system calls in native mode operation.
    Type: Grant
    Filed: February 19, 2013
    Date of Patent: October 16, 2018
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Shuvabrata Ganguly, Jason S. Wohlgemuth, Allen Marshall
  • Patent number: 10095584
    Abstract: The amount of data to be backed up and recovered is reduced when supply of power to a semiconductor device is stopped and restarted. A backup need determination circuit provided in the semiconductor device reads the kind of instruction decoded by a decoder and determines whether data needs to be backed up from a volatile register to a nonvolatile register. With a structure according to one embodiment of the present invention, it is possible to select necessary data from data used for operation in a logic circuit before the power supply is stopped and after the power supply is restarted. Data that is necessary after the power supply is restarted can be backed up from the volatile register to the nonvolatile register before the power supply is stopped. Data that is unnecessary is not backed up from the volatile register to the nonvolatile register before the power supply is stopped.
    Type: Grant
    Filed: April 23, 2014
    Date of Patent: October 9, 2018
    Assignee: Semiconductor Energy Laboratory Co., Ltd.
    Inventor: Seiichi Yoneda
  • Patent number: 10061705
    Abstract: A technique for processing instructions includes examining instructions in an instruction stream of a processor to determine properties of the instructions. The properties indicate whether the instructions may belong in an instruction sequence subject to decode-time instruction optimization (DTIO). Whether the properties of multiple ones of the instructions are compatible for inclusion within an instruction sequence of a same group is determined. The instructions with compatible ones of the properties are grouped into a first instruction group. The instructions of the first instruction group are decoded subsequent to formation of the first instruction group. Whether the first instruction group actually includes a DTIO sequence is verified based on the decoding. Based on the verifying, DTIO is performed on the instructions of the first instruction group or is not performed on the instructions of the first instruction group.
    Type: Grant
    Filed: June 9, 2015
    Date of Patent: August 28, 2018
    Assignee: International Business Machines Corporation
    Inventors: Michael K. Gschwind, Valentina Salapura
  • Patent number: 10061587
    Abstract: A processor includes a front end, a decoder, an allocator, and a retirement unit. The decoder includes logic to identify an end-of-live-range (EOLR) indicator. The EOLR indicator specifies an architectural register and a location in code for which the architectural register is unused. The allocator includes logic to scan for a mapping of the architectural register to a physical register, based upon the EOLR indicator. The allocator also includes logic to generate a request to disassociate the architectural register from the physical register. The retirement unit includes logic to disassociate the architectural register from the physical register.
    Type: Grant
    Filed: September 25, 2014
    Date of Patent: August 28, 2018
    Assignee: Intel Corporation
    Inventors: David Pardo Keppel, Denis M. Khartikov, Fernando LaTorre, Marc Lupon, Grigorios Magklis, Naveen Neelakantam, Georgios Tournavitis, Polychronis Xekalakis
  • Patent number: 10061609
    Abstract: A method and system uses exceptions for code specialization in a system that supports transactions. The method and system includes inserting one or more branchless instructions into a sequence of computer instructions. The branchless instructions include one or more instructions that are executable if a commonly occurring condition is satisfied and include one or more instructions that are configured to raise an exception if the commonly occurring condition is not satisfied.
    Type: Grant
    Filed: October 31, 2016
    Date of Patent: August 28, 2018
    Assignee: Intel Corporation
    Inventors: Arvind Krishnaswamy, Daniel M. Lavery
  • Patent number: 10055226
    Abstract: A system and process for managing thread transitions includes determining that a transition is to be made regarding the relative use of two data register sets where the two data register sets are used by a processor as first-level registers for thread execution. Based on the transition determination, a determination is made whether to move thread data in at least one of the first-level registers to second-level registers. Responsive to determining to move the thread data, a portion of main memory or cache memory is assigned as the second-level registers where the second-level registers serve as registers of at least one of the two data register sets for executing a thread. The thread data from the at least one first-level register is moved to the second-level registers based on the move determination.
    Type: Grant
    Filed: July 2, 2017
    Date of Patent: August 21, 2018
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Christopher M. Abernathy, Mary D. Brown, Susan E. Eisen, James A. Kahle, Hung Q. Le, Dung Q. Nguyen
  • Patent number: 10042813
    Abstract: Methods and apparatus relating to improved SIMD (Single Instruction, Multiple Data) K-nearest-neighbors implementations are described. An embodiment provides a technique for improving SIMD implementations of the multidimensional K-Nearest-Neighbors (KNN) techniques. One embodiment replaces the non-SIMD friendly part of the KNN algorithm with a sequence of SIMD operations. For example, in order to avoid branches in the algorithm hotspot (e.g., the inner loop), SIMD operations may be used to update the list of nearest distances (and neighbors) after each iteration. Other embodiments are also disclosed and claimed.
    Type: Grant
    Filed: December 15, 2014
    Date of Patent: August 7, 2018
    Assignee: Intel Corporation
    Inventor: Amos Goldman
  • Patent number: 10019266
    Abstract: A method includes providing a data processor having an instruction pipeline, where the instruction pipeline has a plurality of instruction pipeline stages, and where the plurality of instruction pipeline stages includes a first instruction pipeline stage and a second instruction pipeline stage. The method further includes providing a data processor instruction that causes the data processor to perform a first set of computational operations during execution of the data processor instruction, performing the first set of computational operations in the first instruction pipeline stage if the data processor instruction is being executed and a first mode has been selected, and performing the first set of computational operations in the second instruction pipeline stage if the data processor instruction is being executed and a second mode has been selected.
    Type: Grant
    Filed: September 11, 2015
    Date of Patent: July 10, 2018
    Assignee: RAMBUS INC.
    Inventors: William C. Moyer, Jeffrey W. Scott