Patents Examined by Daniel H. Pan

Processing of multiple instruction streams in a parallel slice processor

Patent number: 10157064

Abstract: A method of managing instruction execution for multiple instruction streams using a processor core having multiple parallel instruction execution slices. An event is detected indicating that either resource requirement or resource availability for a subsequent instruction of an instruction stream will not be met by the instruction execution slice currently executing the instruction stream. In response to detecting the event, dispatch of at least a portion of the subsequent instruction is made to another instruction execution slice. The event may be a compiler-inserted directive, may be an event detected by logic in the processor core, or may be determined by a thread sequencer. The instruction execution slices may be dynamically reconfigured as between single-instruction-multiple-data (SIMD) instruction execution, ordinary instruction execution, wide instruction execution.

Type: Grant

Filed: February 27, 2017

Date of Patent: December 18, 2018

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Lee Evan Eisen, Hung Qui Le, Jentje Leenstra, Jose Eduardo Moreira, Bruce Joseph Ronchetti, Brian William Thompto, Albert James Van Norstrand, Jr.
Suspendable load address tracking inside transactions

Patent number: 10146538

Abstract: Suspendable load address tracking inside transactions is disclosed. An example processing device of implementations of the disclosure includes a transactional memory (TM) read set tracking component circuitry to identify a suspend read tracking instruction within a transaction executed by the processing device, mark load instructions occurring in the transaction subsequent to the identified suspend read tracking instruction with a suspend attribute, wherein the addresses corresponding to the marked load instructions are excluded from a read set maintained for the transaction, identify a resume read tracking instruction within the transaction, and stop marking the load instructions occurring subsequent to the identified resume read tracking instruction with the suspend attribute.

Type: Grant

Filed: September 30, 2016

Date of Patent: December 4, 2018

Assignee: Intel Corporation

Inventors: Raanan Sade, Roman Dementiev, Ravi Rajwar, Ady Tal, Alex Gerber
Methods, systems and apparatus for supporting wide and efficient front-end operation with guest-architecture emulation

Patent number: 10140138

Abstract: Methods for supporting wide and efficient front-end operation with guest architecture emulation are disclosed. As a part of a method for supporting wide and efficient front-end operation, upon receiving a request to fetch a first far taken branch instruction, a cache line that includes the first far taken branch instruction, a next cache line and a cache line located at the target of the first far taken branch instruction is read. Based on information that is accessed from a data table, the cache line and either the next cache line or the cache line located at the target is fetched in a single cycle.

Type: Grant

Filed: March 17, 2014

Date of Patent: November 27, 2018

Assignee: Intel Corporation

Inventors: Mohammad Abdallah, Ankur Groen, Erika Gunadi, Mandeep Singh, Ravishankar Rao
Integrating sign extensions for loads

Patent number: 10126976

Abstract: An address and a data size are provided to a rotator. The rotator stores, based on the address and the data size, a data element in a location having a defined number of positions. The data element includes one or more data units and the one or more data units are aligned correctly in one or more positions of the location based on a predefined position in the location to receive a selected data unit of the one or more data units. The rotator replicates a value of a chosen data unit of the one or more data units to one or more other positions of the location.

Type: Grant

Filed: February 17, 2017

Date of Patent: November 13, 2018

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: Michael K. Gschwind
Honoring hardware entitlement of a hardware thread

Patent number: 10114673

Abstract: A method for scheduling the execution of a computer instruction, receive an entitlement processor resource percentage for a logical partition on a computer system. The logical partition is associated with a hardware thread of a processor of the computer system. The entitlement processor resource percentage for the logical partition is stored in a register of the hardware thread associated with the logical partition. An instruction is received from the logical partition of the computer system and the processor dispatches the instruction based on the entitlement processor resource percentage stored in the register of the hardware thread associated with the logical partition.

Type: Grant

Filed: September 29, 2016

Date of Patent: October 30, 2018

Assignee: International Business Machines Corporation

Inventors: Nitin Gupta, Mehulkumar J. Patel, Deepak C. Shetty
Dependency-prediction of instructions

Patent number: 10108419

Abstract: Systems and methods for dependency-prediction include executing instructions in an instruction pipeline of a processor and detecting a conditionality-imposing control instruction, such as an If-Then (IT) instruction, which imposes dependent behavior on a conditionality block size of one or more dependent instructions. Prior to executing a first instruction, a dependency-prediction is made to determine if the first instruction is a dependent instruction of the conditionality-imposing control instruction, based on the conditionality block size and one or more parameters of the instruction pipeline. The first instruction is executed based on the dependency-prediction. When the first instruction is dependency-mispredicted, an associated dependency-misprediction penalty is mitigated. If the first instruction is a branch instruction, the mitigation involves training a branch prediction tracking mechanism to correctly dependency-predict future occurrences of the first instruction.

Type: Grant

Filed: September 26, 2014

Date of Patent: October 23, 2018

Assignee: QUALCOMM Incorporated

Inventors: Brian Michael Stempel, James Norris Dieffenderfer, Michael Scott McIlvaine, Melinda Joyce Brown
Instruction block address register

Patent number: 10095519

Abstract: Apparatus and methods are disclosed for controlling instruction flow in block-based processor architectures. In one example of the disclosed technology, an instruction block address register stores an index address to a memory storing a plurality of instructions for an instruction block, the indexed address being inaccessible when the processor is in one or more unprivileged operational modes, one or more execution units configured to execute instructions for the instruction block, and a control unit configured to fetch and decode two or more of the plurality of instructions from the memory based on the indexed address.

Type: Grant

Filed: March 3, 2016

Date of Patent: October 9, 2018

Assignee: Microsoft Technology Licensing, LLC

Inventors: Douglas C. Burger, Aaron L. Smith
Instruction and logic for register based hardware memory renaming

Patent number: 10095522

Abstract: A processor includes a core, a memory subsystem, a predictor module, and a memory rename module. The predictor module may include a first logic to identify a dependency between a store instruction and a load instruction, and a second logic to assign a memory renaming (MRN) register to the store instruction and the load instruction based on the identified dependency. Further, the memory rename module may include a third logic to copy, based on the assigned MRN register, information in a first logical register associated with the store instruction directly to a second logical register associated with the load instruction.

Type: Grant

Filed: December 23, 2014

Date of Patent: October 9, 2018

Assignee: Intel Corporation

Inventors: Kamil Garifullin, Stanislav Shwartsman, Lihu Rappoport, Zeev Sperber, Pavel I. Kryukov, Andrey Kluchnikov, Igor Yanover, George Leifman, Alex Gerber, Jared W. Stark
Dual data streams sharing dual level two cache access ports to maximize bandwidth utilization

Patent number: 10083035

Abstract: A streaming engine employed in a digital data processor specifies fixed first and second read only data streams. Corresponding stream address generator produces address of data elements of the two streams. Corresponding steam head registers stores data elements next to be supplied to functional units for use as operands. The two streams share two memory ports. A toggling preference of stream to port ensures fair allocation. The arbiters permit one stream to borrow the other's interface when the other interface is idle. Thus one stream may issue two memory requests, one from each memory port, if the other stream is idle. This spreads the bandwidth demand for each stream across both interfaces, ensuring neither interface becomes a bottleneck.

Type: Grant

Filed: December 20, 2016

Date of Patent: September 25, 2018

Assignee: TEXAS INSTRUMENTS INCORPORATED

Inventors: Joseph Zbiciak, Timothy Anderson
Streaming engine with error detection, correction and restart

Patent number: 10078551

Abstract: This invention is a streaming engine employed in a digital signal processor. A fixed data stream sequence including plural nested loops is specified by a control register. The streaming engine includes an address generator producing addresses of data elements and a steam head register storing data elements next to be supplied as operands. The streaming engine fetches stream data ahead of use by the central processing unit core in a stream buffer. Parity bits are formed upon storage of data in the stream buffer which are stored with the corresponding data. Upon transfer to the stream head register a second parity is calculated and compared with the stored parity. The streaming engine signals a parity fault if the parities do not match. The streaming engine preferably restarts fetching the data stream at the data element generating a parity fault.

Type: Grant

Filed: December 20, 2016

Date of Patent: September 18, 2018

Assignee: TEXAS INSTRUMENTS INCORPORATED

Inventors: Joseph Zbiciak, Timothy Anderson
Streaming engine with cache-like stream data storage and lifetime tracking

Patent number: 10073696

Abstract: A streaming engine employed in a digital data processor specifies a fixed read only data stream defined by plural nested loops. An address generator produces address of data elements. A steam head register stores data elements next to be supplied to functional units for use as operands. The streaming engine fetches stream data ahead of use by the central processing unit core in a stream buffer constructed like a cache. The stream buffer cache includes plural cache lines, each includes tag bits, at least one valid bit and data bits. Cache lines are allocated to store newly fetched stream data. Cache lines are deallocated upon consumption of the data by a central processing unit core functional unit. Instructions preferably include operand fields with a first subset of codings corresponding to registers, a stream read only operand coding and a stream read and advance operand coding.

Type: Grant

Filed: December 20, 2016

Date of Patent: September 11, 2018

Assignee: TEXAS INSTRUMENTS INCORPORATED

Inventor: Joseph Zbiciak
Store nullification in the target field

Patent number: 10061584

Abstract: Apparatus and methods are disclosed for nullifying memory store instructions identified in a target field of a nullification instruction. In some examples of the disclosed technology, an apparatus can include memory and one or more block-based processor cores configured to fetch and execute a plurality of instruction blocks. One of the cores can include a control unit configured, based at least in part on receiving a nullification instruction, to obtain an instruction identification for a memory access instruction of a plurality of memory access instructions, based on a target field of the nullification instruction. The memory access instruction associated with the instruction identification is nullified. The memory access instruction is in a first instruction block of the plurality of instruction blocks. Based on the nullified memory access instruction, a subsequent memory access instruction from the first instruction block is executed.

Type: Grant

Filed: March 3, 2016

Date of Patent: August 28, 2018

Assignee: Microsoft Technology Licensing, LLC

Inventors: Douglas C. Burger, Aaron L. Smith
Streaming engine with deferred exception reporting

Patent number: 10061675

Abstract: This invention is a streaming engine employed in a digital signal processor. A fixed data stream sequence is specified by a control register. The streaming engine fetches stream data ahead of use by a central processing unit and stores it in a stream buffer. Upon occurrence of a fault reading data from memory, the streaming engine identifies the data element triggering the fault preferably storing this address in a fault address register. The streaming engine defers signaling the fault to the central processing unit until this data element is used as an operand. If the data element is never used by the central processing unit, the streaming engine never signals the fault. The streaming engine preferably stores data identifying the fault in a fault source register. The fault address register and the fault source register are preferably extended control registers accessible only via a debugger.

Type: Grant

Filed: December 20, 2016

Date of Patent: August 28, 2018

Assignee: TEXAS INSTRUMENTS INCORPORATED

Inventors: Joseph Zbiciak, Timothy D. Anderson, Duc Bui, Kai Chirca
Instruction and logic for a vector format for processing computations

Patent number: 10061746

Abstract: A processor includes a front end to fetch an instruction. The instruction is to calculate a data point using inputs from a plurality of adjacent source data in a plurality of dimensions. The processor includes a decoder to decode the instruction. The processor also includes a core to, based on the decoded instruction, perform a plurality of tabular vector read operations to read the plurality of adjacent source data and perform a tabular vector calculation to execute the instruction. The tabular vector calculation is based upon results of performing the plurality of tabular vector read operations. The core is further to write results of the tabular vector calculation.

Type: Grant

Filed: September 26, 2014

Date of Patent: August 28, 2018

Assignee: Intel Corporation

Inventor: Charles R. Yount
Processing queue management

Patent number: 10042640

Abstract: A data processing system 2 includes multiple out-of-order issue queues 8, 10. A master serialization instruction MSI received by a first issue queue 8 is detected by slave generation circuitry 24 which generates a slave serialization instruction SSI added to a second issue queue 10. The master serialization instruction MSI manages serialization relative to the instructions within the first issue queue 8. The slave serialization instruction SSI manages serialization relative to the instructions within the second issue queue 10. The master serialization instruction MSI and the slave serialization instruction SSI are removed when both have met their serialization conditions and are respectively the oldest instructions within their issue queues.

Type: Grant

Filed: March 22, 2016

Date of Patent: August 7, 2018

Assignee: ARM Limited

Inventors: Luca Scalabrino, Frederic Jean Denis Arsanto, Thomas Gilles Tarridec, Cedric Denis Robert Airaud
Power management of branch predictors in a computer processor

Patent number: 10037207

Abstract: A computer processor includes a branch prediction unit that includes a local branch predictor and a global branch predictor. Managing power consumption in such a computer processor includes, for each of a plurality of branch instructions: performing, by the local branch predictor, a local branch prediction; performing, by each of the global branch predictors, a global branch prediction; determining to utilize the local branch prediction over the global branch predictions as a branch prediction for the branch instruction; incrementing a value of a counter; determining whether the value of the counter exceeds a predetermined threshold; and if the value of the counter exceeds the predetermined threshold, powering down at least one of the global branch predictors and configuring the branch prediction unit to bypass the powered down global branch predictor for branch predictions of subsequent branch instructions.

Type: Grant

Filed: July 27, 2016

Date of Patent: July 31, 2018

Assignee: International Business Machines Corporation

Inventors: David S. Levitan, Nicholas R. Orzol, Robert A. Philhower
Operation of a multi-slice processor with an expanded merge fetching queue

Patent number: 10037211

Abstract: Operation of a multi-slice processor that includes a plurality of execution slices and a plurality of load/store slices, where each load/store slice includes a load miss queue and a load reorder queue, includes: receiving, at a load reorder queue, a load instruction requesting data; responsive to the data not being stored in a data cache, determining whether a previous load instruction is pending a fetch of a cache line comprising the data; if the cache line does not comprise the data, allocating an entry for the load instruction in the load miss queue; and if the cache line does comprise the data: merging, in the load reorder queue, the load instruction with an entry for the previous load instruction.

Type: Grant

Filed: March 22, 2016

Date of Patent: July 31, 2018

Assignee: International Business Machines Corporation

Inventors: Kimberly M. Fernsler, David A. Hrusecky, Hung Q. Le, Elizabeth A. McGlone, Brian W. Thompto
Multi-nullification

Patent number: 10031756

Abstract: Apparatus and methods are disclosed for nullifying memory store instructions and one or more registers identified in a target field of a nullification instruction. In some examples of the disclosed technology, an apparatus can include memory and one or more block-based processor cores configured to fetch and execute a plurality of instruction blocks. One of the cores can include a control unit configured, based at least in part on receiving a nullification instruction, to obtain an instruction identification for a memory access instruction of a plurality of memory access instructions and a register identification of at least one of a plurality of registers, based on a first and second target fields of the nullification instruction. The at least one register and the memory access instruction associated with the instruction identification are nullified. Based on the nullified memory access instruction, a subsequent memory access instruction is executed.

Type: Grant

Filed: March 3, 2016

Date of Patent: July 24, 2018

Assignee: Microsoft Technology Licensing, LLC

Inventors: Douglas C. Burger, Aaron L. Smith
Data processing method, processor, and data processing device

Patent number: 10025752

Abstract: Disclosed are a data processing method, a processor, and a data processing device. The method comprises: an arbiter sends data D(a,1) to a first processing circuit; the first processing circuit processes the data D(a,1) to obtain data D(1,2), the first processing circuit being a processing circuit among m processing circuits; the first processing circuit sends the data D(1,2) to a second processing circuit; the second processing circuit to an mth processing circuit separately process the received data; and the arbiter receives data D(m,a) sent by the mth processing circuit. The processor comprises an arbiter and a first processing circuit to an (m+1)th processing circuit. Each processing circuit in the first processing circuit to the (m+1)th processing circuit can receive first data to be processed sent by the arbiter, and process the first data to be processed. The scheme is helpful to improve efficiency of data processing.

Type: Grant

Filed: October 28, 2016

Date of Patent: July 17, 2018

Assignee: Huawei Technologies Co., Ltd.

Inventors: Nan Li, Linchun Wang, Hongfei Chen
Reordered speculative instruction sequences with a disambiguation-free out of order load store queue

Patent number: 10019263

Abstract: In a processor, a disambiguation-free out of order load store queue method. The method includes implementing a memory resource that can be accessed by a plurality of asynchronous cores; implementing a store retirement buffer, wherein stores from a store queue have entries in the store retirement buffer in original program order; and implementing speculative execution, wherein results of speculative execution can be saved in the store retirement/reorder buffer as a speculative state. The method further includes, upon dispatch of a subsequent load from a load queue, searching the store retirement buffer for address matching; and, in cases where there are a plurality of address matches, locating a correct forwarding entry by scanning for the store retirement buffer for a first match, and forwarding data from the first match to the subsequent load. Once speculative outcomes are known, the speculative state is retired to memory.

Type: Grant

Filed: December 12, 2014

Date of Patent: July 10, 2018

Assignee: Intel Corporation

Inventor: Mohammad A. Abdallah

prev … 8 9 10 11 12 13 14 15 16 … next