Dynamic Instruction Dependency Checking, Monitoring Or Conflict Resolution Patents (Class 712/216)

Scoreboarding, reservation station, or aliasing (Class 712/217)

Commitment control or register bypass (Class 712/218)

Reducing an impact of a stall or pipeline bubble (Class 712/219)

Instruction and logic for tracking fetch performance bottlenecks

Patent number: 10635442

Abstract: A processor includes a front end, an execution unit, a retirement stage, a counter, and a performance monitoring unit. The front end includes logic to receive an event instruction to enable supervision of a front end event that will delay execution of instructions. The execution unit includes logic to set a register with parameters for supervision of the front end event. The front end further includes logic to receive a candidate instruction and match the candidate instruction to the front end event. The counter includes logic to generate the front end event upon retirement of the candidate instruction.

Type: Grant

Filed: March 12, 2018

Date of Patent: April 28, 2020

Assignee: Intel Corporation

Inventor: Ahmad Yasin
Modulo hardware generator

Patent number: 10628125

Abstract: A method of generating a hardware design to calculate a modulo value for any input value in a target input range with respect to a constant value d using one or more range reduction stages. The hardware design is generated through an iterative process that selects the optimum component for mapping successively increasing input ranges to the target output range until a component is selected that maps the target input range to the target output range. Each iteration includes generating hardware design components for mapping the input range to the target output range using each of a plurality of modulo preserving range reduction methods, synthesizing the generated hardware design components, and selecting one of the generated hardware design components based on the results of the synthesis.

Type: Grant

Filed: January 18, 2019

Date of Patent: April 21, 2020

Assignee: Imagination Technologies Limited

Inventor: Samuel Lee
Branch resolve pointer optimization

Patent number: 10628164

Abstract: A system and method for efficiently handling speculative execution. A load store unit (LSU) of a processor stores a commit candidate pointer, which points to a given store instruction buffered in the store queue. The given store instruction is an oldest store instruction not currently permitted to commit to the data cache. The LSU receives a first pointer from the mapping unit, which points to an oldest instruction of non-dispatched branches and unresolved system instructions. The LSU receives a second pointer from the execution unit, which points to an oldest unresolved, issued branch instruction. When the LSU determines the commit candidate pointer is older than each of the first pointer and the second pointer, the commit candidate pointer is updated to point to an oldest store instruction younger than the given store instruction stored in the store queue. The given store instruction is permitted to commit to the data cache.

Type: Grant

Filed: July 30, 2018

Date of Patent: April 21, 2020

Assignee: Apple Inc.

Inventors: Kulin N. Kothari, Mridul Agarwal, Aditya Kesiraju, Deepankar Duggal, Sean M. Reynolds
Predicting a table of contents pointer value responsive to branching to a subroutine

Patent number: 10620955

Abstract: Predicting a Table of Contents (TOC) pointer value responsive to branching to a subroutine. A subroutine is called from a calling module executing on a processor. Based on calling the subroutine, a value of a pointer to a reference data structure, such as a TOC, is predicted. The predicting is performed prior to executing a sequence of one or more instructions in the subroutine to compute the value. The value that is predicted is used to access the reference data structure to obtain a variable value for a variable of the subroutine.

Type: Grant

Filed: September 19, 2017

Date of Patent: April 14, 2020

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Michael K. Gschwind, Valentina Salapura
Handling effective address synonyms in a load-store unit that operates without address translation

Patent number: 10606592

Abstract: Technical solutions are described for issuing, by a load-store unit (LSU), a plurality of instructions from an out-of-order (OoO) window. The issuing includes, in response to determining a first effective address being used by a first instruction, the first effective address corresponding to a first real address, creating an effective real table (ERT) entry in an ERT, the ERT entry mapping the first effective address to the first real address. Further, the execution includes in response to determining an effective address synonym used by a second instruction, the effective address synonym being a second effective address that is also corresponding to said first real address: creating a synonym detection table (SDT) entry in an SDT, wherein the SDT entry maps the second effective address to the ERT entry, and relaunching the second instruction by replacing the second effective address in the second instruction with the first effective address.

Type: Grant

Filed: November 29, 2017

Date of Patent: March 31, 2020

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Bryan Lloyd, Balaram Sinharoy
Handling effective address synonyms in a load-store unit that operates without address translation

Patent number: 10606591

Abstract: Technical solutions are described for issuing, by a load-store unit (LSU), a plurality of instructions from an out-of-order (OoO) window. The issuing includes, in response to determining a first effective address being used by a first instruction, the first effective address corresponding to a first real address, creating an effective real table (ERT) entry in an ERT, the ERT entry mapping the first effective address to the first real address. Further, the execution includes in response to determining an effective address synonym used by a second instruction, the effective address synonym being a second effective address that is also corresponding to said first real address: creating a synonym detection table (SDT) entry in an SDT, wherein the SDT entry maps the second effective address to the ERT entry, and relaunching the second instruction by replacing the second effective address in the second instruction with the first effective address.

Type: Grant

Filed: October 6, 2017

Date of Patent: March 31, 2020

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Bryan Lloyd, Balaram Sinharoy
Instruction prefetch mechanism

Patent number: 10599571

Abstract: An apparatus to facilitate data prefetching is disclosed. The apparatus includes a cache, one or more execution units (EUs) to execute program code, prefetch logic to maintain tracking information of memory instructions in the program code that trigger a cache miss and compiler logic to receive the tracking information, insert one or more pre-fetch instructions in updated program code to prefetch data from a memory for execution of one or more of the memory instructions that triggered a cache miss and download the updated program code for execution by the one or more EUs.

Type: Grant

Filed: August 7, 2017

Date of Patent: March 24, 2020

Assignee: Intel Corporation

Inventors: Vasileios Porpodas, Guei-Yuan Lueh, Subramaniam Maiyuran, Wei-Yu Chen
Efficient store-forwarding with partitioned FIFO store-reorder queue in out-of-order processor

Patent number: 10579387

Abstract: Technical solutions are described for executing one or more out-of-order (OoO) instructions by a processing unit. The execution includes detecting, by a load-store unit (LSU), a load-hit-store (LHS) in an out-of-order execution of the instructions, the detecting based only on effective addresses. The detecting includes determining an effective address associated with an operand of a load instruction. The detecting further includes determining whether a store instruction entry using said effective address to store a data value is present in a store reorder queue, and indicating that an LHS has been detected based at least in part on determining that store instruction entry using said effective address is present in the store reorder queue. In response to detecting the LHS, a store forwarding is performed that includes forwarding data from the store instruction to the load instruction.

Type: Grant

Filed: October 6, 2017

Date of Patent: March 3, 2020

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Christopher Gonzalez, Bryan Lloyd, Balaram Sinharoy
Handling effective address synonyms in a load-store unit that operates without address translation

Patent number: 10572257

Abstract: Technical solutions are described for issuing, by a load-store unit (LSU), a plurality of instructions from an out-of-order (OoO) window. The issuing includes, in response to determining a first effective address (EA) being used by a first instruction, the first EA corresponding to a first real address (RA), creating a first effective real translation (ERT) table entry in an ERT table, the ERT entry mapping the first EA to the first RA. Further, in response to determining an EA synonym used by a second instruction, the execution includes replacing the first ERT entry with a second ERT entry, wherein the second ERT entry maps the second EA with the first RA, and creating an ERT eviction (ERTE) table entry in an ERTE table, wherein the ERTE entry maps the first RA to the first EA, the ERTE table entry maintains the relationship between the first EA and the first RA.

Type: Grant

Filed: November 29, 2017

Date of Patent: February 25, 2020

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Bryan Lloyd, Balaram Sinharoy
Speculatively performing memory move requests with respect to a barrier

Patent number: 10572179

Abstract: A lower level cache receives, from a processor core, a plurality of copy-type requests and a plurality of paste-type requests that together indicate a memory move to be performed, as well as a barrier request that requests ordering of memory access requests prior to and after the barrier request. The barrier request precedes a copy-type request and a paste-type request of the memory move in program order. Prior to completion of processing of the barrier request, the lower level cache allocates first and second state machines to service the copy-type and paste-type requests. The first state machine speculatively reads a data granule identified by a source real address of the copy-type request into a non-architected buffer. After processing of the barrier request is complete, the second state machine writes the data granule from the non-architected buffer to a storage location identified by a destination real address of the paste-type request.

Type: Grant

Filed: July 19, 2018

Date of Patent: February 25, 2020

Assignee: International Business Machines Corporation

Inventors: Guy L. Guthrie, Derek E. Williams
Handling effective address synonyms in a load-store unit that operates without address translation

Patent number: 10572256

Abstract: Technical solutions are described for issuing, by a load-store unit (LSU), a plurality of instructions from an out-of-order (OoO) window. The issuing includes, in response to determining a first effective address (EA) being used by a first instruction, the first EA corresponding to a first real address (RA), creating a first effective real translation (ERT) table entry in an ERT table, the ERT entry mapping the first EA to the first RA. Further, in response to determining an EA synonym used by a second instruction, the execution includes replacing the first ERT entry with a second ERT entry, wherein the second ERT entry maps the second EA with the first RA, and creating an ERT eviction (ERTE) table entry in an ERTE table, wherein the ERTE entry maps the first RA to the first EA, the ERTE table entry maintains the relationship between the first EA and the first RA.

Type: Grant

Filed: October 6, 2017

Date of Patent: February 25, 2020

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Bryan Lloyd, Balaram Sinharoy
Monitor support on accelerated processing device

Patent number: 10558418

Abstract: A technique for implementing synchronization monitors on an accelerated processing device (“APD”) is provided. Work on an APD includes workgroups that include one or more wavefronts. All wavefronts of a workgroup execute on a single compute unit. A monitor is a synchronization construct that allows workgroups to stall until a particular condition is met. Responsive to all wavefronts of a workgroup executing a wait instruction, the monitor coordinator records the workgroup in an “entry queue.” The workgroup begins saving its state to a general APD memory and, when such saving is complete, the monitor coordinator moves the workgroup to a “condition queue.” When the condition specified by the wait instruction is met, the monitor coordinator moves the workgroup to a “ready queue,” and, when sufficient resources are available on a compute unit, the APD schedules the ready workgroup for execution on a compute unit.

Type: Grant

Filed: July 27, 2017

Date of Patent: February 11, 2020

Assignee: Advanced Micro Devices, Inc.

Inventors: Alexandru Dutu, Bradford M. Beckmann
Multi-level history buffer for transaction memory in a microprocessor

Patent number: 10545765

Abstract: Embodiments include systems, methods, and computer program products for using a multi-level history buffer (HB) for a speculative transaction. One method includes after dispatching a first instruction indicating start of the speculative transaction, marking one or more register file (RF) entries as pre-transaction memory (PTM), and after dispatching a second instruction targeting one of the marked RF entries, moving data from the marked RF entry to a first level HB entry and marking the first level HB entry as PTM. The method also includes upon detecting a write back to the first level HB entry, moving data from the first level HB entry to a second level HB entry and marking the second level HB entry as PTM. The method further includes upon determining that the second level HB entry has been completed, moving data from the second level HB entry to a third level HB entry.

Type: Grant

Filed: May 17, 2017

Date of Patent: January 28, 2020

Assignee: International Business Machines Corporation

Inventors: Brian D. Barrick, Steven J. Battle, Joshua W. Bowman, Hung Q. Le, Dung Q. Nguyen, David R. Terry, Albert J. Van Norstrand, Jr.
System and method for a synthetic trace model

Patent number: 10546075

Abstract: A system and method for a synthetic trace model includes providing a first system model, the first system model comprising a plurality of subsystem models, each of the plurality of subsystem models having a trace format, generating a first plurality of traces from an overall pool of trace instructions, each of the first plurality of traces generated for respective ones of the plurality of subsystem models, according to the trace format of the subsystem model, executing the traces on each of the subsystem models, and evaluating execution characteristics for each trace executed on the first system model.

Type: Grant

Filed: April 27, 2016

Date of Patent: January 28, 2020

Assignee: FUTUREWEI TECHNOLOGIES, INC.

Inventors: YwhPyng Harn, Fa Yin, Xiaotao Chen
Parallelization method, parallelization tool, and in-vehicle device

Patent number: 10540156

Abstract: A computer generates a parallel program, based on an analysis of a single program that includes a plurality of tasks written for a single-core microcomputer, by parallelizing parallelizable tasks for a multi-core processor having multiple cores. The computer includes a macro task (MT) group extractor that analyzes, or finds, a commonly-accessed resource commonly accessed by the plurality of tasks, and extracts a plurality of MTs showing access to such commonly-accessed resource. Then, the computer uses an allocation restriction determiner to allocate the extracted plural MTs to the same core in the multi-core processor. By devising a parallelization method described above, an overhead in an execution time of the parallel program by the multi-core processor is reduced, and an in-vehicle device is enabled to execute each of the MTs in the program optimally.

Type: Grant

Filed: June 8, 2017

Date of Patent: January 21, 2020

Assignee: DENSO CORPORATION

Inventor: Kenichi Mineda
Load-hit-load detection in an out-of-order processor

Patent number: 10534616

Abstract: Technical solutions are described for executing one or more out-of-order instructions by a load-store unit (LSU) by detecting a load-hit-load (LHL) case based only on effective addresses (EA). An example method includes, in response to receiving a first load instruction, creating an entry in a LHL table. Further, in response to receiving a second load instruction in the load reorder queue, and in response to the predetermined number of bits from a second EA used by the second load instruction matching the predetermined number of bits from the first EA, comparing the first EA and the second EA. Further, a first thread identifier for the first load instruction is compared with a second thread identifier for the second load instruction. In response to the first EA matching the second EA, and the first thread identifier matching the second thread identifier, the method includes flushing the first load instruction.

Type: Grant

Filed: October 6, 2017

Date of Patent: January 14, 2020

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Christopher Gonzalez, Bryan Lloyd, Balaram Sinharoy
Handling move instructions via register renaming or writing to a different physical register using control flags

Patent number: 10528355

Abstract: An apparatus has processing circuitry, register rename circuitry and control circuitry which selects one of first and second move handling techniques for handling a move instruction specifying a source logical register and a destination logical register. In the first technique, the register rename circuitry maps the destination logical register of the move to the same physical register as the source logical register. In the second technique, the processing circuitry writes a data value read from a physical register corresponding to the source logical register to a different physical register corresponding to the destination local register. The second technique is selected when the move instruction specifies the same source logical register as one of the source and destination logical registers as an earlier move instruction handled according to the first technique, and the register mapping used for that register when handling the earlier move instruction is still current.

Type: Grant

Filed: December 24, 2015

Date of Patent: January 7, 2020

Assignee: ARM Limited

Inventors: Chris Abernathy, Florent Begon
Context sensitive barriers with an implicit access ordering constraint for a victim context

Patent number: 10503512

Abstract: Apparatus for data processing and a method of data processing are provided, according to which the processing circuitry of the apparatus can access a memory system and execute data processing instructions in one context of multiple contexts which it supports. When the processing circuitry executes a barrier instruction, the resulting access ordering constraint may be limited to being enforced for accesses which have been initiated by the processing circuitry when operating in an identified context, which may for example be the context in which the barrier instruction has been executed. This provides a separation between the operation of the processing circuitry in its multiple possible contexts and in particular avoids delays in the completion of the access ordering constraint, for example relating to accesses to high latency regions of memory, from affecting the timing sensitivities of other contexts.

Type: Grant

Filed: November 3, 2015

Date of Patent: December 10, 2019

Assignee: ARM Limited

Inventors: Simon John Craske, Alexander Alfred Hornung, Max John Batley, Kauser Yakub Johar
Resource synchronization for graphics processing

Patent number: 10504270

Abstract: Techniques are disclosed relating to synchronizing access to pixel resources. Examples of pixel resources include color attachments, a stencil buffer, and a depth buffer. In some embodiments, hardware registers are used to track status of assigned pixel resources and pixel wait and pixel release instruction are used to synchronize access to the pixel resources. In some embodiments, other accesses to the pixel resources may occur out of program order. Relative to tracking and ordering pass groups, this weak ordering and explicit synchronization may improve performance and reduce power consumption. Disclosed techniques may also facilitate coordination between fragment rendering threads and auxiliary mid-render compute tasks.

Type: Grant

Filed: December 22, 2016

Date of Patent: December 10, 2019

Assignee: Apple Inc.

Inventors: Terence M. Potter, Richard W. Schreyer, James J. Ding, Alexander K. Kan, Michael Imbrogno
Operation unit, method and device capable of supporting operation data of different bit widths

Patent number: 10489704

Abstract: Aspects for supporting operation data of different bit widths in neural networks are described herein. The aspects may include a processing module that includes one or more processors. The processor may be capable of processing data of one or more respective bit-widths. Further, the aspects may include a determiner module configured to receive one or more instructions that include one or more operands and one or more width fields. The operands may correspond to one or more operand types and each of the width fields may indicate an operand bit-width of one operand type. The determiner module may be further configured to identify at least one operand bit-widths that is greater than each of the bit-widths. In addition, the aspects may include a processor combiner configured to designate a combination of two or more of the processors to process the operands.

Type: Grant

Filed: February 5, 2019

Date of Patent: November 26, 2019

Assignee: CAMBRICON TECHNOLOGIES CORPORATION LIMITED

Inventors: Tianshi Chen, Qi Guo, Zidong Du
Reuse of a related thread's cache while recording a trace file of code execution

Patent number: 10489273

Abstract: Reusing a related thread's cache during tracing. An embodiment includes executing a first thread at a processing unit while recording a trace to a first buffer. During execution, a context switch from the first thread to a second thread at the same processing unit is detected. Based on the context switch, it is determined that the second thread is related to the first thread, and that it is being traced to a separate second buffer. Based on this determination, a cache of the first thread is reused. The reuse includes recording a first identifier in the first buffer, and recording a second identifier in the second buffer. The first and second identifiers provide a linkage between the first buffer and the second buffer. Execution of the second thread is then initiated, while recording a trace to the second buffer, and without invalidating logging state of a cache.

Type: Grant

Filed: May 24, 2017

Date of Patent: November 26, 2019

Assignee: Microsoft Technology Licensing, LLC

Inventor: Jordi Mola
Method and apparatus for nearest potential store tagging

Patent number: 10467010

Abstract: A method for performing memory disambiguation in an out-of-order microprocessor pipeline is disclosed. The method comprises storing a tag with a load operation, wherein the tag is an identification number representing a store instruction nearest to the load operation, wherein the store instruction is older with respect to the load operation and wherein the store has potential to result in a RAW violation in conjunction with the load operation. The method also comprises issuing the load operation from an instruction scheduling module. Further, the method comprises acquiring data for the load operation speculatively after the load operation has arrived at a load store queue module. Finally, the method comprises determining if an identification number associated with a last contiguous issued store with respect to the load operation is equal to or greater than the tag and gating a validation process for the load operation in response to the determination.

Type: Grant

Filed: March 13, 2014

Date of Patent: November 5, 2019

Assignee: Intel Corporation

Inventors: Mohammad A. Abdallah, Mandeep Singh
Accelerating memory fault resolution by performing fast re-fetching

Patent number: 10402263

Abstract: A method for handling load faults in an out-of-order processor is described. The method includes detecting, by a memory ordering buffer of the out-of-order processor, a load fault corresponding to a load instruction that was executed out-of-order by the out-of-order processor; determining, by the memory ordering buffer, whether instant reclamation is available for resolving the load fault of the load instruction; and performing, in response to determining that instant reclamation is available for resolving the load fault of the load instruction, instant reclamation to re-fetch the load instruction for execution prior to attempting to retire the load instruction.

Type: Grant

Filed: December 4, 2017

Date of Patent: September 3, 2019

Assignee: Intel Corporation

Inventors: Zeev Sperber, Stanislav Shwartsman, Jared W. Stark, IV, Lihu Rappoport, Igor Yanover, George Leifman
Method and apparatus for detecting memory conflicts using distinguished memory addresses

Patent number: 10402201

Abstract: A method and apparatus for detecting potential memory conflicts in a parallel computing environment by executing two parallel program threads. The parallel program threads include special operands that are used by a processing core to identify memory addresses that have the potential for conflict. These memory addresses are combined into a composite access record for each thread. The composite access records are compared to each other in order to detect a potential memory conflict.

Type: Grant

Filed: March 9, 2017

Date of Patent: September 3, 2019

Inventors: Joel Kevin Jones, Ananth Jasty
Symmetric multiprocessing management

Patent number: 10375038

Abstract: Disclosed aspects relate to symmetric multiprocessing (SMP) management. A first SMP topology may be identified by a service processor firmware. The first SMP topology may indicate a first set of connection paths for a plurality of processor chips of a multi-node server. A second SMP topology may be identified by the service processor firmware. The second SMP topology may indicate a second set of connection paths for the plurality of processor chips of the multi-node server. The second SMP topology may differ from the first SMP topology. An error event related to the first SMP topology may be detected. A set of traffic may be routed using the second SMP topology. The set of traffic may be routed by the service processor firmware in response to detecting the error event related to the first SMP topology.

Type: Grant

Filed: November 30, 2016

Date of Patent: August 6, 2019

Assignee: International Business Machines Corporation

Inventors: Deepak Kodihalli, Venkatesh Sainath, Dhruvaraj Subhashchandran
Efficient pointer load and format

Patent number: 10360030

Abstract: Embodiments of the present disclosure relate to processing a microprocessor instruction by receiving a microprocessor instruction for processing by a microprocessor, and processing the microprocessor instruction in a multi-cycle operation by acquiring a unit of data having a plurality of ordered bits, where the acquiring is performed by the microprocessor during a first clock cycle, and shifting the unit of data by a number of bits, where the shifting is performed by the microprocessor during a second clock cycle subsequent to the first clock cycle.

Type: Grant

Filed: December 20, 2017

Date of Patent: July 23, 2019

Assignee: International Business Machines Corporation

Inventors: Eyal Naor, Martin Recktenwald, Christian Zoellin, Aaron Tsai
Software scoreboard information and synchronization

Patent number: 10360654

Abstract: Embodiments described herein provide a graphics processor in which dependency tracking hardware is simplified via the use of compiler provided software scoreboard information. In one embodiment the shader compiler for shader programs is configured to encode software scoreboard information into each instruction. Dependencies can be evaluated by the shader compiler and provided as scoreboard information with each instruction. The hardware can then use the provided information when scheduling instructions. In one embodiment, a software scoreboard synchronization instruction is provided to facilitate software dependency handling within a shader program. Using software to facilitate software dependency handling and synchronization can simplify hardware design, reducing the area consumed by the hardware. In one embodiment, dependencies can be evaluated by the shader compiler instead of the GPU hardware.

Type: Grant

Filed: May 25, 2018

Date of Patent: July 23, 2019

Assignee: Intel Corporation

Inventors: Subramaniam Maiyuran, Supratim Pal, Jorge E. Parra, Chandra S. Gurram, Ashwin J. Shivani, Ashutosh Garg, Brent A. Schwartz, Jorge F. Garcia Pabon, Darin M. Starkey, Shubh B. Shah, Guei-Yuan Lueh, Kaiyu Chen, Konrad Trifunovic, Buqi Cheng, Weiyu Chen
Efficient pointer load and format

Patent number: 10353707

Abstract: Embodiments of the present disclosure relate to processing a microprocessor instruction by receiving a microprocessor instruction for processing by a microprocessor, and processing the microprocessor instruction in a multi-cycle operation by acquiring a unit of data having a plurality of ordered bits, where the acquiring is performed by the microprocessor during a first clock cycle, and shifting the unit of data by a number of bits, where the shifting is performed by the microprocessor during a second clock cycle subsequent to the first clock cycle.

Type: Grant

Filed: July 12, 2017

Date of Patent: July 16, 2019

Assignee: International Business Machines Corporation

Inventors: Eyal Naor, Martin Recktenwald, Christian Zoellin, Aaron Tsai
Reading a register pair by writing a wide register

Patent number: 10318299

Abstract: A read operation is initiated to obtain a wide input operand. Based on the initiating, a determination is made as to whether the wide input operand is available in a wide register or in two narrow registers. Based on determining the wide input operand is not available in the wide register, merging at least a portion of contents of the two narrow registers to obtain merged contents, writing the merged contents into the wide register, and continuing the read operation to obtain the wide input operand. Based on determining the wide input operand is available in the wide register, obtaining the wide input operand from the wide register.

Type: Grant

Filed: October 31, 2013

Date of Patent: June 11, 2019

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Jonathan D. Bradbury, Michael K. Gschwind
Parallelized execution of instruction sequences based on pre-monitoring

Patent number: 10296346

Abstract: A method which includes, in a processor that processes instructions of program code, processing one or more of the instructions in a first segment of the instructions by a first hardware thread. Upon detecting that an instruction defined as a parallelization point has been fetched for the first thread, a second hardware thread is invoked to process at least one of the instructions in a second segment of the instructions, at least partially in parallel with processing of the instructions of the first segment by the first hardware thread, in accordance with a specification of register access that is indicative of data dependencies between the first and second segments.

Type: Grant

Filed: March 31, 2015

Date of Patent: May 21, 2019

Assignee: CENTIPEDE SEMI LTD.

Inventors: Noam Mizrahi, Alberto Mandler, Shay Koren, Jonathan Friedmann
Parallelized execution of instruction sequences

Patent number: 10296350

Abstract: A method which includes, in a processor that processes instructions of program code, processing one or more of the instructions by a first hardware thread. Upon detecting that an instruction defined as a parallelization point has been fetched for the first thread, a second hardware thread is invoked to process at least one of the instructions at least partially in parallel with processing of the instructions by the first hardware thread.

Type: Grant

Filed: March 31, 2015

Date of Patent: May 21, 2019

Assignee: CENTIPEDE SEMI LTD.

Inventors: Noam Mizrahi, Alberto Mandler, Shay Koren, Jonathan Friedmann
Numerical controller for reducing consumed power in non-cutting state

Patent number: 10281901

Abstract: A numerical controller looks ahead a machining program to detect consecutive non-cutting blocks. The numerical controller calculates first consumed power needed during an execution duration of the non-cutting blocks to shift equipment to a power saving state, operate the equipment in the power saving state, and restore the equipment to a state before the shifting to the power saving state, and second consumed power needed during the execution duration of the non-cutting blocks to operate the equipment without shifting the equipment to the power saving state. When a result of the calculation indicates that the first consumed power is lower than the second consumed power, the numerical controller creates an equipment operation variation pattern according to which the equipment is to be shifted to the power saving state, operated in the power saving state, and then restored to the state before the shifting to the power saving state.

Type: Grant

Filed: April 21, 2017

Date of Patent: May 7, 2019

Assignee: Fanuc Corporation

Inventor: Takenori Ono
Scheduling business process

Patent number: 10282707

Abstract: A system and method for scheduling a business process including tasks, comprises a calculation unit, a determination unit, and a decision unit. The calculation unit is configured to calculate an estimated processing time required to execute the tasks. The determination unit is configured to calculate an estimated end time of a route including the tasks on the basis of the estimated processing time and schedule of a user to execute the tasks, and determine whether to apply speculative execution to the business process on the basis of the estimated end time. The decision unit is configured to decide to speculatively execute a task out of the tasks in the business process. The decision is made with reference to a remaining period for executing the task. The remaining period is calculated on the basis of a predicted execution timing of each task and a deadline of the business process.

Type: Grant

Filed: July 2, 2015

Date of Patent: May 7, 2019

Assignee: International Business Machines Corporation

Inventors: Mari A. Fukuda, Ai Yoshino, Takuya Nakaike
Method to do control speculation on loads in a high performance strand-based loop accelerator

Patent number: 10241789

Abstract: An apparatus includes a binary translator to hoist a load instruction in a branch of a conditional statement above the conditional statement and insert a speculation control of load (SCL) instruction in a complementary branch of the conditional statement, where the SCL instruction provides an indication of a real program order (RPO) of the load instruction before the load instruction was hoisted. The apparatus further includes an execution circuit to execute the load instruction to perform a load and cause an entry for the load instruction to be inserted in an ordering buffer, and where the execution circuit is to execute the SCL instruction to locate the entry for the load instruction in the ordering buffer using the RPO of the load instruction provided by the SCL instruction and discard the entry for the load instruction from the ordering buffer.

Type: Grant

Filed: December 27, 2016

Date of Patent: March 26, 2019

Assignee: INTEL CORPORATION

Inventors: Alexander Y. Ostanevich, Sergey P. Scherbinin, Jayesh Iyer, Dmitry M. Maslennikov, Denis G. Motin, Alexander V. Ermolovich, Andrey Chudnovets, Sergey A. Rozhkov, Boris A. Babayan
Out-of-order command execution with sliding windows to maintain completion statuses

Patent number: 10241799

Abstract: Techniques are described for reordering commands to improve the speed at which at least one command stream may execute. Prior to distributing commands in the at least one command stream to multiple pipelines, a multimedia processor analyzes any inter-pipeline dependencies and determines the current execution state of the pipelines. The processor may, based on this information, reorder the at least one command stream by prioritizing commands that lack any current dependencies and therefore may be executed immediately by the appropriate pipeline. Such out of order execution of commands in the at least one command stream may increase the throughput of the multimedia processor by increasing the rate at which the command stream is executed.

Type: Grant

Filed: July 16, 2010

Date of Patent: March 26, 2019

Assignee: QUALCOMM Incorporated

Inventors: Alexei V. Bourd, Guofang Jiao
Execution of program region with transactional memory

Patent number: 10241700

Abstract: A method for executing a program region by a computer system with transactional memory support is disclosed. The computer system uses hierarchical locks for executing the program region. Determination is conducted whether a first condition related to a transaction abort is satisfied in beginning a transaction for the program region. If the first condition is satisfied, a bottom level lock corresponding to a bottom level resource among available resources is acquired to execute the program region in the transaction. If a second condition is determined to be satisfied, a next level lock corresponding to next level resource is acquired. If the acquired lock is a top level lock corresponding to a top level resource, the program region is executed without using the transaction.

Type: Grant

Filed: August 21, 2015

Date of Patent: March 26, 2019

Assignee: International Business Machines Corporation

Inventor: Takuya Nakaike
Reading a register pair by writing a wide register

Patent number: 10228946

Abstract: A read operation is initiated to obtain a wide input operand. Based on the initiating, a determination is made as to whether the wide input operand is available in a wide register or in two narrow registers. Based on determining the wide input operand is not available in the wide register, merging at least a portion of contents of the two narrow registers to obtain merged contents, writing the merged contents into the wide register, and continuing the read operation to obtain the wide input operand. Based on determining the wide input operand is available in the wide register, obtaining the wide input operand from the wide register.

Type: Grant

Filed: November 22, 2014

Date of Patent: March 12, 2019

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Jonathan D. Bradbury, Michael K. Gschwind
Apparatus and method for programmable load replay preclusion

Patent number: 10228944

Abstract: An apparatus including first and second reservation stations. The first reservation station dispatches a load micro instruction, and indicates on a hold bus if the load micro instruction is a specified load micro instruction directed to retrieve an operand from a prescribed resource other than on-core cache memory. The second reservation station is coupled to the hold bus, and dispatches one or more younger micro instructions therein that depend on the load micro instruction for execution after a number of clock cycles following dispatch of the first load micro instruction, and if it is indicated on the hold bus that the load micro instruction is the specified load micro instruction, the second reservation station is configured to stall dispatch of the one or more younger micro instructions until the load micro instruction has retrieved the operand.

Type: Grant

Filed: November 24, 2015

Date of Patent: March 12, 2019

Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD.

Inventors: Gerard M. Col, Colin Eddy, G. Glenn Henry
Computing dependent and conflicting changes of business process models

Patent number: 10223651

Abstract: Changing a business process model involves several aspects: (1) given a set of change operations, dependencies and conflicts are encoded in dependency and conflict matrices; (2) given a change sequence for a process model M, the change sequence is broken up into subsequences such that operations from different subsequences are independent; (3) given a change sequence for a process model V1 and another change sequence for a process model V2, conflicts between operations in the different change sequences are determined; (4) the process structure tree can be used to localize dependency computations, yielding a more efficient approach to determining dependencies; and (5) the process structure tree can be used to localize conflict computations, yielding a more efficient approach to determining conflicts.

Type: Grant

Filed: June 26, 2012

Date of Patent: March 5, 2019

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Jochen M. Kuester, Christian Gerth
Apparatus and method for programmable load replay preclusion

Patent number: 10209996

Abstract: An apparatus including first and second reservation stations. The first reservation station dispatches a load micro instruction, and indicates on a hold bus if the load micro instruction is a specified load micro instruction directed to retrieve an operand from a prescribed resource other than on-core cache memory. The second reservation station is coupled to the hold bus, and dispatches one or more younger micro instructions therein that depend on the load micro instruction for execution after a number of clock cycles following dispatch of the first load micro instruction, and if it is indicated on the hold bus that the load micro instruction is the specified load micro instruction, the second reservation station is configured to stall dispatch of the one or more younger micro instructions until the load micro instruction has retrieved the operand.

Type: Grant

Filed: December 14, 2014

Date of Patent: February 19, 2019

Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD.

Inventors: Gerard M. Col, Colin Eddy, G. Glenn Henry
Processor with improved alias queue and store collision detection to reduce memory violations and load replays

Patent number: 10203957

Abstract: A register alias table for a processor including an alias queue, load and store comparators, and dependency logic. Each entry of the alias queue stores instruction pointers of a pair of colliding load and store instructions that caused a memory violation and a valid value. The store comparator compares the instruction pointer of a subsequent store instruction with those stored in the alias queue, and if a match occurs, indicates that a store index of the subsequent store instruction is valid. The load comparator determines whether the instruction pointer of a subsequent load instruction matches an instruction pointer stored in the alias queue. If so, dependency logic provides a store index, if valid, as dependency information for the subsequent load instruction.

Type: Grant

Filed: September 30, 2016

Date of Patent: February 12, 2019

Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD.

Inventor: Xiaolong Fei
Processor with efficient memory access

Patent number: 10185561

Abstract: A method includes, in a processor, processing program code that includes memory-access instructions, wherein at least some of the memory-access instructions include symbolic expressions that specify memory addresses in an external memory in terms of one or more register names. A relationship between the memory addresses accessed by two or more of the memory-access instructions is identified, based on respective formats of the memory addresses specified in the symbolic expressions. An outcome of at least one of the memory-access instructions is assigned to be served from an internal memory in the processor, based on the identified relationship.

Type: Grant

Filed: July 9, 2015

Date of Patent: January 22, 2019

Assignee: CENTIPEDE SEMI LTD.

Inventors: Noam Mizrahi, Jonathan Friedmann
Data processing systems

Patent number: 10176546

Abstract: A data processing system determines for a stream of instructions to be executed, whether there are any instructions that can be re-ordered in the instruction stream 41 and assigns each such instruction to an instruction completion tracker and includes in the encoding for the instruction an indication of the instruction completion tracker it has been assigned to 42. For each instruction in the instruction stream, an indication of which instruction completion trackers, if any, the instruction depends on is also provided 43, 44. Then, when an instruction that is indicated as being dependent on an instruction completion tracker is to be executed, the status of the relevant instruction completion tracker is checked before executing the instruction.

Type: Grant

Filed: July 2, 2013

Date of Patent: January 8, 2019

Assignee: Arm Limited

Inventor: Jorn Nystad
Load replay precluding mechanism

Patent number: 10146546

Abstract: An apparatus including first and second reservation stations. The first reservation station dispatches a load micro instruction, and indicates on a hold bus if the load micro instruction is a specified load micro instruction directed to retrieve an operand from a prescribed resource other than on-core cache memory. The second reservation station is coupled to the hold bus, and dispatches one or more younger micro instructions therein that depend on the load micro instruction for execution after a number of clock cycles following dispatch of the first load micro instruction, and if it is indicated on the hold bus that the load micro instruction is the specified load micro instruction, the second reservation station is configured to stall dispatch of the one or more younger micro instructions until the load micro instruction has retrieved the operand. The resources include an advanced programmable interrupt controller (APIC), configured to perform interrupt operations.

Type: Grant

Filed: December 14, 2014

Date of Patent: December 4, 2018

Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD

Inventors: Gerard M. Col, Colin Eddy, G. Glenn Henry
Load replay precluding mechanism

Patent number: 10146539

Abstract: An apparatus including first and second reservation stations. The first reservation station dispatches a load micro instruction, and indicates on a hold bus if the load micro instruction is a specified load micro instruction directed to retrieve an operand from a prescribed resource other than on-core cache memory. The second reservation station is coupled to the hold bus, and dispatches one or more younger micro instructions therein that depend on the load micro instruction for execution after a number of clock cycles following dispatch of the first load micro instruction, and if it is indicated on the hold bus that the load micro instruction is the specified load micro instruction, the second reservation station is configured to stall dispatch of the one or more younger micro instructions until the load micro instruction has retrieved the operand. The resources include an advanced programmable interrupt controller (APIC), configured to perform interrupt operations.

Type: Grant

Filed: November 24, 2015

Date of Patent: December 4, 2018

Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD.

Inventors: Gerard M. Col, Colin Eddy, G. Glenn Henry
Apparatus and method to preclude load replays dependent on write combining memory space access in an out-of-order processor

Patent number: 10146540

Abstract: An apparatus including first and second reservation stations. The first reservation station dispatches a load micro instruction, and indicates on a hold bus if the load micro instruction is a specified load micro instruction directed to retrieve an operand from a prescribed resource other than on-core cache memory. The second reservation station is coupled to the hold bus, and dispatches one or more younger micro instructions therein that depend on the load micro instruction for execution after a number of clock cycles following dispatch of the first load micro instruction, and if it is indicated on the hold bus that the load micro instruction is the specified load micro instruction, the second reservation station is configured to stall dispatch of the one or more younger micro instructions until the load micro instruction has retrieved the operand.

Type: Grant

Filed: November 24, 2015

Date of Patent: December 4, 2018

Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD

Inventors: Gerard M. Col, Colin Eddy, G. Glenn Henry
Mechanism to preclude uncacheable-dependent load replays in out-of-order processor

Patent number: 10133579

Abstract: An apparatus including first and second reservation stations. The first reservation station dispatches a load micro instruction, and indicates on a hold bus if the load micro instruction is a specified load micro instruction directed to retrieve an operand from a prescribed resource other than on-core cache memory. The second reservation station is coupled to the hold bus, and dispatches one or more younger micro instructions therein that depend on the load micro instruction for execution after a number of clock cycles following dispatch of the first load micro instruction, and if it is indicated on the hold bus that the load micro instruction is the specified load micro instruction, the second reservation station is configured to stall dispatch of the one or more younger micro instructions until the load micro instruction has retrieved the operand. The resources include system memory, coupled an out-of-order processor via a memory bus.

Type: Grant

Filed: December 14, 2014

Date of Patent: November 20, 2018

Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD.

Inventors: Gerard M. Col, Colin Eddy, G. Glenn Henry
Apparatus and method to preclude load replays dependent on write combining memory space access in an out-of-order processor

Patent number: 10133580

Abstract: An apparatus including first and second reservation stations. The first reservation station dispatches a load micro instruction, and indicates on a hold bus if the load micro instruction is a specified load micro instruction directed to retrieve an operand from a prescribed resource other than on-core cache memory. The second reservation station is coupled to the hold bus, and dispatches one or more younger micro instructions therein that depend on the load micro instruction for execution after a number of clock cycles following dispatch of the first load micro instruction, and if it is indicated on the hold bus that the load micro instruction is the specified load micro instruction, the second reservation station is configured to stall dispatch of the one or more younger micro instructions until the load micro instruction has retrieved the operand.

Type: Grant

Filed: December 14, 2014

Date of Patent: November 20, 2018

Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD

Inventors: Gerard M. Col, Colin Eddy, G. Glenn Henry
Mechanism to preclude uncacheable-dependent load replays in out-of-order processor

Patent number: 10127046

Abstract: An apparatus including first and second reservation stations. The first reservation station dispatches a load micro instruction, and indicates on a hold bus if the load micro instruction is a specified load micro instruction directed to retrieve an operand from a prescribed resource other than on-core cache memory. The second reservation station is coupled to the hold bus, and dispatches one or more younger micro instructions therein that depend on the load micro instruction for execution after a number of clock cycles following dispatch of the first load micro instruction, and if it is indicated on the hold bus that the load micro instruction is the specified load micro instruction, the second reservation station is configured to stall dispatch of the one or more younger micro instructions until the load micro instruction has retrieved the operand. The resources include system memory, coupled an out-of-order processor via a memory bus.

Type: Grant

Filed: November 24, 2015

Date of Patent: November 13, 2018

Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD.

Inventors: Gerard M. Col, Colin Eddy, G. Glenn Henry
Programmable load replay precluding mechanism

Patent number: 10114794

Abstract: An apparatus including first and second reservation stations. The first reservation station dispatches a load micro instruction, and indicates on a hold bus if the load micro instruction is a specified load micro instruction directed to retrieve an operand from a prescribed resource other than on-core cache memory. The second reservation station is coupled to the hold bus, and dispatches one or more younger micro instructions therein that depend on the load micro instruction for execution after a number of clock cycles following dispatch of the first load micro instruction, and if it is indicated on the hold bus that the load micro instruction is the specified load micro instruction, the second reservation station is configured to stall dispatch of the one or more younger micro instructions until the load micro instruction has retrieved the operand.

Type: Grant

Filed: December 14, 2014

Date of Patent: October 30, 2018

Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD

Inventors: Gerard M. Col, Colin Eddy, G. Glenn Henry

prev 1 2 3 4 5 6 7 … next