Dynamic Instruction Dependency Checking, Monitoring Or Conflict Resolution Patents (Class 712/216)
  • Patent number: 11402822
    Abstract: To provide a numerical controller that can detect a position in a machining program at which a speed control abnormality is likely to occur due to an insufficient look-ahead blocks that are used to determine an acceleration/deceleration operation, and start a look-ahead processing function from the position in parallel with looking ahead at the machining program from the start of the machining program.
    Type: Grant
    Filed: October 25, 2019
    Date of Patent: August 2, 2022
    Assignee: FANUC CORPORATION
    Inventors: Daisuke Uenishi, Chikara Tango
  • Patent number: 11392410
    Abstract: Operand pool instruction reservation clustering in a scheduler circuit in a processor is disclosed. The scheduler circuit includes a plurality of operand pool reservation circuits each having an assigned number of source operands for an instruction stored that must be ready before the instruction is issued. Instructions having the same number of source operands that are not yet ready for its issuance can be stored in an operand pool reservation circuit having the same assigned number of source operands. In this manner, the number of reservation entries and associated comparator circuits in the clustered scheduler circuit is distributed among the plurality of operand pool reservation circuits to avoid or reduce an increase in the number of scheduling path connections and complexity in each reservation circuit. This can avoid or reduce an increase in scheduling latency for a given number of reservation entries in the clustered scheduler circuit.
    Type: Grant
    Filed: April 8, 2020
    Date of Patent: July 19, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Shivam Priyadarshi, Yusuf Cagatay Tekmen, Rodney Wayne Smith, Vignyan Reddy Kothinti Naresh
  • Patent number: 11392380
    Abstract: Systems, methods, and apparatuses relating to circuitry to precisely monitor memory store accesses are described.
    Type: Grant
    Filed: December 28, 2019
    Date of Patent: July 19, 2022
    Assignee: Intel Corporation
    Inventors: Ahmad Yasin, Raanan Sade, Liron Zur, Igor Yanover, Joseph Nuzman
  • Patent number: 11372662
    Abstract: Disclosed herein are methods, systems, and processes to perform granular and selective agent-based throttling of command executions. A resource consumption threshold is allocated to an agent process that is configured to perform data collection tasks on a host computing device. A desired throttle is generated for the agent process based on the resource consumption threshold allocated to the agent process and execution of the agent process is controlled in polling intervals. For each polling interval, a current throttle level for the agent process is determined based on a run count and a skip count of the agent process, the agent process is suspended if the agent process is active and the current throttle is greater than the desired throttle level, and the agent process is resumed if the agent process is idle and the current throttle level is not greater than the desired throttle level.
    Type: Grant
    Filed: June 9, 2021
    Date of Patent: June 28, 2022
    Assignee: Rapid7, Inc.
    Inventor: Shreyas Khare
  • Patent number: 11366664
    Abstract: Systems, apparatuses and methods are disclosed for efficient management of registers in a graph stream processing (GSP) system. The GSP system includes a thread scheduler module operative to initiate a Single Instruction Multiple Data (SIMD) thread, the SIMD thread including a dispatch mask with an initial value. A thread arbiter module operative to select an instruction from the instructions and provide the instruction to each of one or more compute resources, and an instruction iterator module, associated with the each of one or more compute resources operative to determine a data type of the instruction. The instruction iterator module iteratively executes the instruction based on the data type and the dispatch mask.
    Type: Grant
    Filed: December 8, 2019
    Date of Patent: June 21, 2022
    Assignee: Blaize, Inc.
    Inventors: Kamaraj Thangam, Srinivasulu Nagisetty, Venkata Divya Bharathi Palaparthy, Aswathy Asok, Satyaki Koneru
  • Patent number: 11366669
    Abstract: Data processing apparatus, data processing methods, a method and a computer program product are disclosed. The data processing apparatus includes a processor core operable to execute sequences of instructions of a plurality of program threads. The processor core has a plurality of pipeline stages, one of which is an instruction schedule stage having scheduling logic operable, in response to a thread pause instruction within a program thread, to prevent scheduling of instructions from that program thread following the thread pause instruction and instead to schedule instructions from another program thread for execution within the plurality of pipeline stages.
    Type: Grant
    Filed: November 30, 2016
    Date of Patent: June 21, 2022
    Assignee: Swarm64 AS
    Inventor: Eivind Liland
  • Patent number: 11360773
    Abstract: Reusing fetched, flushed instructions after an instruction pipeline flush in response to a hazard in a processor to reduce instruction re-fetching is disclosed. An instruction processing circuit is configured to detect fetched performance degrading instructions (Pals) in a pre-execution stage in an instruction pipeline that may cause a precise interrupt that would cause flushing of the instruction pipeline. In response to detecting a PDI in an instruction pipeline, the instruction processing circuit is configured to capture the fetched PDI and/or its successor, younger fetched instructions that are processed in the instruction pipeline behind the PDI, in a pipeline refill circuit.
    Type: Grant
    Filed: June 22, 2020
    Date of Patent: June 14, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Rami Mohammad Al Sheikh, Michael Scott McIlvaine
  • Patent number: 11354129
    Abstract: A system for predicting latency of at least one variable-latency instruction, wherein a microprocessor includes at least one pipeline, the at least one pipeline having an instruction stream. The microprocessor is configured to issue at least one dependent instruction, execute the at least one pipeline to serve at least one variable-latency instruction, generate a result of the at least one variable-latency instruction, and serve the at least one dependent instruction by using the result of the at least one variable-latency instruction.
    Type: Grant
    Filed: October 9, 2015
    Date of Patent: June 7, 2022
    Assignee: SPREADTRUM HONG KONG LIMITED
    Inventor: Jeremy L. Branscome
  • Patent number: 11340787
    Abstract: The present disclosure includes apparatuses and methods related to a memory protocol. An example apparatus can perform operations on a number of block buffers of the memory device based on commands received from a host using a block configuration register, wherein the operations can read data from the number of block buffers and write data to the number of block buffers on the memory device.
    Type: Grant
    Filed: December 18, 2019
    Date of Patent: May 24, 2022
    Assignee: Micron Technology, Inc.
    Inventors: Robert M. Walker, James A. Hall, Jr.
  • Patent number: 11281466
    Abstract: A floating point unit includes a non-pickable scheduler queue (NSQ) that offers a load operation concurrently with a load store unit retrieving load data for an operand that is to be loaded by the load operation. The floating point unit also includes a renamer that renames architectural registers used by the load operation and allocates physical register numbers to the load operation in response to receiving the load operation from the NSQ. The floating point unit further includes a set of pickable scheduler queues that receive the load operation from the renamer and store the load operation prior to execution. A physical register file is implemented in the floating point unit and a free list is used to store physical register numbers of entries in the physical register file that are available for allocation.
    Type: Grant
    Filed: October 22, 2019
    Date of Patent: March 22, 2022
    Assignees: Advanced Micro Devices, Inc., ATI TECHNOLOGIES ULC
    Inventors: Arun A. Nair, Michael Estlick, Erik Swanson, Sneha V. Desai, Donglin Ji
  • Patent number: 11281187
    Abstract: To provide a numerical controller that can detect a position in a machining program at which a speed control abnormality is likely to occur due to an insufficient look-ahead blocks that are used to determine an acceleration/deceleration operation, and supplement the look-ahead blocks at that position in order to stabilize feed rate, cutting speed and other factors. A numerical controller includes a required look-ahead blocks setting unit that sets a required look-ahead blocks, which is a look-ahead blocks required to execute a machining program, and an operation limitation unit that compares a look-ahead blocks calculated by a look-ahead blocks calculation unit to the required look-ahead blocks and, if the look-ahead blocks is less than the required look-ahead blocks, limits execution of the machining program until the look-ahead blocks reach the required look-ahead blocks.
    Type: Grant
    Filed: October 22, 2019
    Date of Patent: March 22, 2022
    Assignee: FANUC CORPORATION
    Inventors: Daisuke Uenishi, Chikara Tango
  • Patent number: 11275712
    Abstract: In an embodiment, a method for processing data in a single instruction multiple data (SIMD) computer architecture is provided. A processing element (PE) may determine based on a masking instruction, a predication state indicative of one of a conditional predication mode and an absolute predication mode. The PE may receive a predicated instruction and, based on a value of a head bit of the bits of a predication mask and on the value indicative of the predication state whether to commit a computation corresponding to execution of the predicated instruction. In another embodiment, a SIMD controller stores loops and sections of a program as a separate instruction stream record for generating the memory address of the next instruction. For data streams, the SIMD controller records information for each data memory access that references the same register files that are used by the instruction streams.
    Type: Grant
    Filed: August 20, 2020
    Date of Patent: March 15, 2022
    Assignee: NORTHROP GRUMMAN SYSTEMS CORPORATION
    Inventors: Paul Kenton Tschirhart, Brian Konigsburg
  • Patent number: 11249985
    Abstract: The present disclosure relates to systems, methods, and non-transitory computer readable storage media for implementing a scalable, secure, efficient, and adaptable distributed digital ledger transaction network. Indeed, the disclosed systems can reduce storage and processing requirements, improve security of implementing computing devices and underlying digital assets, accommodate a wide variety of different digital programs (or “smart contracts”), and scale to accommodate billions of users and associated digital transactions. For example, the disclosed systems can utilize a host of features that improve storage, account/address management, digital transaction execution, consensus, and synchronization processes. The disclosed systems can also utilize a new programming language that improves efficiency and security of the distributed digital ledger transaction network.
    Type: Grant
    Filed: June 15, 2019
    Date of Patent: February 15, 2022
    Assignee: Facebook, Inc.
    Inventors: Qinfan Wu, Benjamin D. Maurer
  • Patent number: 11243774
    Abstract: Methods, systems and computer program products for dynamically selecting an OSC hazard avoidance mechanism are provided. Aspects include receiving a load instruction that is associated with an operand store compare (OSC) prediction. The OSC prediction is stored in an entry of an OSC history table (OHT) and includes a multiple dependencies indicator (MDI). Responsive to determining the MDI is in a first state, aspects include applying a first OSC hazard avoidance mechanism in relation to the load instruction. Responsive to determining that the load instruction is dependent on more than one store instruction, aspects include placing the MDI in a second state. The MDI being in the second state provides an indication to apply a second OSC hazard avoidance mechanism in relation to the load instruction.
    Type: Grant
    Filed: March 20, 2019
    Date of Patent: February 8, 2022
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: James Raymond Cuffney, Adam Collura, James Bonanno, Edward Malley, Anthony Saporito, Jang-Soo Lee, Michael Cadigan, Jr., Jonathan Hsieh
  • Patent number: 11210193
    Abstract: A method for improving performance of a system including a first processor and a second processor includes obtaining a code region specified to be executed on the second processor, the code region including a plurality of instructions, calculating a performance improvement of executing at least one of the plurality of instructions included in the code region on the second processor over executing the at least one instruction on the first processor, removing the at least one instruction from the code region in response to a condition including that the performance improvement does not exceed a first threshold, and repeating the calculating and the removing to produce a modified code region specified to be executed on the second processor.
    Type: Grant
    Filed: October 28, 2019
    Date of Patent: December 28, 2021
    Assignee: International Business Machines Corporation
    Inventor: Kazuaki Ishizaki
  • Patent number: 11204768
    Abstract: Instruction length based parallel instruction demarcators and methods for parallel instruction demarcation are included, wherein an instruction sequence is received at an instruction buffer, the instruction sequence comprising a plurality of instruction syllables, and the instruction sequence is stored at the instruction buffer. It is determined, using one or more logic blocks arranged in a sequence, a length of instructions and at least one boundary. Additionally, using a controlling logic block, the sequence is demarcated into individual instructions.
    Type: Grant
    Filed: August 12, 2020
    Date of Patent: December 21, 2021
    Inventor: Sitaram Yadavalli
  • Patent number: 11204773
    Abstract: A data processing apparatus is provided. It includes processing circuitry for speculatively executing a plurality of instructions. Storage circuitry stores a current state of the processing circuitry and a plurality of previous states of the processing circuitry. Execution of the plurality of instructions changes the current state of the processing circuitry. Flush circuitry replaces, in response to a miss-prediction, the current state of the processing circuitry with a replacement one of the plurality of previous states of the processing circuitry.
    Type: Grant
    Filed: September 7, 2018
    Date of Patent: December 21, 2021
    Assignee: Arm Limited
    Inventors: William Elton Burky, Glen Andrew Harris, Yasuo Ishii
  • Patent number: 11200064
    Abstract: Methods and parallel processing units for avoiding inter-pipeline data hazards wherein inter-pipeline data hazards are identified at compile time. For each identified inter-pipeline data hazard the primary instruction and secondary instruction(s) thereof are identified as such and are linked by a counter which is used to track that inter-pipeline data hazard. Then when a primary instruction is output by the instruction decoder for execution the value of the counter associated therewith is adjusted (e.g. incremented) to indicate that there is hazard related to the primary instruction, and when primary instruction has been resolved by one of multiple parallel processing pipelines the value of the counter associated therewith is adjusted (e.g. decremented) to indicate that the hazard related to the primary instruction has been resolved.
    Type: Grant
    Filed: October 14, 2020
    Date of Patent: December 14, 2021
    Assignee: Imagination Technologies Limited
    Inventors: Luca Iuliano, Simon Nield, Yoong-Chert Foo, Ollie Mower
  • Patent number: 11188255
    Abstract: An integrated circuit may include a memory controller circuit for communicating with an off-chip memory device. The memory controller is operable in a read-write major mode that is capable of dynamically adapting to any memory traffic pattern, which results in improved memory scheduling efficiency across different user applications. The memory controller may include at least a write command queue, a read command queue, an arbiter, and a command scheduler. The command scheduler may monitor a write command count, a read command count, a write stall count, and a read stall count to determine whether to dynamically adjust a read burst threshold setting and a write burst threshold setting.
    Type: Grant
    Filed: March 28, 2018
    Date of Patent: November 30, 2021
    Assignee: Intel Corporation
    Inventors: Chee Hak Teh, Yu Ying Ong, Kevin Chao Ing Teoh
  • Patent number: 11188334
    Abstract: Obsoleting values stored in registers in a processor based on processing obsolescent register-encoded instructions is disclosed. The processor is configured to support execution of read and/or write instructions that include obsolescence encoding indicating that one or more of its source and/or target register operands are to be obsoleted by the processor. A register encoded as obsolescent means the data value stored in such register will not be used by subsequent instructions in an instruction stream, and thus does not need to be retained. Thus, such register can be set as being in an obsolescent state so that the data value stored in such register can be ignored to improve performance. As one example, data values for registers having an obsolescent state can be ignored and thus not stored in a saved context for a process being switched out, thus conserving memory and improving processing time for a process switch.
    Type: Grant
    Filed: December 2, 2019
    Date of Patent: November 30, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Thomas Andrew Sartorius, Thomas Philip Speier, Michael Scott McIlvaine, James Norris Dieffenderfer, Rodney Wayne Smith
  • Patent number: 11182207
    Abstract: Techniques are disclosed for reducing the latency between the completion of a producer task and the launch of a consumer task dependent on the producer task. Such latency exists when the information needed to launch the consumer task is unavailable when the producer task completes. Thus, various techniques are disclosed, where a task management unit initiates the retrieval of the information needed to launch the consumer task from memory in parallel with the producer task being launched. Because the retrieval of such information is initiated in parallel with the launch of the producer task, the information is often available when the producer task completes, thus allowing for the consumer task to be launched without delay. The disclosed techniques, therefore, enable the latency between completing the producer task and launching the consumer task to be reduced.
    Type: Grant
    Filed: June 24, 2019
    Date of Patent: November 23, 2021
    Assignee: NVIDIA CORPORATION
    Inventors: Gentaro Hirota, Brian Pharris, Jeff Tuckey, Robert Overman, Stephen Jones
  • Patent number: 11182168
    Abstract: A computer data processing system includes an instruction pipeline having a front end and a back end, a decoding and dispatch unit to dispatch a current instruction; and a pipeline by-pass unit to invoke an out-of-order pipeline by-pass operation. The pipeline by-pass unit by-passes a section of the instruction pipeline such that the current instruction architecturally completes before initiating instruction execution. The computer data processing system further includes a post-completion execution unit that executes the current instruction after the current instruction architecturally completes.
    Type: Grant
    Filed: December 21, 2020
    Date of Patent: November 23, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Avery Francois, Christian Jacobi, Gregory William Alexander
  • Patent number: 11163675
    Abstract: Mutation testing can indicate whether mutants of a software application, created by intentionally altering source code of the software application, are successfully “killed” by test cases executed against the mutants. Mutation testing can be performed via parallel threads by, within each parallel thread, modifying individual source code class files and recompiling the modified class files to generate and test mutants. Individual mutation test results produced within each of the parallel threads can be aggregated to generate an aggregated test result report that indicates overall testing metrics associated with the mutation testing across the parallel threads.
    Type: Grant
    Filed: April 7, 2021
    Date of Patent: November 2, 2021
    Assignee: State Farm Mutual Automobile Insurance Company
    Inventors: Andrew L Pearson, Nate Shepherd
  • Patent number: 11150979
    Abstract: A method for handling load faults in an out-of-order processor is described. The method includes detecting, by a memory ordering buffer of the out-of-order processor, a load fault corresponding to a load instruction that was executed out-of-order by the out-of-order processor; determining, by the memory ordering buffer, whether instant reclamation is available for resolving the load fault of the load instruction; and performing, in response to determining that instant reclamation is available for resolving the load fault of the load instruction, instant reclamation to re-fetch the load instruction for execution prior to attempting to retire the load instruction.
    Type: Grant
    Filed: August 13, 2019
    Date of Patent: October 19, 2021
    Assignee: Intel Corporation
    Inventors: Zeev Sperber, Stanislav Shwartsman, Jared W. Stark, IV, Lihu Rappoport, Igor Yanover, George Leifman
  • Patent number: 11150902
    Abstract: Systems and methods of performing processor pipeline management include receiving an instruction for processing, determining that data in a first memory sub-group of a memory group needed to process the instruction is not available in a cache that ensures fixed latency access, and determining that the instruction should be put in a sleep state. The sleep state indicates that the instruction will not be reissued until the instruction is moved to a wakeup state. The methods also include associating the instruction with a ticket identifier (ID) that corresponds with a second memory sub-group of the memory group, and moving the instruction to the wakeup state based on the second memory sub-group of the memory group being moved into the cache.
    Type: Grant
    Filed: February 11, 2019
    Date of Patent: October 19, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Taylor J. Pritchard, Jonathan Hsieh, Michael Cadigan, Jr.
  • Patent number: 11150907
    Abstract: An execution unit circuit for use in a processor core provides efficient use of area and energy by reducing the per-entry storage requirement of a load-store unit issue queue. The execution unit circuit includes a recirculation queue that stores the effective address of the load and store operations and the values to be stored by the store operations. A queue control logic controls the recirculation queue and issue queue so that that after the effective address of a load or store operation has been computed, the effective address of the load operation or the store operation is written to the recirculation queue and the operation is removed from the issue queue, so that address operands and other values that were in the issue queue entry no longer require storage. When a load or store operation is rejected by the cache unit, it is subsequently reissued from the recirculation queue.
    Type: Grant
    Filed: July 30, 2018
    Date of Patent: October 19, 2021
    Assignee: International Business Machines Corporation
    Inventors: Salma Ayub, Sundeep Chadha, Robert Allen Cordes, David Allen Hrusecky, Hung Qui Le, Dung Quoc Nguyen, Brian William Thompto
  • Patent number: 11113055
    Abstract: A computer implemented method for marking a store instruction overlap in a processor pipeline is provided. A non-limiting example of the method includes detecting a second store instruction subsequent to a first store instruction in an instruction stream, in which there is a match between the operand address information of the first store instruction and a load instruction. The operand address information of the first store instruction is compared with the operand address information of the second store instruction to determine whether there is match. In the event of a match, the second store instruction is delayed in the processor pipeline in response to determining that there is a memory image overlap between the operand address information of the second store instruction and the first store instruction.
    Type: Grant
    Filed: March 19, 2019
    Date of Patent: September 7, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Edward Malley, Jang-Soo Lee, Anthony Saporito, Chung-Lung K. Shum, Gregory William Alexander
  • Patent number: 11106431
    Abstract: A computing device to implement fast floating-point adder tree for the neural network applications is disclosed. The fast float-point adder tree comprises a data preparation module, a fast fixed-point Carry-Save Adder (CSA) tree, and a normalization module. The floating-point input data comprises a sign bit, exponent part and fraction part. The data preparation module aligns the fraction part of the input data and prepares the input data for subsequent processing. The fast adder uses a signed fixed-point CSA tree to quickly add a large number of fixed-point data into 2 output values and then uses a normal adder to add the 2 output values into one output value. The fast adder uses for a large number of operands is based on multiple levels of fast adders for a small number of operands. The output from the signed fixed-point Carry-Save Adder tree is converted to a selected floating-point format.
    Type: Grant
    Filed: February 22, 2020
    Date of Patent: August 31, 2021
    Assignee: DINOPLUSAI HOLDINGS LIMITED
    Inventors: Yutian Feng, Yujie Hu
  • Patent number: 11086526
    Abstract: The present disclosure provides techniques for implementing a computing system that includes a processing sub-system, a memory sub-system, and one or more memory controllers. The processing sub-system includes processing circuitry that performs an operation based on a target data block and a processor-side cache coupled between the processing circuitry and a system bus. The memory sub-system includes a memory that stores data blocks in a memory array and a memory-side caches coupled between the memory channel and the system bus. The one or more memory controllers control caching in the processor-side cache based at least in part on temporal relationship between previous data block targeting by the processing circuitry and control caching in memory-side cache based at least in part on spatial relationship between data block storage locations in the memory channel.
    Type: Grant
    Filed: August 2, 2018
    Date of Patent: August 10, 2021
    Assignee: Micron Technology, Inc.
    Inventors: Richard C. Murphy, Anton Korzh, Stephen S. Pawlowski
  • Patent number: 11080060
    Abstract: Managing application execution by receiving a store instruction, including a store instruction itag and store instruction address, creating a hash of the store instruction address, receiving a load instruction and matching a hash of a store instruction address associated with the load instruction with the hash of the store instruction address associated with the store instruction. The store instruction itag is sent to an instruction sequencing unit (ISU). The ISU delays execution of the load instruction according to the received itag.
    Type: Grant
    Filed: April 23, 2019
    Date of Patent: August 3, 2021
    Assignee: International Business Machines Corporation
    Inventors: Ehsan Fatehi, Brian W. Thompto, John B. Griswell, Jr.
  • Patent number: 11048579
    Abstract: In one embodiment, the present invention includes a method for receiving incoming data in a processor and performing a checksum operation on the incoming data in the processor pursuant to a user-level instruction for the checksum operation. For example, a cyclic redundancy checksum may be computed in the processor itself responsive to the user-level instruction. Other embodiments are described and claimed.
    Type: Grant
    Filed: August 12, 2019
    Date of Patent: June 29, 2021
    Assignee: Intel Corporation
    Inventors: Steven R. King, Frank L. Berry, Michael E. Kounavis
  • Patent number: 11023241
    Abstract: Systems and methods selectively bypass address-generation hardware in processor instruction pipelines. In an embodiment, a processor includes an address-generation stage and an address-generation-bypass-determination unit (ABDU). The ABDU receives a load/store instruction. If an effective address for the load/store instruction is not known at the ABDU, the ABDU routes the load/store instruction via the address-generation stage of the processor. If, however, the effective address of the load/store instruction is known at the ABDU, the ABDU routes the load/store instruction to bypass the address-generation stage of the processor.
    Type: Grant
    Filed: August 21, 2018
    Date of Patent: June 1, 2021
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Andrej Kocev, Jay Fleischman, Kai Troester, Johnny C. Chu, Tim J. Wilkens, Neil Marketkar, Michael W. Long
  • Patent number: 11010164
    Abstract: Predicting a Table of Contents (TOC) pointer value responsive to branching to a subroutine. A subroutine is called from a calling module executing on a processor. Based on calling the subroutine, a value of a pointer to a reference data structure, such as a TOC, is predicted. The predicting is performed prior to executing a sequence of one or more instructions in the subroutine to compute the value. The value that is predicted is used to access the reference data structure to obtain a variable value for a variable of the subroutine.
    Type: Grant
    Filed: October 2, 2019
    Date of Patent: May 18, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michael K. Gschwind, Valentina Salapura
  • Patent number: 10969995
    Abstract: Systems and method are disclosed for monitoring processor performance. Embodiments described relate to differentiating function performance by input parameters. In one embodiment, a method includes configuring a counter contained in a processor to count occurrences of an event in the processor and to overflow upon the count of occurrences reaching a specified value, configuring a precise event based sampling (PEBS) handler circuit to generate and store a PEBS record into a PEBS memory buffer after at least one overflow, the PEBS record containing at least one stack entry read from a stack after the at least one overflow, enabling the PEBS handler circuit to generate and store the PEBS record after the at least one overflow, generating and storing the PEBS record into the PEBS memory buffer after the at least one overflow; and storing contents of the PEBS memory buffer to a PEBS trace file in a memory.
    Type: Grant
    Filed: October 16, 2018
    Date of Patent: April 6, 2021
    Assignee: Intel Corporation
    Inventors: Ahmad Yasin, Stanislav Bratanov
  • Patent number: 10963389
    Abstract: An apparatus to facilitate data prefetching is disclosed. The apparatus includes a cache, one or more execution units (EUs) to execute program code, prefetch logic to maintain tracking information of memory instructions in the program code that trigger a cache miss and compiler logic to receive the tracking information, insert one or more pre-fetch instructions in updated program code to prefetch data from a memory for execution of one or more of the memory instructions that triggered a cache miss and download the updated program code for execution by the one or more EUs.
    Type: Grant
    Filed: February 11, 2020
    Date of Patent: March 30, 2021
    Assignee: Intel Corporation
    Inventors: Vasileios Porpodas, Guei-Yuan Lueh, Subramaniam Maiyuran, Wei-Yu Chen
  • Patent number: 10949205
    Abstract: A computer system includes a dispatch routing network to dispatch a plurality of instructions, and a processor in signal communication with the dispatch routing network. The processor determines a move instruction from the plurality of instructions to move data produced by an older second instruction, and copies a splice target file (STF) tag from a source register of the move instruction to a destination register of the move instruction without physically copying data in a slice target register and without assigning a new STF tag destination to the move instruction.
    Type: Grant
    Filed: December 20, 2018
    Date of Patent: March 16, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Joshua Bowman, Dung Q. Nguyen, Hung Le, Brian Thompto, Maureen A. Delaney, Cliff Kucharski, Steven J Battle
  • Patent number: 10922159
    Abstract: A method for performing a data dump includes detecting an error in a segmented application having an address space and a buffer. In response to detecting the error, the method quiesces the address space and copies content of the address space to another location while the address space is quiesced. The method reactivates the address space after the content of the address space is completely copied. The method suspends write access to the buffer and copies content of the buffer to another location while write access to the buffer is suspended. While write access to the buffer is suspended, the method redirects writes intended for the buffer to a temporary storage area, and directs reads intended for the buffer to one of the buffer and the temporary storage area, depending on where valid data is stored. A corresponding system and computer program product are also disclosed.
    Type: Grant
    Filed: April 16, 2019
    Date of Patent: February 16, 2021
    Assignee: International Business Machines Corporation
    Inventors: Thomas C. Reed, David C. Reed
  • Patent number: 10838733
    Abstract: A load request to restore a plurality of architected registers is obtained. Based on obtaining the load request, one or more architected registers of the plurality of architected registers are restored. The restoring uses a snapshot that maps architected registers to physical registers to replace one or more physical registers currently assigned to the one or more architected registers with one or more physical registers of the snapshot corresponding to the one or more architected registers.
    Type: Grant
    Filed: April 18, 2017
    Date of Patent: November 17, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michael K. Gschwind, Valentina Salapura, Chung-Lung K. Shum, Timothy J. Slegel
  • Patent number: 10838725
    Abstract: Techniques are disclosed relating to fetching items from a compute command stream that includes compute kernels. In some embodiments, stream fetch circuitry sequentially pre-fetches items from the stream and stores them in a buffer. In some embodiments, fetch parse circuitry iterate through items in the buffer using a fetch parse pointer to detect indirect-data-access items and/or redirect items in the buffer. The fetch parse circuitry may send detected indirect data accesses to indirect-fetch circuitry, which may buffer requests. In some embodiments, execute parse circuitry iterates through items in the buffer using an execute parse pointer (e.g., which may trail the fetch parse pointer) and outputs both item data from the buffer and indirect-fetch results from indirect-fetch circuitry for execution. In various embodiments, the disclosed techniques may reduce fetch latency for compute kernels.
    Type: Grant
    Filed: September 26, 2018
    Date of Patent: November 17, 2020
    Assignee: Apple Inc.
    Inventors: Andrew M. Havlir, Jeffrey T. Brady
  • Patent number: 10824430
    Abstract: Managing program instruction execution by receiving a first OSC (operand store compare) instruction, the first OSC instruction comprising a first itag and a first instruction address and creating a first OSC table entry according to the first itag and first instruction address. Further, receiving a second OSC instruction, the second OSC instruction comprising a second itag and a second instruction address and creating a second OSC table entry according to the second itag and an itag delta between the first itag and the second itag, then appending the second OSC table entry according to an itag delta between the second itag and a third itag, and providing an itag delta from the second OSC table entry to an instruction sequencing unit (ISU).
    Type: Grant
    Filed: April 25, 2019
    Date of Patent: November 3, 2020
    Assignee: International Business Machines Corporation
    Inventors: Ehsan Fatehi, Brian W. Thompto
  • Patent number: 10776897
    Abstract: Embodiments described herein provide an apparatus comprising a processor to configure a plurality of contexts of a command engine to execute a graphics workload comprising a plurality of walkers, allocate, from a pool of execution units of a graphics processor, a subset of execution units to each walker in the plurality of walkers based at least in part on the predetermined number of walkers configured for the context, for each context in the plurality of contexts, dispatch one or more walkers of the plurality of walkers to the execution units, and upon dispatch of the one or more walkers of the plurality of walkers, write an opcode to a computer-readable memory indicating that the dispatch of the walker is complete, wherein the opcode comprises dependency data for the one or more walkers of the plurality of walkers. Other embodiments may be described and claimed.
    Type: Grant
    Filed: March 8, 2019
    Date of Patent: September 15, 2020
    Assignee: INTEL CORPORATION
    Inventors: James Valerio, Vasanth Ranganathan, Joydeep Ray, Abhishek R. Appu, Ben J. Ashbaugh, Brandon Fliflet, Jeffery S. Boles, Srinivasan Embar Raghukrishnan, Rahul Kulkarni
  • Patent number: 10776125
    Abstract: In an embodiment, at least one CPU processor and at least one coprocessor are included in a system. The CPU processor may issue operations to the coprocessor to perform, including load/store operations. The CPU processor may generate the addresses that are accessed by the coprocessor load/store operations, as well as executing its own CPU load/store operations. The CPU processor may include a memory ordering table configured to track at least one memory region within which there are outstanding coprocessor load/store memory operations that have not yet completed. The CPU processor may delay CPU load/store operations until the outstanding coprocessor load/store operations are complete. In this fashion, the proper ordering of CPU load/store operations and coprocessor load/store operations may be maintained.
    Type: Grant
    Filed: December 5, 2018
    Date of Patent: September 15, 2020
    Assignee: Apple Inc.
    Inventors: Aditya Kesiraju, Brett S. Feero, Nikhil Gupta
  • Patent number: 10762458
    Abstract: Systems and methods are provided for scheduling objects having pair-wise and cumulative constraints. The systems and methods presented can utilize a directed acyclic graph to increase or maximize a utilization function. The objects can comprise satellites in a constellation of satellites. In some implementations, the satellites are imaging satellites, and the systems and methods for scheduling can use human collaboration to determine events of interest for acquisition of images. In some implementations, dominant edges are removed from the directed acyclic graph. In some implementations, dynamic weights are assigned to nodes associated with downlink events in the directed acyclic graph.
    Type: Grant
    Filed: October 24, 2014
    Date of Patent: September 1, 2020
    Assignee: Planet Labs, Inc.
    Inventor: Sean Augenstein
  • Patent number: 10761594
    Abstract: In an embodiment, a processor includes a first core and a power management agent (PMA), coupled to the first core, to include a static table that stores a list of operations, and a plurality of columns each to specify a corresponding flow that includes a corresponding subset of the operations. Execution of each flow is associated with a corresponding state of the first core. The PMA includes a control register (CR) that includes a plurality of storage elements to receive one of a first value and a second value. The processor includes execution logic, responsive to a command to place the first core into a first state, to execute an operation of a first flow when a corresponding storage element stores the first value and to refrain from execution of an operation of the first flow when the corresponding element stores the second value. Other embodiments are described and claimed.
    Type: Grant
    Filed: June 15, 2017
    Date of Patent: September 1, 2020
    Assignee: Intel Corporation
    Inventors: Israel Diamand, Asaf Rubinstein, Arik Gihon, Tal Kuzi, Tomer Ziv, Nadav Shulman
  • Patent number: 10761854
    Abstract: Preventing hazard flushes in an instruction sequencing unit of a multi-slice processor including receiving a load instruction in a load reorder queue, wherein the load instruction is an instruction to load data from a memory location; subsequent to receiving the load instruction, receiving a store instruction in a store reorder queue, wherein the store instruction is an instruction to store data in the memory location; determining that the store instruction causes a hazard against the load instruction; preventing a flush of the load reorder queue based on a state of the load instruction; and re-executing the load instruction.
    Type: Grant
    Filed: April 19, 2016
    Date of Patent: September 1, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Robert A. Cordes, David A. Hrusecky, Elizabeth A. McGlone
  • Patent number: 10747539
    Abstract: Systems, apparatuses, and methods for instruction next fetch prediction. A scan-on-fill target predictor in a processor generates a predicted next fetch address for the instruction fetch unit. When a group of instructions is used to fill an instruction cache but is not currently being retrieved from the instruction cache for processing by other pipeline stages, the group of instructions are scanned to identify exit points of basic blocks within the group. An entry of a table in the scan-on-fill target predictor is allocated for an instruction in a basic block in the group when the basic block has an exit point with a target address that can be resolved within a single clock cycle. The scan-on-fill target predictor may perform a lookup of the table with the current fetch address. The prediction may be compared to a main branch predictor at a later pipeline stage for training purposes.
    Type: Grant
    Filed: November 14, 2016
    Date of Patent: August 18, 2020
    Assignee: Apple Inc.
    Inventors: James Robert Howard Hakewill, Constantin Pistol
  • Patent number: 10740269
    Abstract: Arbitration circuitry is provided for allocating up to M resources to N requesters, where M?2. The arbitration circuitry comprises group allocation circuitry to control a group allocation in which the N requesters are allocated to M groups of requesters, with each requester allocated to one of the groups; and M arbiters each corresponding to a respective one of the M groups. Each arbiter selects a winning requester from the corresponding group, which is to be allocated a corresponding resource of the M resources. In response to a given requester being selected as the winning requester by the arbiter for a given group, the group allocation is changed so that in a subsequent arbitration cycle the given requester is in a different group to the given group.
    Type: Grant
    Filed: July 17, 2018
    Date of Patent: August 11, 2020
    Assignee: ARM Limited
    Inventor: Andrew David Tune
  • Patent number: 10740107
    Abstract: Operation of a multi-slice processor that includes a plurality of execution slices and an instruction sequencing unit. Operation of such a multi-slice processor includes: receiving, at the instruction sequencing unit, a load instruction indicating load address data and a load data length; determining a previous store instruction in an issue queue such that store address data for the previous store instruction corresponds to the load address data, wherein the previous store instruction corresponds to a store data length; and generating, in dependence upon the store data length matching the load data length, an indication in the issue queue that indicates a dependency between the load instruction and the previous store instruction.
    Type: Grant
    Filed: June 1, 2016
    Date of Patent: August 11, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Salma Ayub, Joshua W. Bowman, Jeffrey C. Brownscheidle, Kurt A. Feiste, Dung Q. Nguyen, Salim A. Shah, Brian W. Thompto
  • Patent number: 10684834
    Abstract: Embodiments of the present invention disclose a method and an apparatus for detecting inter-instruction data dependency. The method comprises: comparing a thread number corresponding to a historical access operation with a thread number corresponding to a write access operation, if the thread number corresponding to the write access operation is less than the thread number corresponding to the historical access operation, which indicates existence of data dependency for a to-be-detected instruction, terminating the detection; or comparing a thread number corresponding to a historical write access operation with a thread number corresponding to a read access operation, if the thread number corresponding to the read access operation is less than the thread number corresponding to the historical write access operation, which indicates existence of data dependency for the to-be-detected instruction, terminating the detection.
    Type: Grant
    Filed: March 21, 2019
    Date of Patent: June 16, 2020
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Hongyuan Liu, Cho-Li Wang, KingTin Lam, Huanxin Lin, Bin Zhang, Junchao Ma
  • Patent number: 10684859
    Abstract: Providing memory dependence prediction in block-atomic dataflow architectures is provided, in one aspect, la a memory dependence prediction circuit. The memory dependence prediction circuit comprises a predictor table configured to store multiple predictor table entries, each comprising a store instruction identifier, a block reach set, and a load set. Using this data, the memory dependence prediction circuit determines, upon a fetch of an instruction block by an execution pipeline, whether the instruction block contains store instructions that reach dependent load instructions. If so, the store instructions are marked as having dependent load instructions to wake. In some aspects, the memory dependence prediction circuit is configured to determine whether the instruction block contains dependent load instructions reached by store instructions. If so, the memory dependence prediction circuit delays execution of the dependent load instructions.
    Type: Grant
    Filed: September 19, 2016
    Date of Patent: June 16, 2020
    Assignee: QUALCOMM Incorporated
    Inventors: Chen-Han Ho, Gregory Michael Wright