Prefetching Patents (Class 712/207)
  • Patent number: 11080052
    Abstract: Effectiveness of prefetch instructions is determined. A prefetch instruction is executed to request that data be fetched into a cache of the computing environment. The effectiveness of the prefetch instruction is determined. This includes updating, based on executing the prefetch instruction, a cache directory of the cache. The updating includes, in the cache directory, effectiveness data relating to the data. The effectiveness data includes whether the data was installed in the cache based on the prefetch instruction. Additionally, the determining the effectiveness includes obtaining at least a portion of the effectiveness data from the cache directory, and using the at least a portion of effectiveness data to determine the effectiveness of the prefetch instruction.
    Type: Grant
    Filed: October 30, 2019
    Date of Patent: August 3, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michael K. Gschwind, Christian Jacobi, Anthony Saporito, Chung-Lung K. Shum, Timothy J. Slegel
  • Patent number: 11068246
    Abstract: A method and systems generate a control flow graph including an edge of the control flow graph from a branch instruction to a target address of the branch instruction in an abstract interpretation for an assignment instruction to a branch target variable of a program. The program allocates a particular branch target variable to a branch instruction having a plurality of branch targets. The branch target address is loaded from the branch target variable upon branching, a branch address of a branch instruction having one branch target as well as the address assigned by the assignment instruction to the branch target variable being determined as certain constant values determined by compiling the program. The target address assigned by the assignment instruction is added to an object of the abstract interpretation. A current abstract interpretation is terminated if the abstract interpretation reaches an instruction already subjected to the abstract interpretation.
    Type: Grant
    Filed: January 17, 2018
    Date of Patent: July 20, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Reid T. Copeland, Toshihiko Koju
  • Patent number: 11055224
    Abstract: An area for prefetching is determined while accommodating an increase in a block address space. A prediction model predicts prefetch addresses for each of bit ranges into which block addresses are split by using a plurality of neural networks assuming charge of the different bit ranges having performed machine learning on I/O trace data, a prediction accuracy determination section determines a size of an area for prefetching on the basis of addresses in the bit range for which prediction accuracy in prefetch is lower than a predetermined value, a predicted value determination section determines addresses of the area for prefetching on the basis of addresses in the bit range for which the prediction accuracy in the prefetch is equal to or higher than the predetermined value, and a prefetch issuance section caches data in the area for prefetching in a storage class memory from a NAND flash memory.
    Type: Grant
    Filed: March 6, 2019
    Date of Patent: July 6, 2021
    Assignee: HITACHI, LTD.
    Inventors: Yuji Saeki, Takashige Baba
  • Patent number: 11010168
    Abstract: A method, system, and computer program product are provided for prioritizing prefetch instructions. The method includes a processor issuing a prefetch instruction and fetching elements from a cache that can include a memory or a higher level cache. The processor stores the elements in temporary storage and monitors for accesses by an instruction. The processor stores a record representing the prefetch instruction. The processor updates the record with an indicator and issues a new prefetch instruction by comparing the new prefetch instruction to the record, based on the new prefetch instruction matching the prefetch instruction, assigning the indicator to the new prefetch instruction as a priority value, based on the new prefetch instruction not matching the prefetch instruction, assigning a default value to the new prefetch instruction as the priority value, and determining whether to execute the new prefetch instruction, based on the priority value of the new prefetch instruction.
    Type: Grant
    Filed: June 18, 2019
    Date of Patent: May 18, 2021
    Assignee: International Business Machines Corporation
    Inventors: Michael K. Gschwind, Christian Jacobi, Anthony Saporito, Chung-Lung K. Shum
  • Patent number: 11010162
    Abstract: An electronic device can execute instructions, comprising: a processing circuit; a first storage device, coupled to the processing circuit, configured to store at least one instruction and first operation data; and a second storage device, coupled to the processing circuit. The processing circuit reads at least one of the instruction and the first operation data corresponding to the read instruction from the first storage device, and the second storage device does not store the first operation data corresponding to the read instruction, the processing circuit backs up the read first operation data to the second storage device.
    Type: Grant
    Filed: September 17, 2019
    Date of Patent: May 18, 2021
    Assignee: Realtek Semiconductor Corp.
    Inventor: Yen-Ju Lu
  • Patent number: 11003581
    Abstract: An arithmetic processing device includes circuitry configured to add an identifier of a request source that generates a prefetch request into the prefetch request, and output, in response to detecting a certain number of cache hits less than a first threshold, each of the cache hits occurring in a first cache memory provided at a lower hierarchical level than a second cache memory by each prefetch request into which a first identifier is added, a notification for suppressing a prefetch request issued for the lower hierarchical level of the first cache memory from a first request source identified by the first identifier.
    Type: Grant
    Filed: June 26, 2019
    Date of Patent: May 11, 2021
    Assignee: FUJITSU LIMITED
    Inventors: Masakazu Tanomoto, Hideki Okawara
  • Patent number: 10916280
    Abstract: Systems and methods for securely sharing a memory between an Embedded Controller (EC) and a Platform Controller Hub (PCH). In some embodiments, an IHS may include: a chipset; a flash device coupled to the chipset; and an EC coupled to the flash device via a first bus and to the chipset via a second bus, wherein the EC comprises a Read-Only Memory (ROM) portion and a Random Access Memory (RAM) portion, the EC configured to: retrieve EC firmware from the flash device via the first bus; store the retrieved EC firmware in the RAM portion; and prior to the execution of any instruction stored in the RAM portion, relinquish access to the flash device via the first bus.
    Type: Grant
    Filed: March 15, 2018
    Date of Patent: February 9, 2021
    Assignee: Dell Products, L.P.
    Inventors: Adolfo S. Montero, Mark Tracy Ellis
  • Patent number: 10884749
    Abstract: Aspects of the present disclosure relate to control of speculative demand loads. In some embodiments, the method includes receiving instructions for a branch in a program, detecting the branch load is in the cache, monitoring a number of completed loads for the program, determining a cache pollution ratio of executed loads to completed loads, providing the cache pollution ratio to a branch prediction unit, and altering load instructions for the branch based on the cache pollution ratio.
    Type: Grant
    Filed: March 26, 2019
    Date of Patent: January 5, 2021
    Assignee: International Business Machines Corporation
    Inventors: Satish Kumar Sadasivam, Puneeth A. H. Bhat, Shruti Saxena, Sangram Alapati
  • Patent number: 10887223
    Abstract: A provider edge device, capable of accessing a first type of memory and a second type of memory, may determine a network address associated with a customer edge device. The provider edge device may determine whether the customer edge device is categorized as a leaf device in an Ethernet Tree service provided by the provider edge device. The provider edge device may selectively store the network address in the first type of memory or the second type of memory based on determining whether the customer edge device is categorized as a leaf device in the Ethernet Tree service.
    Type: Grant
    Filed: September 14, 2018
    Date of Patent: January 5, 2021
    Assignee: Juniper Networks, Inc.
    Inventors: Manoj Sharma, Poorna Pushkala Balasubramanian, Nitin Singh, Xiaomin Wu
  • Patent number: 10877864
    Abstract: A processor memory is stress tested with a variable link stack depth using test code segments and link stack test segments on non-naturally aligned data boundaries. Link stack test segments are interspersed into test code segments of a processor memory test to change the link stack depth without changing results of the test code. The link stack test segments include branch to target, push/pop, push and pop segments. The depth of the link stack is varied independent of the memory test code by changing the number to branches in the branch to target segment and varying the number of the push/pop segments. The link stack test segments and test segments may be placed randomly with a recursive algorithm to intersperse the link stack test segments in the test code segments and to reduce the amount of data to be saved and restored for all subroutine calls, push and pop segments.
    Type: Grant
    Filed: December 20, 2018
    Date of Patent: December 29, 2020
    Assignee: International Business Machines Corporation
    Inventors: Shakti Kapoor, Manoj Dusanapudi
  • Patent number: 10838871
    Abstract: Aspects of the disclosure relate to a hardware processor architecture. The hardware processor architecture may include a processor cache to manage a set of instructions. The hardware processor architecture may include a hint cache to manage a set of hints associated with the set of instructions. Disclosed aspects relate to establishing a hint cache which has a set of hints associated with the set of instructions. The hint cache may be established with respect to the processor cache which has a set of instructions. The set of instructions may be accessed from the processor cache. From the hint cache, the set of hints associated with the set of instructions may be communicated. The set of instructions may be processed by the hardware processor using the set of hints associated with the set of instructions. In embodiments, static or dynamic hint cache bits can be utilized.
    Type: Grant
    Filed: November 7, 2016
    Date of Patent: November 17, 2020
    Assignee: International Business Machines Corporation
    Inventor: Satish K. Sadasivam
  • Patent number: 10769070
    Abstract: Apparatuses and methods for prefetch generation are disclosed. Prefetching circuitry receives addresses specified by load instructions and can cause retrieval of a data value from an address before that address is received. Stride determination circuitry determines stride values as a difference between a current address and a previously received address. Plural stride values corresponding to a sequence of received addresses are determined. Multiple stride storage circuitry stores the plurality of stride values determined by the stride determination circuitry. New address comparison circuitry determines whether a current address corresponds to a matching stride value based on the plurality of stride values stored in the multiple stride storage circuitry. Prefetch initiation circuitry can causes a data value to be retrieved from a further address, wherein the further address is the current address modified by the matching stride value of the plurality of stride values.
    Type: Grant
    Filed: September 25, 2018
    Date of Patent: September 8, 2020
    Assignee: Arm Limited
    Inventors: Joseph Michael Pusdesris, Miles Robert Dooley, Alexander Cole Shulyak, Krishnendra Nathella, Dam Sunwoo
  • Patent number: 10754655
    Abstract: A processing device includes a branch IP table and branch predication circuitry coupled to the branch IP table. The branch predication circuitry to: determine a dynamic convergence point in a conditional branch of set of instructions; store the dynamic convergence point in the branch IP table; fetch a first and second speculative path of the conditional branch; while determining which of the first speculative path and the second speculative path is a taken path of the conditional branch and determining whether a dynamic convergence point is fetched corresponding to the stored dynamic convergence point, stall scheduling of instructions of the first speculative path and the second speculative path; and in response to determining that one of the first speculative path and the second speculative path is the taken path and the fetched dynamic convergence point corresponds to the stored convergence point, resume scheduling of the instructions of the taken path.
    Type: Grant
    Filed: June 28, 2018
    Date of Patent: August 25, 2020
    Assignee: Intel Corporation
    Inventors: Adarsh Chauhan, Hong Wang, Jayesh Gaur, Zeev Sperber, Sumeet Bandishte, Lihu Rappoport, Stanislav Shwartsman, Kamil Garifullin, Sreenivas Subramoney, Adi Yoaz
  • Patent number: 10740105
    Abstract: A processor includes an execution unit and a subroutine cache. The execution unit is configured to execute instructions. The subroutine cache us configured to provide instructions of a subroutine to the execution unit for execution. The subroutine cache includes subroutine instruction storage, a subroutine address register, and subroutine cache control logic. The subroutine control logic is configured to: identify a subroutine call instruction provided to the execution unit; determine whether an instruction of a subroutine invoked by the subroutine call instruction is stored in the subroutine instruction storage by evaluating a subroutine validity indicator that indicates whether at least a portion of the subroutine is stored in the subroutine instruction storage; and provide the instruction of the subroutine to the execution unit based on the subroutine validity indicator indicating that at least a portion of the subroutine is stored in the subroutine instruction storage.
    Type: Grant
    Filed: April 4, 2014
    Date of Patent: August 11, 2020
    Assignee: TEXAS INSTRUMENTS INCORPORATED
    Inventor: Christian Wiencke
  • Patent number: 10740234
    Abstract: An approach is provided in which a first core broadcasts a cache line request in response to detecting a cache miss corresponding to a first virtual central processing unit (VCPU) executing on the first core. Next, the first core receives a cache line response from the second core responding to the cache line request that includes tag extension data. The first core determines a cache miss type of the cache miss based on the tag extension data and, in turn, sends the cache miss type to a hypervisor that utilizes the cache miss type during a future VCPU dispatch selection.
    Type: Grant
    Filed: September 4, 2018
    Date of Patent: August 11, 2020
    Assignee: International Business Machines Corporation
    Inventors: Bret R. Olszewski, Ram Raghavan, Maria Lorena Pesantez, Gayathri Mohan
  • Patent number: 10691460
    Abstract: A method includes a processor providing at least one line entry address tag in each line of a branch predictor; indexing into the branch predictor with a current line address to predict a taken branch's target address and a next line address; re-indexing into the branch predictor with one of a predicted next line address or a sequential next line address when the at least one line entry address tag does not match the current line address; using branch prediction content compared against a search address to predict a direction and targets of branches and determining when a new line address is generated; and re-indexing into the branch predictor with a corrected next line address when it is determined that one of the predicted next line address or the sequential next line address differs from the new line address.
    Type: Grant
    Filed: December 13, 2016
    Date of Patent: June 23, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: James J. Bonanno, Brian R. Prasky
  • Patent number: 10684857
    Abstract: A method includes storing a first address of a first instruction executed by a processor core in a first table, where the first instruction writes a value into a register for utilization in addressing memory. The method stores the first address of the first instruction executed by the processor core in a second table with multiple entries, where a register value loaded into the register is utilized as a second address by a second instruction executed by the processor core to access a main memory. The method determines whether an instruction address associated with an instruction executed by the processor core is present in the second table, where the instruction address is the second address. Responsive to determining the instruction address is present in the second table, the method prefetches data from the main memory, where the register value is utilized as the second address in the main memory.
    Type: Grant
    Filed: February 1, 2018
    Date of Patent: June 16, 2020
    Assignee: International Business Machines Corporation
    Inventors: Wolfgang Gellerich, Gerrit Koch, Peter M. Held, Martin Schwidefsky
  • Patent number: 10678479
    Abstract: Provided are integrated circuits and methods for operating integrated circuits. An integrated circuit can include a plurality of memory banks and an execution engine including a set of execution components. Each execution component can be associated with a respective memory bank, and can read from and write to only the respective memory bank. The integrated circuit can further include a set of registers each associated with a respective memory bank from the plurality of memory banks. The integrated circuit can further be operable to load to or store from the set of registers in parallel, and load to or store from the set of registers serially. A parallel operation followed by a serial operation enables data to be moved from many memory banks into one memory bank. A serial operation followed by a parallel operation enables data to be moved from one memory bank into many memory banks.
    Type: Grant
    Filed: November 29, 2018
    Date of Patent: June 9, 2020
    Assignee: Amazon Technologies, Inc.
    Inventors: Ron Diamant, Randy Renfu Huang, Sundeep Amirineni, Jeffrey T. Huynh
  • Patent number: 10664283
    Abstract: Computing system and controller thereof are disclosed for ensuring the correct logical relationship between multiple instructions during their parallel execution. The computing system comprises: a plurality of functional modules each performing a respective function in response to an instruction for the given functional module; and a controller for determining whether or not to send an instruction to a corresponding functional module according to dependency relationship between the plurality of instructions.
    Type: Grant
    Filed: June 12, 2017
    Date of Patent: May 26, 2020
    Assignee: BEIJING DEEPHI TECHNOLOGY CO., LTD.
    Inventors: Kaiyuan Guo, Song Yao
  • Patent number: 10652353
    Abstract: Technologies for communication with direct data placement include a number of computing nodes in communication over a network. Each computing node includes a many-core processor having an integrated host fabric interface (HFI) that maintains an association table (AT). In response to receiving a message from a remote device, the HFI determines whether the AT includes an entry associating one or more parameters of the message to a destination processor core. If so, the HFI causes a data transfer agent (DTA) of the destination core to receive the message data. The DTA may place the message data in a private cache of the destination core. Message parameters may include a destination process identifier or other network address and a virtual memory address range. The HFI may automatically update the AT based on communication operations generated by software executed by the processor cores. Other embodiments are described and claimed.
    Type: Grant
    Filed: September 24, 2015
    Date of Patent: May 12, 2020
    Assignee: Intel Corporation
    Inventors: James Dinan, Venkata Krishnan, Srinivas Sridharan, David A. Webb
  • Patent number: 10628163
    Abstract: A method and apparatus for controlling pre-fetching in a processor. A processor includes an execution pipeline and an instruction pre-fetch unit. The execution pipeline is configured to execute instructions. The instruction pre-fetch unit is coupled to the execution pipeline. The instruction pre-fetch unit includes instruction storage to store pre-fetched instructions, and pre-fetch control logic. The pre-fetch control logic is configured to fetch instructions from memory and store the fetched instructions in the instruction storage. The pre-fetch control logic is also configured to provide instructions stored in the instruction storage to the execution pipeline for execution. The pre-fetch control logic is further configured set a maximum number of instruction words to be pre-fetched for execution subsequent to execution of an instruction currently being executed in the execution pipeline.
    Type: Grant
    Filed: April 17, 2014
    Date of Patent: April 21, 2020
    Assignee: TEXAS INSTRUMENTS INCORPORATED
    Inventors: Christian Wiencke, Johann Zipperer
  • Patent number: 10620953
    Abstract: A data processing apparatus has prefetch circuitry for prefetching instructions from a data store into an instruction queue. Branch prediction circuitry is provided for predicting outcomes of branch instructions and the prefetch circuitry may prefetch instructions subsequent to the branch based on the predicted outcome. Instruction identifying circuitry identifies whether a given instruction prefetched from the data store is a predetermined type of program flow altering instruction and if so then controls the prefetch circuitry to halt prefetching of subsequent instructions into the instruction queue.
    Type: Grant
    Filed: February 14, 2017
    Date of Patent: April 14, 2020
    Assignee: ARM Limited
    Inventors: Kauser Yakub Johar, Antony John Penton
  • Patent number: 10612805
    Abstract: A method for executing computations in parallel for a building management system of a building includes receiving a computing job request to determine values for one or more particular properties, receiving a property model indicating dependencies between a plurality of properties, the plurality of properties including the one or more particular properties, wherein the plurality of properties include building data for the building, and generating one or more computing threads based on the property model, wherein each computing thread includes a sequence of computations for determining values for the plurality of properties. The method further includes executing the computing threads in parallel to determine the values for the particular properties.
    Type: Grant
    Filed: February 14, 2018
    Date of Patent: April 7, 2020
    Assignee: Johnson Controls Technology Company
    Inventor: Andrew J. Przybylski
  • Patent number: 10564966
    Abstract: A method of an aspect includes receiving a packed data operation mask shift instruction. The packed data operation mask shift instruction indicates a source having a packed data operation mask, indicates a shift count number of bits, and indicates a destination. The method further includes storing a result in the destination in response to the packed data operation mask shift instruction. The result includes a sequence of bits of the packed data operation mask that have been shifted by the shift count number of bits. Other methods, apparatus, systems, and instructions are disclosed.
    Type: Grant
    Filed: December 22, 2011
    Date of Patent: February 18, 2020
    Assignee: Intel Corporation
    Inventors: Bret L. Toll, Robert Valentine, Jesus Corbal San Adrian, Elmoustapha Ould-Ahmed Vall, Mark J. Charney
  • Patent number: 10558463
    Abstract: Embodiments of the present disclosure support hardware based thread switching in a multithreading environment. The thread switching is implemented on a multithread microprocessor by utilizing thread mailbox registers and other auxiliary registers that can be pre-programmed for hardware based thread switching. A set of mailbox registers can be allocated to each thread of a plurality of threads that can be executed in the microprocessor. A mailbox register in the set of mailbox registers comprises an identifier of a next thread of the plurality of threads to which an active thread switches based on a thread switch condition further indicated in the mailbox register. The auxiliary registers in the microprocessor can be used to configure a number of threads for simultaneous execution in the microprocessor, a priority for thread switching, and to store a program counter of each thread and states of registers of each thread.
    Type: Grant
    Filed: June 3, 2016
    Date of Patent: February 11, 2020
    Assignee: Synopsys, Inc.
    Inventor: Thang Tran
  • Patent number: 10540288
    Abstract: Techniques are described in which a system having multiple processing units processes a series of work units in a processing pipeline, where some or all of the work units access or manipulate data stored in non-coherent memory. In one example, this disclosure describes a method that includes identifying, prior to completing processing of a first work unit with a processing unit of a processor having multiple processing units, a second work unit that is expected to be processed by the processing unit after the first work unit. The method also includes processing the first work unit, and prefetching, from non-coherent memory, data associated with the second work unit into a second cache segment of the buffer cache, wherein prefetching the data associated with the second work unit occurs concurrently with at least a portion of the processing of the first work unit by the processing unit.
    Type: Grant
    Filed: April 10, 2018
    Date of Patent: January 21, 2020
    Assignee: Fungible, Inc.
    Inventors: Wael Noureddine, Jean-Marc Frailong, Felix A. Marti, Charles Edward Gray, Paul Kim
  • Patent number: 10534610
    Abstract: Techniques for processing instructions include receiving a plurality of instructions from a program counter (PC) operable to be fused into a PC-relative plus offset instruction. The technique also includes fusing the plurality of instructions into an internal operation (IOP) that specifies PC-relative addressing with an offset. The technique also includes computing a shared PC portion that includes one or more common upper bits of a PC address of each of the plurality of instructions. If the shared PC portion is different than a previously computed shared PC portion, the technique transmits the shared PC portion to one or more downstream components in the processor pipeline. The technique further includes transmitting the IOP with a representation of lower order bits of the PC address and processing the IOP.
    Type: Grant
    Filed: July 20, 2016
    Date of Patent: January 14, 2020
    Assignee: International Business Machines Corporation
    Inventor: Michael Karl Gschwind
  • Patent number: 10528352
    Abstract: Blocking instruction fetching in a computer processor, includes: receiving a non-branching instruction to be executed by the computer processor; determining whether executing the non-branching instruction will cause a flush; and responsive to determining that executing the non-branching instruction will cause a flush, disabling instruction fetching for the computer processor for a time, including recoding the instruction such that the recoded instruction will be interpreted by an instruction fetch unit as an unconditional branch instruction.
    Type: Grant
    Filed: March 8, 2016
    Date of Patent: January 7, 2020
    Assignee: International Business Machines Corporation
    Inventors: Bryan G. Hickerson, Sheldon Levenstein, David S. Levitan, Albert J. Van Norstrand, Jr.
  • Patent number: 10521350
    Abstract: Effectiveness of prefetch instructions is determined. A prefetch instruction is executed to request that data be fetched into a cache of the computing environment. The effectiveness of the prefetch instruction is determined. This includes updating, based on executing the prefetch instruction, a cache directory of the cache. The updating includes, in the cache directory, effectiveness data relating to the data. The effectiveness data includes whether the data was installed in the cache based on the prefetch instruction. Additionally, the determining the effectiveness includes obtaining at least a portion of the effectiveness data from the cache directory, and using the at least a portion of effectiveness data to determine the effectiveness of the prefetch instruction.
    Type: Grant
    Filed: July 20, 2016
    Date of Patent: December 31, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michael K. Gschwind, Christian Jacobi, Anthony Saporito, Chung-Lung K. Shum, Timothy J. Slegel
  • Patent number: 10437602
    Abstract: Program exception conditions cause a transaction to abort and typically result in an interruption in which the operating system obtains control. A program interruption filtering control is provided to selectively present the interrupt. That is, the interrupt from the program exception condition may or may not be presented depending at least on the program interruption filtering control and a transaction class associated with the program exception condition. The program interruption filtering control is provided by a TRANSACTION BEGIN instruction.
    Type: Grant
    Filed: June 15, 2012
    Date of Patent: October 8, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Dan F. Greiner, Christian Jacobi, Marcel Mitran, Timothy J. Slegel
  • Patent number: 10430195
    Abstract: A computer-implemented method for predicting a taken branch that ends an instruction stream in a pipelined high frequency microprocessor includes receiving, by a processor, a first instruction within a first instruction stream, the first instruction including a first instruction address. The computer-implemented method further includes searching, by the processor, a stream-based index accelerator predictor one time for the stream; determining, by the processor, a prediction for a branch ending the branch stream; influencing, by the processor, a metadata prediction engine based on the prediction; and updating, by the processor, a stream-based index accelerator predictor with information indicative of the prediction.
    Type: Grant
    Filed: June 27, 2016
    Date of Patent: October 1, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: James J. Bonanno, Michael J. Cadigan, Jr., Adam B. Collura, Christian Jacobi, Daniel Lipetz, Anthony Saporito
  • Patent number: 10423420
    Abstract: A computer-implemented method for predicting a taken branch that ends an instruction stream in a pipelined high frequency microprocessor includes receiving, by a processor, a first instruction within a first instruction stream, the first instruction comprising a first instruction address; searching, by the processor, an index accelerator predictor one time for the stream; determining, by the processor, a prediction for a taken branch ending the branch stream; influencing, by the processor, a metadata prediction engine based on the prediction; observing a plurality of taken branches from the exit accelerator predictor; maintaining frequency information based on the observed taken branches; determining, based on the frequency information, an updated prediction of the observed plurality of taken branches; and updating, by the processor, the index accelerator predictor with the the updated prediction.
    Type: Grant
    Filed: March 1, 2017
    Date of Patent: September 24, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: James J. Bonanno, Michael J. Cadigan, Jr., Adam B. Collura, Daniel Lipetz
  • Patent number: 10423419
    Abstract: A computer-implemented method for predicting a taken branch that ends an instruction stream in a pipelined high frequency microprocessor includes receiving, by a processor, a first instruction within a first instruction stream, the first instruction comprising a first instruction address; searching, by the processor, an index accelerator predictor one time for the stream; determining, by the processor, a prediction for a taken branch ending the branch stream; influencing, by the processor, a metadata prediction engine based on the prediction; observing a plurality of taken branches from the exit accelerator predictor; maintaining frequency information based on the observed taken branches; determining, based on the frequency information, an updated prediction of the observed plurality of taken branches; and updating, by the processor, the index accelerator predictor with the updated prediction.
    Type: Grant
    Filed: June 27, 2016
    Date of Patent: September 24, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: James J. Bonanno, Michael J. Cadigan, Jr., Adam B. Collura, Daniel Lipetz
  • Patent number: 10394559
    Abstract: A computer-implemented method includes determining, by a stream-based index accelerator predictor of a processor, a predicted stream length between an instruction address and a taken branch ending an instruction stream. A first-level branch predictor of a hierarchical asynchronous lookahead branch predictor of the processor is searched for a branch prediction in one or more entries in a search range bounded by the instruction address and the predicted stream length. A search of a second-level branch predictor of the hierarchical asynchronous lookahead branch predictor is triggered based on failing to locate the branch prediction in the search range.
    Type: Grant
    Filed: December 13, 2016
    Date of Patent: August 27, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: James J. Bonanno, Michael J. Cadigan, Jr., Adam B. Collura, Daniel Lipetz
  • Patent number: 10379863
    Abstract: Systems and methods for constructing an instruction slice for prefetching data of a data-dependent load instruction include a slicer for identifying a load instruction in an instruction sequence as a first occurrence of a qualified load instruction which will miss in a last-level cache. A commit buffer stores information pertaining to the first occurrence of the qualified load instruction and shadow instructions which follow. For a second occurrence of the qualified load instruction, an instruction slice is constructed from the information in the commit buffer to form a slice payload. A pre-execution engine pre-executes the instruction slice based on the slice payload to determine an address from which data is to be fetched for execution of a third and any subsequent occurrences of the qualified load instruction. The data is prefetched from the determined address for the third and any subsequent occurrence of the qualified load instruction.
    Type: Grant
    Filed: September 21, 2017
    Date of Patent: August 13, 2019
    Assignee: QUALCOMM Incorporated
    Inventors: Shivam Priyadarshi, Rami Mohammad A. Al Sheikh, Brandon Dwiel, Derek Hower
  • Patent number: 10379852
    Abstract: An apparatus for integrating arithmetic with logic operations contains at least a calculation device and a post-logic unit. The calculation device calculates source data to generate and output first destination data. The post-logic unit, coupled to the calculation device, performs a comparison operation for comparing the first destination data with 0 and outputs a comparison result.
    Type: Grant
    Filed: August 24, 2017
    Date of Patent: August 13, 2019
    Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD.
    Inventors: Huaisheng Zhang, Dacheng Liang, Boming Chen, Renyu Bian
  • Patent number: 10379862
    Abstract: A method, system, and computer program product are provided for prioritizing prefetch instructions. The method includes a processor issuing a prefetch instruction and fetching elements from a cache that can include a memory or a higher level cache. The processor stores the elements in temporary storage and monitors for accesses by an instruction. The processor stores a record representing the prefetch instruction. The processor updates the record with an indicator and issues a new prefetch instruction by comparing the new prefetch instruction to the record, based on the new prefetch instruction matching the prefetch instruction, assigning the indicator to the new prefetch instruction as a priority value, based on the new prefetch instruction not matching the prefetch instruction, assigning a default value to the new prefetch instruction as the priority value, and determining whether to execute the new prefetch instruction, based on the priority value of the new prefetch instruction.
    Type: Grant
    Filed: June 28, 2016
    Date of Patent: August 13, 2019
    Assignee: International Business Machines Corporation
    Inventors: Michael K. Gschwind, Christian Jacobi, Anthony Saporito, Chung-Lung K. Shum
  • Patent number: 10372457
    Abstract: A method, system, and computer program product are provided for prioritizing prefetch instructions. The method includes a processor issuing a prefetch instruction and fetching elements from a cache that can include a memory or a higher level cache. The processor stores the elements in temporary storage and monitors for accesses by an instruction. The processor stores a record representing the prefetch instruction. The processor updates the record with an indicator and issues a new prefetch instruction by comparing the new prefetch instruction to the record, based on the new prefetch instruction matching the prefetch instruction, assigning the indicator to the new prefetch instruction as a priority value, based on the new prefetch instruction not matching the prefetch instruction, assigning a default value to the new prefetch instruction as the priority value, and determining whether to execute the new prefetch instruction, based on the priority value of the new prefetch instruction.
    Type: Grant
    Filed: November 3, 2017
    Date of Patent: August 6, 2019
    Assignee: international Business Machines Corporation
    Inventors: Michael K. Gschwind, Christian Jacobi, Anthony Saporito, Chung-Lung K. Shum
  • Patent number: 10303483
    Abstract: An apparatus includes: a cache to retain an instruction; an instruction-control circuit to read out the instruction from the cache; and an instruction-execution circuit to execute the instruction read out from the cache, wherein the cache includes: a pipeline processing circuit including a plurality of selection stages in each of which, among a plurality of requests for causing the cache to operate, a request having a priority level higher than priority levels of other requests is outputted to a next stage and a plurality of processing stages in each of which processing based on a request outputted from a last stage among the plurality of selection stages is sequentially executed; and a cache-control circuit to input a request received from the instruction-control circuit to the selection stage in which processing order of the processing stage is reception order of the request.
    Type: Grant
    Filed: June 7, 2017
    Date of Patent: May 28, 2019
    Assignee: FUJITSU LIMITED
    Inventor: Yuji Shirahige
  • Patent number: 10268586
    Abstract: A processor including a programmable prefetcher for prefetching information from an external memory. The programmable prefetcher includes a load monitor, a programmable prefetch engine, and a prefetch requester. The load monitor tracks load requests issued by the processor to retrieve information from the external memory. The programmable prefetch engine is configured to be programmed by at least one prefetch program to operate as a programmed prefetcher, such that during operation of the processor, the programmed prefetcher generates at least one prefetch address based on the load requests issued by the processor. The requester uses each generated prefetch address to prefetch information from the external memory. A prefetch memory may store one or more prefetch programs and a prefetch programmer may be included to select from among stored prefetch programs to program the prefetcher based on an executing process. Each prefetch program may be configured according to a prefetch definition.
    Type: Grant
    Filed: October 28, 2016
    Date of Patent: April 23, 2019
    Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD.
    Inventors: G. Glenn Henry, Rodney E. Hooker, Terry Parks, Douglas R. Reed
  • Patent number: 10268587
    Abstract: A processor including a front end, at least one load pipeline, and a memory system that further includes a programmable prefetcher for prefetching information from an external memory. The front end converts fetched program instructions into microinstructions including load microinstructions and dispatches microinstructions for execution. The load pipeline executes dispatched load microinstructions and provides load requests to the memory system. The programmable prefetcher includes a load monitor, a programmable prefetch engine, and a prefetch requester. The load monitor tracks the load requests. The prefetch engine is configured to be programmed by at least one prefetch program to operate as a programmed prefetcher, such that during operation of the processor, the programmed prefetcher generates at least one prefetch address based on the load requests issued by the processor. The prefetch requester submits the at least one prefetch address to prefetch information from the memory system.
    Type: Grant
    Filed: December 7, 2016
    Date of Patent: April 23, 2019
    Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD.
    Inventors: G. Glenn Henry, Rodney E. Hooker, Terry Parks, Douglas R. Reed
  • Patent number: 10268630
    Abstract: A first node controller may include logic to direct the first node controller to: receive a noncoherent inter-processor communication from a source processor, remap the noncoherent inter-processor communication to a local address space of a destination processor and transmit the noncoherent inter-processor communication directly to a second node controller of the destination processor using an interconnect interface that also carries coherent communications.
    Type: Grant
    Filed: October 24, 2017
    Date of Patent: April 23, 2019
    Assignee: Hewlett Packard Enterprise Development LP
    Inventor: Frank R. Dropps
  • Patent number: 10261905
    Abstract: A method for accessing a cache including reading an access instruction for acquiring data; determining, according to a delay identifier carried by the access instruction, whether the access instruction produces a delay; accessing the cache and performing, according to a location identifier carried by the access instruction, a pre-fetch operation if a delay is produced; and modifying, according to a location where the data required by the access instruction is acquired, the delay identifier and the location identifier carried by the access instruction. The technical solutions solve the problem of a low hit rate upon cache access, reduce the probability of misses, and reduce an access delay caused by a level-by-level access to each level of cache upon target data acquisition, which correspondingly lowers the power consumption generated upon the cache access and improves the CPU performance.
    Type: Grant
    Filed: October 28, 2016
    Date of Patent: April 16, 2019
    Assignee: Alibaba Group Holding Limited
    Inventors: Ling Ma, Zhihong Wang, Lei Zhang
  • Patent number: 10237378
    Abstract: Provided are systems, methods, and integrated circuits for a low-latency, metadata-based packet rewriter. In various implementations, an integrated circuit may include a first pipeline stage operable to receive packet bytes for a packet and packet information. The first stage may further be operable to extract a first value from the packet bytes, and provide the packet bytes, packet information, and first value. The integrated circuit may further include a second stage, operable to receive the packet bytes and packet information. The second stage may further calculate a value using a value from the packet information, and provide the packet bytes, packet information, and second value. The integrated circuit may further include a third stage, operable to receive the packet bytes, packet information, and a third value. The third stage may further be operable to insert the third value into the packet bytes, and provide the packet bytes and packet information.
    Type: Grant
    Filed: June 29, 2016
    Date of Patent: March 19, 2019
    Assignee: Amazon Technologies, Inc.
    Inventors: Kiran Kalkunte Seshadri, Thomas A. Volpe
  • Patent number: 10223090
    Abstract: A method and system of the look-ahead branch prediction and instruction fetch are designed for hiding multi-cycle taken-branch prediction latency while providing accurate and timely instruction fetch and performing look-ahead branch prediction. The invention is designed for identifying the branches that require to be predicted and for reordering the branches in program to perform the look-ahead branch prediction operations during the invented compilation. The invention is also designed for delivering the branch look-ahead instructions comprising predictable branch instructions and the other instructions. In particular, the reordered branch look-ahead instructions are sequentially or concurrently fetched to a single or plurality of branch look-ahead microprocessors in an accurate and timely manner while dynamically recovering order of the branch look-ahead instructions to achieve compatibility of the original program.
    Type: Grant
    Filed: October 23, 2015
    Date of Patent: March 5, 2019
    Inventor: Yong-Kyu Jung
  • Patent number: 10216520
    Abstract: A compressing instruction queue for a microprocessor including a storage queue and a redirect logic circuit. The storage queue includes a matrix of storage locations including N rows and M columns for storing microinstructions of the microprocessor in sequential order. The redirect logic circuit is configured to receive and write multiple microinstructions per cycle of a clock signal into sequential storage locations of the storage queue without leaving unused storage locations and beginning at a first available storage location in the storage queue. The redirect logic circuit performs redirection and compression to eliminate empty locations or holes in the storage queue and to reduce the number of write ports interfaced with each storage location of the storage queue.
    Type: Grant
    Filed: December 12, 2014
    Date of Patent: February 26, 2019
    Assignee: VIA TECHNOLOGIES, INC.
    Inventors: Matthew Daniel Day, G. Glenn Henry, Terry Parks
  • Patent number: 10185568
    Abstract: A processor having an instruction cache for storing a plurality of instructions is provided. The processor further includes annotation logic configured to determine a lookahead distance associated with an instruction and annotate the at least one instruction cache with the lookahead distance. The lookahead distance may correspond to a number of instructions that separates an instruction that references a register from the most recent register definition. The lookahead distance may indicate the shortest distance to a later instruction that references a register that this instruction defines.
    Type: Grant
    Filed: April 22, 2016
    Date of Patent: January 22, 2019
    Assignee: Microsoft Technology Licensing, LLC
    Inventor: Burton J. Smith
  • Patent number: 10169245
    Abstract: A processor or system may include a memory controller to store, in a pre-allocated portion of bit-addressable, random access persistent memory (PM), a relationship between a group of addresses being stored in the PM according to a set of instructions when executed. The memory controller is further to retrieve the relationship when accessing an address from the groups of addresses.
    Type: Grant
    Filed: October 16, 2017
    Date of Patent: January 1, 2019
    Assignee: Intel Corporation
    Inventors: Karthik Kumar, Martin P. Dimitrov, Thomas Willhalm
  • Patent number: 10114558
    Abstract: System, method, and apparatus for integrated main memory (MM) and configurable coprocessor (CP) chip for processing subset of network functions. Chip supports external accesses to MM without additional latency from on-chip CP. On-chip memory scheduler resolves all bank conflicts and configurably load balances MM accesses. Instruction set and data on which the CP executes instructions are all disposed on-chip with no on-chip cache memory, thereby avoiding latency and coherency issues. Multiple independent and orthogonal threading domains used: a FIFO-based scheduling domain (SD) for the I/O; a multi-threaded processing domain for the CP. The CP is an array of independent, autonomous, unsequenced processing engines processing on-chip data tracked by SD of external CMD and reordered per FIFO CMD sequence before transmission.
    Type: Grant
    Filed: February 18, 2018
    Date of Patent: October 30, 2018
    Assignee: MOSYS, INC.
    Inventors: Michael J. Miller, Jay B Patel, Michael J Morrison
  • Patent number: 10108549
    Abstract: A method is described that includes creating a first data pattern access record for a region of system memory in response to a cache miss at a host side cache for a first memory access request. The first memory access request specifies an address within the region of system memory. The method includes fetching a previously existing data access pattern record for the region from the system memory in response to the cache miss. The previously existing data access pattern record identifies blocks of data within the region that have been previously accessed. The method includes pre-fetching the blocks from the system memory and storing the blocks in the cache.
    Type: Grant
    Filed: September 23, 2015
    Date of Patent: October 23, 2018
    Assignee: Intel Corporation
    Inventors: Zhe Wang, Christopher B. Wilkerson, Zeshan A. Chishti, Seth H. Pugsley, Alaa R. Alameldeen, Shih-Lien L. Lu