Instruction Data Cache Patents (Class 711/125)
  • Patent number: 11934342
    Abstract: Embodiments are generally directed to graphics processor data access and sharing. An embodiment of an apparatus includes a circuit element to produce a result in processing of an application; a load-store unit to receive the result and generate pre-fetch information for a cache utilizing the result; and a prefetch generator to produce prefetch addresses based at least in part on the pre-fetch information; wherein the load-store unit is to receive software assistance for prefetching, and wherein generation of the pre-fetch information is based at least in part on the software assistance.
    Type: Grant
    Filed: March 14, 2020
    Date of Patent: March 19, 2024
    Assignee: INTEL CORPORATION
    Inventors: Altug Koker, Varghese George, Aravindh Anantaraman, Valentin Andrei, Abhishek R. Appu, Niranjan Cooray, Nicolas Galoppo Von Borries, Mike MacPherson, Subramaniam Maiyuran, ElMoustapha Ould-Ahmed-Vall, David Puffer, Vasanth Ranganathan, Joydeep Ray, Ankur N. Shah, Lakshminarayanan Striramassarma, Prasoonkumar Surti, Saurabh Tangri
  • Patent number: 11886344
    Abstract: A cache system includes a computational cache and a computational cache miss-handler. The computational cache is configured to cache state vectors and perform read-modify-write (RMW) operations on the cached state vectors responsive to received RMW commands. The computational cache miss-handler is configured to perform RMW operations on state vectors stored in a memory responsive to cache misses in the computational cache. The memory is external to the cache system.
    Type: Grant
    Filed: June 29, 2022
    Date of Patent: January 30, 2024
    Assignee: Xilinx, Inc.
    Inventors: Noel J. Brady, Lars-Olof B Svensson
  • Patent number: 11886744
    Abstract: A method, computer program product, apparatus, and system are provided. Some embodiments may include transmitting a request to make one or more writes associated with an identification tag. The request may include the identification tag, the one or more writes, a first instruction to make the one or more writes to one of a plurality of persistence levels of a memory, and a second instruction to respond with at least one first indication that the one or more writes associated with the identification tag have been written to at least one of the one of the plurality of persistence levels of the memory. Some embodiments may include receiving the at least one first indication that the one or more writes associated with the identification tag have been written to at least one of the one of the plurality of persistence levels of the memory.
    Type: Grant
    Filed: December 15, 2021
    Date of Patent: January 30, 2024
    Assignee: NVIDIA CORPORATION
    Inventor: Stephen David Glaser
  • Patent number: 11848980
    Abstract: In general, this disclosure describes techniques for applying a distributed pipeline model in a distributed computing system to cause processing nodes of the distributed computing system to process data according to a distributed pipeline having an execution topology, specified within a pipeline statement, to perform a task.
    Type: Grant
    Filed: February 18, 2021
    Date of Patent: December 19, 2023
    Assignee: BORAY DATA TECHNOLOGY CO. LTD.
    Inventors: Raymond John Huetter, Alka Yamarti, Craig Alexander McIntyre
  • Patent number: 11847048
    Abstract: A processing device and methods of controlling remote persistent writes are provided. Methods include receiving an instruction of a program to issue a persistent write to remote memory. The methods also include logging an entry in a local domain when the persistent write instruction is received and providing a first indication that the persistent write will be persisted to the remote memory. The methods also include executing the persistent write to the remote memory and providing a second indication that the persistent write to the remote memory is completed. The methods also include providing the first and second indications when it is determined not to execute the persistent write according to global ordering and providing the second indication without providing the first indication when it is determined to execute the persistent write to remote memory according to global ordering.
    Type: Grant
    Filed: September 24, 2020
    Date of Patent: December 19, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Nuwan Jayasena, Shaizeen Aga
  • Patent number: 11748622
    Abstract: A computing system is configured to access intermediate outputs of a neural network by augmenting a data flow graph generated for the neural network. The data flow graph includes a plurality of nodes interconnected by connections, each node representing an operation to be executed by the neural network. To access the intermediate output, the data flow graph is augmented by inserting a node representing an operation that saves the output of a node which produces the intermediate output. The node representing the save operation is inserted while maintaining all existing nodes and connections in the data flow graph, thereby preserving the behavior of the data flow graph. The augmenting can be performed using a compiler that generates the data flow graph from program code.
    Type: Grant
    Filed: March 4, 2019
    Date of Patent: September 5, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Drazen Borkovic, Se jong Oh
  • Patent number: 11726793
    Abstract: Embodiments described herein provide an apparatus comprising a plurality of processing resources including a first processing resource and a second processing resource, a memory communicatively coupled to the first processing resource and the second processing resource, and a processor to receive data dependencies for one or more tasks comprising one or more producer tasks executing on the first processing resource and one or more consumer tasks executing on the second processing resource and move a data output from one or more producer tasks executing on the first processing resource to a cache memory communicatively coupled to the second processing resource. Other embodiments may be described and claimed.
    Type: Grant
    Filed: November 11, 2020
    Date of Patent: August 15, 2023
    Assignee: INTEL CORPORATION
    Inventors: Christopher J. Hughes, Prasoonkumar Surti, Guei-Yuan Lueh, Adam T. Lake, Jill Boyce, Subramaniam Maiyuran, Lidong Xu, James M. Holland, Vasanth Ranganathan, Nikos Kaburlasos, Altug Koker, Abhishek R. Appu
  • Patent number: 11726912
    Abstract: Systems and methods are disclosed for performing wide memory operations for a wide data cache line. In some examples of the disclosed technology, a processor having two or more execution lanes includes a data cache coupled to memory, a wide memory load circuit that concurrently loads two or more words from a cache line of the data cache, and a writeback circuit situated to send a respective word of the concurrently-loaded words to a selected execution lane of the processor, either into an operand buffer or bypassing the operand buffer. In some examples, a sharding circuit is provided that allows bitwise, byte-wise, and/or word-wise manipulation of memory operation data. In some examples, wide cache loads allows for concurrent execution of plural execution lanes of the processor.
    Type: Grant
    Filed: March 29, 2021
    Date of Patent: August 15, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Douglas C. Burger, Aaron L. Smith, Gagan Gupta, David T. Harper
  • Patent number: 11693776
    Abstract: A processing unit includes a processor core and an associated cache memory. The cache memory establishes a reservation of a hardware thread of the processor core for a store target address and services a store-conditional request of the processor core by conditionally updating the shared memory with store data based on the whether the hardware thread has a reservation for the store target address. The cache memory receives a hint associated with the store-conditional request indicating an intent of the store-conditional request. The cache memory protects the store target address against access by any conflicting memory access request during a protection window extension following servicing of the store-conditional request. The cache memory establishes a first duration for the protection window extension based on the hint having a first value and establishes a different second duration for the protection window extension based on the hint having a different second value.
    Type: Grant
    Filed: June 18, 2021
    Date of Patent: July 4, 2023
    Assignee: International Business Machines Corporation
    Inventors: Derek E. Williams, Guy L. Guthrie, Hugh Shen, Jeffrey A. Stuecheli
  • Patent number: 11681531
    Abstract: Apparatus and methods are disclosed for controlling execution of memory access instructions in a block-based processor architecture using a hardware structure that indicates a relative ordering of memory access instruction in an instruction block. In one example of the disclosed technology, a method of executing an instruction block having a plurality of memory load and/or memory store instructions includes selecting a next memory load or memory store instruction to execute based on dependencies encoded within the block, and on a store vector that stores data indicating which memory load and memory store instructions in the instruction block have executed. The store vector can be masked using a store mask. The store mask can be generated when decoding the instruction block, or copied from an instruction block header. Based on the encoded dependencies and the masked store vector, the next instruction can issue when its dependencies are available.
    Type: Grant
    Filed: October 23, 2015
    Date of Patent: June 20, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Douglas C. Burger, Aaron L. Smith
  • Patent number: 11675630
    Abstract: Methods, apparatus, systems and articles of manufacture are disclosed to configure heterogenous components in an accelerator. An example apparatus includes a graph compiler to identify a workload node in a workload and generate a selector for the workload node, and the selector to identify an input condition and an output condition of a compute building block, wherein the graph compiler is to, in response to obtaining the identified input condition and output condition from the selector, map the workload node to the compute building block.
    Type: Grant
    Filed: August 15, 2019
    Date of Patent: June 13, 2023
    Assignee: INTEL CORPORATION
    Inventors: Michael Behar, Moshe Maor, Ronen Gabbai, Roni Rosner, Zigi Walter, Oren Agam
  • Patent number: 11675594
    Abstract: Embodiments of instructions are detailed herein including one or more of 1) a branch fence instruction, prefix, or variants (BFENCE); 2) a predictor fence instruction, prefix, or variants (PFENCE); 3) an exception fence instruction, prefix, or variants (EFENCE); 4) an address computation fence instruction, prefix, or variants (AFENCE); 5) a register fence instruction, prefix, or variants (RFENCE); and, additionally, modes that apply the above semantics to some or all ordinary instructions.
    Type: Grant
    Filed: December 28, 2018
    Date of Patent: June 13, 2023
    Assignee: Intel Corporation
    Inventors: Robert S. Chappell, Jason W. Brandt, Alan Cox, Asit Mallick, Joseph Nuzman, Arjan Van De Ven
  • Patent number: 11586441
    Abstract: Systems, apparatuses, and methods for virtualizing a micro-operation cache are disclosed. A processor includes at least a micro-operation cache, a conventional cache subsystem, a decode unit, and control logic. The decode unit decodes instructions into micro-operations which are then stored in the micro-operation cache. The micro-operation cache has limited capacity for storing micro-operations. When new micro-operations are decoded from pending instructions, existing micro-operations are evicted from the micro-operation cache to make room for the new micro-operations. Rather than being discarded, micro-operations evicted from the micro-operation cache are stored in the conventional cache subsystem. This prevents the original instruction from having to be decoded again on subsequent executions.
    Type: Grant
    Filed: December 17, 2020
    Date of Patent: February 21, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: John Kalamatianos, Jagadish B. Kotra
  • Patent number: 11579873
    Abstract: An apparatus is described with support for transactional memory and load/store-exclusive instructions using an exclusive monitor indication to track exclusive access to a given address. In response to a predetermined type of load instruction specifying a load target address, which is executed within a given transaction, any exclusive monitor indication previously set for the load target address is cleared. In response to a load-exclusive instruction, an abort is triggered for a transaction for which the given address is specified as one of its working set of addresses. This helps to maintain mutual exclusion between transactional and non-transactional threads even if there is load speculation in the non-transactional thread.
    Type: Grant
    Filed: May 9, 2019
    Date of Patent: February 14, 2023
    Assignee: Arm Limited
    Inventors: Matthew James Horsnell, Grigorios Magklis, Richard Roy Grisenthwaite, Nathan Yong Seng Chong
  • Patent number: 11507640
    Abstract: Aspects for vector operations in neural network are described herein. The aspects may include a vector caching unit configured to store a first vector and a second vector, wherein the first vector includes one or more first elements and the second vector includes one or more second elements. The aspects may further include one or more adders and a combiner. The one or more adders may be configured to respectively add each of the first elements to a corresponding one of the second elements to generate one or more addition results. The combiner may be configured to combine a combiner configured to combine the one or more addition results into an output vector.
    Type: Grant
    Filed: October 26, 2018
    Date of Patent: November 22, 2022
    Assignee: CAMBRICON TECHNOLOGIES CORPORATION LIMITED
    Inventors: Jinhua Tao, Tian Zhi, Shaoli Liu, Tianshi Chen, Yunji Chen
  • Patent number: 11487545
    Abstract: A processor branch prediction circuit employs back-invalidation of prediction cache entries based on decoded branch instructions. The execution information of a previously executed branch instruction is obtained from a prediction cache entry and compared to generated decode information in an instruction decode circuit. Execution information of branch instructions stored in the prediction cache entry is updated in response to a mismatch of the execution information and the decode information of the branch instruction. Existing branch prediction circuits invalidate prediction cache entries of a block of instructions when the block of instructions is invalidated in an instruction cache. As a result, valid branch instruction execution information may be unnecessarily discarded. Updating prediction cache entries in response to a mismatch of the execution information and the decode information of the branch instruction maintains the execution information in the prediction cache.
    Type: Grant
    Filed: March 4, 2021
    Date of Patent: November 1, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Daren E. Streett, Rami Mohammad Al Sheikh, Michael Scott McIlvaine, Richard W. Doing, Robert Douglas Clancy
  • Patent number: 11481328
    Abstract: A technique includes, in response to a cache miss occurring with a given processing node of a plurality of processing nodes, using a directory-based coherence system for the plurality of processing nodes to regulate snooping of an address that is associated with the cache miss. Using the directory-based coherence system to regulate whether the address is included in a snooping domain is based at least in part on a number of cache misses associated with the address.
    Type: Grant
    Filed: July 10, 2020
    Date of Patent: October 25, 2022
    Assignee: Hewlett Packard Enterprise Development LP
    Inventors: Alexandros Daglis, Paolo Faraboschi, Qiong Cai, Gary Gostin
  • Patent number: 11481330
    Abstract: Methods, systems, and devices for cache architectures for memory devices are described. For example, a memory device may include a main array having a first set of memory cells, a cache having a second set of memory cells, and a cache delay register configured to store an indication of cache addresses associated with recently performed access operations. In some examples, the cache delay register may be operated as a first-in-first-out (FIFO) register of cache addresses, where a cache address associated with a performed access operation may be added to the beginning of the FIFO register, and a cache address at the end of the FIFO register may be purged. Information associated with access operations on the main array may be maintained in the cache, and accessed directly (e.g., without another accessing of the main array), at least as long as the cache address is present in the cache delay register.
    Type: Grant
    Filed: June 3, 2020
    Date of Patent: October 25, 2022
    Assignee: Micron Technology, Inc.
    Inventor: Nicola Del Gatto
  • Patent number: 11409539
    Abstract: Devices and techniques for on-demand programmable atomic kernel loading are described herein. A programmable atomic unit (PAU) of a memory controller can receive an invocation of a programmable atomic operator by the memory controller. The PAU can then perform a verification on a programmable atomic operator partition for the programmable atomic operator. Here, the programmable atomic operator partition is located in a memory of the PAU. The PAU can then signal a trap in response to the verification indicating that the programmable atomic operator partition is not prepared.
    Type: Grant
    Filed: October 20, 2020
    Date of Patent: August 9, 2022
    Assignee: Micron Technology, Inc.
    Inventors: Dean E. Walker, Tony Brewer, Chris Baronne
  • Patent number: 11379381
    Abstract: A main memory device includes a first memory device; and a second memory device having an access latency different from that of the first memory device. The first memory device determines, based on an access count for at least one region of the first memory device, a hot page included in the at least one region.
    Type: Grant
    Filed: October 4, 2019
    Date of Patent: July 5, 2022
    Assignee: SK hynix Inc.
    Inventors: Mi Seon Han, Yun Jeong Mun, Young Pyo Joo
  • Patent number: 11341117
    Abstract: System and methods for evicting and inserting eviction an entry for a deduplication table are described.
    Type: Grant
    Filed: January 9, 2020
    Date of Patent: May 24, 2022
    Assignee: Pure Storage, Inc.
    Inventors: John Colgrove, Joseph S. Hasbani, John Martin Hayes, Ethan L. Miller, Cary A. Sandvig
  • Patent number: 11334384
    Abstract: Systems, apparatuses, and methods for implementing scheduler queue assignment burst mode are disclosed. A scheduler queue assignment unit receives a dispatch packet with a plurality of operations from a decode unit in each clock cycle. The scheduler queue assignment unit determines if the number of operations in the dispatch packet for any class of operations is greater than a corresponding threshold for dispatching to the scheduler queues in a single cycle. If the number of operations for a given class is greater than the corresponding threshold, and if a burst mode counter is less than a burst mode window threshold, the scheduler queue assignment unit dispatches the extra number of operations for the given class in a single cycle. By operating in burst mode for a given operation class during a small number of cycles, processor throughput can be increased without starving the processor of other operation classes.
    Type: Grant
    Filed: December 10, 2019
    Date of Patent: May 17, 2022
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Alok Garg, Scott Andrew McLelland, Marius Evers, Matthew T. Sobel
  • Patent number: 11301251
    Abstract: Systems and methods are disclosed for fetch stage handling of indirect jumps in a processor pipeline. For example, a method includes detecting a sequence of instructions fetched by a processor core, wherein the sequence of instructions includes a first instruction, with a result that depends on an immediate field of the first instruction and a program counter value, followed by a second instruction that is an indirect jump instruction; responsive to detection of the sequence of instructions, preventing an indirect jump target predictor circuit from generating a target address prediction for the second instruction; and, responsive to detection of the sequence of instructions, determining a target address for the second instruction before the first instruction is issued to an execution stage of a pipeline of the processor core.
    Type: Grant
    Filed: April 23, 2020
    Date of Patent: April 12, 2022
    Assignee: SiFive, Inc.
    Inventors: Joshua Smith, Krste Asanovic, Andrew Waterman
  • Patent number: 11294678
    Abstract: Systems, apparatuses, and methods for implementing scheduler queue assignment logic are disclosed. A processor includes at least a decode unit, scheduler queue assignment logic, scheduler queues, pickers, and execution units. The assignment logic receives a plurality of operations from a decode unit in each clock cycle. The assignment logic includes a separate logical unit for each different type of operation which is executable by the different execution units of the processor. For each different type of operation, the assignment logic determines which of the possible assignment permutations are valid for assigning different numbers of operations to scheduler queues in a given clock cycle. The assignment logic receives an indication of how many operations to assign in the given clock cycle, and then the assignment logic selects one of the valid assignment permutations for the number of operations specified by the indication.
    Type: Grant
    Filed: May 29, 2018
    Date of Patent: April 5, 2022
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Matthew T. Sobel, Donald A. Priore, Alok Garg
  • Patent number: 11249762
    Abstract: An apparatus and method are provided for handling incorrect branch direction predictions. The apparatus has processing circuitry for executing instructions, branch prediction circuitry for making branch direction predictions in respect of branch instructions, and fetch circuitry for fetching instructions from an instruction cache in dependence on the branch direction predictions and for forwarding the fetched instructions to the processing circuitry for execution. A cache location buffer stores cache location information for a given branch instruction for which accuracy of the branch direction predictions made by the branch prediction circuitry is below a determined threshold. The cache location information identifies where within the instruction cache one or more instructions are stored that will need to be executed in the event that a subsequent branch direction prediction made for the given branch instruction is incorrect.
    Type: Grant
    Filed: October 24, 2019
    Date of Patent: February 15, 2022
    Assignee: Arm Limited
    Inventors: Houdhaifa Bouzguarrou, Guillaume Bolbenes, Thibaut Elie Lanois
  • Patent number: 11243767
    Abstract: A caching device, an instruction cache, a system for processing an instruction, a method and apparatus for processing data and a medium are provided. The caching device includes a first queue, a second queue, a write port group, a read port, a first pop-up port, a second pop-up port and a press-in port. The is configured to write cache data into a set storage address in the first queue and/or the second queue; the read port is configured to read all cache data from the first queue and/or the second queue at one time; the press-in port is configured to press cache data into the first queue and/or the second queue; the first pop-up port is configured to pop up cache data from the first queue; and the second pop-up port is configured to pop up cache data from the second queue.
    Type: Grant
    Filed: September 11, 2020
    Date of Patent: February 8, 2022
    Assignee: Beijing Baidu Netcom Science and Technology Co., Ltd.
    Inventors: Chao Tang, Xueliang Du, Yingnan Xu, Kang An
  • Patent number: 11237730
    Abstract: A method for improving cache hit ratios for selected volumes within a storage system is disclosed. In one embodiment, such a method includes monitoring I/O to multiple volumes residing on a storage system. The storage system includes a cache to store data associated with the volumes. The method determines, from the I/O, which particular volumes of the multiple volumes would benefit the most if provided favored status in the cache. The favored status provides increased residency time in the cache to the particular volumes compared to volumes not having the favored status. The method generates a list of the particular volumes and transmits the list to the storage system. The storage system, in turn, provides increased residency time to the particular volumes in accordance with their favored status. A corresponding system and computer program product are also disclosed.
    Type: Grant
    Filed: May 12, 2019
    Date of Patent: February 1, 2022
    Assignee: International Business Machines Corporation
    Inventors: Lokesh M. Gupta, Beth A. Peterson, Kevin J. Ash, Kyler A. Anderson
  • Patent number: 11112774
    Abstract: A numerical controller determines whether or not block prefetching from a program is sufficient based on whether at least one block subsequent to a predetermined reference block, which exists after a running block of the program and is needed to execute the reference block, has been prefetched or not. If the determination result is that prefetching is not sufficient, block prefetching from the program is performed.
    Type: Grant
    Filed: March 16, 2020
    Date of Patent: September 7, 2021
    Assignee: Fanuc Corporation
    Inventor: Nobuhito Oonishi
  • Patent number: 11108833
    Abstract: A method and devices for handling crossed-invite situations in set-up of IP-based sessions. A local device receives an incoming session invite after sending an outgoing session invite before the outgoing session invite has been accepted. It then determines that the incoming session invite was sent by the remote device to which the outgoing session invite is also addressed. The method includes determining a remote device priority value from identifying information contained in the incoming session invite, comparing the remote device priority value with a local device priority value to determine whether the remote device or the local device is higher priority and, if the remote device is higher priority, canceling the outgoing session invite and displaying an incoming call answer screen for the incoming session invite, and if the local device is higher priority, waiting for cancelation of the incoming session invite and acceptance of the outgoing session invite.
    Type: Grant
    Filed: June 6, 2016
    Date of Patent: August 31, 2021
    Assignee: BlackBerry Limited
    Inventors: Bechir Trabelsi, Andrew Michael Allen, Kevin N. Chen, Lawrence Edward Kuhl
  • Patent number: 11099995
    Abstract: Examples include techniques to prefetch data from a second level of memory of a hierarchical arrangement of memory to a second level of memory of the hierarchical arrangement of memory. Examples include circuitry for a processor receiving a prefetch request from a core of the processor to prefetch data from the first level to the second level. The prefetch request indicating an amount of data to prefetch that is greater than a data capacity of a cache line utilized by the core.
    Type: Grant
    Filed: March 28, 2018
    Date of Patent: August 24, 2021
    Assignee: Intel Corporation
    Inventors: Michael Klemm, Thomas Willhalm
  • Patent number: 11080195
    Abstract: The size of a cache is modestly increased so that a short pointer to a predicted next memory address in the same cache is added to each cache line in the cache. In response to a cache hit, the predicted next memory address identified by the short pointer in the cache line of the hit along with an associated entry are pushed to a next faster cache when a valid short pointer to the predicted next memory address is present in the cache line of the hit.
    Type: Grant
    Filed: September 10, 2019
    Date of Patent: August 3, 2021
    Assignee: Marvell Asia Pte, Ltd.
    Inventors: Shay Gal-On, Srilatha Manne, Edward McLellan, Alexander Rucker
  • Patent number: 11048440
    Abstract: A memory system includes a memory device having a plurality of memory blocks and a subcommand storage circuit, and a memory controller for controlling the memory device, wherein the memory device is capable of being in one or more of a ready state, a first busy state, and a second busy state, and wherein the subcommand is stored in the subcommand storage circuit when the subcommand is received from the memory controller in the first busy state and the subcommand is executable after the first busy state is released, and the subcommand stored in the subcommand storage circuit is executed after the memory device is changed to the ready state.
    Type: Grant
    Filed: August 26, 2019
    Date of Patent: June 29, 2021
    Assignee: SK hynix Inc.
    Inventors: Sung-Won Bae, Jun-Hyuk Lee, Deung-Kak Yoo, Min-Kyu Lee
  • Patent number: 11042315
    Abstract: In a computer system, a multilevel memory includes a near memory device and a far memory device, which are byte addressable. The multilevel memory includes a controller that receives a data request including original tag information. The controller includes routing hardware to selectively provide alternate tag information for the data request to cause a cache hit or a cache miss to selectively direct the request to the near memory device or to the far memory device, respectively. The controller can include selection circuitry to select between the original tag information and the alternate tag information to control where the data request is sent.
    Type: Grant
    Filed: March 29, 2018
    Date of Patent: June 22, 2021
    Assignee: Intel Corporation
    Inventors: Lakshminarayana Pappu, Christopher E. Cox, Navneet Dour, Asaf Rubinstein, Israel Diamand
  • Patent number: 11042462
    Abstract: Identifying computer program execution characteristics for determine relevance of pattern instruction executions to determine characteristics of a computer program. Filters are utilized to determine which subsequent occurrences of execution of at least one computer instruction are relevant to a counter based on execution characteristics of the at least one computer instruction where the counter counts the subsequent occurrences of execution of at least one computer instruction following prior executions of the same at least one computer instruction.
    Type: Grant
    Filed: September 4, 2019
    Date of Patent: June 22, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Anthony Thomas Sofia, Peter Sutton, Robert W. St. John, Matthias Klein
  • Patent number: 11044140
    Abstract: Embodiments of the present disclosure provide a method and an apparatus for serialization and deserialization of a message structure. The method for serialization includes: acquiring a message structure, and pre-processing the message structure to generate a corresponding target version number, wherein message structures with different contents correspond to different version numbers, and the message structure is a structure of status information transmitted by a server to a client; serializing information to be transmitted to obtain a corresponding byte stream, wherein the information to be transmitted carries data of the status information and the target version number; and transmitting the byte stream to the client.
    Type: Grant
    Filed: November 26, 2019
    Date of Patent: June 22, 2021
    Assignee: MICROFUN CO., LTD
    Inventors: Chi Gao, Guangrong Su, Yingjie Han
  • Patent number: 11030073
    Abstract: Techniques are provided for redundant execution by a better processor for intensive dynamic profiling after initial execution by a constrained processor. In an embodiment, a system of computer(s) receives a request to profile particular runtime aspects of an original binary executable. Based on the particular runtime aspects and without accessing source logic, the system statically rewrites the original binary executable into a rewritten binary executable that invokes telemetry instrumentation that makes observations of the particular runtime aspects and emits traces of those observations. A first processing core having low power (capacity) performs a first execution of the rewritten binary executable to make first observations and emit first traces of the first observations. Afterwards, a second processing core performs a second (redundant) execution of the original binary executable based on the first traces.
    Type: Grant
    Filed: October 31, 2019
    Date of Patent: June 8, 2021
    Assignee: Oracle International Corporation
    Inventors: Sam Idicula, Kirtikar Kashyap, Arun Raghavan, Evangelos Vlachos, Venkatraman Govindaraju
  • Patent number: 10997066
    Abstract: A storage device includes a descramble module configured to descramble at least a portion of a read command, which includes logical block address (LBA) information and first meta information, into first signature information and first physical address (PA) information, for accessing a flash memory. A compare module is provided, which is configured to compare the first signature information against stored signature information to thereby determine an equivalency or discrepancy therebetween. An access module is provided, which is configured to use the first PA information to access a data region of the flash memory, upon determination of the equivalency by said compare module.
    Type: Grant
    Filed: September 18, 2018
    Date of Patent: May 4, 2021
    Inventors: Dong-Woo Kim, Jae Sun No, Song Ho Yoon, Kyoung Back Lee, Wook Han Jeong
  • Patent number: 10990159
    Abstract: Systems, apparatuses, and methods for retaining architected state for relatively frequent switching between sleep and active operating states are described. A processor receives an indication to transition from an active state to a sleep state. The processor stores a copy of a first subset of the architected state information in on-die storage elements capable of retaining storage after power is turned off. The processor supports programmable input/output (PIO) access of particular stored information during the sleep state. When a wakeup event is detected, circuitry within the processor is powered up again. A boot sequence and recovery of architected state from off-chip memory are not performed. Rather than fetch from a memory location pointed to by a reset base address register, the processor instead fetches an instruction from a memory location pointed to by a restored program counter of the retained subset of the architected state information.
    Type: Grant
    Filed: April 25, 2017
    Date of Patent: April 27, 2021
    Assignee: Apple Inc.
    Inventors: Bernard Joseph Semeria, John H. Mylius, Pradeep Kanapathipillai, Richard F. Russo, Shih-Chieh Wen, Richard H. Larson
  • Patent number: 10963399
    Abstract: A memory system may include a storage device and a controller. The storage device may include a non-volatile memory device. The controller may include a device memory. The controller may control operations of the non-volatile memory device in accordance with a request of a host device. wherein the controller includes a map data management circuit configured to cache one or more segments from a plurality of map segment groups stored in the storage device, each segment having information including a reference count and mapping relationships between logical addresses and physical addresses, detect, among the one or more cached segments, an upload target segment in which the reference count is greater than a predetermined count and transmit, when a predetermined number or greater of upload target segments are detected within a first map segment group, the predetermined number or greater of upload target segments to the host device.
    Type: Grant
    Filed: October 3, 2019
    Date of Patent: March 30, 2021
    Assignee: SK hynix Inc.
    Inventor: Eu Joon Byun
  • Patent number: 10896044
    Abstract: The techniques described herein provide an instruction fetch and decode unit having an operation cache with low latency in switching between fetching decoded operations from the operation cache and fetching and decoding instructions using a decode unit. This low latency is accomplished through a synchronization mechanism that allows work to flow through both the operation cache path and the instruction cache path until that work is stopped due to needing to wait on output from the opposite path. The existence of decoupling buffers in the operation cache path and the instruction cache path allows work to be held until that work is cleared to proceed. Other improvements, such as a specially configured operation cache tag array that allows for detection of multiple hits in a single cycle, also improve latency by, for example, improving the speed at which entries are consumed from a prediction queue that stores predicted address blocks.
    Type: Grant
    Filed: June 21, 2018
    Date of Patent: January 19, 2021
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Marius Evers, Dhanaraj Bapurao Tavare, Ashok Tirupathy Venkatachar, Arunachalam Annamalai, Donald A. Priore, Douglas R. Williams
  • Patent number: 10891135
    Abstract: A system and a method are disclosed to process instructions in an execution unit (EU) that includes an operand cache (OC). The OC stores a copy of at least one frequently used operand stored in a physical register file (PRF). The EU may process instructions using operands obtained from the PRF or from the OC. In the first mode, an OC renaming unit (OC-REN) indicates to the EU to process instructions using operands obtained from the OC if processing the instructions using operands obtained from the OC uses less power than using operands obtained from the PRF. In the second mode, the OC-REN indicates to the EU to process the instructions using operands obtained from the PRF if processing the instructions using operands obtained from the PRF uses less power than using operands obtained from the OC.
    Type: Grant
    Filed: March 6, 2019
    Date of Patent: January 12, 2021
    Inventors: Paul E. Kitchin, Nicholas Humphries, Ken Yu Lim, Ryan Hensley
  • Patent number: 10860324
    Abstract: An apparatus and method are provided for making predictions for branch instructions. The apparatus has a prediction queue for identifying instructions to be fetched for execution, and branch prediction circuitry for making predictions in respect of branch instructions, and for controlling which instructions are identified in the prediction queue in dependence on the predictions. During each prediction iteration, the branch prediction circuitry makes a prediction for a predict block comprising a sequence of M instructions. The branch prediction circuitry comprises a target prediction storage having a plurality of entries that are used to identify target addresses for branch instructions that are predicted as taken, the target prediction storage being arranged as an N-way set associative storage structure comprising a plurality of sets. Each predict block has an associated set within the target prediction storage.
    Type: Grant
    Filed: June 5, 2019
    Date of Patent: December 8, 2020
    Assignee: Arm Limited
    Inventors: Houdhaifa Bouzguarrou, Guillaume Bolbenes, Eddy Lapeyre, Luc Orion
  • Patent number: 10853224
    Abstract: Indexing and searching a bit-accurate trace for arbitrary length/arbitrary alignment values in traced thread(s). Indexing includes, while replaying a plurality of trace segments, identifying a set of n-grams for each trace segment that exist in processor data influx(es) and/or store(s) to a processor cache that resulted from replay of the trace segment. An index data structure, which associates each identified n-gram with trace location(s) at or in which the n-gram was found, is then generated. The index data structure thus associates unique n-grams with prior execution time(s) at or during which the traced thread(s) read or wrote the n-gram. Searching an indexed trace includes identifying n-grams in a query and using the index data structure to determine trace location(s) where these n-grams were seen during indexing. A query response is generated after using trace replay to locate particular execution time(s) and memory location(s) at which the n-grams occurred.
    Type: Grant
    Filed: November 30, 2018
    Date of Patent: December 1, 2020
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventor: Jordi Mola
  • Patent number: 10776261
    Abstract: A storage apparatus managing method, applied to a first storage apparatus and a second storage apparatus coupled to an electronic apparatus, wherein the first storage apparatus comprises a local registering region and a global registering region, comprising: (a) receiving a read request indicating reading a target data unit from the second storage apparatus; (b) confirming whether the global registering region has the target data unit; (c) if yes, reading the target data unit from the global registering region, if not, confirming whether the local registering region has the target data unit; and (d) reading the target data unit from the local registering region if the local registering region has the target data unit, reading the target data unit from the second storage apparatus if the local registering region does not have the target data unit.
    Type: Grant
    Filed: July 5, 2018
    Date of Patent: September 15, 2020
    Assignee: Silicon Motion, Inc.
    Inventor: Chao-Yu Lin
  • Patent number: 10754775
    Abstract: A memory management unit responds to an invalidate by class command by identifying a marker for a class of cache entries that the invalidate by class command is meant to invalidate. The memory management unit stores the active marker as a retired marker and then sets the active marker to the next available marker. Thereafter, the memory management sends an acknowledgement signal (e.g., to the operating system) while invalidating the cache entries having the class and the retired marker in the background. By correlating markers with classes of cache entries, the memory management can more quickly respond to class invalidation requests.
    Type: Grant
    Filed: December 17, 2018
    Date of Patent: August 25, 2020
    Assignee: NVIDIA Corporation
    Inventors: Jay Gupta, Gosagan Padmanabhan, Devesh Mittal, Kaushal Agarwal
  • Patent number: 10747668
    Abstract: A shared cache memory can be logically partitioned among different workloads to provide isolation between workloads and avoid excessive resource contention. Each logical partition is apportioned a share of the cache memory, and is exclusive to a respective one of the workloads. Each partition has an initial size allocation. Historical data can be collected and processed for each partition and used to periodically update its size allocation.
    Type: Grant
    Filed: November 1, 2018
    Date of Patent: August 18, 2020
    Assignee: VMWARE, INC.
    Inventors: Zhihao Yao, Tan Li, Sunil Satnur, Kiran Joshi
  • Patent number: 10719442
    Abstract: An apparatus and method for prioritizing transactional memory regions. For example, one embodiment of a processor comprises: a plurality of cores to execute threads comprising sequences of instructions, at least some of the instructions specifying a transactional memory region; a cache of each core to store a plurality of cache lines; transactional memory circuitry of each core to manage execution of the transactional memory (TM) regions based on priorities associated with each of the TM regions; and wherein the transactional memory circuitry, upon detecting a conflict between a first TM region having a first priority value and a second TM region having a second priority value, is to determine which of the first TM region or the second TM region is permitted to continue executing and which is to be aborted based, at least in part, on the first and second priority values.
    Type: Grant
    Filed: September 10, 2018
    Date of Patent: July 21, 2020
    Assignee: Intel Corporation
    Inventors: Ren Wang, Raanan Sade, Yipeng Wang, Tsung-Yuan Tai, Sameh Gobriel
  • Patent number: 10719448
    Abstract: A cache is presented. The cache comprises a tag array configured to store one or more tag addresses, a data array configured to store data acquired from a dynamic random access memory device, and a cache controller. The cache controller is configured to: receive a cache access request; determine, based on an indication associated with the cache access request, a cache access policy; and perform an operation to the tag array and to the data array based on the determined cache access policy.
    Type: Grant
    Filed: June 13, 2017
    Date of Patent: July 21, 2020
    Assignee: ALIBABA GROUP HOLDING LIMITED
    Inventor: Xiaowei Jiang
  • Patent number: 10713054
    Abstract: A processor includes two or more branch target buffer (BTB) tables for branch prediction, each BTB table storing entries of a different target size or width or storing entries of a different branch type. Each BTB entry includes at least a tag and a target address. For certain branch types that only require a few target address bits, the respective BTB tables are narrower thereby allowing for more BTB entries in the processor separated into respective BTB tables by branch instruction type. An increased number of available BTB entries are stored in a same or a less space in the processor thereby increasing a speed of instruction processing. BTB tables can be defined that do not store any target address and rely on a decode unit to provide it. High value BTB entries have dedicated storage and are therefore less likely to be evicted than low value BTB entries.
    Type: Grant
    Filed: July 9, 2018
    Date of Patent: July 14, 2020
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Thomas Cloqueur, Anthony Jarvis
  • Patent number: 10657063
    Abstract: The present invention discloses a data access device and method applicable to a processor. An embodiment of the data access device comprises: an instruction cache memory; a data cache memory; a processor circuit configured to read specific data from the instruction cache memory for the Nth time and read the specific data from the data cache memory for the Mth time, in which both N and M are positive integers and M is greater than N; a duplication circuit configured to copy the specific data from the instruction cache memory to the data cache memory when the processor circuit reads the specific data for the Nth time; and a decision circuit configured to determine whether data requested by a read request from the processor circuit are stored in the data cache memory according to the read request.
    Type: Grant
    Filed: July 13, 2018
    Date of Patent: May 19, 2020
    Assignee: REALTEK SEMICONDUCTOR CORPORATION
    Inventors: Yen-Ju Lu, Chao-Wei Huang