Instruction Data Cache Patents (Class 711/125)
-
Patent number: 12135606Abstract: An electronic apparatus including a memory; and a processor including at least one core, among a plurality of cores, that is configured to execute an instruction corresponding to at least one safety function. The processor is further configured to, based on at least one instruction being executed in the at least one core while the electronic apparatus operates in a first state, identify whether the at least one instruction corresponds to the safety function based on an output of a trained neural network model; and based on a result of the identification, determine an operation state of the electronic apparatus as one of the first state or a second state.Type: GrantFiled: December 4, 2020Date of Patent: November 5, 2024Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Sangyoung Hwang, Woomok Kim
-
Patent number: 12008252Abstract: Various illustrative aspects are directed to a data storage device, comprising one or more disks; an actuator mechanism configured to position heads proximate to a recording medium of the disks; and one or more processing devices. The processing devices are configured to detect a criterion for inserting padding on the recording medium proximate to data containers to be written to the recording medium, the containers configured for assigning logic blocks to the containers, the logic blocks configured to store data to be written in an interleaved pattern across sectors based on a distributed sector encoding scheme, wherein detecting the criterion comprises detecting a mismatch in size between at least a portion of a zone and an integer number of containers in which to write the at least a portion of the zone; and insert mapping indicators to a mapping to indicate padding blocks proximate to the containers.Type: GrantFiled: June 27, 2022Date of Patent: June 11, 2024Assignee: WESTERN DIGITAL TECHNOLOGIES, INC.Inventors: Scott Burton, Daniel J. Wade, Eric B. Smith
-
Patent number: 11934342Abstract: Embodiments are generally directed to graphics processor data access and sharing. An embodiment of an apparatus includes a circuit element to produce a result in processing of an application; a load-store unit to receive the result and generate pre-fetch information for a cache utilizing the result; and a prefetch generator to produce prefetch addresses based at least in part on the pre-fetch information; wherein the load-store unit is to receive software assistance for prefetching, and wherein generation of the pre-fetch information is based at least in part on the software assistance.Type: GrantFiled: March 14, 2020Date of Patent: March 19, 2024Assignee: INTEL CORPORATIONInventors: Altug Koker, Varghese George, Aravindh Anantaraman, Valentin Andrei, Abhishek R. Appu, Niranjan Cooray, Nicolas Galoppo Von Borries, Mike MacPherson, Subramaniam Maiyuran, ElMoustapha Ould-Ahmed-Vall, David Puffer, Vasanth Ranganathan, Joydeep Ray, Ankur N. Shah, Lakshminarayanan Striramassarma, Prasoonkumar Surti, Saurabh Tangri
-
Patent number: 11886344Abstract: A cache system includes a computational cache and a computational cache miss-handler. The computational cache is configured to cache state vectors and perform read-modify-write (RMW) operations on the cached state vectors responsive to received RMW commands. The computational cache miss-handler is configured to perform RMW operations on state vectors stored in a memory responsive to cache misses in the computational cache. The memory is external to the cache system.Type: GrantFiled: June 29, 2022Date of Patent: January 30, 2024Assignee: Xilinx, Inc.Inventors: Noel J. Brady, Lars-Olof B Svensson
-
Patent number: 11886744Abstract: A method, computer program product, apparatus, and system are provided. Some embodiments may include transmitting a request to make one or more writes associated with an identification tag. The request may include the identification tag, the one or more writes, a first instruction to make the one or more writes to one of a plurality of persistence levels of a memory, and a second instruction to respond with at least one first indication that the one or more writes associated with the identification tag have been written to at least one of the one of the plurality of persistence levels of the memory. Some embodiments may include receiving the at least one first indication that the one or more writes associated with the identification tag have been written to at least one of the one of the plurality of persistence levels of the memory.Type: GrantFiled: December 15, 2021Date of Patent: January 30, 2024Assignee: NVIDIA CORPORATIONInventor: Stephen David Glaser
-
Patent number: 11847048Abstract: A processing device and methods of controlling remote persistent writes are provided. Methods include receiving an instruction of a program to issue a persistent write to remote memory. The methods also include logging an entry in a local domain when the persistent write instruction is received and providing a first indication that the persistent write will be persisted to the remote memory. The methods also include executing the persistent write to the remote memory and providing a second indication that the persistent write to the remote memory is completed. The methods also include providing the first and second indications when it is determined not to execute the persistent write according to global ordering and providing the second indication without providing the first indication when it is determined to execute the persistent write to remote memory according to global ordering.Type: GrantFiled: September 24, 2020Date of Patent: December 19, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Nuwan Jayasena, Shaizeen Aga
-
Patent number: 11848980Abstract: In general, this disclosure describes techniques for applying a distributed pipeline model in a distributed computing system to cause processing nodes of the distributed computing system to process data according to a distributed pipeline having an execution topology, specified within a pipeline statement, to perform a task.Type: GrantFiled: February 18, 2021Date of Patent: December 19, 2023Assignee: BORAY DATA TECHNOLOGY CO. LTD.Inventors: Raymond John Huetter, Alka Yamarti, Craig Alexander McIntyre
-
Patent number: 11748622Abstract: A computing system is configured to access intermediate outputs of a neural network by augmenting a data flow graph generated for the neural network. The data flow graph includes a plurality of nodes interconnected by connections, each node representing an operation to be executed by the neural network. To access the intermediate output, the data flow graph is augmented by inserting a node representing an operation that saves the output of a node which produces the intermediate output. The node representing the save operation is inserted while maintaining all existing nodes and connections in the data flow graph, thereby preserving the behavior of the data flow graph. The augmenting can be performed using a compiler that generates the data flow graph from program code.Type: GrantFiled: March 4, 2019Date of Patent: September 5, 2023Assignee: Amazon Technologies, Inc.Inventors: Drazen Borkovic, Se jong Oh
-
Patent number: 11726793Abstract: Embodiments described herein provide an apparatus comprising a plurality of processing resources including a first processing resource and a second processing resource, a memory communicatively coupled to the first processing resource and the second processing resource, and a processor to receive data dependencies for one or more tasks comprising one or more producer tasks executing on the first processing resource and one or more consumer tasks executing on the second processing resource and move a data output from one or more producer tasks executing on the first processing resource to a cache memory communicatively coupled to the second processing resource. Other embodiments may be described and claimed.Type: GrantFiled: November 11, 2020Date of Patent: August 15, 2023Assignee: INTEL CORPORATIONInventors: Christopher J. Hughes, Prasoonkumar Surti, Guei-Yuan Lueh, Adam T. Lake, Jill Boyce, Subramaniam Maiyuran, Lidong Xu, James M. Holland, Vasanth Ranganathan, Nikos Kaburlasos, Altug Koker, Abhishek R. Appu
-
Patent number: 11726912Abstract: Systems and methods are disclosed for performing wide memory operations for a wide data cache line. In some examples of the disclosed technology, a processor having two or more execution lanes includes a data cache coupled to memory, a wide memory load circuit that concurrently loads two or more words from a cache line of the data cache, and a writeback circuit situated to send a respective word of the concurrently-loaded words to a selected execution lane of the processor, either into an operand buffer or bypassing the operand buffer. In some examples, a sharding circuit is provided that allows bitwise, byte-wise, and/or word-wise manipulation of memory operation data. In some examples, wide cache loads allows for concurrent execution of plural execution lanes of the processor.Type: GrantFiled: March 29, 2021Date of Patent: August 15, 2023Assignee: Microsoft Technology Licensing, LLCInventors: Douglas C. Burger, Aaron L. Smith, Gagan Gupta, David T. Harper
-
Patent number: 11693776Abstract: A processing unit includes a processor core and an associated cache memory. The cache memory establishes a reservation of a hardware thread of the processor core for a store target address and services a store-conditional request of the processor core by conditionally updating the shared memory with store data based on the whether the hardware thread has a reservation for the store target address. The cache memory receives a hint associated with the store-conditional request indicating an intent of the store-conditional request. The cache memory protects the store target address against access by any conflicting memory access request during a protection window extension following servicing of the store-conditional request. The cache memory establishes a first duration for the protection window extension based on the hint having a first value and establishes a different second duration for the protection window extension based on the hint having a different second value.Type: GrantFiled: June 18, 2021Date of Patent: July 4, 2023Assignee: International Business Machines CorporationInventors: Derek E. Williams, Guy L. Guthrie, Hugh Shen, Jeffrey A. Stuecheli
-
Patent number: 11681531Abstract: Apparatus and methods are disclosed for controlling execution of memory access instructions in a block-based processor architecture using a hardware structure that indicates a relative ordering of memory access instruction in an instruction block. In one example of the disclosed technology, a method of executing an instruction block having a plurality of memory load and/or memory store instructions includes selecting a next memory load or memory store instruction to execute based on dependencies encoded within the block, and on a store vector that stores data indicating which memory load and memory store instructions in the instruction block have executed. The store vector can be masked using a store mask. The store mask can be generated when decoding the instruction block, or copied from an instruction block header. Based on the encoded dependencies and the masked store vector, the next instruction can issue when its dependencies are available.Type: GrantFiled: October 23, 2015Date of Patent: June 20, 2023Assignee: Microsoft Technology Licensing, LLCInventors: Douglas C. Burger, Aaron L. Smith
-
Patent number: 11675630Abstract: Methods, apparatus, systems and articles of manufacture are disclosed to configure heterogenous components in an accelerator. An example apparatus includes a graph compiler to identify a workload node in a workload and generate a selector for the workload node, and the selector to identify an input condition and an output condition of a compute building block, wherein the graph compiler is to, in response to obtaining the identified input condition and output condition from the selector, map the workload node to the compute building block.Type: GrantFiled: August 15, 2019Date of Patent: June 13, 2023Assignee: INTEL CORPORATIONInventors: Michael Behar, Moshe Maor, Ronen Gabbai, Roni Rosner, Zigi Walter, Oren Agam
-
Patent number: 11675594Abstract: Embodiments of instructions are detailed herein including one or more of 1) a branch fence instruction, prefix, or variants (BFENCE); 2) a predictor fence instruction, prefix, or variants (PFENCE); 3) an exception fence instruction, prefix, or variants (EFENCE); 4) an address computation fence instruction, prefix, or variants (AFENCE); 5) a register fence instruction, prefix, or variants (RFENCE); and, additionally, modes that apply the above semantics to some or all ordinary instructions.Type: GrantFiled: December 28, 2018Date of Patent: June 13, 2023Assignee: Intel CorporationInventors: Robert S. Chappell, Jason W. Brandt, Alan Cox, Asit Mallick, Joseph Nuzman, Arjan Van De Ven
-
Patent number: 11586441Abstract: Systems, apparatuses, and methods for virtualizing a micro-operation cache are disclosed. A processor includes at least a micro-operation cache, a conventional cache subsystem, a decode unit, and control logic. The decode unit decodes instructions into micro-operations which are then stored in the micro-operation cache. The micro-operation cache has limited capacity for storing micro-operations. When new micro-operations are decoded from pending instructions, existing micro-operations are evicted from the micro-operation cache to make room for the new micro-operations. Rather than being discarded, micro-operations evicted from the micro-operation cache are stored in the conventional cache subsystem. This prevents the original instruction from having to be decoded again on subsequent executions.Type: GrantFiled: December 17, 2020Date of Patent: February 21, 2023Assignee: Advanced Micro Devices, Inc.Inventors: John Kalamatianos, Jagadish B. Kotra
-
Patent number: 11579873Abstract: An apparatus is described with support for transactional memory and load/store-exclusive instructions using an exclusive monitor indication to track exclusive access to a given address. In response to a predetermined type of load instruction specifying a load target address, which is executed within a given transaction, any exclusive monitor indication previously set for the load target address is cleared. In response to a load-exclusive instruction, an abort is triggered for a transaction for which the given address is specified as one of its working set of addresses. This helps to maintain mutual exclusion between transactional and non-transactional threads even if there is load speculation in the non-transactional thread.Type: GrantFiled: May 9, 2019Date of Patent: February 14, 2023Assignee: Arm LimitedInventors: Matthew James Horsnell, Grigorios Magklis, Richard Roy Grisenthwaite, Nathan Yong Seng Chong
-
Patent number: 11507640Abstract: Aspects for vector operations in neural network are described herein. The aspects may include a vector caching unit configured to store a first vector and a second vector, wherein the first vector includes one or more first elements and the second vector includes one or more second elements. The aspects may further include one or more adders and a combiner. The one or more adders may be configured to respectively add each of the first elements to a corresponding one of the second elements to generate one or more addition results. The combiner may be configured to combine a combiner configured to combine the one or more addition results into an output vector.Type: GrantFiled: October 26, 2018Date of Patent: November 22, 2022Assignee: CAMBRICON TECHNOLOGIES CORPORATION LIMITEDInventors: Jinhua Tao, Tian Zhi, Shaoli Liu, Tianshi Chen, Yunji Chen
-
Patent number: 11487545Abstract: A processor branch prediction circuit employs back-invalidation of prediction cache entries based on decoded branch instructions. The execution information of a previously executed branch instruction is obtained from a prediction cache entry and compared to generated decode information in an instruction decode circuit. Execution information of branch instructions stored in the prediction cache entry is updated in response to a mismatch of the execution information and the decode information of the branch instruction. Existing branch prediction circuits invalidate prediction cache entries of a block of instructions when the block of instructions is invalidated in an instruction cache. As a result, valid branch instruction execution information may be unnecessarily discarded. Updating prediction cache entries in response to a mismatch of the execution information and the decode information of the branch instruction maintains the execution information in the prediction cache.Type: GrantFiled: March 4, 2021Date of Patent: November 1, 2022Assignee: Microsoft Technology Licensing, LLCInventors: Daren E. Streett, Rami Mohammad Al Sheikh, Michael Scott McIlvaine, Richard W. Doing, Robert Douglas Clancy
-
Patent number: 11481330Abstract: Methods, systems, and devices for cache architectures for memory devices are described. For example, a memory device may include a main array having a first set of memory cells, a cache having a second set of memory cells, and a cache delay register configured to store an indication of cache addresses associated with recently performed access operations. In some examples, the cache delay register may be operated as a first-in-first-out (FIFO) register of cache addresses, where a cache address associated with a performed access operation may be added to the beginning of the FIFO register, and a cache address at the end of the FIFO register may be purged. Information associated with access operations on the main array may be maintained in the cache, and accessed directly (e.g., without another accessing of the main array), at least as long as the cache address is present in the cache delay register.Type: GrantFiled: June 3, 2020Date of Patent: October 25, 2022Assignee: Micron Technology, Inc.Inventor: Nicola Del Gatto
-
Patent number: 11481328Abstract: A technique includes, in response to a cache miss occurring with a given processing node of a plurality of processing nodes, using a directory-based coherence system for the plurality of processing nodes to regulate snooping of an address that is associated with the cache miss. Using the directory-based coherence system to regulate whether the address is included in a snooping domain is based at least in part on a number of cache misses associated with the address.Type: GrantFiled: July 10, 2020Date of Patent: October 25, 2022Assignee: Hewlett Packard Enterprise Development LPInventors: Alexandros Daglis, Paolo Faraboschi, Qiong Cai, Gary Gostin
-
Patent number: 11409539Abstract: Devices and techniques for on-demand programmable atomic kernel loading are described herein. A programmable atomic unit (PAU) of a memory controller can receive an invocation of a programmable atomic operator by the memory controller. The PAU can then perform a verification on a programmable atomic operator partition for the programmable atomic operator. Here, the programmable atomic operator partition is located in a memory of the PAU. The PAU can then signal a trap in response to the verification indicating that the programmable atomic operator partition is not prepared.Type: GrantFiled: October 20, 2020Date of Patent: August 9, 2022Assignee: Micron Technology, Inc.Inventors: Dean E. Walker, Tony Brewer, Chris Baronne
-
Patent number: 11379381Abstract: A main memory device includes a first memory device; and a second memory device having an access latency different from that of the first memory device. The first memory device determines, based on an access count for at least one region of the first memory device, a hot page included in the at least one region.Type: GrantFiled: October 4, 2019Date of Patent: July 5, 2022Assignee: SK hynix Inc.Inventors: Mi Seon Han, Yun Jeong Mun, Young Pyo Joo
-
Patent number: 11341117Abstract: System and methods for evicting and inserting eviction an entry for a deduplication table are described.Type: GrantFiled: January 9, 2020Date of Patent: May 24, 2022Assignee: Pure Storage, Inc.Inventors: John Colgrove, Joseph S. Hasbani, John Martin Hayes, Ethan L. Miller, Cary A. Sandvig
-
Patent number: 11334384Abstract: Systems, apparatuses, and methods for implementing scheduler queue assignment burst mode are disclosed. A scheduler queue assignment unit receives a dispatch packet with a plurality of operations from a decode unit in each clock cycle. The scheduler queue assignment unit determines if the number of operations in the dispatch packet for any class of operations is greater than a corresponding threshold for dispatching to the scheduler queues in a single cycle. If the number of operations for a given class is greater than the corresponding threshold, and if a burst mode counter is less than a burst mode window threshold, the scheduler queue assignment unit dispatches the extra number of operations for the given class in a single cycle. By operating in burst mode for a given operation class during a small number of cycles, processor throughput can be increased without starving the processor of other operation classes.Type: GrantFiled: December 10, 2019Date of Patent: May 17, 2022Assignee: Advanced Micro Devices, Inc.Inventors: Alok Garg, Scott Andrew McLelland, Marius Evers, Matthew T. Sobel
-
Patent number: 11301251Abstract: Systems and methods are disclosed for fetch stage handling of indirect jumps in a processor pipeline. For example, a method includes detecting a sequence of instructions fetched by a processor core, wherein the sequence of instructions includes a first instruction, with a result that depends on an immediate field of the first instruction and a program counter value, followed by a second instruction that is an indirect jump instruction; responsive to detection of the sequence of instructions, preventing an indirect jump target predictor circuit from generating a target address prediction for the second instruction; and, responsive to detection of the sequence of instructions, determining a target address for the second instruction before the first instruction is issued to an execution stage of a pipeline of the processor core.Type: GrantFiled: April 23, 2020Date of Patent: April 12, 2022Assignee: SiFive, Inc.Inventors: Joshua Smith, Krste Asanovic, Andrew Waterman
-
Patent number: 11294678Abstract: Systems, apparatuses, and methods for implementing scheduler queue assignment logic are disclosed. A processor includes at least a decode unit, scheduler queue assignment logic, scheduler queues, pickers, and execution units. The assignment logic receives a plurality of operations from a decode unit in each clock cycle. The assignment logic includes a separate logical unit for each different type of operation which is executable by the different execution units of the processor. For each different type of operation, the assignment logic determines which of the possible assignment permutations are valid for assigning different numbers of operations to scheduler queues in a given clock cycle. The assignment logic receives an indication of how many operations to assign in the given clock cycle, and then the assignment logic selects one of the valid assignment permutations for the number of operations specified by the indication.Type: GrantFiled: May 29, 2018Date of Patent: April 5, 2022Assignee: Advanced Micro Devices, Inc.Inventors: Matthew T. Sobel, Donald A. Priore, Alok Garg
-
Patent number: 11249762Abstract: An apparatus and method are provided for handling incorrect branch direction predictions. The apparatus has processing circuitry for executing instructions, branch prediction circuitry for making branch direction predictions in respect of branch instructions, and fetch circuitry for fetching instructions from an instruction cache in dependence on the branch direction predictions and for forwarding the fetched instructions to the processing circuitry for execution. A cache location buffer stores cache location information for a given branch instruction for which accuracy of the branch direction predictions made by the branch prediction circuitry is below a determined threshold. The cache location information identifies where within the instruction cache one or more instructions are stored that will need to be executed in the event that a subsequent branch direction prediction made for the given branch instruction is incorrect.Type: GrantFiled: October 24, 2019Date of Patent: February 15, 2022Assignee: Arm LimitedInventors: Houdhaifa Bouzguarrou, Guillaume Bolbenes, Thibaut Elie Lanois
-
Patent number: 11243767Abstract: A caching device, an instruction cache, a system for processing an instruction, a method and apparatus for processing data and a medium are provided. The caching device includes a first queue, a second queue, a write port group, a read port, a first pop-up port, a second pop-up port and a press-in port. The is configured to write cache data into a set storage address in the first queue and/or the second queue; the read port is configured to read all cache data from the first queue and/or the second queue at one time; the press-in port is configured to press cache data into the first queue and/or the second queue; the first pop-up port is configured to pop up cache data from the first queue; and the second pop-up port is configured to pop up cache data from the second queue.Type: GrantFiled: September 11, 2020Date of Patent: February 8, 2022Assignee: Beijing Baidu Netcom Science and Technology Co., Ltd.Inventors: Chao Tang, Xueliang Du, Yingnan Xu, Kang An
-
Patent number: 11237730Abstract: A method for improving cache hit ratios for selected volumes within a storage system is disclosed. In one embodiment, such a method includes monitoring I/O to multiple volumes residing on a storage system. The storage system includes a cache to store data associated with the volumes. The method determines, from the I/O, which particular volumes of the multiple volumes would benefit the most if provided favored status in the cache. The favored status provides increased residency time in the cache to the particular volumes compared to volumes not having the favored status. The method generates a list of the particular volumes and transmits the list to the storage system. The storage system, in turn, provides increased residency time to the particular volumes in accordance with their favored status. A corresponding system and computer program product are also disclosed.Type: GrantFiled: May 12, 2019Date of Patent: February 1, 2022Assignee: International Business Machines CorporationInventors: Lokesh M. Gupta, Beth A. Peterson, Kevin J. Ash, Kyler A. Anderson
-
Patent number: 11112774Abstract: A numerical controller determines whether or not block prefetching from a program is sufficient based on whether at least one block subsequent to a predetermined reference block, which exists after a running block of the program and is needed to execute the reference block, has been prefetched or not. If the determination result is that prefetching is not sufficient, block prefetching from the program is performed.Type: GrantFiled: March 16, 2020Date of Patent: September 7, 2021Assignee: Fanuc CorporationInventor: Nobuhito Oonishi
-
Patent number: 11108833Abstract: A method and devices for handling crossed-invite situations in set-up of IP-based sessions. A local device receives an incoming session invite after sending an outgoing session invite before the outgoing session invite has been accepted. It then determines that the incoming session invite was sent by the remote device to which the outgoing session invite is also addressed. The method includes determining a remote device priority value from identifying information contained in the incoming session invite, comparing the remote device priority value with a local device priority value to determine whether the remote device or the local device is higher priority and, if the remote device is higher priority, canceling the outgoing session invite and displaying an incoming call answer screen for the incoming session invite, and if the local device is higher priority, waiting for cancelation of the incoming session invite and acceptance of the outgoing session invite.Type: GrantFiled: June 6, 2016Date of Patent: August 31, 2021Assignee: BlackBerry LimitedInventors: Bechir Trabelsi, Andrew Michael Allen, Kevin N. Chen, Lawrence Edward Kuhl
-
Patent number: 11099995Abstract: Examples include techniques to prefetch data from a second level of memory of a hierarchical arrangement of memory to a second level of memory of the hierarchical arrangement of memory. Examples include circuitry for a processor receiving a prefetch request from a core of the processor to prefetch data from the first level to the second level. The prefetch request indicating an amount of data to prefetch that is greater than a data capacity of a cache line utilized by the core.Type: GrantFiled: March 28, 2018Date of Patent: August 24, 2021Assignee: Intel CorporationInventors: Michael Klemm, Thomas Willhalm
-
Patent number: 11080195Abstract: The size of a cache is modestly increased so that a short pointer to a predicted next memory address in the same cache is added to each cache line in the cache. In response to a cache hit, the predicted next memory address identified by the short pointer in the cache line of the hit along with an associated entry are pushed to a next faster cache when a valid short pointer to the predicted next memory address is present in the cache line of the hit.Type: GrantFiled: September 10, 2019Date of Patent: August 3, 2021Assignee: Marvell Asia Pte, Ltd.Inventors: Shay Gal-On, Srilatha Manne, Edward McLellan, Alexander Rucker
-
Patent number: 11048440Abstract: A memory system includes a memory device having a plurality of memory blocks and a subcommand storage circuit, and a memory controller for controlling the memory device, wherein the memory device is capable of being in one or more of a ready state, a first busy state, and a second busy state, and wherein the subcommand is stored in the subcommand storage circuit when the subcommand is received from the memory controller in the first busy state and the subcommand is executable after the first busy state is released, and the subcommand stored in the subcommand storage circuit is executed after the memory device is changed to the ready state.Type: GrantFiled: August 26, 2019Date of Patent: June 29, 2021Assignee: SK hynix Inc.Inventors: Sung-Won Bae, Jun-Hyuk Lee, Deung-Kak Yoo, Min-Kyu Lee
-
Patent number: 11042462Abstract: Identifying computer program execution characteristics for determine relevance of pattern instruction executions to determine characteristics of a computer program. Filters are utilized to determine which subsequent occurrences of execution of at least one computer instruction are relevant to a counter based on execution characteristics of the at least one computer instruction where the counter counts the subsequent occurrences of execution of at least one computer instruction following prior executions of the same at least one computer instruction.Type: GrantFiled: September 4, 2019Date of Patent: June 22, 2021Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Anthony Thomas Sofia, Peter Sutton, Robert W. St. John, Matthias Klein
-
Patent number: 11042315Abstract: In a computer system, a multilevel memory includes a near memory device and a far memory device, which are byte addressable. The multilevel memory includes a controller that receives a data request including original tag information. The controller includes routing hardware to selectively provide alternate tag information for the data request to cause a cache hit or a cache miss to selectively direct the request to the near memory device or to the far memory device, respectively. The controller can include selection circuitry to select between the original tag information and the alternate tag information to control where the data request is sent.Type: GrantFiled: March 29, 2018Date of Patent: June 22, 2021Assignee: Intel CorporationInventors: Lakshminarayana Pappu, Christopher E. Cox, Navneet Dour, Asaf Rubinstein, Israel Diamand
-
Patent number: 11044140Abstract: Embodiments of the present disclosure provide a method and an apparatus for serialization and deserialization of a message structure. The method for serialization includes: acquiring a message structure, and pre-processing the message structure to generate a corresponding target version number, wherein message structures with different contents correspond to different version numbers, and the message structure is a structure of status information transmitted by a server to a client; serializing information to be transmitted to obtain a corresponding byte stream, wherein the information to be transmitted carries data of the status information and the target version number; and transmitting the byte stream to the client.Type: GrantFiled: November 26, 2019Date of Patent: June 22, 2021Assignee: MICROFUN CO., LTDInventors: Chi Gao, Guangrong Su, Yingjie Han
-
Patent number: 11030073Abstract: Techniques are provided for redundant execution by a better processor for intensive dynamic profiling after initial execution by a constrained processor. In an embodiment, a system of computer(s) receives a request to profile particular runtime aspects of an original binary executable. Based on the particular runtime aspects and without accessing source logic, the system statically rewrites the original binary executable into a rewritten binary executable that invokes telemetry instrumentation that makes observations of the particular runtime aspects and emits traces of those observations. A first processing core having low power (capacity) performs a first execution of the rewritten binary executable to make first observations and emit first traces of the first observations. Afterwards, a second processing core performs a second (redundant) execution of the original binary executable based on the first traces.Type: GrantFiled: October 31, 2019Date of Patent: June 8, 2021Assignee: Oracle International CorporationInventors: Sam Idicula, Kirtikar Kashyap, Arun Raghavan, Evangelos Vlachos, Venkatraman Govindaraju
-
Patent number: 10997066Abstract: A storage device includes a descramble module configured to descramble at least a portion of a read command, which includes logical block address (LBA) information and first meta information, into first signature information and first physical address (PA) information, for accessing a flash memory. A compare module is provided, which is configured to compare the first signature information against stored signature information to thereby determine an equivalency or discrepancy therebetween. An access module is provided, which is configured to use the first PA information to access a data region of the flash memory, upon determination of the equivalency by said compare module.Type: GrantFiled: September 18, 2018Date of Patent: May 4, 2021Inventors: Dong-Woo Kim, Jae Sun No, Song Ho Yoon, Kyoung Back Lee, Wook Han Jeong
-
Patent number: 10990159Abstract: Systems, apparatuses, and methods for retaining architected state for relatively frequent switching between sleep and active operating states are described. A processor receives an indication to transition from an active state to a sleep state. The processor stores a copy of a first subset of the architected state information in on-die storage elements capable of retaining storage after power is turned off. The processor supports programmable input/output (PIO) access of particular stored information during the sleep state. When a wakeup event is detected, circuitry within the processor is powered up again. A boot sequence and recovery of architected state from off-chip memory are not performed. Rather than fetch from a memory location pointed to by a reset base address register, the processor instead fetches an instruction from a memory location pointed to by a restored program counter of the retained subset of the architected state information.Type: GrantFiled: April 25, 2017Date of Patent: April 27, 2021Assignee: Apple Inc.Inventors: Bernard Joseph Semeria, John H. Mylius, Pradeep Kanapathipillai, Richard F. Russo, Shih-Chieh Wen, Richard H. Larson
-
Patent number: 10963399Abstract: A memory system may include a storage device and a controller. The storage device may include a non-volatile memory device. The controller may include a device memory. The controller may control operations of the non-volatile memory device in accordance with a request of a host device. wherein the controller includes a map data management circuit configured to cache one or more segments from a plurality of map segment groups stored in the storage device, each segment having information including a reference count and mapping relationships between logical addresses and physical addresses, detect, among the one or more cached segments, an upload target segment in which the reference count is greater than a predetermined count and transmit, when a predetermined number or greater of upload target segments are detected within a first map segment group, the predetermined number or greater of upload target segments to the host device.Type: GrantFiled: October 3, 2019Date of Patent: March 30, 2021Assignee: SK hynix Inc.Inventor: Eu Joon Byun
-
Patent number: 10896044Abstract: The techniques described herein provide an instruction fetch and decode unit having an operation cache with low latency in switching between fetching decoded operations from the operation cache and fetching and decoding instructions using a decode unit. This low latency is accomplished through a synchronization mechanism that allows work to flow through both the operation cache path and the instruction cache path until that work is stopped due to needing to wait on output from the opposite path. The existence of decoupling buffers in the operation cache path and the instruction cache path allows work to be held until that work is cleared to proceed. Other improvements, such as a specially configured operation cache tag array that allows for detection of multiple hits in a single cycle, also improve latency by, for example, improving the speed at which entries are consumed from a prediction queue that stores predicted address blocks.Type: GrantFiled: June 21, 2018Date of Patent: January 19, 2021Assignee: Advanced Micro Devices, Inc.Inventors: Marius Evers, Dhanaraj Bapurao Tavare, Ashok Tirupathy Venkatachar, Arunachalam Annamalai, Donald A. Priore, Douglas R. Williams
-
Patent number: 10891135Abstract: A system and a method are disclosed to process instructions in an execution unit (EU) that includes an operand cache (OC). The OC stores a copy of at least one frequently used operand stored in a physical register file (PRF). The EU may process instructions using operands obtained from the PRF or from the OC. In the first mode, an OC renaming unit (OC-REN) indicates to the EU to process instructions using operands obtained from the OC if processing the instructions using operands obtained from the OC uses less power than using operands obtained from the PRF. In the second mode, the OC-REN indicates to the EU to process the instructions using operands obtained from the PRF if processing the instructions using operands obtained from the PRF uses less power than using operands obtained from the OC.Type: GrantFiled: March 6, 2019Date of Patent: January 12, 2021Inventors: Paul E. Kitchin, Nicholas Humphries, Ken Yu Lim, Ryan Hensley
-
Patent number: 10860324Abstract: An apparatus and method are provided for making predictions for branch instructions. The apparatus has a prediction queue for identifying instructions to be fetched for execution, and branch prediction circuitry for making predictions in respect of branch instructions, and for controlling which instructions are identified in the prediction queue in dependence on the predictions. During each prediction iteration, the branch prediction circuitry makes a prediction for a predict block comprising a sequence of M instructions. The branch prediction circuitry comprises a target prediction storage having a plurality of entries that are used to identify target addresses for branch instructions that are predicted as taken, the target prediction storage being arranged as an N-way set associative storage structure comprising a plurality of sets. Each predict block has an associated set within the target prediction storage.Type: GrantFiled: June 5, 2019Date of Patent: December 8, 2020Assignee: Arm LimitedInventors: Houdhaifa Bouzguarrou, Guillaume Bolbenes, Eddy Lapeyre, Luc Orion
-
Patent number: 10853224Abstract: Indexing and searching a bit-accurate trace for arbitrary length/arbitrary alignment values in traced thread(s). Indexing includes, while replaying a plurality of trace segments, identifying a set of n-grams for each trace segment that exist in processor data influx(es) and/or store(s) to a processor cache that resulted from replay of the trace segment. An index data structure, which associates each identified n-gram with trace location(s) at or in which the n-gram was found, is then generated. The index data structure thus associates unique n-grams with prior execution time(s) at or during which the traced thread(s) read or wrote the n-gram. Searching an indexed trace includes identifying n-grams in a query and using the index data structure to determine trace location(s) where these n-grams were seen during indexing. A query response is generated after using trace replay to locate particular execution time(s) and memory location(s) at which the n-grams occurred.Type: GrantFiled: November 30, 2018Date of Patent: December 1, 2020Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventor: Jordi Mola
-
Patent number: 10776261Abstract: A storage apparatus managing method, applied to a first storage apparatus and a second storage apparatus coupled to an electronic apparatus, wherein the first storage apparatus comprises a local registering region and a global registering region, comprising: (a) receiving a read request indicating reading a target data unit from the second storage apparatus; (b) confirming whether the global registering region has the target data unit; (c) if yes, reading the target data unit from the global registering region, if not, confirming whether the local registering region has the target data unit; and (d) reading the target data unit from the local registering region if the local registering region has the target data unit, reading the target data unit from the second storage apparatus if the local registering region does not have the target data unit.Type: GrantFiled: July 5, 2018Date of Patent: September 15, 2020Assignee: Silicon Motion, Inc.Inventor: Chao-Yu Lin
-
Patent number: 10754775Abstract: A memory management unit responds to an invalidate by class command by identifying a marker for a class of cache entries that the invalidate by class command is meant to invalidate. The memory management unit stores the active marker as a retired marker and then sets the active marker to the next available marker. Thereafter, the memory management sends an acknowledgement signal (e.g., to the operating system) while invalidating the cache entries having the class and the retired marker in the background. By correlating markers with classes of cache entries, the memory management can more quickly respond to class invalidation requests.Type: GrantFiled: December 17, 2018Date of Patent: August 25, 2020Assignee: NVIDIA CorporationInventors: Jay Gupta, Gosagan Padmanabhan, Devesh Mittal, Kaushal Agarwal
-
Patent number: 10747668Abstract: A shared cache memory can be logically partitioned among different workloads to provide isolation between workloads and avoid excessive resource contention. Each logical partition is apportioned a share of the cache memory, and is exclusive to a respective one of the workloads. Each partition has an initial size allocation. Historical data can be collected and processed for each partition and used to periodically update its size allocation.Type: GrantFiled: November 1, 2018Date of Patent: August 18, 2020Assignee: VMWARE, INC.Inventors: Zhihao Yao, Tan Li, Sunil Satnur, Kiran Joshi
-
Patent number: 10719448Abstract: A cache is presented. The cache comprises a tag array configured to store one or more tag addresses, a data array configured to store data acquired from a dynamic random access memory device, and a cache controller. The cache controller is configured to: receive a cache access request; determine, based on an indication associated with the cache access request, a cache access policy; and perform an operation to the tag array and to the data array based on the determined cache access policy.Type: GrantFiled: June 13, 2017Date of Patent: July 21, 2020Assignee: ALIBABA GROUP HOLDING LIMITEDInventor: Xiaowei Jiang
-
Patent number: 10719442Abstract: An apparatus and method for prioritizing transactional memory regions. For example, one embodiment of a processor comprises: a plurality of cores to execute threads comprising sequences of instructions, at least some of the instructions specifying a transactional memory region; a cache of each core to store a plurality of cache lines; transactional memory circuitry of each core to manage execution of the transactional memory (TM) regions based on priorities associated with each of the TM regions; and wherein the transactional memory circuitry, upon detecting a conflict between a first TM region having a first priority value and a second TM region having a second priority value, is to determine which of the first TM region or the second TM region is permitted to continue executing and which is to be aborted based, at least in part, on the first and second priority values.Type: GrantFiled: September 10, 2018Date of Patent: July 21, 2020Assignee: Intel CorporationInventors: Ren Wang, Raanan Sade, Yipeng Wang, Tsung-Yuan Tai, Sameh Gobriel