Cache Pipelining Patents (Class 711/140)
  • Patent number: 12061677
    Abstract: A secure processor, comprising a logic execution unit configured to process data based on instructions; a communication interface unit, configured to transfer of the instructions and the data, and metadata tags accompanying respective instructions and data; a metadata processing unit, configured to enforce specific restrictions with respect to at least execution of instructions, access to resources, and manipulation of data, selectively dependent on the received metadata tags; and a control transfer processing unit, configured to validate a branch instruction execution and an entry point instruction of each control transfer, selectively dependent on the respective metadata tags.
    Type: Grant
    Filed: August 25, 2023
    Date of Patent: August 13, 2024
    Assignee: The Research Foundation for The State University of New York
    Inventor: Kanad Ghose
  • Patent number: 12038843
    Abstract: A joint scheduler adapted for dispatching prefetch and demand accesses of data relating to a plurality of instructions loaded in an execution pipeline of processing circuit(s). Each prefetch access comprises checking whether a respective data is cached in a cache entry and each demand access comprises accessing a respective data. The joint scheduler is adapted to, responsive to each hit prefetch access dispatched for a respective data relating to a respective instruction, associate the respective instruction with a valid indication and a pointer to a respective cache entry storing the respective data such that the demand access relating to the respective instruction uses the associated pointer to access the respective data in the cache, and responsive to each missed prefetch access dispatched for a respective data relating to a respective instruction, initiate a read cycle for loading the respective data from next level memory and cache it in the cache.
    Type: Grant
    Filed: December 13, 2023
    Date of Patent: July 16, 2024
    Assignee: Next Silicon Ltd
    Inventors: Yiftach Gilad, Liron Zur
  • Patent number: 11755497
    Abstract: Memory management apparatus comprises input circuitry to receive a translation request defining a first memory address within a first memory address space; prediction circuitry to generate a predicted second memory address within a second memory address space as a predicted translation of the first memory address, the predicted second memory address being a predetermined function of the first memory address; control circuitry to initiate processing of the predicted second memory address; translation and permission circuitry to perform an operation to generate a translated second memory address for the first memory address associated with permission information to indicate whether memory access is permitted to the translated second memory address; and output circuitry to provide the translated second memory address as a response to the translation request when the permission information indicates that access is permitted to the translated second memory address.
    Type: Grant
    Filed: March 10, 2021
    Date of Patent: September 12, 2023
    Assignee: Arm Limited
    Inventor: Andrew Brookfield Swaine
  • Patent number: 11741196
    Abstract: A secure processor, comprising a logic execution unit configured to process data based on instructions; a communication interface unit, configured to transfer of the instructions and the data, and metadata tags accompanying respective instructions and data; a metadata processing unit, configured to enforce specific restrictions with respect to at least execution of instructions, access to resources, and manipulation of data, selectively dependent on the received metadata tags; and a control transfer processing unit, configured to validate a branch instruction execution and an entry point instruction of each control transfer, selectively dependent on the respective metadata tags.
    Type: Grant
    Filed: November 14, 2019
    Date of Patent: August 29, 2023
    Assignee: The Research Foundation for The State University of New York
    Inventor: Kanad Ghose
  • Patent number: 11625295
    Abstract: A memory device is set to a performance mode. Data item is received. The data item in a page of a logical unit of the memory device associated with a fault tolerant stripe is stored. A redundancy metadata update for the fault tolerant stripe is delayed until a subsequent media management operation.
    Type: Grant
    Filed: May 10, 2021
    Date of Patent: April 11, 2023
    Assignee: Micron Technology, Inc.
    Inventors: Seungjune Jeon, Zhenming Zhou, Jiangli Zhu
  • Patent number: 11545209
    Abstract: Systems and methods for injecting a toggling signal in a command pipeline configured to receive a multiple command types for the memory device. Toggling circuitry is configured to inject the toggling signal into at least a portion of the command pipeline when the memory device is in a power saving mode and the command pipeline is clear of valid commands. The toggling is blocked from causing writes by disabling a data strobe when a command that is invalid in the power saving mode is asserted during the power saving mode.
    Type: Grant
    Filed: May 28, 2021
    Date of Patent: January 3, 2023
    Assignee: Micron Technology, Inc.
    Inventors: Parthasarathy Gajapathy, Kallol Mazumder
  • Patent number: 11531550
    Abstract: Techniques are disclosed relating to an apparatus that includes a plurality of execution pipelines including first and second execution pipelines, a shared circuit that is shared by the first and second execution pipelines, and a decode circuit. The first and second execution pipelines are configured to concurrently perform operations for respective instructions. The decode circuit is configured to assign a first program thread to the first execution pipeline and a second program thread to the second execution pipeline. In response to determining that respective instructions from the first and second program threads that utilize the shared circuit are concurrently available for dispatch, the decode circuit is further configured to select between the first program thread and the second program thread.
    Type: Grant
    Filed: February 10, 2021
    Date of Patent: December 20, 2022
    Assignee: Cadence Design Systems, Inc.
    Inventors: Robert T. Golla, Christopher Olson
  • Patent number: 11507498
    Abstract: An apparatus including a memory structure comprising non-volatile memory cells and a microcontroller. The microcontroller is configured to output Core Timing Control (CTC) signals that are used to control voltages applied in the memory structure. In one aspect, information from which the CTC signals may be generated is pre-computed and stored. This pre-computation may be performed in a power on phase of the memory system. When a request to perform a memory operation is received, the stored information may be accessed and used to generate the CTC signals to control the memory operation. Thus, considerable time and/or power is saved. Note that this time savings occurs each time the memory operation is performed. Also, power is saved due to not having to repeatedly perform the computation.
    Type: Grant
    Filed: March 5, 2020
    Date of Patent: November 22, 2022
    Assignee: SanDisk Technologies LLC
    Inventors: Yuheng Zhang, Yan Li
  • Patent number: 11445020
    Abstract: Circuitry comprises a set of data handling nodes comprising: two or more master nodes each having respective storage circuitry to hold copies of data items from a main memory, each copy of a data item being associated with indicator information to indicate a coherency state of the respective copy, the indicator information being configured to indicate at least whether that copy has been updated more recently than the data item held by the main memory; a home node to serialise data access operations and to control coherency amongst data items held by the set of data handling nodes so that data written to a memory address is consistent with data read from that memory address in response to a subsequent access request; and one or more slave nodes including the main memory; in which: a requesting node of the set of data handling nodes is configured to communicate a conditional request to a target node of the set of data handling nodes in respect of a copy of a given data item at a given memory address, the condit
    Type: Grant
    Filed: March 24, 2020
    Date of Patent: September 13, 2022
    Assignee: Arm Limited
    Inventors: Jonathan Curtis Beard, Jamshed Jalal, Curtis Glenn Dunham, Roxana Rusitoru
  • Patent number: 11237965
    Abstract: A cache coherent system includes a directory with more than one snoop filter, each of which stores information in a different set of snoop filter entries. Each snoop filter is associated with a subset of all caching agents within the system. Each snoop filter uses an algorithm chosen for best performance on the caching agents associated with the snoop filter. The number of snoop filter entries in each snoop filter is primarily chosen based on the caching capacity of just the caching agents associated with the snoop filter. The type of information stored in each snoop filter entry of each snoop filter is chosen to meet the desired filtering function of the specific snoop filter.
    Type: Grant
    Filed: December 31, 2014
    Date of Patent: February 1, 2022
    Assignee: ARTERIS, INC.
    Inventors: Craig Stephen Forrest, David A. Kruckemyer
  • Patent number: 11080191
    Abstract: A cache coherent system includes a directory with more than one snoop filter, each of which stores information in a different set of snoop filter entries. Each snoop filter is associated with a subset of all caching agents within the system. Each snoop filter uses an algorithm chosen for best performance on the caching agents associated with the snoop filter. The number of snoop filter entries in each snoop filter is primarily chosen based on the caching capacity of just the caching agents associated with the snoop filter. The type of information stored in each snoop filter entry of each snoop filter is chosen to meet the desired filtering function of the specific snoop filter.
    Type: Grant
    Filed: March 18, 2020
    Date of Patent: August 3, 2021
    Assignee: ARTERIS, INC.
    Inventors: Craig Stephen Forrest, David A. Kruckemyer
  • Patent number: 10956161
    Abstract: Provided is a method for predicting a target address using a set of Indirect Target TAgged GEometric (ITTAGE) tables and a target address pattern table. A branch instruction that is to be executed may be identified. A first tag for the branch instruction may be determined. The first tag may be a unique identifier that corresponds to the branch instruction. Using the tag, the branch instruction may be determined to be in a target address pattern table, and an index may be generated. A predicted target address for the branch instruction may be determined using the generated index and the largest ITTAGE table. Instructions associated with the predicted target address may be fetched.
    Type: Grant
    Filed: January 15, 2019
    Date of Patent: March 23, 2021
    Assignee: International Business Machines Corporation
    Inventors: Satish Kumar Sadasivam, Puneeth A. H. Bhat, Shruti Saxena
  • Patent number: 10901900
    Abstract: A cache coherence management system includes: a set of directories distributed between nodes of a network for interconnecting processors including cache memories, each directory including a correspondence table between cache lines and information fields on the cache lines; and a mechanism updating the directories by adding, modifying, or deleting cache lines in the correspondence tables. In each correspondence table and for each cache line identified, at least one field is provided for indicating a possible blocking of a transaction relative to the cache line considered, when the blocking occurs in the node associated with the correspondence table considered. The system further includes a mechanism detecting fields indicating a transaction blocking and restarting each transaction detected as blocked from the node in which it is indicated as blocked.
    Type: Grant
    Filed: April 12, 2013
    Date of Patent: January 26, 2021
    Assignees: COMMISSARIAT A L'ENERGIE ATOMIQUE ET AUX ENERGIES ALTERNATIVES, BULL SAS
    Inventors: Christian Bernard, Eric Guthmuller, Huy Nam Nguyen
  • Patent number: 10831660
    Abstract: A processing unit for a multiprocessor data processing system includes a processor core having an upper level cache and a lower level cache coupled to the processor core. The lower level cache includes one or more state machines for handling requests snooped from the system interconnect. The processing unit includes an interrupt unit configured to, based on receipt of an interrupt request while the processor core is in a powered up state, record which of the one or more state machines are active processing a prior snooped request that can invalidate a cache line in the upper level cache and present an interrupt to the processor core based on determining that each state machine that was active processing a prior snooped request that can invalidate a cache line in the upper level cache has completed processing of its respective prior snooped request.
    Type: Grant
    Filed: June 27, 2019
    Date of Patent: November 10, 2020
    Assignee: International Business Machines Corporation
    Inventors: Derek E. Williams, Guy L. Guthrie, Hugh Shen
  • Patent number: 10657055
    Abstract: An apparatus and method are provided for managing snoop operations. The apparatus has an interface for receiving access requests from any of N master devices that have associated cache storage, each access request specifying a memory address within memory associated with the apparatus. Snoop filter storage is provided that has a plurality of snoop filter entries, where each snoop filter entry identifies a memory portion and snoop control information indicative of the master devices that have accessed that memory portion. When an access request received at the interface specifies a memory address that is within the memory portion associated with a snoop filter entry, snoop control circuitry uses the snoop control information in that snoop filter entry to determine which master devices to subject to a snoop operation.
    Type: Grant
    Filed: December 13, 2018
    Date of Patent: May 19, 2020
    Assignee: Arm Limited
    Inventors: Jamshed Jalal, Mark David Werkheiser, Gurunath Ramagiri, Mukesh Patel
  • Patent number: 10635591
    Abstract: Systems and methods selectively filter, buffer, and process cache coherency probes. A processor includes a probe buffering unit that includes a cache coherency probe buffer. The probe buffering unit receives cache coherency probes and memory access requests for a cache. The probe buffering unit identifies and discards any of the probes that are directed to a memory block that is not cached in the cache, and buffers at least a subset of the remaining probes in the probe buffer. The probe buffering unit submits to the cache, in descending order of priority, one or more of: any buffered probes that are directed to the memory block to which a current memory access request is also directed; any current memory access requests that are directed to a memory block to which there is not a buffered probe also directed; and any buffered probes when there is not a current memory access request.
    Type: Grant
    Filed: December 5, 2018
    Date of Patent: April 28, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Ashok T. Venkatachar, Anthony Jarvis
  • Patent number: 10585803
    Abstract: Cache memory mapping techniques are presented. A cache may contain an index configuration register. The register may configure the locations of an upper index portion and a lower index portion of a memory address. The portions may be combined to create a combined index. The configurable split-index addressing structure may be used, among other applications, to reduce the rate of cache conflicts occurring between multiple processors decoding the video frame in parallel.
    Type: Grant
    Filed: February 4, 2019
    Date of Patent: March 10, 2020
    Assignee: Movidius Limited
    Inventor: Richard Richmond
  • Patent number: 10534712
    Abstract: A method for service level agreement (SLA) allocation of resources of a cache memory of a storage system, the method may include monitoring, by a control layer of the storage system, actual performances of the storage system that are related to multiple logical volumes; calculating actual-to-required relationships between the actual performances and SLA defined performances of the multiple logical volumes; assigning caching priorities, to different logical volumes of the multiple logical volumes; wherein the assigning is based on, at least, the actual-to-required relationships; and managing, based on at least the caching priorities, a pre-cache memory module that is upstream to the cache module and is configured to store write requests that (i) are associated with one or more logical volumes of the different logical volumes and (ii) are received by the pre-cache memory module at points in time when the cache memory is full; wherein the managing comprises transferring one or more write requests from the pre-ca
    Type: Grant
    Filed: August 29, 2016
    Date of Patent: January 14, 2020
    Assignee: INFINIDAT LTD.
    Inventors: Qun Fan, Venu Nayar, Haim Helman
  • Patent number: 10430193
    Abstract: A processor includes a first mode where the processor is not to use packed data operation masking, and a second mode where the processor is to use packed data operation masking. A decode unit to decode an unmasked packed data instruction for a given packed data operation in the first mode, and to decode a masked packed data instruction for a masked version of the given packed data operation in the second mode. The instructions have a same instruction length. The masked instruction has bit(s) to specify a mask. Execution unit(s) are coupled with the decode unit. The execution unit(s), in response to the decode unit decoding the unmasked instruction in the first mode, to perform the given packed data operation. The execution unit(s), in response to the decode unit decoding the masked instruction in the second mode, to perform the masked version of the given packed data operation.
    Type: Grant
    Filed: June 1, 2018
    Date of Patent: October 1, 2019
    Assignee: Intel Corporation
    Inventors: Bret L. Toll, Buford M. Guy, Ronak Singhal, Mishali Naik
  • Patent number: 10424393
    Abstract: Dynamic redundancy buffers for use with a device are disclosed. The dynamic redundancy buffers allow a memory array of the device to be operated with high write error rate (WER). A first level redundancy buffer (e1 buffer) is couple to the memory array. The e1 buffer may store data words that have failed verification or have not been verified. The e1 buffer may transfer data words to another dynamic redundancy buffer (e2 buffer). The e1 buffer may transfer data words that have failed to write to a memory array after a predetermined number of re-write attempts. The e1 buffer may also transfer data words upon power down.
    Type: Grant
    Filed: December 20, 2017
    Date of Patent: September 24, 2019
    Assignee: SPIN MEMORY, INC.
    Inventors: Mourad El Baraji, Neal Berger, Benjamin Stanley Louie, Lester M. Crudele, Daniel L. Hillman, Barry Hoberman
  • Patent number: 10353826
    Abstract: A data processing system includes a memory system, a first processing element, a first address translator that maps virtual addresses to system addresses, a second address translator that maps system address to physical addresses, and a task management unit. A first program task uses a first virtual memory space that is mapped to a first system address range using a first table. The context of the first program task includes an address of the first table and is cloned by creating a second table indicative of a mapping from a second virtual address space to a second range of system addresses, where the second range is mapped to the same physical addresses as the first range until a write occurs, at which time memory is allocated and the mapping of the second range is updated. The cloned context includes an address of the second table.
    Type: Grant
    Filed: July 14, 2017
    Date of Patent: July 16, 2019
    Assignee: Arm Limited
    Inventors: Jonathan Curtis Beard, Roxana Rusitoru, Curtis Glenn Dunham
  • Patent number: 10275251
    Abstract: A processor includes a first level register file, second level register file, and register file mapper. The first and second level register files are comprised of physical registers, with the first level register file more efficiently accessed relative to the second level register file. The register file mapper is coupled with the first and second level register files. The register file mapper comprises a mapping structure and register file mapper controller. The mapping structure hosts mappings between logical registers and physical registers of the first level register file. The register file mapper controller determines whether to map a destination logical register of an instruction to a physical register in the first level register file. The register file mapper controller also determines, based on metadata associated with the instruction, whether to write data associated with the destination logical register to one of the physical registers of the second level register file.
    Type: Grant
    Filed: October 31, 2012
    Date of Patent: April 30, 2019
    Assignee: International Business Machines Corporation
    Inventors: Christopher M. Abernathy, Mary D. Brown, Dung Q. Nguyen
  • Patent number: 10248576
    Abstract: The present invention provides a DRAM/NVM hierarchical heterogeneous memory system with software-hardware cooperative management schemes. In the system, NVM is used as large-capacity main memory, and DRAM is used as a cache to the NVM. Some reserved bits in the data structure of TLB and last-level page table are employed effectively to eliminate hardware costs in the conventional hardware-managed hierarchical memory architecture. The cache management in such a heterogeneous memory system is pushed to the software level. Moreover, the invention is able to reduce memory access latency in case of last-level cache misses. Considering that many applications have relatively poor data locality in big data application environments, the conventional demand-based data fetching policy for DRAM cache can aggravates cache pollution.
    Type: Grant
    Filed: October 6, 2016
    Date of Patent: April 2, 2019
    Assignee: HUAZHONG UNIVERSITY OF SCIENCE AND TECHNOLOGY
    Inventors: Hai Jin, Xiaofei Liao, Haikun Liu, Yujie Chen, Rentong Guo
  • Patent number: 10216448
    Abstract: The storage system has one or more storage drives, and one or more controllers for receiving processing requests from a superior device, wherein each of the one or more controllers has a processor for executing the processing request and an accelerator, and the accelerator has multiple internal data memories and an internal control memory, wherein if the processing request is a read I/O request, it stores a control information regarding the request to the internal control memory, and reads data being the target of the relevant request from at least one storage drive out of the multiple storage drives, which is temporarily stored in the one or more said internal data memories, and transferred sequentially in order from the internal data memory already storing data to the superior device.
    Type: Grant
    Filed: September 11, 2014
    Date of Patent: February 26, 2019
    Assignee: Hitachi, Ltd.
    Inventors: Kazushi Nakagawa, Masanori Takada, Norio Simozono
  • Patent number: 10191748
    Abstract: In one embodiment, a processor includes a decode logic, an issue logic to issue decoded instructions, and at least one execution logic to execute issued instructions of a program. The at least one execution logic is to execute at least some instructions of the program out-of-order, and the decode logic is to decode and provide a first in-order memory instruction of the program to the issue logic. In turn, the issue logic is to order the first in-order memory instruction ahead of a second in-order memory instruction of the program. Other embodiments are described and claimed.
    Type: Grant
    Filed: November 30, 2015
    Date of Patent: January 29, 2019
    Assignee: Intel IP Corporation
    Inventor: Jacob Mathew
  • Patent number: 10146690
    Abstract: In an embodiment, a processor includes a plurality of cores and synchronization logic. The synchronization logic includes circuitry to: receive a first memory request and a second memory request; determine whether the second memory request is in contention with the first memory request; and in response to a determination that the second memory request is in contention with the first memory request, process the second memory request using a non-blocking cache coherence protocol. Other embodiments are described and claimed.
    Type: Grant
    Filed: June 13, 2016
    Date of Patent: December 4, 2018
    Assignee: Intel Corporation
    Inventors: Samantika S. Sury, Robert G. Blankenship, Simon C. Steely, Jr.
  • Patent number: 10127161
    Abstract: A method for the coexistence of software having different safety levels in a multicore processor which has at least two processor cores (2, 3). A memory range (4, 5) is associated with each processor core (2, 3) and a plurality of software (SW1, SW2) is processed on one of the processor cores (2, 3) having a predefined safety level. The plurality of software (SW1, SW2) is processed having a predefined safety level only on the processor core (2, 3) with which the same safety level is associated, in which during the processing of the plurality of software (SW1, SW2), the processor core (2, 3) accesses only the protected memory range (4, 5) which is permanently associated with this processor core (2, 3).
    Type: Grant
    Filed: January 23, 2015
    Date of Patent: November 13, 2018
    Assignee: Robert Bosch GmbH
    Inventors: Peter Wegner, Jochen Ulrich Haenger, Markus Schweizer, Carsten Gebauer, Bernd Mueller, Thomas Heinz
  • Patent number: 10095524
    Abstract: A processing system and method includes a predecoder configured to identify instructions that are combinable to form a single, executable internal instruction. Instruction storage is configured to merge instructions that are combinable. An instruction execution unit is configured to execute the single, executable internal instruction on a hardware wide datapath.
    Type: Grant
    Filed: November 18, 2014
    Date of Patent: October 9, 2018
    Assignee: International Business Machines Corporation
    Inventors: Michael Gschwind, Balaram Sinharoy
  • Patent number: 10095629
    Abstract: Generally discussed herein are systems, devices, and methods for local and remote dual address decoding. According to an example a node can include one or more processors to generate a first memory request, the first memory request including a first address and a node identification, a caching agent coupled to the one or more processors, the caching agent to determine that the first address is homed to a remote node remote to the local node, a network interface controller (NIC) coupled to the caching agent, the NIC to produce a second memory request based on the first memory request, and the one or more processors further to receive a response to the second memory request, the response generated by a switch coupled to the NIC, the switch includes a remote system address decoder to determine a node identification to which the second memory request is homed.
    Type: Grant
    Filed: September 28, 2016
    Date of Patent: October 9, 2018
    Assignee: Intel Corporation
    Inventors: Francesc Cesc Guim Bernat, Kshitij A. Doshi, Steen Larsen, Mark A Schmisseur, Raj K. Ramanujan
  • Patent number: 10048899
    Abstract: A storage device includes a storage medium and a controller configured to control the storage medium. The controller includes an interface unit configured to interface with a host, a processing unit connected to the interface unit via a first signal line and configured to process a direct load operation and a direct store operation between the host and the controller, and at least one memory connected to the interface unit via a second signal line. The at least one memory is configured to temporarily store data read from the storage medium or data received from the host, and is configured to be directly accessed by the host.
    Type: Grant
    Filed: February 4, 2015
    Date of Patent: August 14, 2018
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Myeong-Eun Hwang, Ki-Jo Jung, Tae-Hack Lee, Kwang-Ho Choi, Sang-Kyoo Jeong
  • Patent number: 9823932
    Abstract: A tagged geometric length (TAGE) branch predictor incorporates multiple prediction tables. Each of these prediction tables has prediction storage lines which store a common stored TAG value and a plurality of branch predictions in respect of different offset positions within a block of program instructions read in parallel. Each of the branch prediction has an associated validity indicator. Update of predictions stored may be made by a partial allocation mechanism in which a TAG match occurs and a branch storage line is partially overwritten or by full allocation in which no already matching TAG victim storage line can be identified and instead a whole prediction storage line is cleared and the new prediction stored therein.
    Type: Grant
    Filed: April 20, 2015
    Date of Patent: November 21, 2017
    Assignee: ARM Limited
    Inventor: Houdhaifa Bouzguarrou
  • Patent number: 9727466
    Abstract: An interconnect and method of managing a snoop filter within such an interconnect are provided. The interconnect is used to connect a plurality of devices, including a plurality of master devices where one or more of the master devices has an associated cache storage. The interconnect comprises coherency control circuitry to perform coherency control operations for data access transactions received by the interconnect from the master devices. In performing those operations, the coherency control circuitry has access to snoop filter circuitry that maintains address-dependent caching indication data, and is responsive to a data access transaction specifying a target address to produce snoop control data providing an indication of which master devices have cached data for the target address in their associated cache storage.
    Type: Grant
    Filed: August 11, 2015
    Date of Patent: August 8, 2017
    Assignee: ARM Limited
    Inventors: Andrew David Tune, Sean James Salisbury
  • Patent number: 9727469
    Abstract: According to one aspect of the present disclosure, a method and technique for performance-driven cache line memory access is disclosed. The method includes: receiving, by a memory controller of a data processing system, a request for a cache line; dividing the request into a plurality of cache subline requests, wherein at least one of the cache subline requests comprises a high priority data request and at least one of the cache subline requests comprises a low priority data request; servicing the high priority data request; and delaying servicing of the low priority data request until a low priority condition has been satisfied.
    Type: Grant
    Filed: February 15, 2013
    Date of Patent: August 8, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Robert H. Bell, Jr., Men-Chow Chiang, Hong L. Hua, Mysore S. Srinivas
  • Patent number: 9690591
    Abstract: A technique to enable efficient instruction fusion within a computer system is disclosed. In one embodiment, processor logic delays the processing of a first instruction for a threshold amount of time if the first instruction within an instruction queue is fusible with a second instruction.
    Type: Grant
    Filed: October 30, 2008
    Date of Patent: June 27, 2017
    Assignee: Intel Corporation
    Inventors: Ido Ouziel, Lihu Rappoport, Robert Valentine, Ron Gabor, Pankaj Raghuvanshi
  • Patent number: 9652396
    Abstract: A method for accessing a cache memory structure includes dividing a multiple cache elements of a cache memory structure into multiple groups. A serial probing process of the multiple groups is performed. Upon a tag hit resulting from the serial probing process, the probing process for remaining groups exits.
    Type: Grant
    Filed: December 27, 2013
    Date of Patent: May 16, 2017
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Michael G. Butler, Magnus Ekman
  • Patent number: 9626294
    Abstract: According to one aspect of the present disclosure a system and technique for performance-driven cache line memory access is disclosed. The system includes: a processor, a cache hierarchy coupled to the processor, and a memory coupled to the cache hierarchy. The system also includes logic executable to, responsive to receiving, a request for a cache line: divide the request into a plurality of cache subline requests, wherein at least one of the cache subline requests comprises a high priority data request and at least one of the cache subline requests comprises a low priority data request; service the high priority data request; and delay servicing of the low priority data request until a low priority condition has been satisfied.
    Type: Grant
    Filed: October 3, 2012
    Date of Patent: April 18, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Robert H. Bell, Jr., Men-Chow Chiang, Hong L. Hua, Mysore S. Srinivas
  • Patent number: 9525890
    Abstract: A decoding method for an audio video coding standard (AVS) system is provided. According to a stop-fetching criterion, a stop-fetching flag is set to an enabled status or a disabled status. In an offset fetching procedure, it is determined whether an offset value is smaller than a threshold and whether the stop-fetching is in the disabled status. When a determination result is affirmative, one subsequent bit is fetched for the offset value, an offset shift value is correspondingly increased, and the determination step is iterated. When the determination result is negative, the offset fetching procedure is terminated. Next, it is determined whether a decoding result is a least probable symbol (LPS) or a most probable symbol (MPS).
    Type: Grant
    Filed: June 18, 2014
    Date of Patent: December 20, 2016
    Assignee: MSTAR SEMICONDUCTOR, INC.
    Inventors: He-Yuan Lin, Yi-Shin Tung
  • Patent number: 9507716
    Abstract: An interconnect has coherency control circuitry for performing coherency control operations and a snoop filter for identifying which devices coupled to the interconnect have cached data from a given address. When an address is looked up in the snoop filter and misses, and there is no spare snoop filter entry available, then the snoop filter selects a victim entry corresponding to a victim address, and issues an invalidate transaction for invalidating locally cached copies of the data identified by the victim. The coherency control circuitry for performing coherency checking operations for data access transactions is reused for performing coherency control operations for the invalidate transaction issued by the snoop filter. This greatly reduces the circuitry complexity of the snoop filter.
    Type: Grant
    Filed: March 6, 2015
    Date of Patent: November 29, 2016
    Assignee: ARM Limited
    Inventors: Sean James Salisbury, Andrew David Tune, Jamshed Jalal, Mark David Werkheiser, Arthur Laughton, George Robert Scott Lloyd, Peter Andrew Riocreux, Daniel Sara
  • Patent number: 9323679
    Abstract: A system, method, and computer program product are provided for managing miss requests. In use, a miss request is received at a unified miss handler from one of a plurality of distributed local caches. Additionally, the miss request is managed, utilizing the unified miss handler.
    Type: Grant
    Filed: August 14, 2012
    Date of Patent: April 26, 2016
    Assignee: NVIDIA Corporation
    Inventors: Brucek Kurdo Khailany, Ronny Meir Krashinsky, James David Balfour
  • Patent number: 9304926
    Abstract: A coherent memory system includes a plurality of level 1 cache memories 6 connected via interconnect circuitry 18 to a level 2 cache memory 8. Coherency control circuitry 10 manages coherency between lines of data. Evict messages from the level 1 cache memories to the coherency control circuitry 10 are sent via the read address channel AR. Read messages are also sent via the read address channel AR. The read address channel AR is configured such that a read message may not be reordered relative to an evict message. The coherency control circuitry 10 is configured such that a read message will not be processed ahead of an evict message. The level 1 cache memories 6 do not track in-flight evict messages. No acknowledgement of an evict message is sent from the coherency control circuitry 10 back to the level 1 cache memory 6.
    Type: Grant
    Filed: July 23, 2013
    Date of Patent: April 5, 2016
    Assignee: ARM Limited
    Inventors: Ian Bratt, Mladen Wilder, Ole Henrik Jahren
  • Patent number: 9304714
    Abstract: A system and method is described for operating a computer memory system having a plurality of controllers capable of accessing a common set of memory modules. Access to the physical storage of the memory modules may be managed by configuration logical units (LUNs) addressable by the users. The amount of memory associated with each LUN may be managed in units of memory (LMA) from a same free LMA table maintained in each controller of the plurality of controllers. A request for maintenance of a LUN may be received from any user through any controller and results in the association of a free memory area with the LUN, and the remaining controllers perform the same operation. A test for misallocation of a free memory area is performed and when such misallocation occurs, the situation is corrected in accordance with a policy.
    Type: Grant
    Filed: April 18, 2013
    Date of Patent: April 5, 2016
    Assignee: VIOLIN MEMORY INC
    Inventor: Jon C. R. Bennett
  • Patent number: 9286068
    Abstract: A processor includes an execution unit, a first level register file, a second level register file, a plurality of storage locations and a register file bypass controller. The first and second level register files are comprised of physical registers, with the first level register file more efficiently accessed relative to the second level register file. The register file bypass controller is coupled with the execution unit and second level register file. The register file bypass controller determines whether an instruction indicates a logical register is unmapped from a physical register in the first level register file. The register file controller also loads data into one of the storage locations and selects one of the storage locations as input to the execution unit, without mapping the logical register to one of the physical registers in the first level register file.
    Type: Grant
    Filed: October 31, 2012
    Date of Patent: March 15, 2016
    Assignee: International Business Machines Corporation
    Inventors: Christopher M. Abernathy, Mary D. Brown, Sundeep Chadha, Dung Q. Nguyen
  • Patent number: 9262080
    Abstract: In a read processing storage system, using a pool of CPU cores, the CPU cores are assigned to process either write operations, read operations, and read and write operations, that are scheduled for processing. A minimal number of the CPU cores are allocated for processing the write operations, thereby increasing write latency. Upon reaching a throughput limit for the write operations that causes the minimal number of the plurality of CPU cores to reach a busy status, the minimal number of the plurality of CPU cores for processing the write operations is increased.
    Type: Grant
    Filed: January 6, 2015
    Date of Patent: February 16, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Jonathan Amit, Amir Lidor, Sergey Marenkov, Rostislav Raikhman
  • Patent number: 9244688
    Abstract: Embodiments relate to using a branch target buffer preload table. An aspect includes receiving a search request to locate branch prediction information associated with a branch instruction. Searching is performed for an entry corresponding to the search request in a branch target buffer and a branch target buffer preload table in parallel. Based on locating a matching entry in the branch target buffer preload table corresponding to the search request and failing to locate the matching entry in the branch target buffer, a victim entry is selected to overwrite in the branch target buffer. Branch prediction information of the matching entry is received from the branch target buffer preload table at the branch target buffer. The victim entry in the branch target buffer is overwritten with the branch prediction information of the matching entry.
    Type: Grant
    Filed: November 25, 2013
    Date of Patent: January 26, 2016
    Assignee: International Business Machines Corporation
    Inventors: James J. Bonanno, Ulrich Mayer, Brian R. Prasky
  • Patent number: 9229745
    Abstract: A computing device identifies a load instruction and store instruction pair that causes a load-hit-store conflict. A processor tags a first load instruction that instructs the processor to load a first data set from memory. The processor stores an address at which the first load instruction is located in memory in a special purpose register. The processor determines where the first load instruction has a load-hit-store conflict with a first store instruction. If the processor determines the first load instruction has a load-hit store conflict with the first store instruction, the processor stores an address at which the first data set is located in memory in a second special purpose register, tags the first data set being stored by the first store instruction, stores an address at which the first store instruction is located in memory in a third special purpose register and increases a conflict counter.
    Type: Grant
    Filed: September 12, 2012
    Date of Patent: January 5, 2016
    Assignee: International Business Machines Corporation
    Inventors: Venkat R. Indukuru, Alexander E. Mericas, Satish K. Sadasivam, Madhavi G. Valluri
  • Patent number: 9229746
    Abstract: A computing device identifies a load instruction and store instruction pair that causes a load-hit-store conflict. A processor tags a first load instruction that instructs the processor to load a first data set from memory. The processor stores an address at which the first load instruction is located in memory in a special purpose register. The processor determines where the first load instruction has a load-hit-store conflict with a first store instruction. If the processor determines the first load instruction has a load-hit store conflict with the first store instruction, the processor stores an address at which the first data set is located in memory in a second special purpose register, tags the first data set being stored by the first store instruction, stores an address at which the first store instruction is located in memory in a third special purpose register and increases a conflict counter.
    Type: Grant
    Filed: December 18, 2013
    Date of Patent: January 5, 2016
    Assignee: International Business Machines Corporation
    Inventors: Venkat R. Indukuru, Alexander E. Mericas, Satish K. Sadasivam, Madhavi G. Valluri
  • Patent number: 9164912
    Abstract: According to an embodiment, a computer system for cache management includes a processor and a cache, the computer system configured to perform a method including receiving a first store request for a first address in the cache and receiving a first fetch request for the first address in the cache. The method also includes executing the first store request and the first fetch request, latching the first store request in a store write-back pipeline in the cache, detecting, in the processor, a conflict following execution of the first store request and the first fetch request and receiving the first store request from a recycle path including the store write-back pipeline and executing the first store request a second time.
    Type: Grant
    Filed: June 13, 2012
    Date of Patent: October 20, 2015
    Assignee: International Business Machines Corporation
    Inventors: Khary J. Alexander, David A. Webber, Patrick M. West, Jr.
  • Patent number: 9158696
    Abstract: This disclosure provides techniques and apparatuses to enable early, run-ahead handling of IC and ITLB misses by decoupling the ITLB and IC tag lookups from the IC data (instruction bytes) accesses, and making ITLB and IC tag lookups run ahead of the IC data accesses.
    Type: Grant
    Filed: December 29, 2011
    Date of Patent: October 13, 2015
    Assignee: Intel Corporation
    Inventors: Ilhyun Kim, Alexandre J. Farcy, Choon Wei Khor, Robert L. Hinton
  • Patent number: 9106592
    Abstract: Controlling a buffered data transfer between a source and a destination by loading a source count value and a destination count value from a buffered data transfer device. A source delta value is computed by subtracting a source previous value from the source count value. The destination count value is adjusted on the buffered data transfer device by adding the source delta value to the destination count value. A destination delta value is computed by subtracting a destination previous value from the destination count value. The source count value is adjusted on the buffered data transfer device by adding the destination delta value to the source count value. A new value for the source previous value is computed by adding the source count value and the destination delta value. A new value for the destination previous value is computed by adding the destination count value and the source delta value.
    Type: Grant
    Filed: May 18, 2008
    Date of Patent: August 11, 2015
    Assignee: Western Digital Technologies, Inc.
    Inventors: Kenneth K. Arimura, Gregory B. Thelin, Rebekah A. Wilson
  • Patent number: 9086889
    Abstract: Techniques are disclosed relating to reducing the latency of restarting a pipeline in a processor that implements scouting. In one embodiment, the processor may reduce pipeline restart latency using two instruction fetch units that are configured to fetch and re-fetch instructions in parallel with one another. In some embodiments, the processor may reduce pipeline restart latency by initiating re-fetching instructions in response to determining that a commit operation is to be attempted with respect to one or more deferred instructions. In other embodiments, the processor may reduce pipeline restart latency by initiating re-fetching instructions in response to receiving an indication that a request for a set of data has been received by a cache, where the indication is sent by the cache before determining whether the data is present in the cache or not.
    Type: Grant
    Filed: April 27, 2010
    Date of Patent: July 21, 2015
    Assignee: Oracle International Corporation
    Inventors: Martin Karlsson, Sherman H. Yip, Shailender Chaudhry