Cache Pipelining Patents (Class 711/140)
-
Patent number: 12061677Abstract: A secure processor, comprising a logic execution unit configured to process data based on instructions; a communication interface unit, configured to transfer of the instructions and the data, and metadata tags accompanying respective instructions and data; a metadata processing unit, configured to enforce specific restrictions with respect to at least execution of instructions, access to resources, and manipulation of data, selectively dependent on the received metadata tags; and a control transfer processing unit, configured to validate a branch instruction execution and an entry point instruction of each control transfer, selectively dependent on the respective metadata tags.Type: GrantFiled: August 25, 2023Date of Patent: August 13, 2024Assignee: The Research Foundation for The State University of New YorkInventor: Kanad Ghose
-
Patent number: 12038843Abstract: A joint scheduler adapted for dispatching prefetch and demand accesses of data relating to a plurality of instructions loaded in an execution pipeline of processing circuit(s). Each prefetch access comprises checking whether a respective data is cached in a cache entry and each demand access comprises accessing a respective data. The joint scheduler is adapted to, responsive to each hit prefetch access dispatched for a respective data relating to a respective instruction, associate the respective instruction with a valid indication and a pointer to a respective cache entry storing the respective data such that the demand access relating to the respective instruction uses the associated pointer to access the respective data in the cache, and responsive to each missed prefetch access dispatched for a respective data relating to a respective instruction, initiate a read cycle for loading the respective data from next level memory and cache it in the cache.Type: GrantFiled: December 13, 2023Date of Patent: July 16, 2024Assignee: Next Silicon LtdInventors: Yiftach Gilad, Liron Zur
-
Patent number: 11755497Abstract: Memory management apparatus comprises input circuitry to receive a translation request defining a first memory address within a first memory address space; prediction circuitry to generate a predicted second memory address within a second memory address space as a predicted translation of the first memory address, the predicted second memory address being a predetermined function of the first memory address; control circuitry to initiate processing of the predicted second memory address; translation and permission circuitry to perform an operation to generate a translated second memory address for the first memory address associated with permission information to indicate whether memory access is permitted to the translated second memory address; and output circuitry to provide the translated second memory address as a response to the translation request when the permission information indicates that access is permitted to the translated second memory address.Type: GrantFiled: March 10, 2021Date of Patent: September 12, 2023Assignee: Arm LimitedInventor: Andrew Brookfield Swaine
-
Patent number: 11741196Abstract: A secure processor, comprising a logic execution unit configured to process data based on instructions; a communication interface unit, configured to transfer of the instructions and the data, and metadata tags accompanying respective instructions and data; a metadata processing unit, configured to enforce specific restrictions with respect to at least execution of instructions, access to resources, and manipulation of data, selectively dependent on the received metadata tags; and a control transfer processing unit, configured to validate a branch instruction execution and an entry point instruction of each control transfer, selectively dependent on the respective metadata tags.Type: GrantFiled: November 14, 2019Date of Patent: August 29, 2023Assignee: The Research Foundation for The State University of New YorkInventor: Kanad Ghose
-
Patent number: 11625295Abstract: A memory device is set to a performance mode. Data item is received. The data item in a page of a logical unit of the memory device associated with a fault tolerant stripe is stored. A redundancy metadata update for the fault tolerant stripe is delayed until a subsequent media management operation.Type: GrantFiled: May 10, 2021Date of Patent: April 11, 2023Assignee: Micron Technology, Inc.Inventors: Seungjune Jeon, Zhenming Zhou, Jiangli Zhu
-
Patent number: 11545209Abstract: Systems and methods for injecting a toggling signal in a command pipeline configured to receive a multiple command types for the memory device. Toggling circuitry is configured to inject the toggling signal into at least a portion of the command pipeline when the memory device is in a power saving mode and the command pipeline is clear of valid commands. The toggling is blocked from causing writes by disabling a data strobe when a command that is invalid in the power saving mode is asserted during the power saving mode.Type: GrantFiled: May 28, 2021Date of Patent: January 3, 2023Assignee: Micron Technology, Inc.Inventors: Parthasarathy Gajapathy, Kallol Mazumder
-
Patent number: 11531550Abstract: Techniques are disclosed relating to an apparatus that includes a plurality of execution pipelines including first and second execution pipelines, a shared circuit that is shared by the first and second execution pipelines, and a decode circuit. The first and second execution pipelines are configured to concurrently perform operations for respective instructions. The decode circuit is configured to assign a first program thread to the first execution pipeline and a second program thread to the second execution pipeline. In response to determining that respective instructions from the first and second program threads that utilize the shared circuit are concurrently available for dispatch, the decode circuit is further configured to select between the first program thread and the second program thread.Type: GrantFiled: February 10, 2021Date of Patent: December 20, 2022Assignee: Cadence Design Systems, Inc.Inventors: Robert T. Golla, Christopher Olson
-
Patent number: 11507498Abstract: An apparatus including a memory structure comprising non-volatile memory cells and a microcontroller. The microcontroller is configured to output Core Timing Control (CTC) signals that are used to control voltages applied in the memory structure. In one aspect, information from which the CTC signals may be generated is pre-computed and stored. This pre-computation may be performed in a power on phase of the memory system. When a request to perform a memory operation is received, the stored information may be accessed and used to generate the CTC signals to control the memory operation. Thus, considerable time and/or power is saved. Note that this time savings occurs each time the memory operation is performed. Also, power is saved due to not having to repeatedly perform the computation.Type: GrantFiled: March 5, 2020Date of Patent: November 22, 2022Assignee: SanDisk Technologies LLCInventors: Yuheng Zhang, Yan Li
-
Patent number: 11445020Abstract: Circuitry comprises a set of data handling nodes comprising: two or more master nodes each having respective storage circuitry to hold copies of data items from a main memory, each copy of a data item being associated with indicator information to indicate a coherency state of the respective copy, the indicator information being configured to indicate at least whether that copy has been updated more recently than the data item held by the main memory; a home node to serialise data access operations and to control coherency amongst data items held by the set of data handling nodes so that data written to a memory address is consistent with data read from that memory address in response to a subsequent access request; and one or more slave nodes including the main memory; in which: a requesting node of the set of data handling nodes is configured to communicate a conditional request to a target node of the set of data handling nodes in respect of a copy of a given data item at a given memory address, the conditType: GrantFiled: March 24, 2020Date of Patent: September 13, 2022Assignee: Arm LimitedInventors: Jonathan Curtis Beard, Jamshed Jalal, Curtis Glenn Dunham, Roxana Rusitoru
-
Patent number: 11237965Abstract: A cache coherent system includes a directory with more than one snoop filter, each of which stores information in a different set of snoop filter entries. Each snoop filter is associated with a subset of all caching agents within the system. Each snoop filter uses an algorithm chosen for best performance on the caching agents associated with the snoop filter. The number of snoop filter entries in each snoop filter is primarily chosen based on the caching capacity of just the caching agents associated with the snoop filter. The type of information stored in each snoop filter entry of each snoop filter is chosen to meet the desired filtering function of the specific snoop filter.Type: GrantFiled: December 31, 2014Date of Patent: February 1, 2022Assignee: ARTERIS, INC.Inventors: Craig Stephen Forrest, David A. Kruckemyer
-
Patent number: 11080191Abstract: A cache coherent system includes a directory with more than one snoop filter, each of which stores information in a different set of snoop filter entries. Each snoop filter is associated with a subset of all caching agents within the system. Each snoop filter uses an algorithm chosen for best performance on the caching agents associated with the snoop filter. The number of snoop filter entries in each snoop filter is primarily chosen based on the caching capacity of just the caching agents associated with the snoop filter. The type of information stored in each snoop filter entry of each snoop filter is chosen to meet the desired filtering function of the specific snoop filter.Type: GrantFiled: March 18, 2020Date of Patent: August 3, 2021Assignee: ARTERIS, INC.Inventors: Craig Stephen Forrest, David A. Kruckemyer
-
Patent number: 10956161Abstract: Provided is a method for predicting a target address using a set of Indirect Target TAgged GEometric (ITTAGE) tables and a target address pattern table. A branch instruction that is to be executed may be identified. A first tag for the branch instruction may be determined. The first tag may be a unique identifier that corresponds to the branch instruction. Using the tag, the branch instruction may be determined to be in a target address pattern table, and an index may be generated. A predicted target address for the branch instruction may be determined using the generated index and the largest ITTAGE table. Instructions associated with the predicted target address may be fetched.Type: GrantFiled: January 15, 2019Date of Patent: March 23, 2021Assignee: International Business Machines CorporationInventors: Satish Kumar Sadasivam, Puneeth A. H. Bhat, Shruti Saxena
-
Patent number: 10901900Abstract: A cache coherence management system includes: a set of directories distributed between nodes of a network for interconnecting processors including cache memories, each directory including a correspondence table between cache lines and information fields on the cache lines; and a mechanism updating the directories by adding, modifying, or deleting cache lines in the correspondence tables. In each correspondence table and for each cache line identified, at least one field is provided for indicating a possible blocking of a transaction relative to the cache line considered, when the blocking occurs in the node associated with the correspondence table considered. The system further includes a mechanism detecting fields indicating a transaction blocking and restarting each transaction detected as blocked from the node in which it is indicated as blocked.Type: GrantFiled: April 12, 2013Date of Patent: January 26, 2021Assignees: COMMISSARIAT A L'ENERGIE ATOMIQUE ET AUX ENERGIES ALTERNATIVES, BULL SASInventors: Christian Bernard, Eric Guthmuller, Huy Nam Nguyen
-
Patent number: 10831660Abstract: A processing unit for a multiprocessor data processing system includes a processor core having an upper level cache and a lower level cache coupled to the processor core. The lower level cache includes one or more state machines for handling requests snooped from the system interconnect. The processing unit includes an interrupt unit configured to, based on receipt of an interrupt request while the processor core is in a powered up state, record which of the one or more state machines are active processing a prior snooped request that can invalidate a cache line in the upper level cache and present an interrupt to the processor core based on determining that each state machine that was active processing a prior snooped request that can invalidate a cache line in the upper level cache has completed processing of its respective prior snooped request.Type: GrantFiled: June 27, 2019Date of Patent: November 10, 2020Assignee: International Business Machines CorporationInventors: Derek E. Williams, Guy L. Guthrie, Hugh Shen
-
Patent number: 10657055Abstract: An apparatus and method are provided for managing snoop operations. The apparatus has an interface for receiving access requests from any of N master devices that have associated cache storage, each access request specifying a memory address within memory associated with the apparatus. Snoop filter storage is provided that has a plurality of snoop filter entries, where each snoop filter entry identifies a memory portion and snoop control information indicative of the master devices that have accessed that memory portion. When an access request received at the interface specifies a memory address that is within the memory portion associated with a snoop filter entry, snoop control circuitry uses the snoop control information in that snoop filter entry to determine which master devices to subject to a snoop operation.Type: GrantFiled: December 13, 2018Date of Patent: May 19, 2020Assignee: Arm LimitedInventors: Jamshed Jalal, Mark David Werkheiser, Gurunath Ramagiri, Mukesh Patel
-
Patent number: 10635591Abstract: Systems and methods selectively filter, buffer, and process cache coherency probes. A processor includes a probe buffering unit that includes a cache coherency probe buffer. The probe buffering unit receives cache coherency probes and memory access requests for a cache. The probe buffering unit identifies and discards any of the probes that are directed to a memory block that is not cached in the cache, and buffers at least a subset of the remaining probes in the probe buffer. The probe buffering unit submits to the cache, in descending order of priority, one or more of: any buffered probes that are directed to the memory block to which a current memory access request is also directed; any current memory access requests that are directed to a memory block to which there is not a buffered probe also directed; and any buffered probes when there is not a current memory access request.Type: GrantFiled: December 5, 2018Date of Patent: April 28, 2020Assignee: Advanced Micro Devices, Inc.Inventors: Ashok T. Venkatachar, Anthony Jarvis
-
Patent number: 10585803Abstract: Cache memory mapping techniques are presented. A cache may contain an index configuration register. The register may configure the locations of an upper index portion and a lower index portion of a memory address. The portions may be combined to create a combined index. The configurable split-index addressing structure may be used, among other applications, to reduce the rate of cache conflicts occurring between multiple processors decoding the video frame in parallel.Type: GrantFiled: February 4, 2019Date of Patent: March 10, 2020Assignee: Movidius LimitedInventor: Richard Richmond
-
Patent number: 10534712Abstract: A method for service level agreement (SLA) allocation of resources of a cache memory of a storage system, the method may include monitoring, by a control layer of the storage system, actual performances of the storage system that are related to multiple logical volumes; calculating actual-to-required relationships between the actual performances and SLA defined performances of the multiple logical volumes; assigning caching priorities, to different logical volumes of the multiple logical volumes; wherein the assigning is based on, at least, the actual-to-required relationships; and managing, based on at least the caching priorities, a pre-cache memory module that is upstream to the cache module and is configured to store write requests that (i) are associated with one or more logical volumes of the different logical volumes and (ii) are received by the pre-cache memory module at points in time when the cache memory is full; wherein the managing comprises transferring one or more write requests from the pre-caType: GrantFiled: August 29, 2016Date of Patent: January 14, 2020Assignee: INFINIDAT LTD.Inventors: Qun Fan, Venu Nayar, Haim Helman
-
Patent number: 10430193Abstract: A processor includes a first mode where the processor is not to use packed data operation masking, and a second mode where the processor is to use packed data operation masking. A decode unit to decode an unmasked packed data instruction for a given packed data operation in the first mode, and to decode a masked packed data instruction for a masked version of the given packed data operation in the second mode. The instructions have a same instruction length. The masked instruction has bit(s) to specify a mask. Execution unit(s) are coupled with the decode unit. The execution unit(s), in response to the decode unit decoding the unmasked instruction in the first mode, to perform the given packed data operation. The execution unit(s), in response to the decode unit decoding the masked instruction in the second mode, to perform the masked version of the given packed data operation.Type: GrantFiled: June 1, 2018Date of Patent: October 1, 2019Assignee: Intel CorporationInventors: Bret L. Toll, Buford M. Guy, Ronak Singhal, Mishali Naik
-
Patent number: 10424393Abstract: Dynamic redundancy buffers for use with a device are disclosed. The dynamic redundancy buffers allow a memory array of the device to be operated with high write error rate (WER). A first level redundancy buffer (e1 buffer) is couple to the memory array. The e1 buffer may store data words that have failed verification or have not been verified. The e1 buffer may transfer data words to another dynamic redundancy buffer (e2 buffer). The e1 buffer may transfer data words that have failed to write to a memory array after a predetermined number of re-write attempts. The e1 buffer may also transfer data words upon power down.Type: GrantFiled: December 20, 2017Date of Patent: September 24, 2019Assignee: SPIN MEMORY, INC.Inventors: Mourad El Baraji, Neal Berger, Benjamin Stanley Louie, Lester M. Crudele, Daniel L. Hillman, Barry Hoberman
-
Patent number: 10353826Abstract: A data processing system includes a memory system, a first processing element, a first address translator that maps virtual addresses to system addresses, a second address translator that maps system address to physical addresses, and a task management unit. A first program task uses a first virtual memory space that is mapped to a first system address range using a first table. The context of the first program task includes an address of the first table and is cloned by creating a second table indicative of a mapping from a second virtual address space to a second range of system addresses, where the second range is mapped to the same physical addresses as the first range until a write occurs, at which time memory is allocated and the mapping of the second range is updated. The cloned context includes an address of the second table.Type: GrantFiled: July 14, 2017Date of Patent: July 16, 2019Assignee: Arm LimitedInventors: Jonathan Curtis Beard, Roxana Rusitoru, Curtis Glenn Dunham
-
Patent number: 10275251Abstract: A processor includes a first level register file, second level register file, and register file mapper. The first and second level register files are comprised of physical registers, with the first level register file more efficiently accessed relative to the second level register file. The register file mapper is coupled with the first and second level register files. The register file mapper comprises a mapping structure and register file mapper controller. The mapping structure hosts mappings between logical registers and physical registers of the first level register file. The register file mapper controller determines whether to map a destination logical register of an instruction to a physical register in the first level register file. The register file mapper controller also determines, based on metadata associated with the instruction, whether to write data associated with the destination logical register to one of the physical registers of the second level register file.Type: GrantFiled: October 31, 2012Date of Patent: April 30, 2019Assignee: International Business Machines CorporationInventors: Christopher M. Abernathy, Mary D. Brown, Dung Q. Nguyen
-
Patent number: 10248576Abstract: The present invention provides a DRAM/NVM hierarchical heterogeneous memory system with software-hardware cooperative management schemes. In the system, NVM is used as large-capacity main memory, and DRAM is used as a cache to the NVM. Some reserved bits in the data structure of TLB and last-level page table are employed effectively to eliminate hardware costs in the conventional hardware-managed hierarchical memory architecture. The cache management in such a heterogeneous memory system is pushed to the software level. Moreover, the invention is able to reduce memory access latency in case of last-level cache misses. Considering that many applications have relatively poor data locality in big data application environments, the conventional demand-based data fetching policy for DRAM cache can aggravates cache pollution.Type: GrantFiled: October 6, 2016Date of Patent: April 2, 2019Assignee: HUAZHONG UNIVERSITY OF SCIENCE AND TECHNOLOGYInventors: Hai Jin, Xiaofei Liao, Haikun Liu, Yujie Chen, Rentong Guo
-
Patent number: 10216448Abstract: The storage system has one or more storage drives, and one or more controllers for receiving processing requests from a superior device, wherein each of the one or more controllers has a processor for executing the processing request and an accelerator, and the accelerator has multiple internal data memories and an internal control memory, wherein if the processing request is a read I/O request, it stores a control information regarding the request to the internal control memory, and reads data being the target of the relevant request from at least one storage drive out of the multiple storage drives, which is temporarily stored in the one or more said internal data memories, and transferred sequentially in order from the internal data memory already storing data to the superior device.Type: GrantFiled: September 11, 2014Date of Patent: February 26, 2019Assignee: Hitachi, Ltd.Inventors: Kazushi Nakagawa, Masanori Takada, Norio Simozono
-
Patent number: 10191748Abstract: In one embodiment, a processor includes a decode logic, an issue logic to issue decoded instructions, and at least one execution logic to execute issued instructions of a program. The at least one execution logic is to execute at least some instructions of the program out-of-order, and the decode logic is to decode and provide a first in-order memory instruction of the program to the issue logic. In turn, the issue logic is to order the first in-order memory instruction ahead of a second in-order memory instruction of the program. Other embodiments are described and claimed.Type: GrantFiled: November 30, 2015Date of Patent: January 29, 2019Assignee: Intel IP CorporationInventor: Jacob Mathew
-
Patent number: 10146690Abstract: In an embodiment, a processor includes a plurality of cores and synchronization logic. The synchronization logic includes circuitry to: receive a first memory request and a second memory request; determine whether the second memory request is in contention with the first memory request; and in response to a determination that the second memory request is in contention with the first memory request, process the second memory request using a non-blocking cache coherence protocol. Other embodiments are described and claimed.Type: GrantFiled: June 13, 2016Date of Patent: December 4, 2018Assignee: Intel CorporationInventors: Samantika S. Sury, Robert G. Blankenship, Simon C. Steely, Jr.
-
Patent number: 10127161Abstract: A method for the coexistence of software having different safety levels in a multicore processor which has at least two processor cores (2, 3). A memory range (4, 5) is associated with each processor core (2, 3) and a plurality of software (SW1, SW2) is processed on one of the processor cores (2, 3) having a predefined safety level. The plurality of software (SW1, SW2) is processed having a predefined safety level only on the processor core (2, 3) with which the same safety level is associated, in which during the processing of the plurality of software (SW1, SW2), the processor core (2, 3) accesses only the protected memory range (4, 5) which is permanently associated with this processor core (2, 3).Type: GrantFiled: January 23, 2015Date of Patent: November 13, 2018Assignee: Robert Bosch GmbHInventors: Peter Wegner, Jochen Ulrich Haenger, Markus Schweizer, Carsten Gebauer, Bernd Mueller, Thomas Heinz
-
Patent number: 10095524Abstract: A processing system and method includes a predecoder configured to identify instructions that are combinable to form a single, executable internal instruction. Instruction storage is configured to merge instructions that are combinable. An instruction execution unit is configured to execute the single, executable internal instruction on a hardware wide datapath.Type: GrantFiled: November 18, 2014Date of Patent: October 9, 2018Assignee: International Business Machines CorporationInventors: Michael Gschwind, Balaram Sinharoy
-
Patent number: 10095629Abstract: Generally discussed herein are systems, devices, and methods for local and remote dual address decoding. According to an example a node can include one or more processors to generate a first memory request, the first memory request including a first address and a node identification, a caching agent coupled to the one or more processors, the caching agent to determine that the first address is homed to a remote node remote to the local node, a network interface controller (NIC) coupled to the caching agent, the NIC to produce a second memory request based on the first memory request, and the one or more processors further to receive a response to the second memory request, the response generated by a switch coupled to the NIC, the switch includes a remote system address decoder to determine a node identification to which the second memory request is homed.Type: GrantFiled: September 28, 2016Date of Patent: October 9, 2018Assignee: Intel CorporationInventors: Francesc Cesc Guim Bernat, Kshitij A. Doshi, Steen Larsen, Mark A Schmisseur, Raj K. Ramanujan
-
Patent number: 10048899Abstract: A storage device includes a storage medium and a controller configured to control the storage medium. The controller includes an interface unit configured to interface with a host, a processing unit connected to the interface unit via a first signal line and configured to process a direct load operation and a direct store operation between the host and the controller, and at least one memory connected to the interface unit via a second signal line. The at least one memory is configured to temporarily store data read from the storage medium or data received from the host, and is configured to be directly accessed by the host.Type: GrantFiled: February 4, 2015Date of Patent: August 14, 2018Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Myeong-Eun Hwang, Ki-Jo Jung, Tae-Hack Lee, Kwang-Ho Choi, Sang-Kyoo Jeong
-
Patent number: 9823932Abstract: A tagged geometric length (TAGE) branch predictor incorporates multiple prediction tables. Each of these prediction tables has prediction storage lines which store a common stored TAG value and a plurality of branch predictions in respect of different offset positions within a block of program instructions read in parallel. Each of the branch prediction has an associated validity indicator. Update of predictions stored may be made by a partial allocation mechanism in which a TAG match occurs and a branch storage line is partially overwritten or by full allocation in which no already matching TAG victim storage line can be identified and instead a whole prediction storage line is cleared and the new prediction stored therein.Type: GrantFiled: April 20, 2015Date of Patent: November 21, 2017Assignee: ARM LimitedInventor: Houdhaifa Bouzguarrou
-
Patent number: 9727466Abstract: An interconnect and method of managing a snoop filter within such an interconnect are provided. The interconnect is used to connect a plurality of devices, including a plurality of master devices where one or more of the master devices has an associated cache storage. The interconnect comprises coherency control circuitry to perform coherency control operations for data access transactions received by the interconnect from the master devices. In performing those operations, the coherency control circuitry has access to snoop filter circuitry that maintains address-dependent caching indication data, and is responsive to a data access transaction specifying a target address to produce snoop control data providing an indication of which master devices have cached data for the target address in their associated cache storage.Type: GrantFiled: August 11, 2015Date of Patent: August 8, 2017Assignee: ARM LimitedInventors: Andrew David Tune, Sean James Salisbury
-
Patent number: 9727469Abstract: According to one aspect of the present disclosure, a method and technique for performance-driven cache line memory access is disclosed. The method includes: receiving, by a memory controller of a data processing system, a request for a cache line; dividing the request into a plurality of cache subline requests, wherein at least one of the cache subline requests comprises a high priority data request and at least one of the cache subline requests comprises a low priority data request; servicing the high priority data request; and delaying servicing of the low priority data request until a low priority condition has been satisfied.Type: GrantFiled: February 15, 2013Date of Patent: August 8, 2017Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Robert H. Bell, Jr., Men-Chow Chiang, Hong L. Hua, Mysore S. Srinivas
-
Patent number: 9690591Abstract: A technique to enable efficient instruction fusion within a computer system is disclosed. In one embodiment, processor logic delays the processing of a first instruction for a threshold amount of time if the first instruction within an instruction queue is fusible with a second instruction.Type: GrantFiled: October 30, 2008Date of Patent: June 27, 2017Assignee: Intel CorporationInventors: Ido Ouziel, Lihu Rappoport, Robert Valentine, Ron Gabor, Pankaj Raghuvanshi
-
Patent number: 9652396Abstract: A method for accessing a cache memory structure includes dividing a multiple cache elements of a cache memory structure into multiple groups. A serial probing process of the multiple groups is performed. Upon a tag hit resulting from the serial probing process, the probing process for remaining groups exits.Type: GrantFiled: December 27, 2013Date of Patent: May 16, 2017Assignee: Samsung Electronics Co., Ltd.Inventors: Michael G. Butler, Magnus Ekman
-
Patent number: 9626294Abstract: According to one aspect of the present disclosure a system and technique for performance-driven cache line memory access is disclosed. The system includes: a processor, a cache hierarchy coupled to the processor, and a memory coupled to the cache hierarchy. The system also includes logic executable to, responsive to receiving, a request for a cache line: divide the request into a plurality of cache subline requests, wherein at least one of the cache subline requests comprises a high priority data request and at least one of the cache subline requests comprises a low priority data request; service the high priority data request; and delay servicing of the low priority data request until a low priority condition has been satisfied.Type: GrantFiled: October 3, 2012Date of Patent: April 18, 2017Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Robert H. Bell, Jr., Men-Chow Chiang, Hong L. Hua, Mysore S. Srinivas
-
Patent number: 9525890Abstract: A decoding method for an audio video coding standard (AVS) system is provided. According to a stop-fetching criterion, a stop-fetching flag is set to an enabled status or a disabled status. In an offset fetching procedure, it is determined whether an offset value is smaller than a threshold and whether the stop-fetching is in the disabled status. When a determination result is affirmative, one subsequent bit is fetched for the offset value, an offset shift value is correspondingly increased, and the determination step is iterated. When the determination result is negative, the offset fetching procedure is terminated. Next, it is determined whether a decoding result is a least probable symbol (LPS) or a most probable symbol (MPS).Type: GrantFiled: June 18, 2014Date of Patent: December 20, 2016Assignee: MSTAR SEMICONDUCTOR, INC.Inventors: He-Yuan Lin, Yi-Shin Tung
-
Patent number: 9507716Abstract: An interconnect has coherency control circuitry for performing coherency control operations and a snoop filter for identifying which devices coupled to the interconnect have cached data from a given address. When an address is looked up in the snoop filter and misses, and there is no spare snoop filter entry available, then the snoop filter selects a victim entry corresponding to a victim address, and issues an invalidate transaction for invalidating locally cached copies of the data identified by the victim. The coherency control circuitry for performing coherency checking operations for data access transactions is reused for performing coherency control operations for the invalidate transaction issued by the snoop filter. This greatly reduces the circuitry complexity of the snoop filter.Type: GrantFiled: March 6, 2015Date of Patent: November 29, 2016Assignee: ARM LimitedInventors: Sean James Salisbury, Andrew David Tune, Jamshed Jalal, Mark David Werkheiser, Arthur Laughton, George Robert Scott Lloyd, Peter Andrew Riocreux, Daniel Sara
-
Patent number: 9323679Abstract: A system, method, and computer program product are provided for managing miss requests. In use, a miss request is received at a unified miss handler from one of a plurality of distributed local caches. Additionally, the miss request is managed, utilizing the unified miss handler.Type: GrantFiled: August 14, 2012Date of Patent: April 26, 2016Assignee: NVIDIA CorporationInventors: Brucek Kurdo Khailany, Ronny Meir Krashinsky, James David Balfour
-
Patent number: 9304926Abstract: A coherent memory system includes a plurality of level 1 cache memories 6 connected via interconnect circuitry 18 to a level 2 cache memory 8. Coherency control circuitry 10 manages coherency between lines of data. Evict messages from the level 1 cache memories to the coherency control circuitry 10 are sent via the read address channel AR. Read messages are also sent via the read address channel AR. The read address channel AR is configured such that a read message may not be reordered relative to an evict message. The coherency control circuitry 10 is configured such that a read message will not be processed ahead of an evict message. The level 1 cache memories 6 do not track in-flight evict messages. No acknowledgement of an evict message is sent from the coherency control circuitry 10 back to the level 1 cache memory 6.Type: GrantFiled: July 23, 2013Date of Patent: April 5, 2016Assignee: ARM LimitedInventors: Ian Bratt, Mladen Wilder, Ole Henrik Jahren
-
Patent number: 9304714Abstract: A system and method is described for operating a computer memory system having a plurality of controllers capable of accessing a common set of memory modules. Access to the physical storage of the memory modules may be managed by configuration logical units (LUNs) addressable by the users. The amount of memory associated with each LUN may be managed in units of memory (LMA) from a same free LMA table maintained in each controller of the plurality of controllers. A request for maintenance of a LUN may be received from any user through any controller and results in the association of a free memory area with the LUN, and the remaining controllers perform the same operation. A test for misallocation of a free memory area is performed and when such misallocation occurs, the situation is corrected in accordance with a policy.Type: GrantFiled: April 18, 2013Date of Patent: April 5, 2016Assignee: VIOLIN MEMORY INCInventor: Jon C. R. Bennett
-
Patent number: 9286068Abstract: A processor includes an execution unit, a first level register file, a second level register file, a plurality of storage locations and a register file bypass controller. The first and second level register files are comprised of physical registers, with the first level register file more efficiently accessed relative to the second level register file. The register file bypass controller is coupled with the execution unit and second level register file. The register file bypass controller determines whether an instruction indicates a logical register is unmapped from a physical register in the first level register file. The register file controller also loads data into one of the storage locations and selects one of the storage locations as input to the execution unit, without mapping the logical register to one of the physical registers in the first level register file.Type: GrantFiled: October 31, 2012Date of Patent: March 15, 2016Assignee: International Business Machines CorporationInventors: Christopher M. Abernathy, Mary D. Brown, Sundeep Chadha, Dung Q. Nguyen
-
Patent number: 9262080Abstract: In a read processing storage system, using a pool of CPU cores, the CPU cores are assigned to process either write operations, read operations, and read and write operations, that are scheduled for processing. A minimal number of the CPU cores are allocated for processing the write operations, thereby increasing write latency. Upon reaching a throughput limit for the write operations that causes the minimal number of the plurality of CPU cores to reach a busy status, the minimal number of the plurality of CPU cores for processing the write operations is increased.Type: GrantFiled: January 6, 2015Date of Patent: February 16, 2016Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Jonathan Amit, Amir Lidor, Sergey Marenkov, Rostislav Raikhman
-
Patent number: 9244688Abstract: Embodiments relate to using a branch target buffer preload table. An aspect includes receiving a search request to locate branch prediction information associated with a branch instruction. Searching is performed for an entry corresponding to the search request in a branch target buffer and a branch target buffer preload table in parallel. Based on locating a matching entry in the branch target buffer preload table corresponding to the search request and failing to locate the matching entry in the branch target buffer, a victim entry is selected to overwrite in the branch target buffer. Branch prediction information of the matching entry is received from the branch target buffer preload table at the branch target buffer. The victim entry in the branch target buffer is overwritten with the branch prediction information of the matching entry.Type: GrantFiled: November 25, 2013Date of Patent: January 26, 2016Assignee: International Business Machines CorporationInventors: James J. Bonanno, Ulrich Mayer, Brian R. Prasky
-
Patent number: 9229745Abstract: A computing device identifies a load instruction and store instruction pair that causes a load-hit-store conflict. A processor tags a first load instruction that instructs the processor to load a first data set from memory. The processor stores an address at which the first load instruction is located in memory in a special purpose register. The processor determines where the first load instruction has a load-hit-store conflict with a first store instruction. If the processor determines the first load instruction has a load-hit store conflict with the first store instruction, the processor stores an address at which the first data set is located in memory in a second special purpose register, tags the first data set being stored by the first store instruction, stores an address at which the first store instruction is located in memory in a third special purpose register and increases a conflict counter.Type: GrantFiled: September 12, 2012Date of Patent: January 5, 2016Assignee: International Business Machines CorporationInventors: Venkat R. Indukuru, Alexander E. Mericas, Satish K. Sadasivam, Madhavi G. Valluri
-
Patent number: 9229746Abstract: A computing device identifies a load instruction and store instruction pair that causes a load-hit-store conflict. A processor tags a first load instruction that instructs the processor to load a first data set from memory. The processor stores an address at which the first load instruction is located in memory in a special purpose register. The processor determines where the first load instruction has a load-hit-store conflict with a first store instruction. If the processor determines the first load instruction has a load-hit store conflict with the first store instruction, the processor stores an address at which the first data set is located in memory in a second special purpose register, tags the first data set being stored by the first store instruction, stores an address at which the first store instruction is located in memory in a third special purpose register and increases a conflict counter.Type: GrantFiled: December 18, 2013Date of Patent: January 5, 2016Assignee: International Business Machines CorporationInventors: Venkat R. Indukuru, Alexander E. Mericas, Satish K. Sadasivam, Madhavi G. Valluri
-
Patent number: 9164912Abstract: According to an embodiment, a computer system for cache management includes a processor and a cache, the computer system configured to perform a method including receiving a first store request for a first address in the cache and receiving a first fetch request for the first address in the cache. The method also includes executing the first store request and the first fetch request, latching the first store request in a store write-back pipeline in the cache, detecting, in the processor, a conflict following execution of the first store request and the first fetch request and receiving the first store request from a recycle path including the store write-back pipeline and executing the first store request a second time.Type: GrantFiled: June 13, 2012Date of Patent: October 20, 2015Assignee: International Business Machines CorporationInventors: Khary J. Alexander, David A. Webber, Patrick M. West, Jr.
-
Patent number: 9158696Abstract: This disclosure provides techniques and apparatuses to enable early, run-ahead handling of IC and ITLB misses by decoupling the ITLB and IC tag lookups from the IC data (instruction bytes) accesses, and making ITLB and IC tag lookups run ahead of the IC data accesses.Type: GrantFiled: December 29, 2011Date of Patent: October 13, 2015Assignee: Intel CorporationInventors: Ilhyun Kim, Alexandre J. Farcy, Choon Wei Khor, Robert L. Hinton
-
Patent number: 9106592Abstract: Controlling a buffered data transfer between a source and a destination by loading a source count value and a destination count value from a buffered data transfer device. A source delta value is computed by subtracting a source previous value from the source count value. The destination count value is adjusted on the buffered data transfer device by adding the source delta value to the destination count value. A destination delta value is computed by subtracting a destination previous value from the destination count value. The source count value is adjusted on the buffered data transfer device by adding the destination delta value to the source count value. A new value for the source previous value is computed by adding the source count value and the destination delta value. A new value for the destination previous value is computed by adding the destination count value and the source delta value.Type: GrantFiled: May 18, 2008Date of Patent: August 11, 2015Assignee: Western Digital Technologies, Inc.Inventors: Kenneth K. Arimura, Gregory B. Thelin, Rebekah A. Wilson
-
Patent number: 9086889Abstract: Techniques are disclosed relating to reducing the latency of restarting a pipeline in a processor that implements scouting. In one embodiment, the processor may reduce pipeline restart latency using two instruction fetch units that are configured to fetch and re-fetch instructions in parallel with one another. In some embodiments, the processor may reduce pipeline restart latency by initiating re-fetching instructions in response to determining that a commit operation is to be attempted with respect to one or more deferred instructions. In other embodiments, the processor may reduce pipeline restart latency by initiating re-fetching instructions in response to receiving an indication that a request for a set of data has been received by a cache, where the indication is sent by the cache before determining whether the data is present in the cache or not.Type: GrantFiled: April 27, 2010Date of Patent: July 21, 2015Assignee: Oracle International CorporationInventors: Martin Karlsson, Sherman H. Yip, Shailender Chaudhry