Patents by Inventor Per H. Hammarlund

Per H. Hammarlund has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Scalable cache coherency protocol

Patent number: 11544193

Abstract: A scalable cache coherency protocol for system including a plurality of coherent agents coupled to one or more memory controllers is described. The memory controller may implement a precise directory for cache blocks from the memory to which the memory controller is coupled. Multiple requests to a cache block may be outstanding, and snoops and completions for requests may include an expected cache state at the receiving agent, as indicated by a directory in the memory controller when the request was processed, to allow the receiving agent to detect race conditions. In an embodiment, the cache states may include a primary shared and a secondary shared state. The primary shared state may apply to a coherent agent that bears responsibility for transmitting a copy of the cache block to a requesting agent. In an embodiment, at least two types of snoops may be supported: snoop forward and snoop back.

Type: Grant

Filed: May 10, 2021

Date of Patent: January 3, 2023

Assignee: Apple Inc.

Inventors: James Vash, Gaurav Garg, Brian P. Lilly, Ramesh B. Gunna, Steven R. Hutsell, Lital Levy-Rubin, Per H. Hammarlund, Harshavardhan Kaushikkar
Critical agent identification to modify bandwidth allocation in a virtual channel

Patent number: 11513848

Abstract: In an embodiment, a system includes rate limiter circuits corresponding to various agents that issue transactions in a virtual channel. At least one agent may be identified as a critical agent, and different rate limits (e.g., lower limits) may be selected for other agents when the critical agent is on than when the critical agent is off (e.g., higher limits).

Type: Grant

Filed: April 1, 2021

Date of Patent: November 29, 2022

Assignee: Apple Inc.

Inventors: Per H. Hammarlund, Liran Fishel, Roman Gindin
Hashing with Soft Memory Folding

Publication number: 20220342806

Abstract: In an embodiment, a system may support programmable hashing of address bits at a plurality of levels of granularity to map memory addresses to memory controllers and ultimately at least to memory devices. The hashing may be programmed to distribute pages of memory across the memory controllers, and consecutive blocks of the page may be mapped to physically distant memory controllers. In an embodiment, address bits may be dropped from each level of granularity, forming a compacted pipe address to save power within the memory controller. In an embodiment, a memory folding scheme may be employed to reduce the number of active memory devices and/or memory controllers in the system when the full complement of memory is not needed.

Type: Application

Filed: November 4, 2021

Publication date: October 27, 2022

Inventors: Steven Fishwick, Jeffry E. Gonion, Per H. Hammarlund, Eran Tamari, Lior Zimet, Gerard R. Williams, III
Multiple Independent On-chip Interconnect

Publication number: 20220334997

Abstract: In an embodiment, a system on a chip (SOC) comprises a semiconductor die on which circuitry is formed, wherein the circuitry comprises a plurality of agents and a plurality of network switches coupled to the plurality of agents. The plurality of network switches are interconnected to form a plurality of physical and logically independent networks. A first network of the plurality of physically and logically independent networks is constructed according to a first topology and a second network of the plurality of physically and logically independent networks is constructed according to a second topology that is different from the first topology. For example, the first topology may a ring topology and the second topology may be a mesh topology. In an embodiment, coherency may be enforced on the first network and the second network may be a relaxed order network.

Type: Application

Filed: June 3, 2021

Publication date: October 20, 2022

Inventors: Sergio Kolor, Sergio V. Tota, Tzach Zemer, Sagi Lahav, Jonathan M. Redshaw, Per H. Hammarlund, Eran Tamari, James Vash, Gaurav Garg
I/O Agent

Publication number: 20220318136

Abstract: Techniques are disclosed relating to an I/O agent circuit of a computer system. The I/O agent circuit may receive, from a peripheral component, a set of transaction requests to perform a set of read transactions that are directed to one or more of a plurality of cache lines. The I/O agent circuit may issue, to a first memory controller circuit configured to manage access to a first one of the plurality of cache lines, a request for exclusive read ownership of the first cache line such that data of the first cache line is not cached outside of the memory and the I/O agent circuit in a valid state. The I/O agent circuit may receive exclusive read ownership of the first cache line, including receiving the data of the first cache line. The I/O agent circuit may then perform the set of read transactions with respect to the data.

Type: Application

Filed: January 14, 2022

Publication date: October 6, 2022

Inventors: Gaurav Garg, Sagi Lahav, Lital Levy - Rubin, Gerard Williams, III, Samer Nassar, Per H. Hammarlund, Harshavardhan Kaushikkar, Srinivasa Rangan Sridharan, Jeff Gonion, James Vash
Critical Agent Identification to Modify Bandwidth Allocation in a Virtual Channel

Publication number: 20220107836

Abstract: In an embodiment, a system includes rate limiter circuits corresponding to various agents that issue transactions in a virtual channel. At least one agent may be identified as a critical agent, and different rate limits (e.g., lower limits) may be selected for other agents when the critical agent is on than when the critical agent is off (e.g., higher limits).

Type: Application

Filed: April 1, 2021

Publication date: April 7, 2022

Inventors: Per H. Hammarlund, Liran Fishel, Roman Gindin
Scalable Cache Coherency Protocol

Publication number: 20220083472

Abstract: A scalable cache coherency protocol for system including a plurality of coherent agents coupled to one or more memory controllers is described. The memory controller may implement a precise directory for cache blocks from the memory to which the memory controller is coupled. Multiple requests to a cache block may be outstanding, and snoops and completions for requests may include an expected cache state at the receiving agent, as indicated by a directory in the memory controller when the request was processed, to allow the receiving agent to detect race conditions. In an embodiment, the cache states may include a primary shared and a secondary shared state. The primary shared state may apply to a coherent agent that bears responsibility for transmitting a copy of the cache block to a requesting agent. In an embodiment, at least two types of snoops may be supported: snoop forward and snoop back.

Type: Application

Filed: May 10, 2021

Publication date: March 17, 2022

Inventors: James Vash, Gaurav Garg, Brian P. Lilly, Ramesh B. Gunna, Steven R. Hutsell, Lital Levy-Rubin, Per H. Hammarlund
Method and apparatus for ensuring real-time snoop latency

Patent number: 10795818

Abstract: Various systems and methods for ensuring real-time snoop latency are disclosed. A system includes a processor and a cache controller. The cache controller receives, via a channel, cache snoop requests from the processor, the snoop requests including latency-sensitive and non-latency sensitive requests. Requests are not prioritized by type within the channel. The cache controller limits a number of non-latency sensitive snoop requests that can be processed ahead of an incoming latency-sensitive snoop requests. Limiting the number of non-latency sensitive snoop requests that can be processed ahead of an incoming latency-sensitive snoop request includes the cache controller determining that the number of received non-latency sensitive snoop requests has reached a predetermined value and responsively prioritizing latency-sensitive requests over non-latency sensitive requests.

Type: Grant

Filed: May 21, 2019

Date of Patent: October 6, 2020

Assignee: Apple Inc.

Inventors: Harshavardhan Kaushikkar, Per H. Hammarlund, Brian P. Lilly, Michael Bekerman, James Vash, Manu Gulati, Benjamin K. Dodge
Decoupling the number of logical threads from the number of simultaneous physical threads in a processor

Patent number: 7797683

Abstract: Systems and methods of managing threads provide for supporting a plurality of logical threads with a plurality of simultaneous physical threads in which the number of logical threads may be greater than or less than the number of physical threads. In one approach, each of the plurality of logical threads is maintained in one of a wait state, an active state, a drain state, and a stall state. A state machine and hardware sequencer can be used to transition the logical threads between states based on triggering events and whether or not an interruptible point has been encountered in the logical threads. The logical threads are scheduled on the physical threads to meet, for example, priority, performance or fairness goals. It is also possible to specify the resources that are available to each logical thread in order to meet these and other, goals. In one example, a single logical thread can speculatively use more than one physical thread, pending a selection of which physical thread should be committed.

Type: Grant

Filed: December 29, 2003

Date of Patent: September 14, 2010

Assignee: Intel Corporation

Inventors: Per H. Hammarlund, Stephan J. Jourdan, Pierre Michaud, Alexandre J. Farcy, Morris Marden, Robert L. Hinton, Douglas M. Carmean
Method and system for dynamic resource allocation

Patent number: 7688746

Abstract: Embodiments of the present invention provide a dynamic resource allocator to allocate resources performance optimization in, for example, a computer system. The dynamic resource allocator to allocate a resource to one or more threads associated with an application based on a performance rate. Embodiments of the present invention may further include a performance monitor to monitor the performance rate of the one or more threads. The dynamic resource allocator to allocate an additional resource to the one or more threads, if the thread is performing above a performance threshold. In embodiments of the present invention, the dynamic resource allocation strategy may be decided based on, for example, optimizing the overall system throughput, minimizing power consumption, meeting system performance goals (e.g., real time requirements), user specified performance priorities and/or application specified performance priorities.

Type: Grant

Filed: December 29, 2003

Date of Patent: March 30, 2010

Assignee: Intel Corporation

Inventors: Per H. Hammarlund, Melih Ozgul
Method for and a trailing store buffer for use in memory renaming

Patent number: 7640419

Abstract: Embodiments of the present invention relate to a memory management scheme and apparatus that enables efficient memory renaming. The method includes computing a store address, writing the store address in a first storage, writing data associated with the store address to a memory, and de-allocating the store address from the first storage, allocating the store address in a second storage, predicting a load instruction to be memory renamed, computing a load store source index, computing a load address, disambiguating the memory renamed load instruction, and retiring the memory renamed load instruction, if the store instruction is still allocated in at least one of the first storage and the second storage and should have effectively provided to the load the full data. The method may also include re-executing the load instruction without memory renaming, if the store instruction is not in at the first storage or in the second storage.

Type: Grant

Filed: December 23, 2003

Date of Patent: December 29, 2009

Assignee: Intel Corporation

Inventors: Sebastien Hily, Per H. Hammarlund
Late allocation of registers

Patent number: 7529913

Abstract: Embodiments of the present invention relate to a method and system for providing virtual identifiers corresponding to physical registers in a computer processor. According to the embodiments, the virtual identifiers may be used to represent the physical registers during operations in a pipeline of the processor.

Type: Grant

Filed: December 23, 2003

Date of Patent: May 5, 2009

Assignee: Intel Corporation

Inventors: Avinash Sodani, Per H. Hammarlund, Stephan J. Jourdan
Decoupling request for ownership tag reads from data read operations

Patent number: 7502892

Abstract: Embodiments of the present invention relate to cache coherency. In an embodiment of the invention, a cache includes one or more cache lines. A store pipeline may retrieve a tag associated with one of the cache lines. The data associated with the cache line may not retrieved and the cache line may be updated if, based on the tag, the cache line is determined to be in a modified or exclusive state.

Type: Grant

Filed: December 30, 2003

Date of Patent: March 10, 2009

Assignee: Intel Corporation

Inventors: Per H. Hammarlund, Hermann W. Gartler
Method and apparatus for rescheduling operations in a processor

Patent number: 7502912

Abstract: A method and apparatus for rescheduling operations in a processor. More particularly, the present invention relates to optimally using a scheduler resource in a processor by analyzing, predicting, and sorting the write order of instructions into the scheduler so that the duration the instructions sit idle in the scheduler is minimized. The analyses, prediction, and sorting may be done between an instruction queue and a scheduler by using delay units. The prediction can be based on history (latency, dependency, and resource) or on a general prediction scheme.

Type: Grant

Filed: December 30, 2003

Date of Patent: March 10, 2009

Assignee: Intel Corporation

Inventors: Avinash Sodani, Per H. Hammarlund, Stephan J. Jourdan
Method and system for transforming memory location references in instructions

Patent number: 7174428

Abstract: Embodiments of the present invention provide a method, apparatus and system for memory renaming. In one embodiment, a decode unit may decode a load instruction. If the load instruction is predicted to be memory renamed, the load instruction may have a predicted store identifier associated with the load instruction. The decode unit may transform the load instruction that is predicted to be memory renamed into a data move instruction and a load check instruction. The data move instruction may read data from the cache based on the predicted store identifier and load check instruction may compare an identifier associated with an identified source store with the predicted store identifier. A retirement unit may retire the load instruction if the predicted store identifier matches an identifier associated with the identified source store.

Type: Grant

Filed: December 29, 2003

Date of Patent: February 6, 2007

Assignee: Intel Corporation

Inventors: Sebastien Hily, Per H. Hammarlund, Avinash Sodani
Apparatus and method for store address for store address prefetch and line locking

Patent number: 7130965

Abstract: Embodiments of the present invention relate to a memory management scheme and apparatus that enables efficient cache memory management. The method includes writing an entry to a store buffer at execute time; determining if the entry's address is in a first-level cache associated with the store buffer before retirement; and setting a status bit associated with the entry in said store buffer, if the address is in the cache in either exclusive or modified state. The method further includes immediately writing the entry to the first-level cache at or after retirement when the status bit is set; and de-allocating the entry from said store buffer at retirement. The method further may comprise resetting the status bit if the cacheline is allocated over or is evicted from the cache before the store buffer entry attempts to write to the cache.

Type: Grant

Filed: December 23, 2003

Date of Patent: October 31, 2006

Assignee: Intel Corporation

Inventors: Per H. Hammarlund, Stephan Jourdan, Sebastien Hily, Aravindh Baktha, Hermann Gartler
System and method for employing a process identifier to minimize aliasing in a linear-addressed cache

Patent number: 6990551

Abstract: A system and method for reducing linear address aliasing is described. In one embodiment, a portion of a linear address is combined with a process identifier, e.g., a page directory base pointer to form an adjusted-linear address. The page directory base pointer is unique to a process and combining it with a portion of the linear address produces an adjusted-linear address that provides a high probability of no aliasing. A portion of the adjusted-linear address is used to search an adjusted-linear-addressed cache memory for a data block specified by the linear address. If the data block does not reside in the adjusted-linear-addressed cache memory, then a replacement policy selects one of the cache lines in the adjusted-linear-addressed cache memory and replaces the data block of the selected cache line with a data block located at a physical address produced from translating the linear address.

Type: Grant

Filed: August 13, 2004

Date of Patent: January 24, 2006

Assignee: Intel Corporation

Inventors: Herbert H. J. Hum, Stephan J. Jourdan, Per H. Hammarlund
Breaking replay dependency loops in a processor using a rescheduled replay queue

Patent number: 6981129

Abstract: Breaking replay dependency loops in a processor using a rescheduled replay queue. The processor comprises a replay queue to receive a plurality of instructions, and an execution unit to execute the plurality of instructions. A scheduler is coupled between the replay queue and the execution unit. The scheduler speculatively schedules instructions for execution and increments a counter for each of the plurality of instructions to reflect the number of times each of the plurality of instructions has been executed. The scheduler also dispatches each instruction to the execution unit either when the counter does not exceed a maximum number of replays or, if the counter exceeds the maximum number of replays, when the instruction is safe to execute. A checker is coupled to the execution unit to determine whether each instruction has executed successfully. The checker is also coupled to the replay queue to communicate to the replay queue each instruction that has not executed successfully.

Type: Grant

Filed: November 2, 2000

Date of Patent: December 27, 2005

Assignee: Intel Corporation

Inventors: Darrell D. Boggs, Douglas M. Carmean, Per H. Hammarlund, Francis X. McKeen, David J. Sager, Ronak Singhal
Method and apparatus for rescheduling multiple micro-operations in a processor using a replay queue and a counter

Patent number: 6877086

Abstract: Rescheduling multiple micro-operations in a processor using a replay queue. The processor comprises a replay queue to receive a plurality of instructions and an execution unit to execute the plurality of instructions. A scheduler is coupled between the replay queue and the execution unit. The scheduler speculatively schedules instructions for execution and dispatches each instruction to the execution unit. A checker is coupled to the execution unit to determine whether each instruction has executed successfully. The checker is also coupled to the replay queue to communicate to the replay queue each instruction that has not executed successfully.

Type: Grant

Filed: November 2, 2000

Date of Patent: April 5, 2005

Assignee: Intel Corporation

Inventors: Darrell D. Boggs, Douglas M. Carmean, Per H. Hammarlund, Francis X. McKeen, David J. Sager, Ronak Singhal
Method and apparatus for variable length coding

Patent number: 6798364

Abstract: A method and apparatus for variable length coding is described. A method comprises receiving a group of data having a group of set values, identifying a group of positions of the group of set values within the group of data without branching, for each of the group of positions, encoding a run of non-set values preceding each of the group of positions.

Type: Grant

Filed: February 5, 2002

Date of Patent: September 28, 2004

Assignee: Intel Corporation

Inventors: Yen-Kuang Chen, Matthew J. Holliman, Herbert Hum, Per H. Hammarlund, Thomas Huff, William W. Macy

prev 1 2 3 next