Patents by Inventor Brian P. Lilly
Brian P. Lilly has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12332792Abstract: A system may include multiple coherent agents, where a given coherent agent includes one or more caches configured to cache data. Memory controller circuitry may control one or more memory circuits from which the one or more caches are configured to cache data and maintain a directory that tracks which of the multiple coherent agent circuits is caching copies of a plurality of cache blocks and states of the cached copies in the multiple coherent agent circuits. A first agent may transmit a first request for a first cache block. The first agent may store, in request buffer circuitry, information corresponding to the first request then detect a second snoop from a second agent circuit to the first cache block. The first agent may absorb the second snoop, including to store information corresponding to the second snoop with the information corresponding to the first request in the request buffer circuitry.Type: GrantFiled: February 20, 2024Date of Patent: June 17, 2025Assignee: Apple Inc.Inventors: James Vash, Gaurav Garg, Brian P. Lilly, Ramesh B. Gunna, Steven R. Hutsell, Lital Levy-Rubin, Per H. Hammarlund, Harshavardhan Kaushikkar
-
Publication number: 20250103520Abstract: A memory controller circuit receives memory access requests from a network of a computer system. Entries are reserved for these requests in a retry queue circuit. An arbitration circuit of the memory controller circuit issues those requests to a tag pipeline circuit that determines whether the received memory access requests hit in a memory cache. As a memory access request passes through the tag pipeline circuit, it may require another pass through this pipeline—for example, if resources such as certain storage circuits needed to complete the memory access request are unavailable (for example a snoop queue circuit). The reservation that has been made in the retry queue circuit thus keeps the request from having to be returned to the network for resubmission to the memory controller circuit if initial processing of the memory access request cannot be completed.Type: ApplicationFiled: August 29, 2024Publication date: March 27, 2025Inventors: Ilya Granovsky, Jurgen M. Schulz, Tom Greenshtein, Elli Bagelman, Brian P. Lilly, John H. Kelm, Rohit K. Gupta, Sandeep Gupta, Anwar Q. Rohillah
-
Patent number: 12216578Abstract: A cache may include multiple request handling pipes, each of which may further include multiple request buffers, for storing device requests from one or more processors to one or more devices. Some of the device requests may require to be sent to the devices according to an order. For a given one of such device requests, the cache may select a request handling pipe, based on an address indicated by the device request, and select a request buffer, based on the available entries of the request buffers of the selected request handling pipe, to store the device request. The cache may further use a first-level and a second-level token stores to track and maintain the device requests in order when transmitting the device requests to the devices.Type: GrantFiled: July 17, 2023Date of Patent: February 4, 2025Assignee: Apple Inc.Inventors: Sandeep Gupta, Brian P Lilly, Krishna C Potnuru
-
Publication number: 20240273024Abstract: A scalable cache coherency protocol for system including a plurality of coherent agents coupled to one or more memory controllers is described. The memory controller may implement a precise directory for cache blocks from the memory to which the memory controller is coupled. Multiple requests to a cache block may be outstanding, and snoops and completions for requests may include an expected cache state at the receiving agent, as indicated by a directory in the memory controller when the request was processed, to allow the receiving agent to detect race conditions. In an embodiment, the cache states may include a primary shared and a secondary shared state. The primary shared state may apply to a coherent agent that bears responsibility for transmitting a copy of the cache block to a requesting agent. In an embodiment, at least two types of snoops may be supported: snoop forward and snoop back.Type: ApplicationFiled: February 20, 2024Publication date: August 15, 2024Inventors: James Vash, Gaurav Garg, Brian P. Lilly, Ramesh B. Gunna, Steven R. Hutsell, Lital Levy-Rubin, Per H. Hammarlund, Harshavardhan Kaushikkar
-
Patent number: 11947457Abstract: A scalable cache coherency protocol for system including a plurality of coherent agents coupled to one or more memory controllers is described. The memory controller may implement a precise directory for cache blocks from the memory to which the memory controller is coupled. Multiple requests to a cache block may be outstanding, and snoops and completions for requests may include an expected cache state at the receiving agent, as indicated by a directory in the memory controller when the request was processed, to allow the receiving agent to detect race conditions. In an embodiment, the cache states may include a primary shared and a secondary shared state. The primary shared state may apply to a coherent agent that bears responsibility for transmitting a copy of the cache block to a requesting agent. In an embodiment, at least two types of snoops may be supported: snoop forward and snoop back.Type: GrantFiled: November 22, 2022Date of Patent: April 2, 2024Assignee: Apple Inc.Inventors: James Vash, Gaurav Garg, Brian P. Lilly, Ramesh B. Gunna, Steven R. Hutsell, Lital Levy-Rubin, Per H. Hammarlund, Harshavardhan Kaushikkar
-
Patent number: 11893241Abstract: A variable latency cache memory is disclosed. A cache subsystem includes a pipeline control circuit configured to initiate cache memory accesses for data. The cache subsystem further includes a cache memory circuit having a data array arranged into a plurality of groups, wherein different ones of the plurality of groups have different minimum access latencies due to different distances from the pipeline control circuit. A plurality of latency control circuits configured to ensure a latency is bounded to a maximum value for a given access to the data array, wherein a given latency control circuit is associated with a corresponding group of the plurality of groups. The latency for a given access may thus vary between a minimum access latency for a group closest to the pipeline control circuit to a maximum latency for an access to the group furthest from the pipeline control circuit.Type: GrantFiled: August 31, 2022Date of Patent: February 6, 2024Assignee: Apple Inc.Inventors: Brian P. Lilly, Sandeep Gupta, Chandan Shantharaj, Krishna C. Potnuru, Sahil Kapoor
-
Patent number: 11868258Abstract: A scalable cache coherency protocol for system including a plurality of coherent agents coupled to one or more memory controllers is described. The memory controller may implement a precise directory for cache blocks from the memory to which the memory controller is coupled. Multiple requests to a cache block may be outstanding, and snoops and completions for requests may include an expected cache state at the receiving agent, as indicated by a directory in the memory controller when the request was processed, to allow the receiving agent to detect race conditions. In an embodiment, the cache states may include a primary shared and a secondary shared state. The primary shared state may apply to a coherent agent that bears responsibility for transmitting a copy of the cache block to a requesting agent. In an embodiment, at least two types of snoops may be supported: snoop forward and snoop back.Type: GrantFiled: January 27, 2023Date of Patent: January 9, 2024Assignee: Apple Inc.Inventors: James Vash, Gaurav Garg, Brian P. Lilly, Ramesh B. Gunna, Steven R. Hutsell, Lital Levy-Rubin, Per H. Hammarlund, Harshavardhan Kaushikkar
-
Publication number: 20230359557Abstract: A cache may include multiple request handling pipes, each of which may further include multiple request buffers, for storing device requests from one or more processors to one or more devices. Some of the device requests may require to be sent to the devices according to an order. For a given one of such device requests, the cache may select a request handling pipe, based on an address indicated by the device request, and select a request buffer, based on the available entries of the request buffers of the selected request handling pipe, to store the device request. The cache may further use a first-level and a second-level token stores to track and maintain the device requests in order when transmitting the device requests to the devices.Type: ApplicationFiled: July 17, 2023Publication date: November 9, 2023Applicant: Apple Inc.Inventors: Sandeep Gupta, Brian P. Lilly, Krishna C. Potnuru
-
Patent number: 11768690Abstract: A system may include a plurality of processors and a coprocessor. A plurality of coprocessor context priority registers corresponding to a plurality of contexts supported by the coprocessor may be included. The plurality of processors may use the plurality of contexts, and may program the coprocessor context priority register corresponding to a context with a value specifying a priority of the context relative to other contexts. An arbiter may arbitrate among instructions issued by the plurality of processors based on the priorities in the plurality of coprocessor context priority registers. In one embodiment, real-time threads may be assigned higher priorities than bulk processing tasks, improving bandwidth allocated to the real-time threads as compared to the bulk tasks.Type: GrantFiled: November 22, 2021Date of Patent: September 26, 2023Assignee: Apple Inc.Inventors: Aditya Kesiraju, Andrew J. Beaumont-Smith, Brian P. Lilly, James Vash, Jason M. Kassoff, Krishna C. Potnuru, Rajdeep L. Bhuyar, Ran A. Chachick, Tyler J. Huberty, Derek R. Kumar
-
Patent number: 11741009Abstract: A cache may include multiple request handling pipes, each of which may further include multiple request buffers, for storing device requests from one or more processors to one or more devices. Some of the device requests may require to be sent to the devices according to an order. For a given one of such device requests, the cache may select a request handling pipe, based on an address indicated by the device request, and select a request buffer, based on the available entries of the request buffers of the selected request handling pipe, to store the device request. The cache may further use a first-level and a second-level token stores to track and maintain the device requests in order when transmitting the device requests to the devices.Type: GrantFiled: November 15, 2021Date of Patent: August 29, 2023Assignee: Apple Inc.Inventors: Sandeep Gupta, Brian P Lilly, Krishna C Potnuru
-
Publication number: 20230169003Abstract: A scalable cache coherency protocol for system including a plurality of coherent agents coupled to one or more memory controllers is described. The memory controller may implement a precise directory for cache blocks from the memory to which the memory controller is coupled. Multiple requests to a cache block may be outstanding, and snoops and completions for requests may include an expected cache state at the receiving agent, as indicated by a directory in the memory controller when the request was processed, to allow the receiving agent to detect race conditions. In an embodiment, the cache states may include a primary shared and a secondary shared state. The primary shared state may apply to a coherent agent that bears responsibility for transmitting a copy of the cache block to a requesting agent. In an embodiment, at least two types of snoops may be supported: snoop forward and snoop back.Type: ApplicationFiled: January 27, 2023Publication date: June 1, 2023Inventors: James Vash, Gaurav Garg, Brian P. Lilly, Ramesh B. Gunna, Steven R. Hutsell, Lital Levy-Rubin, Per H. Hammarlund, Harshavardhan Kaushikkar
-
Publication number: 20230083397Abstract: A scalable cache coherency protocol for system including a plurality of coherent agents coupled to one or more memory controllers is described. The memory controller may implement a precise directory for cache blocks from the memory to which the memory controller is coupled. Multiple requests to a cache block may be outstanding, and snoops and completions for requests may include an expected cache state at the receiving agent, as indicated by a directory in the memory controller when the request was processed, to allow the receiving agent to detect race conditions. In an embodiment, the cache states may include a primary shared and a secondary shared state. The primary shared state may apply to a coherent agent that bears responsibility for transmitting a copy of the cache block to a requesting agent. In an embodiment, at least two types of snoops may be supported: snoop forward and snoop back.Type: ApplicationFiled: November 22, 2022Publication date: March 16, 2023Inventors: James Vash, Gaurav Garg, Brian P. Lilly, Ramesh B. Gunna, Steven R. Hutsell, Lital Levy-Rubin, Per H. Hammarlund, Harshavardhan Kaushikkar
-
Patent number: 11544193Abstract: A scalable cache coherency protocol for system including a plurality of coherent agents coupled to one or more memory controllers is described. The memory controller may implement a precise directory for cache blocks from the memory to which the memory controller is coupled. Multiple requests to a cache block may be outstanding, and snoops and completions for requests may include an expected cache state at the receiving agent, as indicated by a directory in the memory controller when the request was processed, to allow the receiving agent to detect race conditions. In an embodiment, the cache states may include a primary shared and a secondary shared state. The primary shared state may apply to a coherent agent that bears responsibility for transmitting a copy of the cache block to a requesting agent. In an embodiment, at least two types of snoops may be supported: snoop forward and snoop back.Type: GrantFiled: May 10, 2021Date of Patent: January 3, 2023Assignee: Apple Inc.Inventors: James Vash, Gaurav Garg, Brian P. Lilly, Ramesh B. Gunna, Steven R. Hutsell, Lital Levy-Rubin, Per H. Hammarlund, Harshavardhan Kaushikkar
-
Publication number: 20220083472Abstract: A scalable cache coherency protocol for system including a plurality of coherent agents coupled to one or more memory controllers is described. The memory controller may implement a precise directory for cache blocks from the memory to which the memory controller is coupled. Multiple requests to a cache block may be outstanding, and snoops and completions for requests may include an expected cache state at the receiving agent, as indicated by a directory in the memory controller when the request was processed, to allow the receiving agent to detect race conditions. In an embodiment, the cache states may include a primary shared and a secondary shared state. The primary shared state may apply to a coherent agent that bears responsibility for transmitting a copy of the cache block to a requesting agent. In an embodiment, at least two types of snoops may be supported: snoop forward and snoop back.Type: ApplicationFiled: May 10, 2021Publication date: March 17, 2022Inventors: James Vash, Gaurav Garg, Brian P. Lilly, Ramesh B. Gunna, Steven R. Hutsell, Lital Levy-Rubin, Per H. Hammarlund
-
Publication number: 20220083343Abstract: A system may include a plurality of processors and a coprocessor. A plurality of coprocessor context priority registers corresponding to a plurality of contexts supported by the coprocessor may be included. The plurality of processors may use the plurality of contexts, and may program the coprocessor context priority register corresponding to a context with a value specifying a priority of the context relative to other contexts. An arbiter may arbitrate among instructions issued by the plurality of processors based on the priorities in the plurality of coprocessor context priority registers. In one embodiment, real-time threads may be assigned higher priorities than bulk processing tasks, improving bandwidth allocated to the real-time threads as compared to the bulk tasks.Type: ApplicationFiled: November 22, 2021Publication date: March 17, 2022Inventors: Aditya Kesiraju, Andrew J. Beaumont-Smith, Brian P. Lilly, James Vash, Jason M. Kassoff, Krishna C. Potnuru, Rajdeep L. Bhuyar, Ran A. Chachick, Tyler J. Huberty, Derek R. Kumar
-
Patent number: 11210104Abstract: A system may include a plurality of processors and a coprocessor. A plurality of coprocessor context priority registers corresponding to a plurality of contexts supported by the coprocessor may be included. The plurality of processors may use the plurality of contexts, and may program the coprocessor context priority register corresponding to a context with a value specifying a priority of the context relative to other contexts. An arbiter may arbitrate among instructions issued by the plurality of processors based on the priorities in the plurality of coprocessor context priority registers. In one embodiment, real-time threads may be assigned higher priorities than bulk processing tasks, improving bandwidth allocated to the real-time threads as compared to the bulk tasks.Type: GrantFiled: September 11, 2020Date of Patent: December 28, 2021Assignee: Apple Inc.Inventors: Aditya Kesiraju, Andrew J. Beaumont-Smith, Brian P. Lilly, James Vash, Jason M. Kassoff, Krishna C. Potnuru, Rajdeep L. Bhuyar, Ran A. Chachick, Tyler J. Huberty, Derek R. Kumar
-
Patent number: 11138111Abstract: Systems, apparatuses, and methods for performing coherence processing and memory cache processing in parallel are disclosed. A system includes a communication fabric and a plurality of dual-processing pipelines. Each dual-processing pipeline includes a coherence processing pipeline and a memory cache processing pipeline. The communication fabric forwards a transaction to a given dual-processing pipeline, with the communication fabric selecting the given dual-processing pipeline, from the plurality of dual-processing pipelines, based on a hash of the address of the transaction. The given dual-processing pipeline performs a duplicate tag lookup in parallel with a memory cache tag lookup for the transaction. By performing the duplicate tag lookup and the memory cache tag lookup in a parallel fashion rather than in a serial fashion, latency and power consumption are reduced while performance is enhanced.Type: GrantFiled: September 12, 2018Date of Patent: October 5, 2021Assignee: Apple Inc.Inventors: Muditha Kanchana, Srinivasa Rangan Sridharan, Harshavardhan Kaushikkar, Sridhar Kotha, Brian P. Lilly
-
Patent number: 10795818Abstract: Various systems and methods for ensuring real-time snoop latency are disclosed. A system includes a processor and a cache controller. The cache controller receives, via a channel, cache snoop requests from the processor, the snoop requests including latency-sensitive and non-latency sensitive requests. Requests are not prioritized by type within the channel. The cache controller limits a number of non-latency sensitive snoop requests that can be processed ahead of an incoming latency-sensitive snoop requests. Limiting the number of non-latency sensitive snoop requests that can be processed ahead of an incoming latency-sensitive snoop request includes the cache controller determining that the number of received non-latency sensitive snoop requests has reached a predetermined value and responsively prioritizing latency-sensitive requests over non-latency sensitive requests.Type: GrantFiled: May 21, 2019Date of Patent: October 6, 2020Assignee: Apple Inc.Inventors: Harshavardhan Kaushikkar, Per H. Hammarlund, Brian P. Lilly, Michael Bekerman, James Vash, Manu Gulati, Benjamin K. Dodge
-
Publication number: 20200081838Abstract: Systems, apparatuses, and methods for performing coherence processing and memory cache processing in parallel are disclosed. A system includes a communication fabric and a plurality of dual-processing pipelines. Each dual-processing pipeline includes a coherence processing pipeline and a memory cache processing pipeline. The communication fabric forwards a transaction to a given dual-processing pipeline, with the communication fabric selecting the given dual-processing pipeline, from the plurality of dual-processing pipelines, based on a hash of the address of the transaction. The given dual-processing pipeline performs a duplicate tag lookup in parallel with a memory cache tag lookup for the transaction. By performing the duplicate tag lookup and the memory cache tag lookup in a parallel fashion rather than in a serial fashion, latency and power consumption are reduced while performance is enhanced.Type: ApplicationFiled: September 12, 2018Publication date: March 12, 2020Inventors: Muditha Kanchana, Srinivasa Rangan Sridharan, Harshavardhan Kaushikkar, Sridhar Kotha, Brian P. Lilly
-
Patent number: 9563567Abstract: A method and apparatus for selectively powering down a portion of a cache memory includes determining a power down condition dependent upon a number of accesses to the cache memory. In response to the detection of the power down condition, selecting a group of cache ways included in the cache memory dependent upon a number of cache lines in each cache way that are also included in another cache memory. The method further includes locking and flushing the selected group of cache ways, and then activating a low power mode for the selected group of cache ways.Type: GrantFiled: April 28, 2014Date of Patent: February 7, 2017Assignee: Apple Inc.Inventors: Mahnaz P Sadoughi-Yarandi, Perumal R. Subramonium, Brian P. Lilly, Hari S Kannan