Patents by Inventor Mridul Agarwal

Mridul Agarwal has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20250258670
    Abstract: A reservation station that includes a storage circuit with multiple full and partial entries is disclosed. A given store-data entry of the multiple partial entries may store less data than a given full entry of the multiple full entries. A control circuit may receive a load/store operation and, in response to a determination that the load/store operation includes only a single source, store the load/store operation in a particular partial entry of the multiple partial entries.
    Type: Application
    Filed: June 13, 2024
    Publication date: August 14, 2025
    Inventors: Amit KUMAR, Deepankar DUGGAL, Haoyan JIA, Manjunath SHEVGOOR, Mridul AGARWAL, Nikhil GUPTA
  • Publication number: 20250258622
    Abstract: A reservation station that includes primary and secondary storage circuits is disclosed. The primary storage circuit may include multiple full entries, while the secondary storage circuit includes multiple store-data entries. A given store-data entry of the multiple store-data entries stores a subset of the information stored in a given full entry of the multiple full entries. In response to a determination that a store address associated with a particular store operation, stored in particular full entry of the multiple full entries, has been available for use for a threshold number of cycles without store data associated with the particular store operation being available for use, a control circuit may transfer the particular store operation to a particular store-data entry of the multiple store-data entries.
    Type: Application
    Filed: June 13, 2024
    Publication date: August 14, 2025
    Inventors: Amit KUMAR, Brian R. MESTAN, Mridul AGARWAL, Nikhil GUPTA
  • Patent number: 12321746
    Abstract: Techniques are disclosed relating to data synchronization barrier operations. A system includes a first processor that may receive a data barrier operation request from a second processor include in the system. Based on receiving that data barrier operation request from the second processor, the first processor may ensure that outstanding load/store operations executed by the first processor that are directed to addresses outside of an exclusion region have been completed. The first processor may respond to the second processor that the data barrier operation request is complete at the first processor, even in the case that one or more load/store operations that are directed to addresses within the exclusion region are outstanding and not complete when the first processor responds that the data barrier operation request is complete.
    Type: Grant
    Filed: June 16, 2023
    Date of Patent: June 3, 2025
    Assignee: Apple Inc.
    Inventors: Jeff Gonion, John H. Kelm, James Vash, Pradeep Kanapathipillai, Mridul Agarwal, Gideon N. Levinsky, Richard F. Russo, Christopher M. Tsay
  • Patent number: 12314200
    Abstract: An interrupt delivery mechanism for a system includes and interrupt controller and a plurality of cluster interrupt controllers coupled to respective pluralities of processors in an embodiment. The interrupt controller may serially transmit an interrupt request to respective cluster interrupt controllers, which may acknowledge (Ack) or non-acknowledge (Nack) the interrupt based on attempting to deliver the interrupt to processors to which the cluster interrupt controller is coupled. In a soft iteration, the cluster interrupt controller may attempt to deliver the interrupt to processors that are powered on, without attempting to power on processors that are powered off. If the soft iteration does not result in an Ack response from one of the plurality of cluster interrupt controllers, a hard iteration may be performed in which the powered-off processors may be powered on.
    Type: Grant
    Filed: May 24, 2024
    Date of Patent: May 27, 2025
    Assignee: Apple Inc.
    Inventors: Jeffrey E. Gonion, Charles E. Tucker, Tal Kuzi, Richard F. Russo, Mridul Agarwal, Christopher M. Tsay, Gideon N. Levinsky, Shih-Chieh Wen, Lior Zimet
  • Publication number: 20250147767
    Abstract: A system may include multiple processors. One of the processors may receive an indication of a data synchronization barrier (DSB) instruction in another processor that follows a translation look-ahead buffer invalidate (TLBI) instruction to invalidate an entry of a translation look-ahead buffer. The processor may determine whether instructions are pending in the processor for which the virtual addresses used for memory accesses have been translated to physical addresses before receiving the DSB indication. If there are such pending instructions, the processor may provide, after these instructions retire, an indication to the other processor as a response to the DSB indication.
    Type: Application
    Filed: January 3, 2025
    Publication date: May 8, 2025
    Applicant: Apple Inc.
    Inventors: Madhu Sudan Hari, Mridul Agarwal, Kulin N. Kothari, John D. Pape, Niket K. Choudhary
  • Patent number: 12229561
    Abstract: A system may include multiple processors. One of the processors may receive an indication of a data synchronization barrier (DSB) instruction in another processor that follows a translation look-ahead buffer invalidate (TLBI) instruction to invalidate an entry of a translation look-ahead buffer. The processor may determine whether instructions are pending in the processor for which the virtual addresses used for memory accesses have been translated to physical addresses before receiving the DSB indication. If there are such pending instructions, the processor may provide, after these instructions retire, an indication to the other processor as a response to the DSB indication.
    Type: Grant
    Filed: September 16, 2022
    Date of Patent: February 18, 2025
    Assignee: Apple Inc.
    Inventors: Madhu Sudan Hari, Mridul Agarwal, Kulin N Kothari, John D Pape, Niket K Choudhary
  • Publication number: 20240362027
    Abstract: Techniques are disclosed relating to load value prediction. In some embodiments, a processor includes load address prediction circuitry and load value prediction circuitry. Training circuitry may train loads in a given entry, and may include a first entry configured to store first predicted load address information and a confidence indication of confidence that the first predicted load address information is correct and a second entry configured to store first predicted load value information and a confidence indication of confidence that the first predicted load value information is correct (note a given entry may be configured to load or value prediction at different times). Control circuitry may, in response to an entry in the training circuitry reaching a threshold level of confidence, allocate a corresponding entry in either the load value prediction circuitry or the load address prediction circuitry.
    Type: Application
    Filed: July 5, 2024
    Publication date: October 31, 2024
    Inventors: Yuan C. Chou, Debasish Chandra, Mridul Agarwal, Haoyan Jia
  • Publication number: 20240329990
    Abstract: A system, e.g., a system on a chip (SOC), may include one or more processors. A processor may execute an instruction synchronization barrier (ISB) instruction to enforce an ordering constraint on instructions. To execute the ISB instruction, the processor may determine whether contexts of the processor required for execution of instructions older than the ISB instruction are consumed for the older instructions. Responsive to determining that the contexts are consumed for the older instructions, the processor may initiate fetching of an instruction younger than the ISB instruction, without waiting for the older instructions to retire.
    Type: Application
    Filed: June 11, 2024
    Publication date: October 3, 2024
    Applicant: Apple Inc.
    Inventors: Deepankar Duggal, Kulin N Kothari, Mridul Agarwal, Chang Xu, Yanran Yang, Richard F Russo, Yuan C Chou, Douglas C Holman
  • Publication number: 20240311319
    Abstract: An interrupt delivery mechanism for a system includes and interrupt controller and a plurality of cluster interrupt controllers coupled to respective pluralities of processors in an embodiment. The interrupt controller may serially transmit an interrupt request to respective cluster interrupt controllers, which may acknowledge (Ack) or non-acknowledge (Nack) the interrupt based on attempting to deliver the interrupt to processors to which the cluster interrupt controller is coupled. In a soft iteration, the cluster interrupt controller may attempt to deliver the interrupt to processors that are powered on, without attempting to power on processors that are powered off. If the soft iteration does not result in an Ack response from one of the plurality of cluster interrupt controllers, a hard iteration may be performed in which the powered-off processors may be powered on.
    Type: Application
    Filed: May 24, 2024
    Publication date: September 19, 2024
    Inventors: Jeffrey E. Gonion, Charles E. Tucker, Tal Kuzi, Richard F. Russo, Mridul Agarwal, Christopher M. Tsay, Gideon N. Levinsky, Shih-Chieh Wen, Lior Zimet
  • Patent number: 12067398
    Abstract: Techniques are disclosed relating to load value prediction. In some embodiments, a processor includes learning table circuitry that is shared for both address and value prediction. Loads may be trained for value prediction when they are eligible for both value and address prediction. Entries in the learning table may be promoted to an address prediction table or a load value prediction table for prediction, e.g., when they reach a threshold confidence level in the training table. In some embodiments, the learning table stores a hash of a predicted load value and control circuitry uses a probing load to retrieve the actual predicted load value for the value prediction table.
    Type: Grant
    Filed: April 29, 2022
    Date of Patent: August 20, 2024
    Assignee: Apple Inc.
    Inventors: Yuan C. Chou, Debasish Chandra, Mridul Agarwal, Haoyan Jia
  • Publication number: 20240248844
    Abstract: In an embodiment, a processor implements a different atomicity size (for memory consistency order) than the operation size. More particularly, the processor may implement a smaller atomicity size than the operation size. For example, for multiple register loads, the atomicity size may be the register size. In another example, the vector element size may be the atomicity size for vector load instructions. In yet another example, multiple contiguous vector elements, but fewer than all the vector elements in a vector register, may be the atomicity size for vector load instructions.
    Type: Application
    Filed: February 26, 2024
    Publication date: July 25, 2024
    Inventors: Francesco Spadini, Gideon Levinsky, Mridul Agarwal
  • Patent number: 12045615
    Abstract: A system, e.g., a system on a chip (SOC), may include one or more processors. A processor may execute an instruction synchronization barrier (ISB) instruction to enforce an ordering constraint on instructions. To execute the ISB instruction, the processor may determine whether contexts of the processor required for execution of instructions older than the ISB instruction are consumed for the older instructions. Responsive to determining that the contexts are consumed for the older instructions, the processor may initiate fetching of an instruction younger than the ISB instruction, without waiting for the older instructions to retire.
    Type: Grant
    Filed: September 16, 2022
    Date of Patent: July 23, 2024
    Assignee: Apple Inc.
    Inventors: Deepankar Duggal, Kulin N Kothari, Mridul Agarwal, Chang Xu, Yanran Yang, Richard F Russo, Yuan C Chou, Douglas C Holman
  • Patent number: 12007920
    Abstract: An interrupt delivery mechanism for a system includes and interrupt controller and a plurality of cluster interrupt controllers coupled to respective pluralities of processors in an embodiment. The interrupt controller may serially transmit an interrupt request to respective cluster interrupt controllers, which may acknowledge (Ack) or non-acknowledge (Nack) the interrupt based on attempting to deliver the interrupt to processors to which the cluster interrupt controller is coupled. In a soft iteration, the cluster interrupt controller may attempt to deliver the interrupt to processors that are powered on, without attempting to power on processors that are powered off. If the soft iteration does not result in an Ack response from one of the plurality of cluster interrupt controllers, a hard iteration may be performed in which the powered-off processors may be powered on.
    Type: Grant
    Filed: April 17, 2023
    Date of Patent: June 11, 2024
    Assignee: Apple Inc.
    Inventors: Jeffrey E. Gonion, Charles E. Tucker, Tal Kuzi, Richard F. Russo, Mridul Agarwal, Christopher M. Tsay, Gideon N. Levinsky, Shih-Chieh Wen, Lior Zimet
  • Patent number: 11914511
    Abstract: In an embodiment, a processor implements a different atomicity size (for memory consistency order) than the operation size. More particularly, the processor may implement a smaller atomicity size than the operation size. For example, for multiple register loads, the atomicity size may be the register size. In another example, the vector element size may be the atomicity size for vector load instructions. In yet another example, multiple contiguous vector elements, but fewer than all the vector elements in a vector register, may be the atomicity size for vector load instructions.
    Type: Grant
    Filed: June 22, 2020
    Date of Patent: February 27, 2024
    Assignee: Apple Inc.
    Inventors: Francesco Spadini, Gideon Levinsky, Mridul Agarwal
  • Patent number: 11829763
    Abstract: A system and method for efficiently reducing the latency of load operations. In various embodiments, logic of a processor accesses a prediction table after fetching instructions. For a prediction table hit, the logic executes a load instruction with a retrieved predicted address from the prediction table. For a prediction table miss, when the logic determines the address of the load instruction and hits in a learning table, the logic updates a level of confidence indication to indicate a higher level of confidence when a stored address matches the determined address. When the logic determines the level of confidence indication stored in a given table entry of the learning table meets a threshold, the logic allocates, in the prediction table, information stored in the given entry. Therefore, the predicted address is available during the next lookup of the prediction table.
    Type: Grant
    Filed: August 13, 2019
    Date of Patent: November 28, 2023
    Assignee: Apple Inc.
    Inventors: Yuan C. Chou, Viney Gautam, Wei-Han Lien, Kulin N. Kothari, Mridul Agarwal
  • Publication number: 20230333851
    Abstract: Techniques are disclosed relating to data synchronization barrier operations. A system includes a first processor that may receive a data barrier operation request from a second processor include in the system. Based on receiving that data barrier operation request from the second processor, the first processor may ensure that outstanding load/store operations executed by the first processor that are directed to addresses outside of an exclusion region have been completed. The first processor may respond to the second processor that the data barrier operation request is complete at the first processor, even in the case that one or more load/store operations that are directed to addresses within the exclusion region are outstanding and not complete when the first processor responds that the data barrier operation request is complete.
    Type: Application
    Filed: June 16, 2023
    Publication date: October 19, 2023
    Inventors: Jeff Gonion, John H. Kelm, James Vash, Pradeep Kanapathipillai, Mridul Agarwal, Gideon N. Levinsky, Richard F. Russo, Christopher M. Tsay
  • Publication number: 20230251985
    Abstract: An interrupt delivery mechanism for a system includes and interrupt controller and a plurality of cluster interrupt controllers coupled to respective pluralities of processors in an embodiment. The interrupt controller may serially transmit an interrupt request to respective cluster interrupt controllers, which may acknowledge (Ack) or non-acknowledge (Nack) the interrupt based on attempting to deliver the interrupt to processors to which the cluster interrupt controller is coupled. In a soft iteration, the cluster interrupt controller may attempt to deliver the interrupt to processors that are powered on, without attempting to power on processors that are powered off. If the soft iteration does not result in an Ack response from one of the plurality of cluster interrupt controllers, a hard iteration may be performed in which the powered-off processors may be powered on.
    Type: Application
    Filed: April 17, 2023
    Publication date: August 10, 2023
    Inventors: Jeffrey E. Gonion, Charles E. Tucker, Tal Kuzi, Richard F. Russo, Mridul Agarwal, Christopher M. Tsay, Gideon N. Levinsky, Shih-Chieh Wen, Lior Zimet
  • Patent number: 11720360
    Abstract: Techniques are disclosed relating to data synchronization barrier operations. A system includes a first processor that may receive a data barrier operation request from a second processor include in the system. Based on receiving that data barrier operation request from the second processor, the first processor may ensure that outstanding load/store operations executed by the first processor that are directed to addresses outside of an exclusion region have been completed. The first processor may respond to the second processor that the data barrier operation request is complete at the first processor, even in the case that one or more load/store operations that are directed to addresses within the exclusion region are outstanding and not complete when the first processor responds that the data barrier operation request is complete.
    Type: Grant
    Filed: September 8, 2021
    Date of Patent: August 8, 2023
    Assignee: Apple Inc.
    Inventors: Jeff Gonion, John H. Kelm, James Vash, Pradeep Kanapathipillai, Mridul Agarwal, Gideon N. Levinsky, Richard F. Russo, Christopher M. Tsay
  • Patent number: 11630789
    Abstract: An interrupt delivery mechanism for a system includes and interrupt controller and a plurality of cluster interrupt controllers coupled to respective pluralities of processors in an embodiment. The interrupt controller may serially transmit an interrupt request to respective cluster interrupt controllers, which may acknowledge (Ack) or non-acknowledge (Nack) the interrupt based on attempting to deliver the interrupt to processors to which the cluster interrupt controller is coupled. In a soft iteration, the cluster interrupt controller may attempt to deliver the interrupt to processors that are powered on, without attempting to power on processors that are powered off. If the soft iteration does not result in an Ack response from one of the plurality of cluster interrupt controllers, a hard iteration may be performed in which the powered-off processors may be powered on.
    Type: Grant
    Filed: April 30, 2021
    Date of Patent: April 18, 2023
    Assignee: Apple Inc.
    Inventors: Jeffrey E. Gonion, Charles E. Tucker, Tal Kuzi, Richard F. Russo, Mridul Agarwal, Christopher M. Tsay, Gideon N. Levinsky, Shih-Chieh Wen, Lior Zimet
  • Patent number: 11500638
    Abstract: A method and system for compressing and decompressing data is disclosed. A compression command may initiate the prefetching of first data, which may be stored in a first buffer. Multiple words of the first data may be read from the first buffer and used to generate a plurality of compressed packets, each of which includes a command specifying a type of packet. The compressed packets may be combined into a group and multiple groups may be combined and stored in a second buffer. A decompression command may initiate the prefetching of second data, which is stored in the first buffer. A portion of the second data may be read from the first buffer and used to generate a group of compressed packets. Multiple output words may be generated dependent upon the group of compressed packets.
    Type: Grant
    Filed: January 10, 2020
    Date of Patent: November 15, 2022
    Assignee: Apple Inc.
    Inventors: Aditya Kesiraju, James Vash, Pradeep Kanapathipillai, Mridul Agarwal, Zhaoming Hu, Tyler Huberty, Charles Tucker