Patents by Inventor Pradeep Kanapathipillai

Pradeep Kanapathipillai has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11914524
    Abstract: An electronic device includes one or more processors for executing one or more virtual machines. In response to a request for initiating a synchronization event, a processor identifies a subset of speculative memory access requests in one or more memory access request queues. Automatically and in accordance with the identifying, the processor purges translations associated with the subset of speculative memory access requests. Subsequent to the purging, the processor initiates the synchronization event. In some implementations, memory access completion is forced in response to a context synchronization event that corresponds to a termination of a first application, a termination of a first virtual machine, or a system call for updating a system register. Alternatively, in some implementations, memory access completion is forced in an operating system level or an application level in response to a data synchronization event that is initiated on a hypervisor layer or a firmware layer.
    Type: Grant
    Filed: March 1, 2022
    Date of Patent: February 27, 2024
    Assignee: QUALCOMM Incorporated
    Inventors: Adrian Montero, Huzefa Sanjeliwala, Paul Kitchin, Prarthna Santhanakrishnan, Conrado Blasco, Pradeep Kanapathipillai
  • Patent number: 11797045
    Abstract: An electronic system has a plurality of processing clusters including a first processing cluster. The first processing cluster further includes a plurality of processors and a power management processor. The power management processor obtains performance information about the plurality of processors, executes power instructions to transition a first processor of the plurality of processors from a first performance state to a second performance state different from the first performance state, and executes one or more debug instructions to perform debugging of a respective processor of the plurality of processors. The power instructions are executed in accordance with the obtained performance information and independently of respective performance states of other processors in the plurality of processors of the first processing cluster.
    Type: Grant
    Filed: February 7, 2022
    Date of Patent: October 24, 2023
    Assignee: QUALCOMM Incorporated
    Inventors: Jonathan Masters, Pradeep Kanapathipillai, Manu Gulati, Nitin Makhija
  • Publication number: 20230333851
    Abstract: Techniques are disclosed relating to data synchronization barrier operations. A system includes a first processor that may receive a data barrier operation request from a second processor include in the system. Based on receiving that data barrier operation request from the second processor, the first processor may ensure that outstanding load/store operations executed by the first processor that are directed to addresses outside of an exclusion region have been completed. The first processor may respond to the second processor that the data barrier operation request is complete at the first processor, even in the case that one or more load/store operations that are directed to addresses within the exclusion region are outstanding and not complete when the first processor responds that the data barrier operation request is complete.
    Type: Application
    Filed: June 16, 2023
    Publication date: October 19, 2023
    Inventors: Jeff Gonion, John H. Kelm, James Vash, Pradeep Kanapathipillai, Mridul Agarwal, Gideon N. Levinsky, Richard F. Russo, Christopher M. Tsay
  • Publication number: 20230281133
    Abstract: An electronic device includes one or more processors for executing one or more virtual machines. In response to a request for initiating a synchronization event, a processor identifies a subset of speculative memory access requests in one or more memory access request queues. Automatically and in accordance with the identifying, the processor purges translations associated with the subset of speculative memory access requests. Subsequent to the purging, the processor initiates the synchronization event. In some implementations, memory access completion is forced in response to a context synchronization event that corresponds to a termination of a first application, a termination of a first virtual machine, or a system call for updating a system register. Alternatively, in some implementations, memory access completion is forced in an operating system level or an application level in response to a data synchronization event that is initiated on a hypervisor layer or a firmware layer.
    Type: Application
    Filed: March 1, 2022
    Publication date: September 7, 2023
    Inventors: Adrian MONTERO, Huzefa SANJELIWALA, Paul KITCHIN, Prarthna SANTHANAKRISHNAN, Conrado BLASCO, Pradeep KANAPATHIPILLAI
  • Patent number: 11720360
    Abstract: Techniques are disclosed relating to data synchronization barrier operations. A system includes a first processor that may receive a data barrier operation request from a second processor include in the system. Based on receiving that data barrier operation request from the second processor, the first processor may ensure that outstanding load/store operations executed by the first processor that are directed to addresses outside of an exclusion region have been completed. The first processor may respond to the second processor that the data barrier operation request is complete at the first processor, even in the case that one or more load/store operations that are directed to addresses within the exclusion region are outstanding and not complete when the first processor responds that the data barrier operation request is complete.
    Type: Grant
    Filed: September 8, 2021
    Date of Patent: August 8, 2023
    Assignee: Apple Inc.
    Inventors: Jeff Gonion, John H. Kelm, James Vash, Pradeep Kanapathipillai, Mridul Agarwal, Gideon N. Levinsky, Richard F. Russo, Christopher M. Tsay
  • Publication number: 20230093426
    Abstract: An electronic system has a plurality of processing clusters including a first processing cluster. The first processing cluster further includes a plurality of processors and a power management processor. The power management processor obtains performance information about the plurality of processors, executes power instructions to transition a first processor of the plurality of processors from a first performance state to a second performance state different from the first performance state, and executes one or more debug instructions to perform debugging of a respective processor of the plurality of processors. The power instructions are executed in accordance with the obtained performance information and independently of respective performance states of other processors in the plurality of processors of the first processing cluster.
    Type: Application
    Filed: February 7, 2022
    Publication date: March 23, 2023
    Inventors: Jonathan MASTERS, Pradeep KANAPATHIPILLAI, Manu GULATI, Nitin MAKHIJA
  • Publication number: 20230064603
    Abstract: An electronic device includes a plurality of processors for executing one or more virtual machines. A processor of the plurality of processors is associated with a translation cache and one or more filters corresponding to the translation cache. The one or more filters include a virtual machine identifier filter, and the processor is configured to receive a translation invalidation instruction to invalidate one or more entries in the translation cache. In accordance with a determination that the translation invalidation specifies a respective virtual machine identifier, the processor queries the virtual machine identifier filter associated with the translation cache to determine whether the respective virtual machine identifier is stored in the virtual machine identifier filter.
    Type: Application
    Filed: February 18, 2022
    Publication date: March 2, 2023
    Inventors: Conrado BLASCO, Adrian MONTERO, Paul KITCHIN, Pradeep KANAPATHIPILLAI, Huzefa SANJELIWALA
  • Patent number: 11500638
    Abstract: A method and system for compressing and decompressing data is disclosed. A compression command may initiate the prefetching of first data, which may be stored in a first buffer. Multiple words of the first data may be read from the first buffer and used to generate a plurality of compressed packets, each of which includes a command specifying a type of packet. The compressed packets may be combined into a group and multiple groups may be combined and stored in a second buffer. A decompression command may initiate the prefetching of second data, which is stored in the first buffer. A portion of the second data may be read from the first buffer and used to generate a group of compressed packets. Multiple output words may be generated dependent upon the group of compressed packets.
    Type: Grant
    Filed: January 10, 2020
    Date of Patent: November 15, 2022
    Assignee: Apple Inc.
    Inventors: Aditya Kesiraju, James Vash, Pradeep Kanapathipillai, Mridul Agarwal, Zhaoming Hu, Tyler Huberty, Charles Tucker
  • Publication number: 20220358082
    Abstract: In an embodiment, a coprocessor may include a bypass indication which identifies execution circuitry that is not used by a given processor instruction, and thus may be bypassed. The corresponding circuitry may be disabled during execution, preventing evaluation when the output of the circuitry will not be used for the instruction. In another embodiment, the coprocessor may implement a grid of processing elements in rows and columns, where a given coprocessor instruction may specify an operation that causes up to all of the processing elements to operate on vectors of input operands to produce results. Implementations of the coprocessor may implement a portion of the processing elements. The coprocessor control circuitry may be designed to operate with the full grid or partial grid, reissuing instructions in the partial grid case to perform the requested operation. In still another embodiment, the coprocessor may be able to fuse vector mode operations.
    Type: Application
    Filed: July 20, 2022
    Publication date: November 10, 2022
    Inventors: Aditya Kesiraju, Andrew J. Beaumont-Smith, Boris S. Alvarez-Heredia, Pradeep Kanapathipillai, Ran A. Chachick
  • Publication number: 20220083338
    Abstract: Techniques are disclosed relating to data synchronization barrier operations. A system includes a first processor that may receive a data barrier operation request from a second processor include in the system. Based on receiving that data barrier operation request from the second processor, the first processor may ensure that outstanding load/store operations executed by the first processor that are directed to addresses outside of an exclusion region have been completed. The first processor may respond to the second processor that the data barrier operation request is complete at the first processor, even in the case that one or more load/store operations that are directed to addresses within the exclusion region are outstanding and not complete when the first processor responds that the data barrier operation request is complete.
    Type: Application
    Filed: September 8, 2021
    Publication date: March 17, 2022
    Inventors: Jeff Gonion, John H. Kelm, James Vash, Pradeep Kanapathipillai, Mridul Agarwal, Gideon N. Levinsky, Richard F. Russo, Christopher M. Tsay
  • Patent number: 11221962
    Abstract: A system and method for efficiently transferring address mappings and data access permissions corresponding to the address mappings. A computing system includes at least one processor and memory for storing a page table. In response to receiving a memory access operation comprising a first address, the address translation unit is configured to identify a data access permission based on a permission index corresponding to the first address, and access data stored in a memory location of the memory identified by a second address in a manner defined by the retrieved data access permission. The address translation unit is configured to access a table to identify the data access permission, and is configured to determine the permission index and the second address based on the first address. A single permission index may correspond to different permissions for different entities within the system.
    Type: Grant
    Filed: May 15, 2020
    Date of Patent: January 11, 2022
    Assignee: Apple Inc.
    Inventors: Jeffry E. Gonion, Bernard Joseph Semeria, Michael J. Swift, Pradeep Kanapathipillai, David J. Williamson
  • Patent number: 10990159
    Abstract: Systems, apparatuses, and methods for retaining architected state for relatively frequent switching between sleep and active operating states are described. A processor receives an indication to transition from an active state to a sleep state. The processor stores a copy of a first subset of the architected state information in on-die storage elements capable of retaining storage after power is turned off. The processor supports programmable input/output (PIO) access of particular stored information during the sleep state. When a wakeup event is detected, circuitry within the processor is powered up again. A boot sequence and recovery of architected state from off-chip memory are not performed. Rather than fetch from a memory location pointed to by a reset base address register, the processor instead fetches an instruction from a memory location pointed to by a restored program counter of the retained subset of the architected state information.
    Type: Grant
    Filed: April 25, 2017
    Date of Patent: April 27, 2021
    Assignee: Apple Inc.
    Inventors: Bernard Joseph Semeria, John H. Mylius, Pradeep Kanapathipillai, Richard F. Russo, Shih-Chieh Wen, Richard H. Larson
  • Publication number: 20210064539
    Abstract: A system and method for efficiently transferring address mappings and data access permissions corresponding to the address mappings. A computing system includes at least one processor and memory for storing a page table. In response to receiving a memory access operation comprising a first address, the address translation unit is configured to identify a data access permission based on a permission index corresponding to the first address, and access data stored in a memory location of the memory identified by a second address in a manner defined by the retrieved data access permission. The address translation unit is configured to access a table to identify the data access permission, and is configured to determine the permission index and the second address based on the first address. A single permission index may correspond to different permissions for different entities within the system.
    Type: Application
    Filed: May 15, 2020
    Publication date: March 4, 2021
    Inventors: Jeffry E. Gonion, Bernard Joseph Semeria, Michael J. Swift, Pradeep Kanapathipillai, David J. Williamson
  • Publication number: 20200272597
    Abstract: In an embodiment, a coprocessor may include a bypass indication which identifies execution circuitry that is not used by a given processor instruction, and thus may be bypassed. The corresponding circuitry may be disabled during execution, preventing evaluation when the output of the circuitry will not be used for the instruction. In another embodiment, the coprocessor may implement a grid of processing elements in rows and columns, where a given coprocessor instruction may specify an operation that causes up to all of the processing elements to operate on vectors of input operands to produce results. Implementations of the coprocessor may implement a portion of the processing elements. The coprocessor control circuitry may be designed to operate with the full grid or partial grid, reissuing instructions in the partial grid case to perform the requested operation. In still another embodiment, the coprocessor may be able to fuse vector mode operations.
    Type: Application
    Filed: February 26, 2019
    Publication date: August 27, 2020
    Inventors: Aditya Kesiraju, Andrew J. Beaumont-Smith, Boris S. Alvarez-Heredia, Pradeep Kanapathipillai, Ran A. Chachick, Srikanth Balasubramanian
  • Patent number: 10725928
    Abstract: A system and method for efficiently performing maintenance on a cache. In various embodiments, control logic in a cache controller or elsewhere receives an indication for invalidating a range of virtual-to-physical mappings in a given translation lookaside buffer (TLB). The logic determines a first latency to invalidate entries of the TLB based on a number of addresses in the range and a number of supported page sizes simultaneously stored in the TLB. The logic determines a second latency based on a number of entries in the TLB. If the first latency is greater, then the logic traverses through each TLB entry and invalidates TLB entries storing a virtual address within the range. If the first latency is smaller, then the logic traverses through each address in the range and invalidates TLB entries storing a virtual address within the range.
    Type: Grant
    Filed: January 9, 2019
    Date of Patent: July 28, 2020
    Assignee: Apple Inc.
    Inventors: Brian R. Mestan, Pradeep Kanapathipillai, Joshua William Smith
  • Publication number: 20200218663
    Abstract: A system and method for efficiently performing maintenance on a cache. In various embodiments, control logic in a cache controller or elsewhere receives an indication for invalidating a range of virtual-to-physical mappings in a given translation lookaside buffer (TLB). The logic determines a first latency to invalidate entries of the TLB based on a number of addresses in the range and a number of supported page sizes simultaneously stored in the TLB. The logic determines a second latency based on a number of entries in the TLB. If the first latency is greater, then the logic traverses through each TLB entry and invalidates TLB entries storing a virtual address within the range. If the first latency is smaller, then the logic traverses through each address in the range and invalidates TLB entries storing a virtual address within the range.
    Type: Application
    Filed: January 9, 2019
    Publication date: July 9, 2020
    Inventors: Brian R. Mestan, Pradeep Kanapathipillai, Joshua William Smith
  • Patent number: 10621100
    Abstract: In an embodiment, a processor may implement an access map-pattern match (AMPM)-based prefetch circuit for a multi-level cache system. The access patterns that are matched to the access maps may include prefetches for different cache levels. Centralizing the generation of prefetches into one prefetch circuit may provide better observability and controllability of prefetching at various levels of the cache hierarchy, in an embodiment. Prefetches at different levels may be controlled individually based on the accuracy of those prefetches, in an embodiment. Additionally, in an embodiment, access patterns that are longer that a given threshold may have the granularity of the prefetches change so that more data is prefetched and the prefetches occur farther in advance, in some embodiments.
    Type: Grant
    Filed: December 5, 2018
    Date of Patent: April 14, 2020
    Assignee: Apple Inc.
    Inventors: Stephan G. Meier, Tyler J. Huberty, Gerard R. Williams, III, Pradeep Kanapathipillai
  • Patent number: 10437595
    Abstract: Systems, apparatuses, and methods for optimizing a load-store dependency predictor (LSDP). When a younger load instruction is issued before an older store instruction and the younger load is dependent on the older store, the LSDP is trained on this ordering violation. A replay/flush indicator is stored in a corresponding entry in the LSDP to indicate whether the ordering violation resulted in a flush or replay. On subsequent executions, a dependency may be enforced for the load-store pair if a confidence counter is above a threshold, with the threshold varying based on the status of the replay/flush indicator. If a given load matches on multiple entries in the LSDP, and if at least one of the entries has a flush indicator, then the given load may be marked as a multimatch case and forced to wait to issue until all older stores have issued.
    Type: Grant
    Filed: March 15, 2016
    Date of Patent: October 8, 2019
    Assignee: Apple Inc.
    Inventors: Pradeep Kanapathipillai, Stephan G. Meier, Gerard R. Williams, III, Mridul Agarwal, Kulin N. Kothari
  • Patent number: 10228951
    Abstract: Systems, apparatuses, and methods for committing store instructions out of order from a store queue are described. A processor may store a first store instruction and a second store instruction in the store queue, wherein the first store instruction is older than the second store instruction. In response to determining the second store instruction is ready to commit to the memory hierarchy, the processor may allow the second store instruction to commit before the first store instruction, in response to determining that all store instructions in the store queue older than the second store instruction are non-speculative. However, if it is determined that at least one store instruction in the store queue older than the second store instruction is speculative, the processor may prevent the second store instruction from committing to the memory hierarchy before the first store instruction.
    Type: Grant
    Filed: August 20, 2015
    Date of Patent: March 12, 2019
    Assignee: Apple Inc.
    Inventors: Kulin N. Kothari, Mridul Agarwal, Pradeep Kanapathipillai
  • Patent number: 10180905
    Abstract: In an embodiment, a processor may implement an access map-pattern match (AMPM)-based prefetch circuit for a multi-level cache system. The access patterns that are matched to the access maps may include prefetches for different cache levels. Centralizing the generation of prefetches into one prefetch circuit may provide better observability and controllability of prefetching at various levels of the cache hierarchy, in an embodiment. Prefetches at different levels may be controlled individually based on the accuracy of those prefetches, in an embodiment. Additionally, in an embodiment, access patterns that are longer that a given threshold may have the granularity of the prefetches change so that more data is prefetched and the prefetches occur farther in advance, in some embodiments.
    Type: Grant
    Filed: April 7, 2016
    Date of Patent: January 15, 2019
    Assignee: Apple Inc.
    Inventors: Stephan G. Meier, Tyler J. Huberty, Gerard R. Williams, III, Pradeep Kanapathipillai