Patents by Inventor Sudarshan Kadambi

Sudarshan Kadambi has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8892841
    Abstract: In one embodiment, a processor may be configured to write ECC granular stores into the data cache, while non-ECC granular stores may be merged with cache data in a memory request buffer. In one embodiment, a processor may be configured to detect that a victim block writeback hits one or more stores in a memory request buffer (or vice versa) and may convert the victim block writeback to a fill. In one embodiment, a processor may speculatively issue stores that are subsequent to a load from a load/store queue, but prevent the update for the stores in response to a snoop hit on the load.
    Type: Grant
    Filed: July 9, 2012
    Date of Patent: November 18, 2014
    Assignee: Apple Inc.
    Inventors: Ramesh Gunna, Po-Yung Chang, Sudarshan Kadambi
  • Patent number: 8364907
    Abstract: In one embodiment, a processor may be configured to write ECC granular stores into the data cache, while non-ECC granular stores may be merged with cache data in a memory request buffer. In one embodiment, a processor may be configured to detect that a victim block writeback hits one or more stores in a memory request buffer (or vice versa) and may convert the victim block writeback to a fill. In one embodiment, a processor may speculatively issue stores that are subsequent to a load from a load/store queue, but prevent the update for the stores in response to a snoop hit on the load.
    Type: Grant
    Filed: January 27, 2012
    Date of Patent: January 29, 2013
    Assignee: Apple Inc.
    Inventors: Ramesh Gunna, Sudarshan Kadambi
  • Patent number: 8316188
    Abstract: In one embodiment, a processor comprises a prefetch unit coupled to a data cache. The prefetch unit is configured to concurrently maintain a plurality of separate, active prefetch streams. Each prefetch stream is either software initiated via execution by the processor of a dedicated prefetch instruction or hardware initiated via detection of a data cache miss by one or more load/store memory operations. The prefetch unit is further configured to generate prefetch requests responsive to the plurality of prefetch streams to prefetch data in to the data cache. In an embodiment, the prefetch unit is configured to check for a cache hit for a prefetch request by checking a duplicate cache tags.
    Type: Grant
    Filed: June 21, 2011
    Date of Patent: November 20, 2012
    Assignee: Apple Inc.
    Inventors: Sudarshan Kadambi, Puneet Kumar, Po-Yung Chang
  • Publication number: 20120278685
    Abstract: In one embodiment, a processor may be configured to write ECC granular stores into the data cache, while non-ECC granular stores may be merged with cache data in a memory request buffer. In one embodiment, a processor may be configured to detect that a victim block writeback hits one or more stores in a memory request buffer (or vice versa) and may convert the victim block writeback to a fill. In one embodiment, a processor may speculatively issue stores that are subsequent to a load from a load/store queue, but prevent the update for the stores in response to a snoop hit on the load.
    Type: Application
    Filed: July 9, 2012
    Publication date: November 1, 2012
    Inventors: Ramesh Gunna, Po-Yung Chang, Sudarshan Kadambi
  • Patent number: 8301843
    Abstract: In one embodiment, a processor comprises a core configured to execute a data cache block write instruction and an interface unit coupled to the core and to an interconnect on which the processor is configured to communicate. The core is configured to transmit a request to the interface unit in response to the data cache block write instruction. If the request is speculative, the interface unit is configured to issue a first transaction on the interconnect. On the other hand, if the request is non-speculative, the interface unit is configured to issue a second transaction on the interconnect. The second transaction is different from the first transaction. For example, the second transaction may be an invalidate transaction and the first transaction may be a probe transaction. In some embodiments, the processor may be in a system including the interconnect and one or more caching agents.
    Type: Grant
    Filed: December 30, 2009
    Date of Patent: October 30, 2012
    Assignee: Apple Inc.
    Inventors: Ramesh Gunna, Sudarshan Kadambi, Peter J. Bannon
  • Patent number: 8239638
    Abstract: In one embodiment, a processor may be configured to write ECC granular stores into the data cache, while non-ECC granular stores may be merged with cache data in a memory request buffer. In one embodiment, a processor may be configured to detect that a victim block writeback hits one or more stores in a memory request buffer (or vice versa) and may convert the victim block writeback to a fill. In one embodiment, a processor may speculatively issue stores that are subsequent to a load from a load/store queue, but prevent the update for the stores in response to a snoop hit on the load.
    Type: Grant
    Filed: June 5, 2007
    Date of Patent: August 7, 2012
    Assignee: Apple Inc.
    Inventors: Ramesh Gunna, Po-Yung Chang, Sudarshan Kadambi
  • Publication number: 20120131281
    Abstract: In one embodiment, a processor may be configured to write ECC granular stores into the data cache, while non-ECC granular stores may be merged with cache data in a memory request buffer. In one embodiment, a processor may be configured to detect that a victim block writeback hits one or more stores in a memory request buffer (or vice versa) and may convert the victim block writeback to a fill. In one embodiment, a processor may speculatively issue stores that are subsequent to a load from a load/store queue, but prevent the update for the stores in response to a snoop hit on the load.
    Type: Application
    Filed: January 27, 2012
    Publication date: May 24, 2012
    Inventors: Ramesh Gunna, Sudarshan Kadambi
  • Patent number: 8131946
    Abstract: In one embodiment, a processor may be configured to write ECC granular stores into the data cache, while non-ECC granular stores may be merged with cache data in a memory request buffer. In one embodiment, a processor may be configured to detect that a victim block writeback hits one or more stores in a memory request buffer (or vice versa) and may convert the victim block writeback to a fill. In one embodiment, a processor may speculatively issue stores that are subsequent to a load from a load/store queue, but prevent the update for the stores in response to a snoop hit on the load.
    Type: Grant
    Filed: October 20, 2010
    Date of Patent: March 6, 2012
    Assignee: Apple Inc.
    Inventors: Ramesh Gunna, Sudarshan Kadambi
  • Publication number: 20110264864
    Abstract: In one embodiment, a processor comprises a prefetch unit coupled to a data cache. The prefetch unit is configured to concurrently maintain a plurality of separate, active prefetch streams. Each prefetch stream is either software initiated via execution by the processor of a dedicated prefetch instruction or hardware initiated via detection of a data cache miss by one or more load/store memory operations. The prefetch unit is further configured to generate prefetch requests responsive to the plurality of prefetch streams to prefetch data in to the data cache.
    Type: Application
    Filed: June 21, 2011
    Publication date: October 27, 2011
    Inventors: Sudarshan Kadambi, Puneet Kumar, Po-Yung Chang
  • Patent number: 7996624
    Abstract: In one embodiment, a processor comprises a prefetch unit coupled to a data cache. The prefetch unit is configured to concurrently maintain a plurality of separate, active prefetch streams. Each prefetch stream is either software initiated via execution by the processor of a dedicated prefetch instruction or hardware initiated via detection of a data cache miss by one or more load/store memory operations. The prefetch unit is further configured to generate prefetch requests responsive to the plurality of prefetch streams to prefetch data in to the data cache.
    Type: Grant
    Filed: July 6, 2010
    Date of Patent: August 9, 2011
    Assignee: Apple Inc.
    Inventors: Sudarshan Kadambi, Puneet Kumar, Po-Yung Chang
  • Patent number: 7984274
    Abstract: In one embodiment, a processor comprises a prediction circuit and another circuit coupled to the prediction circuit. The prediction circuit is configured to predict whether or not a first load instruction will experience a partial store to load forward (PSTLF) event during execution. A PSTLF event occurs if a plurality of bytes, accessed responsive to the first load instruction during execution, include at least a first byte updated responsive to a previous uncommitted store operation and also include at least a second byte not updated responsive to the previous uncommitted store operation. Coupled to receive the first load instruction, the circuit is configured to generate one or more load operations responsive to the first load instruction. The load operations are to be executed in the processor to execute the first load instruction, and a number of the load operations is dependent on the prediction by the prediction circuit.
    Type: Grant
    Filed: June 18, 2009
    Date of Patent: July 19, 2011
    Assignee: Apple Inc.
    Inventors: Sudarshan Kadambi, Po-Yung Chang, Eric Hao
  • Publication number: 20110047336
    Abstract: In one embodiment, a processor may be configured to write ECC granular stores into the data cache, while non-ECC granular stores may be merged with cache data in a memory request buffer. In one embodiment, a processor may be configured to detect that a victim block writeback hits one or more stores in a memory request buffer (or vice versa) and may convert the victim block writeback to a fill. In one embodiment, a processor may speculatively issue stores that are subsequent to a load from a load/store queue, but prevent the update for the stores in response to a snoop hit on the load.
    Type: Application
    Filed: October 20, 2010
    Publication date: February 24, 2011
    Inventors: Ramesh Gunna, Sudarshan Kadambi
  • Patent number: 7836262
    Abstract: In one embodiment, a processor may be configured to write ECC granular stores into the data cache, while non-ECC granular stores may be merged with cache data in a memory request buffer. In one embodiment, a processor may be configured to detect that a victim block writeback hits one or more stores in a memory request buffer (or vice versa) and may convert the victim block writeback to a fill. In one embodiment, a processor may speculatively issue stores that are subsequent to a load from a load/store queue, but prevent the update for the stores in response to a snoop hit on the load.
    Type: Grant
    Filed: June 5, 2007
    Date of Patent: November 16, 2010
    Assignee: Apple Inc.
    Inventors: Ramesh Gunna, Sudarshan Kadambi
  • Publication number: 20100268894
    Abstract: In one embodiment, a processor comprises a prefetch unit coupled to a data cache. The prefetch unit is configured to concurrently maintain a plurality of separate, active prefetch streams. Each prefetch stream is either software initiated via execution by the processor of a dedicated prefetch instruction or hardware initiated via detection of a data cache miss by one or more load/store memory operations. The prefetch unit is further configured to generate prefetch requests responsive to the plurality of prefetch streams to prefetch data in to the data cache.
    Type: Application
    Filed: July 6, 2010
    Publication date: October 21, 2010
    Inventors: Sudarshan Kadambi, Puneet Kumar, Po-Yung Chang
  • Patent number: 7779208
    Abstract: In one embodiment, a processor comprises a prefetch unit coupled to a data cache. The prefetch unit is configured to concurrently maintain a plurality of separate, active prefetch streams. Each prefetch stream is either software initiated via execution by the processor of a dedicated prefetch instruction or hardware initiated via detection of a data cache miss by one or more load/store memory operations. The prefetch unit is further configured to generate prefetch requests responsive to the plurality of prefetch streams to prefetch data in to the data cache.
    Type: Grant
    Filed: January 7, 2009
    Date of Patent: August 17, 2010
    Assignee: Apple Inc.
    Inventors: Sudarshan Kadambi, Puneet Kumar, Po-Yung Chang
  • Publication number: 20100106916
    Abstract: In one embodiment, a processor comprises a core configured to execute a data cache block write instruction and an interface unit coupled to the core and to an interconnect on which the processor is configured to communicate. The core is configured to transmit a request to the interface unit in response to the data cache block write instruction. If the request is speculative, the interface unit is configured to issue a first transaction on the interconnect. On the other hand, if the request is non-speculative, the interface unit is configured to issue a second transaction on the interconnect. The second transaction is different from the first transaction. For example, the second transaction may be an invalidate transaction and the first transaction may be a probe transaction. In some embodiments, the processor may be in a system including the interconnect and one or more caching agents.
    Type: Application
    Filed: December 30, 2009
    Publication date: April 29, 2010
    Inventors: Ramesh Gunna, Sudarshan Kadambi, Peter J. Bannon
  • Patent number: 7707361
    Abstract: In one embodiment, a processor comprises a core configured to execute a data cache block write instruction and an interface unit coupled to the core and to an interconnect on which the processor is configured to communicate. The core is configured to transmit a request to the interface unit in response to the data cache block write instruction. If the request is speculative, the interface unit is configured to issue a first transaction on the interconnect. On the other hand, if the request is non-speculative, the interface unit is configured to issue a second transaction on the interconnect. The second transaction is different from the first transaction. For example, the second transaction may be an invalidate transaction and the first transaction may be a probe transaction. In some embodiments, the processor may be in a system including the interconnect and one or more caching agents.
    Type: Grant
    Filed: November 17, 2005
    Date of Patent: April 27, 2010
    Assignee: Apple Inc.
    Inventors: Ramesh Gunna, Sudarshan Kadambi, Peter J. Bannon
  • Publication number: 20090254734
    Abstract: In one embodiment, a processor comprises a prediction circuit and another circuit coupled to the prediction circuit. The prediction circuit is configured to predict whether or not a first load instruction will experience a partial store to load forward (PSTLF) event during execution. A PSTLF event occurs if a plurality of bytes, accessed responsive to the first load instruction during execution, include at least a first byte updated responsive to a previous uncommitted store operation and also include at least a second byte not updated responsive to the previous uncommitted store operation. Coupled to receive the first load instruction, the circuit is configured to generate one or more load operations responsive to the first load instruction. The load operations are to be executed in the processor to execute the first load instruction, and a number of the load operations is dependent on the prediction by the prediction circuit.
    Type: Application
    Filed: June 18, 2009
    Publication date: October 8, 2009
    Inventors: Sudarshan Kadambi, Po-Yung Chang, Eric Hao
  • Patent number: 7568087
    Abstract: In one embodiment, a processor comprises a prediction circuit and another circuit coupled to the prediction circuit. The prediction circuit is configured to predict whether or not a first load instruction will experience a partial store to load forward (PSTLF) event during execution. A PSTLF event occurs if a plurality of bytes, accessed responsive to the first load instruction during execution, include at least a first byte updated responsive to a previous uncommitted store operation and also include at least a second byte not updated responsive to the previous uncommitted store operation. Coupled to receive the first load instruction, the circuit is configured to generate one or more load operations responsive to the first load instruction. The load operations are to be executed in the processor to execute the first load instruction, and a number of the load operations is dependent on the prediction by the prediction circuit.
    Type: Grant
    Filed: March 25, 2008
    Date of Patent: July 28, 2009
    Assignee: Apple Inc.
    Inventors: Sudarshan Kadambi, Po-Yung Chang, Eric Hao
  • Publication number: 20090119488
    Abstract: In one embodiment, a processor comprises a prefetch unit coupled to a data cache. The prefetch unit is configured to concurrently maintain a plurality of separate, active prefetch streams. Each prefetch stream is either software initiated via execution by the processor of a dedicated prefetch instruction or hardware initiated via detection of a data cache miss by one or more load/store memory operations. The prefetch unit is further configured to generate prefetch requests responsive to the plurality of prefetch streams to prefetch data in to the data cache.
    Type: Application
    Filed: January 7, 2009
    Publication date: May 7, 2009
    Inventors: Sudarshan Kadambi, Puneet Kumar, Po-Yung Chang