Patents by Inventor Ramesh Gunna
Ramesh Gunna has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 8892841Abstract: In one embodiment, a processor may be configured to write ECC granular stores into the data cache, while non-ECC granular stores may be merged with cache data in a memory request buffer. In one embodiment, a processor may be configured to detect that a victim block writeback hits one or more stores in a memory request buffer (or vice versa) and may convert the victim block writeback to a fill. In one embodiment, a processor may speculatively issue stores that are subsequent to a load from a load/store queue, but prevent the update for the stores in response to a snoop hit on the load.Type: GrantFiled: July 9, 2012Date of Patent: November 18, 2014Assignee: Apple Inc.Inventors: Ramesh Gunna, Po-Yung Chang, Sudarshan Kadambi
-
Patent number: 8555040Abstract: In one embodiment, a processor implements an indirect branch target predictor to predict target addresses of indirect branch instructions. The indirect branch target predictor may store target addresses generated during previous executions of indirect branches, and may use the stored target addresses as predictions for current indirect branches. The indirect branch target predictor may also store a validation tag corresponding to each stored target address. The validation tag may be compared to similar data corresponding to the current indirect branch being predicted. If the validation tag does not match, the indirect branch is presumed to be mispredicted (since the branch target address actually belongs to a different instruction). The indirect branch target predictor may inhibit speculative execution subsequent to the mispredicted indirect branch until the redirect is signalled for the mispredicted indirect branch.Type: GrantFiled: May 24, 2010Date of Patent: October 8, 2013Assignee: Apple Inc.Inventors: Andrew J. Beaumont-Smith, Ramesh Gunna
-
Patent number: 8364907Abstract: In one embodiment, a processor may be configured to write ECC granular stores into the data cache, while non-ECC granular stores may be merged with cache data in a memory request buffer. In one embodiment, a processor may be configured to detect that a victim block writeback hits one or more stores in a memory request buffer (or vice versa) and may convert the victim block writeback to a fill. In one embodiment, a processor may speculatively issue stores that are subsequent to a load from a load/store queue, but prevent the update for the stores in response to a snoop hit on the load.Type: GrantFiled: January 27, 2012Date of Patent: January 29, 2013Assignee: Apple Inc.Inventors: Ramesh Gunna, Sudarshan Kadambi
-
Patent number: 8359414Abstract: An interface unit may comprise a buffer configured to store requests that are to be transmitted on an interconnect and a control unit coupled to the buffer. In one embodiment, the control unit is coupled to receive a retry response from the interconnect during a response phase of a first transaction for a first request stored in the buffer. The control unit is configured to record an identifier supplied on the interconnect with the retry response that identifies a second transaction that is in progress on the interconnect. The control unit is configured to inhibit reinitiation of the first transaction at least until detecting a second transmission of the identifier. In another embodiment, the control unit is configured to assert a retry response during a response phase of a first transaction responsive to a snoop hit of the first transaction on a first request stored in the buffer for which a second transaction is in progress on the interconnect.Type: GrantFiled: June 21, 2011Date of Patent: January 22, 2013Assignee: Apple Inc.Inventors: James B. Keller, Sridhar P. Subramanian, Ramesh Gunna
-
Patent number: 8352685Abstract: In an embodiment, a combining write buffer is configured to maintain one or more flush metrics to determine when to transmit write operations from buffer entries. The combining write buffer may be configured to dynamically modify the flush metrics in response to activity in the write buffer, modifying the conditions under which write operations are transmitted from the write buffer to the next lower level of memory. For example, in one implementation, the flush metrics may include categorizing write buffer entries as “collapsed.” A collapsed write buffer entry, and the collapsed write operations therein, may include at least one write operation that has overwritten data that was written by a previous write operation in the buffer entry. In another implementation, the combining write buffer may maintain the threshold of buffer fullness as a flush metric and may adjust it over time based on the actual buffer fullness.Type: GrantFiled: August 20, 2010Date of Patent: January 8, 2013Assignee: Apple Inc.Inventors: Peter J. Bannon, Andrew J. Beaumont-Smith, Ramesh Gunna, Wei-han Lien, Brian P. Lilly, Jaidev P. Patwardhan, Shih-Chieh R. Wen, Tse-Yu Yeh
-
Patent number: 8347040Abstract: In one embodiment, a system comprises a plurality of agents coupled to an interconnect and a cache coupled to the interconnect. The plurality of agents are configured to cache data. A first agent of the plurality of agents is configured to initiate a transaction on the interconnect by transmitting a memory request, and other agents of the plurality of agents are configured to snoop the memory request from the interconnect. The other agents provide a response in a response phase of the transaction on the interconnect. The cache is configured to detect a hit for the memory request and to provide data for the transaction to the first agent prior to the response phase and independent of the response.Type: GrantFiled: April 18, 2011Date of Patent: January 1, 2013Assignee: Apple Inc.Inventors: Brian P. Lilly, Sridhar P. Subramanian, Ramesh Gunna
-
Patent number: 8341379Abstract: In one embodiment, a processor comprises a memory management unit (MMU) and an interface unit coupled to the MMU and to an interface unit of the processor. The MMU comprises a queue configured to store pending hardware-generated page table entry (PTE) updates. The interface unit is configured to receive a synchronization operation on the interface that is defined to cause the pending hardware-generated PTE updates, if any, to be written to memory. The MMU is configured to accept a subsequent hardware-generated PTE update generated subsequent to receiving the synchronization operation even if the synchronization operation has not completed on the interface. In some embodiments, the MMU may accept the subsequent PTE update responsive to transmitting the pending PTE updates from the queue. In other embodiments, the pending PTE updates may be identified in the queue and subsequent updates may be received.Type: GrantFiled: May 5, 2010Date of Patent: December 25, 2012Assignee: Apple Inc.Inventors: Jesse Pan, Ramesh Gunna
-
Publication number: 20120278685Abstract: In one embodiment, a processor may be configured to write ECC granular stores into the data cache, while non-ECC granular stores may be merged with cache data in a memory request buffer. In one embodiment, a processor may be configured to detect that a victim block writeback hits one or more stores in a memory request buffer (or vice versa) and may convert the victim block writeback to a fill. In one embodiment, a processor may speculatively issue stores that are subsequent to a load from a load/store queue, but prevent the update for the stores in response to a snoop hit on the load.Type: ApplicationFiled: July 9, 2012Publication date: November 1, 2012Inventors: Ramesh Gunna, Po-Yung Chang, Sudarshan Kadambi
-
Patent number: 8301843Abstract: In one embodiment, a processor comprises a core configured to execute a data cache block write instruction and an interface unit coupled to the core and to an interconnect on which the processor is configured to communicate. The core is configured to transmit a request to the interface unit in response to the data cache block write instruction. If the request is speculative, the interface unit is configured to issue a first transaction on the interconnect. On the other hand, if the request is non-speculative, the interface unit is configured to issue a second transaction on the interconnect. The second transaction is different from the first transaction. For example, the second transaction may be an invalidate transaction and the first transaction may be a probe transaction. In some embodiments, the processor may be in a system including the interconnect and one or more caching agents.Type: GrantFiled: December 30, 2009Date of Patent: October 30, 2012Assignee: Apple Inc.Inventors: Ramesh Gunna, Sudarshan Kadambi, Peter J. Bannon
-
Patent number: 8255670Abstract: In one embodiment, a processor comprises a scheduler configured to issue a first instruction operation to be executed and an execution core coupled to the scheduler. Configured to execute the first instruction operation, the execution core comprises a plurality of replay sources configured to cause a replay of the first instruction operation responsive to detecting at least one of a plurality of replay cases. The scheduler is configured to inhibit issuance of the first instruction operation subsequent to the replay for a subset of the plurality of replay cases. The scheduler is coupled to receive an acknowledgement indication corresponding to each of the plurality of replay cases in the subset, and is configured to inhibit issuance of the first instruction operation until the acknowledgement indication is asserted that corresponds to an identified replay case of the subset.Type: GrantFiled: November 17, 2009Date of Patent: August 28, 2012Assignee: Apple Inc.Inventors: Po-Yung Chang, Wei-Han Lien, Jesse Pan, Ramesh Gunna, Tse-Yu Yeh, James B. Keller
-
Patent number: 8239638Abstract: In one embodiment, a processor may be configured to write ECC granular stores into the data cache, while non-ECC granular stores may be merged with cache data in a memory request buffer. In one embodiment, a processor may be configured to detect that a victim block writeback hits one or more stores in a memory request buffer (or vice versa) and may convert the victim block writeback to a fill. In one embodiment, a processor may speculatively issue stores that are subsequent to a load from a load/store queue, but prevent the update for the stores in response to a snoop hit on the load.Type: GrantFiled: June 5, 2007Date of Patent: August 7, 2012Assignee: Apple Inc.Inventors: Ramesh Gunna, Po-Yung Chang, Sudarshan Kadambi
-
Publication number: 20120131281Abstract: In one embodiment, a processor may be configured to write ECC granular stores into the data cache, while non-ECC granular stores may be merged with cache data in a memory request buffer. In one embodiment, a processor may be configured to detect that a victim block writeback hits one or more stores in a memory request buffer (or vice versa) and may convert the victim block writeback to a fill. In one embodiment, a processor may speculatively issue stores that are subsequent to a load from a load/store queue, but prevent the update for the stores in response to a snoop hit on the load.Type: ApplicationFiled: January 27, 2012Publication date: May 24, 2012Inventors: Ramesh Gunna, Sudarshan Kadambi
-
Patent number: 8171326Abstract: In one embodiment, a processor comprises a data cache configured to store a plurality of cache blocks and a control unit coupled to the data cache. The control unit is configured to flush the plurality of cache blocks from the data cache responsive to an indication that the processor is to transition to a low power state in which one or more clocks for the processor are inhibited.Type: GrantFiled: May 24, 2010Date of Patent: May 1, 2012Assignee: Apple Inc.Inventors: James B. Keller, Tse-Yu Yeh, Ramesh Gunna, Brian J. Campbell
-
Patent number: 8131946Abstract: In one embodiment, a processor may be configured to write ECC granular stores into the data cache, while non-ECC granular stores may be merged with cache data in a memory request buffer. In one embodiment, a processor may be configured to detect that a victim block writeback hits one or more stores in a memory request buffer (or vice versa) and may convert the victim block writeback to a fill. In one embodiment, a processor may speculatively issue stores that are subsequent to a load from a load/store queue, but prevent the update for the stores in response to a snoop hit on the load.Type: GrantFiled: October 20, 2010Date of Patent: March 6, 2012Assignee: Apple Inc.Inventors: Ramesh Gunna, Sudarshan Kadambi
-
Publication number: 20120047332Abstract: In an embodiment, a combining write buffer is configured to maintain one or more flush metrics to determine when to transmit write operations from buffer entries. The combining write buffer may be configured to dynamically modify the flush metrics in response to activity in the write buffer, modifying the conditions under which write operations are transmitted from the write buffer to the next lower level of memory. For example, in one implementation, the flush metrics may include categorizing write buffer entries as “collapsed.” A collapsed write buffer entry, and the collapsed write operations therein, may include at least one write operation that has overwritten data that was written by a previous write operation in the buffer entry. In another implementation, the combining write buffer may maintain the threshold of buffer fullness as a flush metric and may adjust it over time based on the actual buffer fullness.Type: ApplicationFiled: August 20, 2010Publication date: February 23, 2012Inventors: Peter J. Bannon, Andrew J. Beaumont-Smith, Ramesh Gunna, Wei-han Lien, Brian P. Lilly, Jaidev P. Patwardhan, Shih-Chieh R. Wen, Tse-Yu Yeh
-
Publication number: 20110289300Abstract: In one embodiment, a processor implements an indirect branch target predictor to predict target addresses of indirect branch instructions. The indirect branch target predictor may store target addresses generated during previous executions of indirect branches, and may use the stored target addresses as predictions for current indirect branches. The indirect branch target predictor may also store a validation tag corresponding to each stored target address. The validation tag may be compared to similar data corresponding to the current indirect branch being predicted. If the validation tag does not match, the indirect branch is presumed to be mispredicted (since the branch target address actually belongs to a different instruction). The indirect branch target predictor may inhibit speculative execution subsequent to the mispredicted indirect branch until the redirect is signalled for the mispredicted indirect branch.Type: ApplicationFiled: May 24, 2010Publication date: November 24, 2011Inventors: Andrew J. Beaumont-Smith, Ramesh Gunna
-
Publication number: 20110252165Abstract: An interface unit may comprise a buffer configured to store requests that are to be transmitted on an interconnect and a control unit coupled to the buffer. In one embodiment, the control unit is coupled to receive a retry response from the interconnect during a response phase of a first transaction for a first request stored in the buffer. The control unit is configured to record an identifier supplied on the interconnect with the retry response that identifies a second transaction that is in progress on the interconnect. The control unit is configured to inhibit reinitiation of the first transaction at least until detecting a second transmission of the identifier. In another embodiment, the control unit is configured to assert a retry response during a response phase of a first transaction responsive to a snoop hit of the first transaction on a first request stored in the buffer for which a second transaction is in progress on the interconnect.Type: ApplicationFiled: June 21, 2011Publication date: October 13, 2011Inventors: James B. Keller, Sridhar P. Subramanian, Ramesh Gunna
-
Publication number: 20110197030Abstract: In one embodiment, a system comprises a plurality of agents coupled to an interconnect and a cache coupled to the interconnect. The plurality of agents are configured to cache data. A first agent of the plurality of agents is configured to initiate a transaction on the interconnect by transmitting a memory request, and other agents of the plurality of agents are configured to snoop the memory request from the interconnect. The other agents provide a response in a response phase of the transaction on the interconnect. The cache is configured to detect a hit for the memory request and to provide data for the transaction to the first agent prior to the response phase and independent of the response.Type: ApplicationFiled: April 18, 2011Publication date: August 11, 2011Inventors: Brian P. Lilly, Sridhar P. Subramanian, Ramesh Gunna
-
Patent number: 7991928Abstract: An interface unit may comprise a buffer configured to store requests that are to be transmitted on an interconnect and a control unit coupled to the buffer. In one embodiment, the control unit is coupled to receive a retry response from the interconnect during a response phase of a first transaction for a first request stored in the buffer. The control unit is configured to record an identifier supplied on the interconnect with the retry response that identifies a second transaction that is in progress on the interconnect. The control unit is configured to inhibit reinitiation of the first transaction at least until detecting a second transmission of the identifier. In another embodiment, the control unit is configured to assert a retry response during a response phase of a first transaction responsive to a snoop hit of the first transaction on a first request stored in the buffer for which a second transaction is in progress on the interconnect.Type: GrantFiled: March 20, 2009Date of Patent: August 2, 2011Assignee: Apple Inc.Inventors: James B. Keller, Sridhar P. Subramanian, Ramesh Gunna
-
Patent number: 7970970Abstract: In one embodiment, a switch is configured to be coupled to an interconnect. The switch comprises a plurality of storage locations and an arbiter control circuit coupled to the plurality of storage locations. The plurality of storage locations are configured to store a plurality of requests transmitted by a plurality of agents. The arbiter control circuit is configured to arbitrate among the plurality of requests stored in the plurality of storage locations. A selected request is the winner of the arbitration, and the switch is configured to transmit the selected request from one of the plurality of storage locations onto the interconnect. In another embodiment, a system comprises a plurality of agents, an interconnect, and the switch coupled to the plurality of agents and the interconnect. In another embodiment, a method is contemplated.Type: GrantFiled: May 26, 2010Date of Patent: June 28, 2011Assignee: Apple Inc.Inventors: Sridhar P. Subramanian, James B. Keller, Ruchi Wadhawan, George Kong Yiu, Ramesh Gunna