Patents by Inventor Peter J. Bannon

Peter J. Bannon has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Maintaining the integrity of an execution return address stack

Patent number: 9354886

Abstract: A processor and method for maintaining the integrity of an execution return address stack (RAS). The execution RAS is maintained in an accurate state by storing information regarding branch instructions in a branch information table. The first time a branch instruction is executed, an entry is allocated and populated in the table. If the branch instruction is re-executed, a pointer address is retrieved from the corresponding table entry and the execution RAS pointer is repositioned to the retrieved pointer address. The execution RAS can also be used to restore a speculative RAS due to a mis-speculation.

Type: Grant

Filed: November 28, 2011

Date of Patent: May 31, 2016

Assignee: Apple Inc.

Inventors: Ramesh B. Gunna, Peter J. Bannon, Andrew J. Beaumont-Smith
Lookahead scanning and cracking of microcode instructions in a dispatch queue

Patent number: 9280352

Abstract: An apparatus and method for avoiding bubbles and maintaining a maximum instruction throughput rate when cracking microcode instructions. A lookahead pointer scans the newest entries of a dispatch queue for microcode instructions. A detected microcode instruction is conveyed to a microcode engine to be cracked into a sequence of micro-ops. Then, the sequence of micro-ops is placed in a queue, and when the original microcode instruction entry in the dispatch queue is selected for dispatch, the sequence of micro-ops is dispatched to the next stage of the processor pipeline.

Type: Grant

Filed: November 30, 2011

Date of Patent: March 8, 2016

Assignee: Apple Inc.

Inventors: Ramesh B. Gunna, Peter J. Bannon, Rajat Goel
Instruction type issue throttling upon reaching threshold by adjusting counter increment amount for issued cycle and decrement amount for not issued cycle

Patent number: 9009451

Abstract: A system and method for reducing power consumption through issue throttling of selected problematic instructions. A power throttle unit within a processor maintains instruction issue counts for associated instruction types. The instruction types may be a subset of supported instruction types executed by an execution core within the processor. The instruction types may be chosen based on high power consumption estimates for processing instructions of these types. The power throttle unit may determine a given instruction issue count exceeds a given threshold. In response, the power throttle unit may select given instruction types to limit a respective issue rate. The power throttle unit may choose an issue rate for each one of the selected given instruction types and limit an associated issue rate to a chosen issue rate. The selection of given instruction types and associated issue rate limits is programmable.

Type: Grant

Filed: October 31, 2011

Date of Patent: April 14, 2015

Assignee: Apple Inc.

Inventors: Daniel C. Murray, Andrew J. Beaumont-Smith, John H. Mylius, Peter J. Bannon, Toshi Takayanagi, Jung Wook Cho
Combining write buffer with dynamically adjustable flush metrics

Patent number: 8566528

Abstract: In an embodiment, a combining write buffer is configured to maintain one or more flush metrics to determine when to transmit write operations from buffer entries. The combining write buffer may be configured to dynamically modify the flush metrics in response to activity in the write buffer, modifying the conditions under which write operations are transmitted from the write buffer to the next lower level of memory. For example, in one implementation, the flush metrics may include categorizing write buffer entries as “collapsed.” A collapsed write buffer entry, and the collapsed write operations therein, may include at least one write operation that has overwritten data that was written by a previous write operation in the buffer entry. In another implementation, the combining write buffer may maintain the threshold of buffer fullness as a flush metric and may adjust it over time based on the actual buffer fullness.

Type: Grant

Filed: December 10, 2012

Date of Patent: October 22, 2013

Assignee: Apple Inc.

Inventors: Peter J. Bannon, Andrew J. Beaumont-Smith, Ramesh B. Gunna, Wei-han Lien, Brian P. Lilly, Jaidev P. Patwardhan, Shih-Chieh R. Wen, Tse-Yu Yeh
EFFICIENT MICROCODE INSTRUCTION DISPATCH

Publication number: 20130138924

Abstract: An apparatus and method for avoiding bubbles and maintaining a maximum instruction throughput rate when cracking microcode instructions. A lookahead pointer scans the newest entries of a dispatch queue for microcode instructions. A detected microcode instruction is conveyed to a microcode engine to be cracked into a sequence of micro-ops. Then, the sequence of micro-ops is placed in a queue, and when the original microcode instruction entry in the dispatch queue is selected for dispatch, the sequence of micro-ops is dispatched to the next stage of the processor pipeline.

Type: Application

Filed: November 30, 2011

Publication date: May 30, 2013

Inventors: Ramesh B. Gunna, Peter J. Bannon, Rajat Goel
MAINTAINING THE INTEGRITY OF AN EXECUTION RETURN ADDRESS STACK

Publication number: 20130138931

Abstract: A processor and method for maintaining the integrity of an execution return address stack (RAS). The execution RAS is maintained in an accurate state by storing information regarding branch instructions in a branch information table. The first time a branch instruction is executed, an entry is allocated and populated in the table. If the branch instruction is re-executed, a pointer address is retrieved from the corresponding table entry and the execution RAS pointer is repositioned to the retrieved pointer address. The execution RAS can also be used to restore a speculative RAS due to a mis-speculation.

Type: Application

Filed: November 28, 2011

Publication date: May 30, 2013

Inventors: Ramesh B. Gunna, Peter J. Bannon, Andrew J. Beaumont-Smith
PROCESSOR INSTRUCTION ISSUE THROTTLING

Publication number: 20130111191

Abstract: A system and method for reducing power consumption through issue throttling of selected problematic instructions. A power throttle unit within a processor maintains instruction issue counts for associated instruction types. The instruction types may be a subset of supported instruction types executed by an execution core within the processor. The instruction types may be chosen based on high power consumption estimates for processing instructions of these types. The power throttle unit may determine a given instruction issue count exceeds a given threshold. In response, the power throttle unit may select given instruction types to limit a respective issue rate. The power throttle unit may choose an issue rate for each one of the selected given instruction types and limit an associated issue rate to a chosen issue rate. The selection of given instruction types and associated issue rate limits is programmable.

Type: Application

Filed: October 31, 2011

Publication date: May 2, 2013

Inventors: Daniel C. Murray, Andrew J. Beaumont-Smith, John H. Mylius, Peter J. Bannon, Toshi Takayanagi, Jung Wook Cho
Processor employing split scheduler in which near, low latency operation dependencies are tracked separate from other operation dependencies

Patent number: 8364936

Abstract: In an embodiment, a scheduler implements a first dependency array that tracks dependencies on instruction operations (ops) within a distance N of a given op and which are short execution latency ops. Other dependencies are tracked in a second dependency array. The first dependency array may evaluate quickly, to support back-to-back issuance of short execution latency ops and their dependent ops. The second array may evaluate more slowly than the first dependency array.

Type: Grant

Filed: July 25, 2012

Date of Patent: January 29, 2013

Assignee: Apple Inc.

Inventors: Andrew J. Beaumont-Smith, Honkai Tam, Daniel C. Murray, John H. Mylius, Peter J. Bannon, Pradeep Kanapathipillai
Combining write buffer with dynamically adjustable flush metrics

Patent number: 8352685

Abstract: In an embodiment, a combining write buffer is configured to maintain one or more flush metrics to determine when to transmit write operations from buffer entries. The combining write buffer may be configured to dynamically modify the flush metrics in response to activity in the write buffer, modifying the conditions under which write operations are transmitted from the write buffer to the next lower level of memory. For example, in one implementation, the flush metrics may include categorizing write buffer entries as “collapsed.” A collapsed write buffer entry, and the collapsed write operations therein, may include at least one write operation that has overwritten data that was written by a previous write operation in the buffer entry. In another implementation, the combining write buffer may maintain the threshold of buffer fullness as a flush metric and may adjust it over time based on the actual buffer fullness.

Type: Grant

Filed: August 20, 2010

Date of Patent: January 8, 2013

Assignee: Apple Inc.

Inventors: Peter J. Bannon, Andrew J. Beaumont-Smith, Ramesh Gunna, Wei-han Lien, Brian P. Lilly, Jaidev P. Patwardhan, Shih-Chieh R. Wen, Tse-Yu Yeh
Split Scheduler

Publication number: 20120290818

Abstract: In an embodiment, a scheduler implements a first dependency array that tracks dependencies on instruction operations (ops) within a distance N of a given op and which are short execution latency ops. Other dependencies are tracked in a second dependency array. The first dependency array may evaluate quickly, to support back-to-back issuance of short execution latency ops and their dependent ops. The second array may evaluate more slowly than the first dependency array.

Type: Application

Filed: July 25, 2012

Publication date: November 15, 2012

Inventors: Andrew J. Beaumont-Smith, Honkai Tam, Daniel C. Murray, John H. Mylius, Peter J. Bannon, Pradeep Kanapathipillai
Data cache block zero implementation

Patent number: 8301843

Abstract: In one embodiment, a processor comprises a core configured to execute a data cache block write instruction and an interface unit coupled to the core and to an interconnect on which the processor is configured to communicate. The core is configured to transmit a request to the interface unit in response to the data cache block write instruction. If the request is speculative, the interface unit is configured to issue a first transaction on the interconnect. On the other hand, if the request is non-speculative, the interface unit is configured to issue a second transaction on the interconnect. The second transaction is different from the first transaction. For example, the second transaction may be an invalidate transaction and the first transaction may be a probe transaction. In some embodiments, the processor may be in a system including the interconnect and one or more caching agents.

Type: Grant

Filed: December 30, 2009

Date of Patent: October 30, 2012

Assignee: Apple Inc.

Inventors: Ramesh Gunna, Sudarshan Kadambi, Peter J. Bannon
Fused store exclusive/memory barrier operation

Patent number: 8285937

Abstract: In an embodiment, a processor may be configured to detect a store exclusive operation followed by a memory barrier operation in a speculative instruction stream being executed by the processor. The processor may fuse the store exclusive operation and the memory barrier operation, creating a fused operation. The fused operation may be transmitted and globally ordered, and the processor may complete both the store exclusive operation and the memory barrier operation in response to the fused operation. As the fused operation progresses through the processor and one or more other components (e.g. caches in the cache hierarchy) to the ordering point in the system, the fused operation may push previous memory operations to effect the memory barrier operation. In some embodiments, the latency for completing the store exclusive operation and the subsequent data memory barrier operation may be reduced if the store exclusive operation is successful at the ordering point.

Type: Grant

Filed: February 24, 2010

Date of Patent: October 9, 2012

Assignee: Apple Inc.

Inventors: Peter J. Bannon, Po-Yung Chang
Processor employing split scheduler in which near, low latency operation dependencies are tracked separate from other operation dependencies

Patent number: 8255671

Abstract: In an embodiment, a scheduler implements a first dependency array that tracks dependencies on instruction operations (ops) within a distance N of a given op and which are short execution latency ops. Other dependencies are tracked in a second dependency array. The first dependency array may evaluate quickly, to support back-to-back issuance of short execution latency ops and their dependent ops. The second array may evaluate more slowly than the first dependency array.

Type: Grant

Filed: December 18, 2008

Date of Patent: August 28, 2012

Assignee: Apple Inc.

Inventors: Andrew J. Beaumont-Smith, Honkai Tam, Daniel C. Murray, John H. Mylius, Peter J. Bannon, Pradeep Kanapathipillai
Combining Write Buffer with Dynamically Adjustable Flush Metrics

Publication number: 20120047332

Abstract: In an embodiment, a combining write buffer is configured to maintain one or more flush metrics to determine when to transmit write operations from buffer entries. The combining write buffer may be configured to dynamically modify the flush metrics in response to activity in the write buffer, modifying the conditions under which write operations are transmitted from the write buffer to the next lower level of memory. For example, in one implementation, the flush metrics may include categorizing write buffer entries as “collapsed.” A collapsed write buffer entry, and the collapsed write operations therein, may include at least one write operation that has overwritten data that was written by a previous write operation in the buffer entry. In another implementation, the combining write buffer may maintain the threshold of buffer fullness as a flush metric and may adjust it over time based on the actual buffer fullness.

Type: Application

Filed: August 20, 2010

Publication date: February 23, 2012

Inventors: Peter J. Bannon, Andrew J. Beaumont-Smith, Ramesh Gunna, Wei-han Lien, Brian P. Lilly, Jaidev P. Patwardhan, Shih-Chieh R. Wen, Tse-Yu Yeh
Fused Store Exclusive/Memory Barrier Operation

Publication number: 20110208915

Abstract: In an embodiment, a processor may be configured to detect a store exclusive operation followed by a memory barrier operation in a speculative instruction stream being executed by the processor. The processor may fuse the store exclusive operation and the memory barrier operation, creating a fused operation. The fused operation may be transmitted and globally ordered, and the processor may complete both the store exclusive operation and the memory barrier operation in response to the fused operation. As the fused operation progresses through the processor and one or more other components (e.g. caches in the cache hierarchy) to the ordering point in the system, the fused operation may push previous memory operations to effect the memory barrier operation. In some embodiments, the latency for completing the store exclusive operation and the subsequent data memory barrier operation may be reduced if the store exclusive operation is successful at the ordering point.

Type: Application

Filed: February 24, 2010

Publication date: August 25, 2011

Inventors: Peter J. Bannon, Po-Yung Chang
Split Scheduler

Publication number: 20100162262

Abstract: In an embodiment, a scheduler implements a first dependency array that tracks dependencies on instruction operations (ops) within a distance N of a given op and which are short execution latency ops. Other dependencies are tracked in a second dependency array. The first dependency array may evaluate quickly, to support back-to-back issuance of short execution latency ops and their dependent ops. The second array may evaluate more slowly than the first dependency array.

Type: Application

Filed: December 18, 2008

Publication date: June 24, 2010

Inventors: Andrew J. Beaumont-Smith, Honkai Tam, Daniel C. Murray, John H. Mylius, Peter J. Bannon, Pradeep Kanapathipillai
Data Cache Block Zero Implementation

Publication number: 20100106916

Abstract: In one embodiment, a processor comprises a core configured to execute a data cache block write instruction and an interface unit coupled to the core and to an interconnect on which the processor is configured to communicate. The core is configured to transmit a request to the interface unit in response to the data cache block write instruction. If the request is speculative, the interface unit is configured to issue a first transaction on the interconnect. On the other hand, if the request is non-speculative, the interface unit is configured to issue a second transaction on the interconnect. The second transaction is different from the first transaction. For example, the second transaction may be an invalidate transaction and the first transaction may be a probe transaction. In some embodiments, the processor may be in a system including the interconnect and one or more caching agents.

Type: Application

Filed: December 30, 2009

Publication date: April 29, 2010

Inventors: Ramesh Gunna, Sudarshan Kadambi, Peter J. Bannon
Data cache block zero implementation

Patent number: 7707361

Abstract: In one embodiment, a processor comprises a core configured to execute a data cache block write instruction and an interface unit coupled to the core and to an interconnect on which the processor is configured to communicate. The core is configured to transmit a request to the interface unit in response to the data cache block write instruction. If the request is speculative, the interface unit is configured to issue a first transaction on the interconnect. On the other hand, if the request is non-speculative, the interface unit is configured to issue a second transaction on the interconnect. The second transaction is different from the first transaction. For example, the second transaction may be an invalidate transaction and the first transaction may be a probe transaction. In some embodiments, the processor may be in a system including the interconnect and one or more caching agents.

Type: Grant

Filed: November 17, 2005

Date of Patent: April 27, 2010

Assignee: Apple Inc.

Inventors: Ramesh Gunna, Sudarshan Kadambi, Peter J. Bannon
Fault containment and error recovery in a scalable multiprocessor

Patent number: 7152191

Abstract: A multi-processor computer system permits various types of partitions to be implemented to contain and isolate hardware failures. The various types of partitions include hard, semi-hard, firm, and soft partitions. Each partition can include one or more processors. Upon detecting a failure associated with a processor, the connection to adjacent processors in the system can be severed, thereby precluding corrupted data from contaminating the rest of the system. If an inter-processor connection is severed, message traffic in the system can become congested as messages become backed up in other processors. Accordingly, each processor includes various timers to monitor for traffic congestion that may be due to a severed connection. Rather than letting the processor continue to wait to be able to transmit its messages, the timers will expire at preprogrammed time periods and the processor will take appropriate action, such as simply dropping queued messages, to keep the system from locking up.

Type: Grant

Filed: October 23, 2003

Date of Patent: December 19, 2006

Assignee: Hewlett-Packard Development Company, L.P.

Inventors: Richard E. Kessler, Peter J. Bannon, Kourosh Gharachorloo, Thukalan V. Verghese
Mechanism for synchronizing multiple skewed source-synchronous data channels with automatic initialization feature

Patent number: 7024533

Abstract: A computer system has a memory controller that includes read buffers coupled to a plurality of memory channels. The memory controller advantageously eliminates the inter-channel skew caused by memory modules being located at different distances from the memory controller. The memory controller preferably includes a channel interface and synchronization logic circuit for each memory channel. This circuit includes read and write buffers and load and unload pointers for the read buffer. Unload pointer logic generates the unload pointer and load pointer logic generates the load pointer. The pointers preferably are free-running pointers that increment in accordance with two different clock signals. The load pointer increments in accordance with a clock generated by the memory controller but that has been routed out to and back from the memory modules.

Type: Grant

Filed: May 20, 2003

Date of Patent: April 4, 2006

Assignee: Hewlett-Packard Development Company, L.P.

Inventors: Richard E. Kessler, Peter J. Bannon, Maurice B. Steinman, Scott E. Breach, Allen J. Baum, Gregg A. Bouchard

1 2 next