Patents by Inventor Ian D. Kountanis
Ian D. Kountanis has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20190286218Abstract: A processor includes a mechanism for disabling a memory array of a branch prediction unit. The processor may include a next fetch prediction unit that may include a number of entries. Each entry may correspond to a next instruction fetch group and may store an indication of whether or not the corresponding the next fetch group includes a conditional branch instruction. In response to an indication that the next fetch group does not include a conditional branch instruction, the fetch prediction unit may be configured to disable, in a next instruction execution cycle, the memory array of the branch prediction unit.Type: ApplicationFiled: March 25, 2019Publication date: September 19, 2019Inventors: Conrado Blasco, Ronald P. Hall, Ramesh B. Gunna, Ian D. Kountanis, Shyam Sundar, André Seznec
-
Patent number: 10346309Abstract: In an embodiment, a prefetch circuit may implement prefetch “boosting” to reduce the cost of cold (compulsory) misses and thus potentially improve performance. When a demand miss occurs, the prefetch circuit may generate one or more prefetch requests. The prefetch circuit may monitor the progress of the demand miss (and optionally the previously-generated prefetch requests as well) through the cache hierarchy to memory. At various progress points, if the demand miss remains a miss, additional prefetch requests may be launched. For example, if the demand miss accesses a lower level cache and misses, additional prefetch requests may be launched because the latency avoided in prefetching the additional cache blocks is higher, which may over ride the potential that the additional cache blocks are incorrectly prefetched.Type: GrantFiled: April 26, 2017Date of Patent: July 9, 2019Assignee: Apple Inc.Inventors: James R. Hakewill, Ian D. Kountanis, Douglas C. Holman
-
Publication number: 20190196834Abstract: In an embodiment, an apparatus includes a plurality of memories configured to store respective data in a plurality of branch prediction entries. Each branch prediction entry corresponds to at least one of a plurality of branch instructions. The apparatus also includes a control circuit configured to store first data associated with a first branch instruction into a corresponding branch prediction entry in at least one memory of the plurality of memories. The control circuit is further configured to select a first memory of the plurality of memories, to disconnect the first memory from a power supply in response to a detection of a first power mode signal, and to cease storing data in the plurality of memories in response to the detection of the first power mode signal.Type: ApplicationFiled: March 4, 2019Publication date: June 27, 2019Inventors: Conrado Blasco, Brett S. Feero, David Williamson, Ian D. Kountanis, Shih-Chieh Wen
-
Patent number: 10241557Abstract: A processor includes a mechanism for disabling a memory array of a branch prediction unit. The processor may include a next fetch prediction unit that may include a number of entries. Each entry may correspond to a next instruction fetch group and may store an indication of whether or not the corresponding the next fetch group includes a conditional branch instruction. In response to an indication that the next fetch group does not include a conditional branch instruction, the fetch prediction unit may be configured to disable, in a next instruction execution cycle, the memory array of the branch prediction unit.Type: GrantFiled: December 12, 2013Date of Patent: March 26, 2019Assignee: Apple Inc.Inventors: Conrado Blasco, Ronald P Hall, Ramesh B Gunna, Ian D Kountanis, Shyam Sundar, André Seznec
-
Patent number: 10223123Abstract: In an embodiment, an apparatus includes a plurality of memories configured to store respective data in a plurality of branch prediction entries. Each branch prediction entry corresponds to at least one of a plurality of branch instructions. The apparatus also includes a control circuit configured to store first data associated with a first branch instruction into a corresponding branch prediction entry in at least one memory of the plurality of memories. The control circuit is further configured to select a first memory of the plurality of memories, to disconnect the first memory from a power supply in response to a detection of a first power mode signal, and to cease storing data in the plurality of memories in response to the detection of the first power mode signal.Type: GrantFiled: April 20, 2016Date of Patent: March 5, 2019Assignee: Apple Inc.Inventors: Conrado Blasco, Brett S. Feero, David Williamson, Ian D. Kountanis, Shih-Chieh Wen
-
Patent number: 10175982Abstract: A method and system for storing branch information is disclosed. First data may be stored in a first entry of a first table in response to a determination that a fetched instruction is a branch instruction. Second data that is dependent upon at least one previously taken branch may be stored in a second entry in a second table in response to a determination that a branch associated with the instruction is predicted to be taken. The first data may be updated to include an index to the second data in response to the determination that the branch is predicted to be taken.Type: GrantFiled: June 16, 2016Date of Patent: January 8, 2019Assignee: Apple Inc.Inventors: Conrado Blasco, Ian D. Kountanis
-
Patent number: 9753733Abstract: Methods, apparatuses, and processors for packing multiple iterations of a loop in a loop buffer. A loop candidate that meets the criteria for buffering is detected in the instruction stream being executed by a processor. When the loop is being written to the loop buffer and the end of the loop is detected, another iteration of the loop is written to the loop buffer if the loop buffer is not yet halfway full. In this way, short loops are written to the loop buffer multiple times to maximize the instruction operations per cycle throughput out of the loop buffer when the processor is in loop buffer mode.Type: GrantFiled: June 15, 2012Date of Patent: September 5, 2017Assignee: Apple Inc.Inventors: Conrado Blasco-Allue, Ian D. Kountanis
-
Patent number: 9632791Abstract: Techniques are disclosed relating to a cache for patterns of instructions. In some embodiments, an apparatus includes an instruction cache and is configured to detect a pattern of execution of instructions by an instruction processing pipeline. The pattern of execution may involve execution of only instructions in a particular group of instructions. The instructions may include multiple backward control transfers and/or a control transfer instruction that is taken in one iteration of the pattern and not taken in another iteration of the pattern. The apparatus may be configured to store the instructions in the instruction cache and fetch and execute the instructions from the instruction cache. The apparatus may include a branch predictor dedicated to predicting the direction of control transfer instructions for the instruction cache. Various embodiments may reduce power consumption associated with instruction processing.Type: GrantFiled: January 21, 2014Date of Patent: April 25, 2017Assignee: Apple Inc.Inventors: Muawya M. Al-Otoom, Ian D. Kountanis, Ronald P. Hall, Michael L. Karm
-
Patent number: 9626185Abstract: Various techniques for processing and pre-decoding branches within an IT instruction block. Instructions are fetched and cached in an instruction cache, and pre-decode bits are generated to indicate the presence of an IT instruction and the likely boundaries of the IT instruction block. If an unconditional branch is detected within the likely boundaries of an IT instruction block, the unconditional branch is treated as if it were a conditional branch. The unconditional branch is sent to the branch direction predictor and the predictor generates a branch direction prediction for the unconditional branch.Type: GrantFiled: February 22, 2013Date of Patent: April 18, 2017Assignee: Apple Inc.Inventors: Shyam Sundar, Ian D. Kountanis, Conrado Blasco-Allue, Gerard R. Williams, III, Wei-Han Lien, Ramesh B. Gunna
-
Patent number: 9557999Abstract: Methods, apparatuses, and processors for tracking loop candidates in an instruction stream. A load buffer control unit detects a backwards taken branch and starts tracking the loop candidate. The control unit tracks taken branches of the loop candidate, and keeps track of the distance to each taken branch from the start of the loop. If the distance to each taken branch stays the same over multiple iterations of the loop, then the loop is stored in a loop buffer. The loop is then dispatched from the loop buffer, and the front-end of the processor is powered down until the loop terminates.Type: GrantFiled: June 15, 2012Date of Patent: January 31, 2017Assignee: Apple Inc.Inventors: Conrado Blasco-Allue, Ian D. Kountanis
-
Publication number: 20170024205Abstract: Systems, apparatuses, and methods for implementing a non-shifting reservation station. A dispatch unit may write an operation into any entry of a reservation station. The reservation station may include an age matrix for determining the relative ages of the operations stored in the entries of the reservation station. The reservation station may include selection logic which is configured to pick the oldest ready operation from the reservation station based on the values stored in the age matrix. The selection logic may utilize control logic to mask off columns of an age matrix corresponding to non-ready operation so as to determine which operation is the oldest ready operation in the reservation station. Also, the reservation station may be configured to dequeue operations early when these operations do not have load dependency.Type: ApplicationFiled: July 24, 2015Publication date: January 26, 2017Inventors: Ian D. Kountanis, Mahesh K. Reddy
-
Patent number: 9524011Abstract: Techniques are disclosed relating to power reduction during execution of instruction loops. Multiple different power saving modes may be used by a processor, such as a first power saving mode after only a few loop iterations (e.g., 2-3) and a second, deeper power saving mode after a greater number of loop iterations. The first power saving mode may include keeping a branch predictor and/or other structures active, but the second power saving mode may include reducing power to the branch predictor and/or other structures. An observation mode and an instruction capture mode may also be used by a processor prior to entering a power saving mode for loop execution. Power saving modes may also be achieved during execution of complex loops having multiple backward branches (e.g., nested loops).Type: GrantFiled: April 11, 2014Date of Patent: December 20, 2016Assignee: Apple Inc.Inventors: Ronald P. Hall, Michael L. Karm, Ian D. Kountanis, David J. Williamson
-
Patent number: 9471322Abstract: Systems, processors, and methods for determining when to enter loop buffer mode early for loops in an instruction stream. A processor waits until a branch history register has saturated before entering loop buffer mode for a loop if the processor has not yet determined the loop has an unpredictable exit. However, if the loop has an unpredictable exit, then the loop is allowed to enter loop buffer mode early. While in loop buffer mode, the loop is dispatched from a loop buffer, and the front-end of the processor is powered down until the loop terminates.Type: GrantFiled: February 12, 2014Date of Patent: October 18, 2016Assignee: Apple Inc.Inventors: Conrado Blasco, Ian D. Kountanis
-
Publication number: 20160048395Abstract: In an embodiment, a processor may be configured to fetch N instruction bytes from an instruction cache (a “fetch group”), even if the fetch group crosses a cache line boundary. A branch predictor may be configured to produce branch predictions for up to M branches in the fetch group, where M is a maximum number of branches that may be included in the fetch group. In an embodiment, a branch direction predictor may be updated responsive to a misprediction and also responsive to the branch prediction being within a threshold of transitioning between predictions. To avoid a lookup to determine if the threshold update is to be performed, the branch predictor may detect the threshold update during prediction, and may transmit an indication with the branch.Type: ApplicationFiled: October 27, 2015Publication date: February 18, 2016Inventors: Ian D. Kountanis, Gerard R. Williams, III, James B. Keller
-
Patent number: 9201658Abstract: In an embodiment, a processor may be configured to fetch N instruction bytes from an instruction cache (a “fetch group”), even if the fetch group crosses a cache line boundary. A branch predictor may be configured to produce branch predictions for up to M branches in the fetch group, where M is a maximum number of branches that may be included in the fetch group. In an embodiment, branch prediction values from multiple entries in each table may be read and respective branch prediction values may be combined to form branch predictions for up to M branches in the fetch group.Type: GrantFiled: September 24, 2012Date of Patent: December 1, 2015Assignee: Apple Inc.Inventors: Ian D. Kountanis, Gerard R. Williams, III, James B. Keller
-
Publication number: 20150293577Abstract: Techniques are disclosed relating to power reduction during execution of instruction loops. Multiple different power saving modes may be used by a processor, such as a first power saving mode after only a few loop iterations (e.g., 2-3) and a second, deeper power saving mode after a greater number of loop iterations. The first power saving mode may include keeping a branch predictor and/or other structures active, but the second power saving mode may include reducing power to the branch predictor and/or other structures. An observation mode and an instruction capture mode may also be used by a processor prior to entering a power saving mode for loop execution. Power saving modes may also be achieved during execution of complex loops having multiple backward branches (e.g., nested loops).Type: ApplicationFiled: April 11, 2014Publication date: October 15, 2015Applicant: Apple Inc.Inventors: Ronald P. Hall, Michael L. Karm, Ian D. Kountanis, David J. Williamson
-
Publication number: 20150227374Abstract: Systems, processors, and methods for determining when to enter loop buffer mode early for loops in an instruction stream. A processor waits until a branch history register has saturated before entering loop buffer mode for a loop if the processor has not yet determined the loop has an unpredictable exit. However, if the loop has an unpredictable exit, then the loop is allowed to enter loop buffer mode early. While in loop buffer mode, the loop is dispatched from a loop buffer, and the front-end of the processor is powered down until the loop terminates.Type: ApplicationFiled: February 12, 2014Publication date: August 13, 2015Applicant: Apple Inc.Inventors: Conrado Blasco, Ian D. Kountanis
-
Publication number: 20150205725Abstract: Techniques are disclosed relating to a cache for patterns of instructions. In some embodiments, an apparatus includes an instruction cache and is configured to detect a pattern of execution of instructions by an instruction processing pipeline. The pattern of execution may involve execution of only instructions in a particular group of instructions. The instructions may include multiple backward control transfers and/or a control transfer instruction that is taken in one iteration of the pattern and not taken in another iteration of the pattern. The apparatus may be configured to store the instructions in the instruction cache and fetch and execute the instructions from the instruction cache. The apparatus may include a branch predictor dedicated to predicting the direction of control transfer instructions for the instruction cache. Various embodiments may reduce power consumption associated with instruction processing.Type: ApplicationFiled: January 21, 2014Publication date: July 23, 2015Applicant: APPLE INC.Inventors: Muawya M. Al-Otoom, Ian D. Kountanis, Ronald P. Hall, Michael L. Karm
-
Publication number: 20150169041Abstract: A processor includes a mechanism for disabling a memory array of a branch prediction unit. The processor may include a next fetch prediction unit that may include a number of entries. Each entry may correspond to a next instruction fetch group and may store an indication of whether or not the corresponding the next fetch group includes a conditional branch instruction. In response to an indication that the next fetch group does not include a conditional branch instruction, the fetch prediction unit may be configured to disable, in a next instruction execution cycle, the memory array of the branch prediction unit.Type: ApplicationFiled: December 12, 2013Publication date: June 18, 2015Applicant: Apple Inc.Inventors: Conrado Blasco, Ronald P. Hall, Ramesh B. Gunna, Ian D. Kountanis, Shyam Sundar, André Seznec
-
Patent number: 8914580Abstract: In some embodiments, a cache may include a tag array and a data array, as well as circuitry that detects whether accesses to the cache are sequential (e.g., occupying the same cache line). For example, a cache may include a tag array and a data array that stores data, such as multiple bundles of instructions per cache line. During operation, it may be determined that successive cache requests are sequential and do not cross a cache line boundary. Responsively, various cache operations may be inhibited to conserve power. For example, access to the tag array and/or data array, or portions thereof, may be inhibited.Type: GrantFiled: August 23, 2010Date of Patent: December 16, 2014Assignee: Apple Inc.Inventors: Rajat Goel, Ian D. Kountanis