Patents by Inventor Ian D. Kountanis

Ian D. Kountanis has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20190286218
    Abstract: A processor includes a mechanism for disabling a memory array of a branch prediction unit. The processor may include a next fetch prediction unit that may include a number of entries. Each entry may correspond to a next instruction fetch group and may store an indication of whether or not the corresponding the next fetch group includes a conditional branch instruction. In response to an indication that the next fetch group does not include a conditional branch instruction, the fetch prediction unit may be configured to disable, in a next instruction execution cycle, the memory array of the branch prediction unit.
    Type: Application
    Filed: March 25, 2019
    Publication date: September 19, 2019
    Inventors: Conrado Blasco, Ronald P. Hall, Ramesh B. Gunna, Ian D. Kountanis, Shyam Sundar, André Seznec
  • Patent number: 10346309
    Abstract: In an embodiment, a prefetch circuit may implement prefetch “boosting” to reduce the cost of cold (compulsory) misses and thus potentially improve performance. When a demand miss occurs, the prefetch circuit may generate one or more prefetch requests. The prefetch circuit may monitor the progress of the demand miss (and optionally the previously-generated prefetch requests as well) through the cache hierarchy to memory. At various progress points, if the demand miss remains a miss, additional prefetch requests may be launched. For example, if the demand miss accesses a lower level cache and misses, additional prefetch requests may be launched because the latency avoided in prefetching the additional cache blocks is higher, which may over ride the potential that the additional cache blocks are incorrectly prefetched.
    Type: Grant
    Filed: April 26, 2017
    Date of Patent: July 9, 2019
    Assignee: Apple Inc.
    Inventors: James R. Hakewill, Ian D. Kountanis, Douglas C. Holman
  • Publication number: 20190196834
    Abstract: In an embodiment, an apparatus includes a plurality of memories configured to store respective data in a plurality of branch prediction entries. Each branch prediction entry corresponds to at least one of a plurality of branch instructions. The apparatus also includes a control circuit configured to store first data associated with a first branch instruction into a corresponding branch prediction entry in at least one memory of the plurality of memories. The control circuit is further configured to select a first memory of the plurality of memories, to disconnect the first memory from a power supply in response to a detection of a first power mode signal, and to cease storing data in the plurality of memories in response to the detection of the first power mode signal.
    Type: Application
    Filed: March 4, 2019
    Publication date: June 27, 2019
    Inventors: Conrado Blasco, Brett S. Feero, David Williamson, Ian D. Kountanis, Shih-Chieh Wen
  • Patent number: 10241557
    Abstract: A processor includes a mechanism for disabling a memory array of a branch prediction unit. The processor may include a next fetch prediction unit that may include a number of entries. Each entry may correspond to a next instruction fetch group and may store an indication of whether or not the corresponding the next fetch group includes a conditional branch instruction. In response to an indication that the next fetch group does not include a conditional branch instruction, the fetch prediction unit may be configured to disable, in a next instruction execution cycle, the memory array of the branch prediction unit.
    Type: Grant
    Filed: December 12, 2013
    Date of Patent: March 26, 2019
    Assignee: Apple Inc.
    Inventors: Conrado Blasco, Ronald P Hall, Ramesh B Gunna, Ian D Kountanis, Shyam Sundar, André Seznec
  • Patent number: 10223123
    Abstract: In an embodiment, an apparatus includes a plurality of memories configured to store respective data in a plurality of branch prediction entries. Each branch prediction entry corresponds to at least one of a plurality of branch instructions. The apparatus also includes a control circuit configured to store first data associated with a first branch instruction into a corresponding branch prediction entry in at least one memory of the plurality of memories. The control circuit is further configured to select a first memory of the plurality of memories, to disconnect the first memory from a power supply in response to a detection of a first power mode signal, and to cease storing data in the plurality of memories in response to the detection of the first power mode signal.
    Type: Grant
    Filed: April 20, 2016
    Date of Patent: March 5, 2019
    Assignee: Apple Inc.
    Inventors: Conrado Blasco, Brett S. Feero, David Williamson, Ian D. Kountanis, Shih-Chieh Wen
  • Patent number: 10175982
    Abstract: A method and system for storing branch information is disclosed. First data may be stored in a first entry of a first table in response to a determination that a fetched instruction is a branch instruction. Second data that is dependent upon at least one previously taken branch may be stored in a second entry in a second table in response to a determination that a branch associated with the instruction is predicted to be taken. The first data may be updated to include an index to the second data in response to the determination that the branch is predicted to be taken.
    Type: Grant
    Filed: June 16, 2016
    Date of Patent: January 8, 2019
    Assignee: Apple Inc.
    Inventors: Conrado Blasco, Ian D. Kountanis
  • Patent number: 9753733
    Abstract: Methods, apparatuses, and processors for packing multiple iterations of a loop in a loop buffer. A loop candidate that meets the criteria for buffering is detected in the instruction stream being executed by a processor. When the loop is being written to the loop buffer and the end of the loop is detected, another iteration of the loop is written to the loop buffer if the loop buffer is not yet halfway full. In this way, short loops are written to the loop buffer multiple times to maximize the instruction operations per cycle throughput out of the loop buffer when the processor is in loop buffer mode.
    Type: Grant
    Filed: June 15, 2012
    Date of Patent: September 5, 2017
    Assignee: Apple Inc.
    Inventors: Conrado Blasco-Allue, Ian D. Kountanis
  • Patent number: 9632791
    Abstract: Techniques are disclosed relating to a cache for patterns of instructions. In some embodiments, an apparatus includes an instruction cache and is configured to detect a pattern of execution of instructions by an instruction processing pipeline. The pattern of execution may involve execution of only instructions in a particular group of instructions. The instructions may include multiple backward control transfers and/or a control transfer instruction that is taken in one iteration of the pattern and not taken in another iteration of the pattern. The apparatus may be configured to store the instructions in the instruction cache and fetch and execute the instructions from the instruction cache. The apparatus may include a branch predictor dedicated to predicting the direction of control transfer instructions for the instruction cache. Various embodiments may reduce power consumption associated with instruction processing.
    Type: Grant
    Filed: January 21, 2014
    Date of Patent: April 25, 2017
    Assignee: Apple Inc.
    Inventors: Muawya M. Al-Otoom, Ian D. Kountanis, Ronald P. Hall, Michael L. Karm
  • Patent number: 9626185
    Abstract: Various techniques for processing and pre-decoding branches within an IT instruction block. Instructions are fetched and cached in an instruction cache, and pre-decode bits are generated to indicate the presence of an IT instruction and the likely boundaries of the IT instruction block. If an unconditional branch is detected within the likely boundaries of an IT instruction block, the unconditional branch is treated as if it were a conditional branch. The unconditional branch is sent to the branch direction predictor and the predictor generates a branch direction prediction for the unconditional branch.
    Type: Grant
    Filed: February 22, 2013
    Date of Patent: April 18, 2017
    Assignee: Apple Inc.
    Inventors: Shyam Sundar, Ian D. Kountanis, Conrado Blasco-Allue, Gerard R. Williams, III, Wei-Han Lien, Ramesh B. Gunna
  • Patent number: 9557999
    Abstract: Methods, apparatuses, and processors for tracking loop candidates in an instruction stream. A load buffer control unit detects a backwards taken branch and starts tracking the loop candidate. The control unit tracks taken branches of the loop candidate, and keeps track of the distance to each taken branch from the start of the loop. If the distance to each taken branch stays the same over multiple iterations of the loop, then the loop is stored in a loop buffer. The loop is then dispatched from the loop buffer, and the front-end of the processor is powered down until the loop terminates.
    Type: Grant
    Filed: June 15, 2012
    Date of Patent: January 31, 2017
    Assignee: Apple Inc.
    Inventors: Conrado Blasco-Allue, Ian D. Kountanis
  • Publication number: 20170024205
    Abstract: Systems, apparatuses, and methods for implementing a non-shifting reservation station. A dispatch unit may write an operation into any entry of a reservation station. The reservation station may include an age matrix for determining the relative ages of the operations stored in the entries of the reservation station. The reservation station may include selection logic which is configured to pick the oldest ready operation from the reservation station based on the values stored in the age matrix. The selection logic may utilize control logic to mask off columns of an age matrix corresponding to non-ready operation so as to determine which operation is the oldest ready operation in the reservation station. Also, the reservation station may be configured to dequeue operations early when these operations do not have load dependency.
    Type: Application
    Filed: July 24, 2015
    Publication date: January 26, 2017
    Inventors: Ian D. Kountanis, Mahesh K. Reddy
  • Patent number: 9524011
    Abstract: Techniques are disclosed relating to power reduction during execution of instruction loops. Multiple different power saving modes may be used by a processor, such as a first power saving mode after only a few loop iterations (e.g., 2-3) and a second, deeper power saving mode after a greater number of loop iterations. The first power saving mode may include keeping a branch predictor and/or other structures active, but the second power saving mode may include reducing power to the branch predictor and/or other structures. An observation mode and an instruction capture mode may also be used by a processor prior to entering a power saving mode for loop execution. Power saving modes may also be achieved during execution of complex loops having multiple backward branches (e.g., nested loops).
    Type: Grant
    Filed: April 11, 2014
    Date of Patent: December 20, 2016
    Assignee: Apple Inc.
    Inventors: Ronald P. Hall, Michael L. Karm, Ian D. Kountanis, David J. Williamson
  • Patent number: 9471322
    Abstract: Systems, processors, and methods for determining when to enter loop buffer mode early for loops in an instruction stream. A processor waits until a branch history register has saturated before entering loop buffer mode for a loop if the processor has not yet determined the loop has an unpredictable exit. However, if the loop has an unpredictable exit, then the loop is allowed to enter loop buffer mode early. While in loop buffer mode, the loop is dispatched from a loop buffer, and the front-end of the processor is powered down until the loop terminates.
    Type: Grant
    Filed: February 12, 2014
    Date of Patent: October 18, 2016
    Assignee: Apple Inc.
    Inventors: Conrado Blasco, Ian D. Kountanis
  • Publication number: 20160048395
    Abstract: In an embodiment, a processor may be configured to fetch N instruction bytes from an instruction cache (a “fetch group”), even if the fetch group crosses a cache line boundary. A branch predictor may be configured to produce branch predictions for up to M branches in the fetch group, where M is a maximum number of branches that may be included in the fetch group. In an embodiment, a branch direction predictor may be updated responsive to a misprediction and also responsive to the branch prediction being within a threshold of transitioning between predictions. To avoid a lookup to determine if the threshold update is to be performed, the branch predictor may detect the threshold update during prediction, and may transmit an indication with the branch.
    Type: Application
    Filed: October 27, 2015
    Publication date: February 18, 2016
    Inventors: Ian D. Kountanis, Gerard R. Williams, III, James B. Keller
  • Patent number: 9201658
    Abstract: In an embodiment, a processor may be configured to fetch N instruction bytes from an instruction cache (a “fetch group”), even if the fetch group crosses a cache line boundary. A branch predictor may be configured to produce branch predictions for up to M branches in the fetch group, where M is a maximum number of branches that may be included in the fetch group. In an embodiment, branch prediction values from multiple entries in each table may be read and respective branch prediction values may be combined to form branch predictions for up to M branches in the fetch group.
    Type: Grant
    Filed: September 24, 2012
    Date of Patent: December 1, 2015
    Assignee: Apple Inc.
    Inventors: Ian D. Kountanis, Gerard R. Williams, III, James B. Keller
  • Publication number: 20150293577
    Abstract: Techniques are disclosed relating to power reduction during execution of instruction loops. Multiple different power saving modes may be used by a processor, such as a first power saving mode after only a few loop iterations (e.g., 2-3) and a second, deeper power saving mode after a greater number of loop iterations. The first power saving mode may include keeping a branch predictor and/or other structures active, but the second power saving mode may include reducing power to the branch predictor and/or other structures. An observation mode and an instruction capture mode may also be used by a processor prior to entering a power saving mode for loop execution. Power saving modes may also be achieved during execution of complex loops having multiple backward branches (e.g., nested loops).
    Type: Application
    Filed: April 11, 2014
    Publication date: October 15, 2015
    Applicant: Apple Inc.
    Inventors: Ronald P. Hall, Michael L. Karm, Ian D. Kountanis, David J. Williamson
  • Publication number: 20150227374
    Abstract: Systems, processors, and methods for determining when to enter loop buffer mode early for loops in an instruction stream. A processor waits until a branch history register has saturated before entering loop buffer mode for a loop if the processor has not yet determined the loop has an unpredictable exit. However, if the loop has an unpredictable exit, then the loop is allowed to enter loop buffer mode early. While in loop buffer mode, the loop is dispatched from a loop buffer, and the front-end of the processor is powered down until the loop terminates.
    Type: Application
    Filed: February 12, 2014
    Publication date: August 13, 2015
    Applicant: Apple Inc.
    Inventors: Conrado Blasco, Ian D. Kountanis
  • Publication number: 20150205725
    Abstract: Techniques are disclosed relating to a cache for patterns of instructions. In some embodiments, an apparatus includes an instruction cache and is configured to detect a pattern of execution of instructions by an instruction processing pipeline. The pattern of execution may involve execution of only instructions in a particular group of instructions. The instructions may include multiple backward control transfers and/or a control transfer instruction that is taken in one iteration of the pattern and not taken in another iteration of the pattern. The apparatus may be configured to store the instructions in the instruction cache and fetch and execute the instructions from the instruction cache. The apparatus may include a branch predictor dedicated to predicting the direction of control transfer instructions for the instruction cache. Various embodiments may reduce power consumption associated with instruction processing.
    Type: Application
    Filed: January 21, 2014
    Publication date: July 23, 2015
    Applicant: APPLE INC.
    Inventors: Muawya M. Al-Otoom, Ian D. Kountanis, Ronald P. Hall, Michael L. Karm
  • Publication number: 20150169041
    Abstract: A processor includes a mechanism for disabling a memory array of a branch prediction unit. The processor may include a next fetch prediction unit that may include a number of entries. Each entry may correspond to a next instruction fetch group and may store an indication of whether or not the corresponding the next fetch group includes a conditional branch instruction. In response to an indication that the next fetch group does not include a conditional branch instruction, the fetch prediction unit may be configured to disable, in a next instruction execution cycle, the memory array of the branch prediction unit.
    Type: Application
    Filed: December 12, 2013
    Publication date: June 18, 2015
    Applicant: Apple Inc.
    Inventors: Conrado Blasco, Ronald P. Hall, Ramesh B. Gunna, Ian D. Kountanis, Shyam Sundar, André Seznec
  • Patent number: 8914580
    Abstract: In some embodiments, a cache may include a tag array and a data array, as well as circuitry that detects whether accesses to the cache are sequential (e.g., occupying the same cache line). For example, a cache may include a tag array and a data array that stores data, such as multiple bundles of instructions per cache line. During operation, it may be determined that successive cache requests are sequential and do not cross a cache line boundary. Responsively, various cache operations may be inhibited to conserve power. For example, access to the tag array and/or data array, or portions thereof, may be inhibited.
    Type: Grant
    Filed: August 23, 2010
    Date of Patent: December 16, 2014
    Assignee: Apple Inc.
    Inventors: Rajat Goel, Ian D. Kountanis