Patents by Inventor Michael L. Karm
Michael L. Karm has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11928467Abstract: In an embodiment, a processor comprises an atomic predictor circuit to predict whether or not an atomic operation will complete successfully. The prediction may be used when a subsequent load operation to the same memory location as the atomic operation is executed, to determine whether or not to forward store data from the atomic operation to the subsequent load operation. If the prediction is successful, the store data may be forwarded. If the prediction is unsuccessful, the store data may not be forwarded. In cases where an atomic operation has been failing (not successfully performing the store operation), the prediction may prevent the forwarding of the store data and thus may prevent a subsequent flush of the load.Type: GrantFiled: September 13, 2021Date of Patent: March 12, 2024Assignee: Apple Inc.Inventors: Brian R. Mestan, Gideon N. Levinsky, Michael L. Karm
-
Publication number: 20220091846Abstract: In an embodiment, a processor comprises an atomic predictor circuit to predict whether or not an atomic operation will complete successfully. The prediction may be used when a subsequent load operation to the same memory location as the atomic operation is executed, to determine whether or not to forward store data from the atomic operation to the subsequent load operation. If the prediction is successful, the store data may be forwarded. If the prediction is unsuccessful, the store data may not be forwarded. In cases where an atomic operation has been failing (not successfully performing the store operation), the prediction may prevent the forwarding of the store data and thus may prevent a subsequent flush of the load.Type: ApplicationFiled: September 13, 2021Publication date: March 24, 2022Inventors: Brian R. Mestan, Gideon N. Levinsky, Michael L. Karm
-
Patent number: 11256622Abstract: In one embodiment, a processor includes a write combining buffer that includes a memory having a plurality of entries. The entries may be allocated to committed store operations transmitted by a load/store unit in the processor, and subsequent committed store operations may merge data with previous store memory operations in the buffer if the subsequent committed store operations are to addresses that match addresses of the previous committed store operations within a predefined granularity (e.g. the width of a cache port). The write combining buffer may be configured to retain up to N entries of committed store operations, but may also be configured to write one or more of the entries to the data cache responsive to receiving more than a threshold amount of non-merging committed store operations in the write combining buffer.Type: GrantFiled: May 8, 2020Date of Patent: February 22, 2022Assignee: Apple Inc.Inventors: Michael L. Karm, Gideon N. Levinsky
-
Publication number: 20210349823Abstract: In one embodiment, a processor includes a write combining buffer that includes a memory having a plurality of entries. The entries may be allocated to committed store operations transmitted by a load/store unit in the processor, and subsequent committed store operations may merge data with previous store memory operations in the buffer if the subsequent committed store operations are to addresses that match addresses of the previous committed store operations within a predefined granularity (e.g. the width of a cache port). The write combining buffer may be configured to retain up to N entries of committed store operations, but may also be configured to write one or more of the entries to the data cache responsive to receiving more than a threshold amount of non-merging committed store operations in the write combining buffer.Type: ApplicationFiled: May 8, 2020Publication date: November 11, 2021Inventors: Michael L. Karm, Gideon N. Levinsky
-
Patent number: 11119767Abstract: In an embodiment, a processor comprises an atomic predictor circuit to predict whether or not an atomic operation will complete successfully. The prediction may be used when a subsequent load operation to the same memory location as the atomic operation is executed, to determine whether or not to forward store data from the atomic operation to the subsequent load operation. If the prediction is successful, the store data may be forwarded. If the prediction is unsuccessful, the store data may not be forwarded. In cases where an atomic operation has been failing (not successfully performing the store operation), the prediction may prevent the forwarding of the store data and thus may prevent a subsequent flush of the load.Type: GrantFiled: June 19, 2020Date of Patent: September 14, 2021Assignee: Apple Inc.Inventors: Brian R. Mestan, Gideon N. Levinsky, Michael L. Karm
-
Patent number: 9772851Abstract: A method, system, and computer program product for instruction fetching within a processor instruction unit, utilizing a loop buffer, one or more virtual loop buffers, and/or an instruction buffer. During instruction fetch, modified instruction buffers coupled to an instruction cache (I-cache) temporarily store instructions from a single branch, backwards short loop. The modified instruction buffers may be a loop buffer, one or more virtual loop buffers, and/or an instruction buffer. The instruction fetch within the instruction unit of a processor retrieves the instructions for the short loop from the modified buffers during the loop cycles of the single branch, backwards short loop, rather than from the instruction cache.Type: GrantFiled: October 25, 2007Date of Patent: September 26, 2017Assignee: International Business Machines CorporationInventors: Ronald Hall, Michael L. Karm, Brian R. Mestan, David Mui
-
Patent number: 9632791Abstract: Techniques are disclosed relating to a cache for patterns of instructions. In some embodiments, an apparatus includes an instruction cache and is configured to detect a pattern of execution of instructions by an instruction processing pipeline. The pattern of execution may involve execution of only instructions in a particular group of instructions. The instructions may include multiple backward control transfers and/or a control transfer instruction that is taken in one iteration of the pattern and not taken in another iteration of the pattern. The apparatus may be configured to store the instructions in the instruction cache and fetch and execute the instructions from the instruction cache. The apparatus may include a branch predictor dedicated to predicting the direction of control transfer instructions for the instruction cache. Various embodiments may reduce power consumption associated with instruction processing.Type: GrantFiled: January 21, 2014Date of Patent: April 25, 2017Assignee: Apple Inc.Inventors: Muawya M. Al-Otoom, Ian D. Kountanis, Ronald P. Hall, Michael L. Karm
-
Patent number: 9632788Abstract: A method and system for instruction fetching within a processor instruction unit, utilizing a loop buffer, one or more virtual loop buffers, and/or an instruction buffer. During instruction fetch, modified instruction buffers coupled to an instruction cache (I-cache) temporarily store instructions from a single branch, backwards short loop. The modified instruction buffers may be a loop buffer, one or more virtual loop buffers, and/or an instruction buffer. The instruction fetch within the instruction unit of a processor retrieves the instructions for the short loop from the modified buffers during the loop cycles of the single branch, backwards short loop, rather than from the instruction cache.Type: GrantFiled: April 5, 2016Date of Patent: April 25, 2017Assignee: International Business Machines CorporationInventors: Ronald Hall, Michael L. Karm, Brian R. Mestan, David Mui
-
Patent number: 9524011Abstract: Techniques are disclosed relating to power reduction during execution of instruction loops. Multiple different power saving modes may be used by a processor, such as a first power saving mode after only a few loop iterations (e.g., 2-3) and a second, deeper power saving mode after a greater number of loop iterations. The first power saving mode may include keeping a branch predictor and/or other structures active, but the second power saving mode may include reducing power to the branch predictor and/or other structures. An observation mode and an instruction capture mode may also be used by a processor prior to entering a power saving mode for loop execution. Power saving modes may also be achieved during execution of complex loops having multiple backward branches (e.g., nested loops).Type: GrantFiled: April 11, 2014Date of Patent: December 20, 2016Assignee: Apple Inc.Inventors: Ronald P. Hall, Michael L. Karm, Ian D. Kountanis, David J. Williamson
-
Publication number: 20160216970Abstract: A method and system for instruction fetching within a processor instruction unit, utilizing a loop buffer, one or more virtual loop buffers, and/or an instruction buffer. During instruction fetch, modified instruction buffers coupled to an instruction cache (I-cache) temporarily store instructions from a single branch, backwards short loop. The modified instruction buffers may be a loop buffer, one or more virtual loop buffers, and/or an instruction buffer. The instruction fetch within the instruction unit of a processor retrieves the instructions for the short loop from the modified buffers during the loop cycles of the single branch, backwards short loop, rather than from the instruction cache.Type: ApplicationFiled: April 5, 2016Publication date: July 28, 2016Inventors: Ronald Hall, Michael L. Karm, Brian R. Mestan, David Mui
-
Patent number: 9395995Abstract: A method, system, and computer program product for instruction fetching within a processor instruction unit, utilizing a loop buffer, one or more virtual loop buffers, and/or an instruction buffer. During instruction fetch, modified instruction buffers coupled to an instruction cache (I-cache) temporarily store instructions from a single branch, backwards short loop. The modified instruction buffers may be a loop buffer, one or more virtual loop buffers, and/or an instruction buffer. The instruction fetch within the instruction unit of a processor retrieves the instructions for the short loop from the modified buffers during the loop cycles of the single branch, backwards short loop, rather than from the instruction cache.Type: GrantFiled: February 29, 2012Date of Patent: July 19, 2016Assignee: International Business Machines CorporationInventors: Ronald Hall, Michael L Karm, Brian R Mestan, David Mui
-
Publication number: 20150293577Abstract: Techniques are disclosed relating to power reduction during execution of instruction loops. Multiple different power saving modes may be used by a processor, such as a first power saving mode after only a few loop iterations (e.g., 2-3) and a second, deeper power saving mode after a greater number of loop iterations. The first power saving mode may include keeping a branch predictor and/or other structures active, but the second power saving mode may include reducing power to the branch predictor and/or other structures. An observation mode and an instruction capture mode may also be used by a processor prior to entering a power saving mode for loop execution. Power saving modes may also be achieved during execution of complex loops having multiple backward branches (e.g., nested loops).Type: ApplicationFiled: April 11, 2014Publication date: October 15, 2015Applicant: Apple Inc.Inventors: Ronald P. Hall, Michael L. Karm, Ian D. Kountanis, David J. Williamson
-
Publication number: 20150205725Abstract: Techniques are disclosed relating to a cache for patterns of instructions. In some embodiments, an apparatus includes an instruction cache and is configured to detect a pattern of execution of instructions by an instruction processing pipeline. The pattern of execution may involve execution of only instructions in a particular group of instructions. The instructions may include multiple backward control transfers and/or a control transfer instruction that is taken in one iteration of the pattern and not taken in another iteration of the pattern. The apparatus may be configured to store the instructions in the instruction cache and fetch and execute the instructions from the instruction cache. The apparatus may include a branch predictor dedicated to predicting the direction of control transfer instructions for the instruction cache. Various embodiments may reduce power consumption associated with instruction processing.Type: ApplicationFiled: January 21, 2014Publication date: July 23, 2015Applicant: APPLE INC.Inventors: Muawya M. Al-Otoom, Ian D. Kountanis, Ronald P. Hall, Michael L. Karm
-
Patent number: 9052910Abstract: A design structure provides instruction fetching within a processor instruction unit, utilizing a loop buffer, one or more virtual loop buffers, and/or an instruction buffer. During instruction fetch, modified instruction buffers coupled to an instruction cache (I-cache) temporarily store instructions from a single branch, backwards short loop. The modified instruction buffers may be a loop buffer, one or more virtual loop buffers, and/or an instruction buffer. Instructions are stored in the modified instruction buffers for the length of the loop cycle. The instruction fetch within the instruction unit of a processor retrieves the instructions for the short loop from the modified buffers during the loop cycle, rather than from the instruction cache.Type: GrantFiled: June 3, 2008Date of Patent: June 9, 2015Assignee: International Business Machines CorporationInventors: Ronald Hall, Michael L. Karm, Brian R. Mestan, David Mui
-
Publication number: 20120159125Abstract: A method, system and computer program product for instruction fetching within a processor instruction unit, utilizing a loop buffer, one or more virtual loop buffers, and/or an instruction buffer. During instruction fetch, modified instruction buffers coupled to an instruction cache (I-cache) temporarily store instructions from a single branch, backwards short loop. The modified instruction buffers may be a loop buffer, one or more virtual loop buffers, and/or an instruction buffer. Instructions are stored in the modified instruction buffers for the length of the loop cycle. The instruction fetch within the instruction unit of a processor retrieves the instructions for the short loop from the modified buffers during the loop cycle, rather than from the instruction cache.Type: ApplicationFiled: February 29, 2012Publication date: June 21, 2012Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Ronald D. Hall, Michael L. Karm, Brian R. Mestan, David Mui
-
Patent number: 8131980Abstract: A design structure for resolving the occurrence of livelock at the interface between the processor core and memory subsystem controller. Livelock is resolved by introducing a livelock detection mechanism (which includes livelock detection utility or logic) within the processor to detect a livelock condition and dynamically change the duration of the delay stage(s) in order to alter the “harmonic” fixed-cycle loop behavior. The livelock detection logic (LDL) counts the number of flushes a particular instruction takes or the number of times an instruction re-issues without completing. The LDL then compares that number to a preset threshold number. Based on the result of the comparison, the LDL triggers the implementation of one of two different livelock resolution processes.Type: GrantFiled: June 3, 2008Date of Patent: March 6, 2012Assignee: International Business Machines CorporationInventors: Ronald Hall, Michael L. Karm, Alvan W. Ng, Todd A. Venton
-
Publication number: 20100037288Abstract: A method for access authorization via inheritance to information of a first registered user on a social network comprises defining authorization criteria for the first registered user; receiving first verification data from a requester, wherein the requester comprises one of a second registered user or a non-registered user; determining if the first verification data satisfies the authorization criteria, and in the event the first verification data satisfies the authorization criteria, extending inherited access authorization to the requester in the event the requester is the non-registered user, and extending inherited access authorization to a contact of the requestor in the event the requestor is the second registered user.Type: ApplicationFiled: January 22, 2009Publication date: February 11, 2010Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Theodore R. Carraher, Jason A. Cox, Lydia M. Do, Michael L. Karm
-
Patent number: 7653848Abstract: An on-chip trace engine stores trace data in on-chip trace arrays and routes the trace data to output pins. An external trace capture device captures the trace data. The on-chip trace engine streams the trace data through the debug output pins at a slower rate that can be supported by external trace capture device. If compression is insufficient for the required data rate reduction, the on-chip trace engine includes selectable data reduction mechanisms. Responsive to an overflow condition, meaning trace data is captured in on-chip trace arrays faster than it can be routed off chip, the on-chip trace engine enters an overflow mode in which one or more of the data reduction mechanisms are selected. The data reduction mechanisms may include, for example, a data width reduction component, a pattern match data elimination component, a priority source select component, an under-sampling component, or various combinations thereof.Type: GrantFiled: July 14, 2006Date of Patent: January 26, 2010Assignee: International Business Machines CorporationInventors: Christopher M. Abernathy, Lydia M. Do, Ronald P. Hall, Michael L. Karm
-
Publication number: 20090113192Abstract: A design structure provides instruction fetching within a processor instruction unit, utilizing a loop buffer, one or more virtual loop buffers, and/or an instruction buffer. During instruction fetch, modified instruction buffers coupled to an instruction cache (I-cache) temporarily store instructions from a single branch, backwards short loop. The modified instruction buffers may be a loop buffer, one or more virtual loop buffers, and/or an instruction buffer. Instructions are stored in the modified instruction buffers for the length of the loop cycle. The instruction fetch within the instruction unit of a processor retrieves the instructions for the short loop from the modified buffers during the loop cycle, rather than from the instruction cache.Type: ApplicationFiled: June 3, 2008Publication date: April 30, 2009Inventors: RONALD HALL, Michael L. Karm, Brian R. Mestan, David Mui
-
Publication number: 20090113191Abstract: A method, system and computer program product for instruction fetching within a processor instruction unit, utilizing a loop buffer, one or more virtual loop buffers, and/or an instruction buffer. During instruction fetch, modified instruction buffers coupled to an instruction cache (I-cache) temporarily store instructions from a single branch, backwards short loop. The modified instruction buffers may be a loop buffer, one or more virtual loop buffers, and/or an instruction buffer. Instructions are stored in the modified instruction buffers for the length of the loop cycle. The instruction fetch within the instruction unit of a processor retrieves the instructions for the short loop from the modified buffers during the loop cycle, rather than from the instruction cache.Type: ApplicationFiled: October 25, 2007Publication date: April 30, 2009Inventors: Ronald Hall, Michael L. Karm, Brian R. Mestan, David Mui