Patents by Inventor Edward J. McLellan
Edward J. McLellan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 10776119Abstract: An example embodiment combines use of a branch predictor with cache-like storage of previously executed branch targets to improve processor performance while minimizing hardware cost. The branch predictor is configured to predict both conditional branch and indirect branch targets and includes a combined predictor table configured to store at least one tagged conditional branch prediction in combination with at least one tagged indirect branch target prediction. The at least one tagged indirect branch target prediction is configured to include a predicted partial target address of a complete target address, the complete target address associated with an indirect branch instruction of a processor. The predictor includes prediction logic configured to use the predicted partial target address to produce a predicted complete target address of the complete target address for use by the processor prior to execution of the indirect branch instruction.Type: GrantFiled: June 15, 2018Date of Patent: September 15, 2020Assignee: MARVELL ASIA PTE, LTD.Inventors: Edward J. McLellan, David A. Carlson, Rohit P. Thakar
-
Patent number: 10747541Abstract: Instructions are executed in a pipeline. Storage accessible to the pipeline stores branch prediction information characterizing results of branch instructions previously executed. A predicted branch result is provided, for at least some branch instructions, based on a selected predictor of multiple predictors. An actual branch result is provided based on an executed branch instruction, and the branch prediction information is updated based on the actual branch result. The predictors include: a first predictor that determines the predicted branch result based on at least a portion of the branch prediction information; and a second predictor that determines the predicted branch result independently from the branch prediction information.Type: GrantFiled: January 25, 2018Date of Patent: August 18, 2020Assignee: Marvell Asia Pte, Ltd.Inventors: Shubhendu Sekhar Mukherjee, David Kravitz, Edward J. McLellan
-
Publication number: 20190384609Abstract: An example embodiment combines use of a branch predictor with cache-like storage of previously executed branch targets to improve processor performance while minimizing hardware cost. The branch predictor is configured to predict both conditional branch and indirect branch targets and includes a combined predictor table configured to store at least one tagged conditional branch prediction in combination with at least one tagged indirect branch target prediction. The at least one tagged indirect branch target prediction is configured to include a predicted partial target address of a complete target address, the complete target address associated with an indirect branch instruction of a processor. The predictor includes prediction logic configured to use the predicted partial target address to produce a predicted complete target address of the complete target address for use by the processor prior to execution of the indirect branch instruction.Type: ApplicationFiled: June 15, 2018Publication date: December 19, 2019Inventors: Edward J. McLellan, David A. Carlson, Rohit P. Thakar
-
Publication number: 20190227804Abstract: Instructions are executed in a pipeline. Storage accessible to the pipeline stores branch prediction information characterizing results of branch instructions previously executed. A predicted branch result is provided, for at least some branch instructions, based on a selected predictor of multiple predictors. An actual branch result is provided based on an executed branch instruction, and the branch prediction information is updated based on the actual branch result. The predictors include: a first predictor that determines the predicted branch result based on at least a portion of the branch prediction information; and a second predictor that determines the predicted branch result independently from the branch prediction information.Type: ApplicationFiled: January 25, 2018Publication date: July 25, 2019Inventors: Shubhendu Sekhar Mukherjee, David Kravitz, Edward J. McLellan
-
Patent number: 9575553Abstract: A processor employs a set of replica paths at a processor to determine an operating frequency and voltage for the processor. The replica paths each represent one or more circuit paths at a functional module of the processor. The delays at the replica paths are normalized to increase the likelihood that the replica paths accurately represent the behavior of the circuit paths of the functional module. After normalization, a distribution of delay values is generated by varying, at each replica path, the delay at an output node of the replica path until a mismatch is detected between a signal at the output node of the replica path and the delayed representation of the signal. The resulting distribution of delay values can then be adjusted based on variations in reference voltages at the replica paths to account for potential distribution errors resulting from the reference voltage variations.Type: GrantFiled: December 19, 2014Date of Patent: February 21, 2017Assignee: Advanced Micro Devices, Inc.Inventors: Seng Oon Toh, Edward J. McLellan, Stephen V. Kosonocky, Michael Leonard Golden, Samuel D. Naffziger
-
Publication number: 20160179186Abstract: A processor employs a set of replica paths at a processor to determine an operating frequency and voltage for the processor. The replica paths each represent one or more circuit paths at a functional module of the processor. The delays at the replica paths are normalized to increase the likelihood that the replica paths accurately represent the behavior of the circuit paths of the functional module. After normalization, a distribution of delay values is generated by varying, at each replica path, the delay at an output node of the replica path until a mismatch is detected between a signal at the output node of the replica path and the delayed representation of the signal. The resulting distribution of delay values can then be adjusted based on variations in reference voltages at the replica paths to account for potential distribution errors resulting from the reference voltage variations.Type: ApplicationFiled: December 19, 2014Publication date: June 23, 2016Inventors: Seng Oon Toh, Edward J. McLellan, Stephen V. Kosonocky, Michael Leonard Golden, Samuel D. Naffziger
-
Patent number: 9021207Abstract: In response to a processor core exiting a low-power state, a cache is set to a minimum size so that fewer than all of the cache's entries are available to store data, thus reducing the cache's power consumption. Over time, the size of the cache can be increased to account for heightened processor activity, thus ensuring that processing efficiency is not significantly impacted by a reduced cache size. In some embodiments, the cache size is increased based on a measured processor performance metric, such as an eviction rate of the cache. In some embodiments, the cache size is increased at regular intervals until a maximum size is reached.Type: GrantFiled: December 20, 2012Date of Patent: April 28, 2015Assignee: Advanced Micro Devices, Inc.Inventors: John Kalamatianos, Edward J. McLellan, Paul Keltcher, Srilatha Manne, Richard E. Klass, James M. O'Connor
-
Publication number: 20150026406Abstract: A size of a cache of a processing system is adjusted by ways, such that each set of the cache has the same number of ways. The cache is a set-associative cache, whereby each set includes a number of ways. In response to defined events at the processing system, a cache controller changes the number of ways of each set of the cache. For example, in response to a processor core indicating that it is entering a period of reduced activity, the cache controller can reduce the number of ways available in each set of the cache.Type: ApplicationFiled: July 19, 2013Publication date: January 22, 2015Applicant: Advanced Micro Devices, Inc.Inventors: Edward J. McLellan, Sudha Thiruvengadam, Douglas R. Beard, Carl D. Dietz, Stephen V. Kosonocky
-
Publication number: 20150026407Abstract: As a processor enters selected low-power modes, a cache is flushed of data by writing data stored at the cache to other levels of a memory hierarchy. The flushing of the cache allows the size of the cache to be reduced without suffering an additional performance penalty of writing the data at the reduced cache locations to the memory hierarchy. Accordingly, when the cache exits the selected low-power modes, it is sized to a minimum size by setting the number of ways of the cache to a minimum number. In response to defined events at the processing system, a cache controller changes the number of ways of each set of the cache.Type: ApplicationFiled: July 19, 2013Publication date: January 22, 2015Applicant: Advanced Micro Devices, Inc.Inventors: Edward J. McLellan, Sudha Thiruvengadam, Douglas R. Beard, Carl D. Dietz, Stephen V. Kosonocky
-
Publication number: 20140181410Abstract: In response to a processor core exiting a low-power state, a cache is set to a minimum size so that fewer than all of the cache's entries are available to store data, thus reducing the cache's power consumption. Over time, the size of the cache can be increased to account for heightened processor activity, thus ensuring that processing efficiency is not significantly impacted by a reduced cache size. In some embodiments, the cache size is increased based on a measured processor performance metric, such as an eviction rate of the cache. In some embodiments, the cache size is increased at regular intervals until a maximum size is reached.Type: ApplicationFiled: December 20, 2012Publication date: June 26, 2014Applicant: Advanced Micro Devices, Inc.Inventors: John Kalamatianos, Edward J. McLellan, Paul Keltcher, Srilatha Manne, Richard E. Klass, James M. O'Connor
-
Publication number: 20120166777Abstract: Techniques for switching or parking threads in a processor including a plurality of processor cores that share a microcode engine are disclosed. In a dual-core or multi-core system, a front end, (e.g., microcode engine), of the processor cores may be shared by the two or more active threads in order to reduce the area, cost, or the like. A currently running thread may be put to a sleep state and execution of another thread may be initiated when a yield microcode command issues while the currently thread is running. The thread may be resumed on a condition that the second thread goes to a sleep state, yields, exits the processing, etc. Alternatively, a thread may be put to a sleep state when a sleep microcode command issues which is programmed to occur when the thread needs to wait for an event to occur.Type: ApplicationFiled: December 22, 2010Publication date: June 28, 2012Applicant: ADVANCED MICRO DEVICES, INC.Inventors: Edward J. McLellan, Magiting M. Talisayon, Donald A. Priore
-
Patent number: 7647472Abstract: An integrated circuit (203) for use in processing streams of data generally and streams of packets in particular. The integrated circuit (203) includes a number of packet processors (307, 313, 303), a table look up engine (301), a queue management engine (305) and a buffer management engine (315). The packet processors (307, 313, 303) include a receive processor (421), a transmit processor (427) and a risc core processor (401), all of which are programmable. The receive processor (421) and the core processor (401) cooperate to receive and route packets being received and the core processor (401) and the transmit processor (427) cooperate to transmit packets. Routing is done by using information from the table look up engine (301) to determine a queue (215) in the queue management engine (305) which is to receive a descriptor (217) describing the received packet's payload.Type: GrantFiled: August 25, 2006Date of Patent: January 12, 2010Assignee: Freescale Semiconductor, Inc.Inventors: Thomas B. Brightman, Andrew D. Funk, David J. Husak, Edward J. McLellan, Andrew T. Brown, John F. Brown, James A. Farrell, Donald A. Priore, Mark A. Sankey, Paul Schmitt
-
Patent number: 7100020Abstract: An integrated circuit (203) for use in processing streams of data generally and streams of packets in particular. The integrated circuit (203) includes a number of packet processors (307, 313, 303), a table look up engine (301), a queue management engine (305) and a buffer management engine (315). The packet processors (307, 313, 303) include a receive processor (421), a transmit processor (427) and a risc core processor (401), all of which are programmable. The receive processor (421) and the core processor (401) cooperate to receive and route packets being received and the core processor (401) and the transmit processor (427) cooperate to transmit packets. Routing is done by using information from the table look up engine (301) to determine a queue (215) in the queue management engine (305) which is to receive a descriptor (217) describing the received packet's payload.Type: GrantFiled: May 7, 1999Date of Patent: August 29, 2006Assignee: Freescale Semiconductor, Inc.Inventors: Thomas B. Brightman, Andrew T. Brown, John F. Brown, James A. Farrell, Andrew D. Funk, David J. Husak, Edward J. McLellan, Mark A. Sankey, Paul Schmitt, Donald A. Priore
-
Patent number: 6449713Abstract: A technique for handling a conditional move instruction in an out-of-order data processor. The technique involves detecting a conditional move instruction within an instruction stream, and generating multiple instructions according to the detected conditional move instruction. The technique further involves replacing the conditional move instruction within the instruction stream with the generated multiple instructions. The generated multiple instructions are generated such that each of the generated multiple instructions executes using no more than two input ports of an execution unit. The generated multiple instructions include a first generated instruction that produces a condition result indicating whether a condition exists, and a second generated instruction that inputs the condition result as a portion of an operand which identifies a register of the out-of-order data processor.Type: GrantFiled: November 18, 1998Date of Patent: September 10, 2002Assignee: Compaq Information Technologies Group, L.P.Inventors: Joel Springer Emer, Bruce Edwards, Daniel Lawrence Leibholz, Edward J. McLellan, Derrick R. Meyer
-
Publication number: 20020112142Abstract: A technique for handling a conditional move instruction in an out-of-order data processor. The technique involves detecting a conditional move instruction within an instruction stream, and generating multiple instructions according to the detected conditional move instruction. The technique further involves replacing the conditional move instruction within the instruction stream with the generated multiple instructions. The generated multiple instructions are generated such that each of the generated multiple instructions executes using no more than two input ports of an execution unit. The generated multiple instructions include a first generated instruction that produces a condition result indicating whether a condition exists, and a second generated instruction that inputs the condition result as a portion of an operand which identifies a register of the out-of-order data processor.Type: ApplicationFiled: November 18, 1998Publication date: August 15, 2002Inventors: JOEL SPRINGER EMER, BRUCE EDWARDS, DANIEL LAWRENCE LEIBHOLZ, EDWARD J. MCLELLAN, DERRICK R. MEYER
-
Patent number: 6195748Abstract: An apparatus is provided for sampling instructions in a processor pipeline of a computer system. The pipeline has a plurality of processing stages. Instructions are fetched into a first stage of the pipeline. A subset of the fetched instructions are identified as selected instructions. Event, latency, and state information of the system is sampled while any of the selected instructions are in any stage of the pipeline. Software is informed whenever any of the selected instructions leaves the pipeline to read the event and latency information.Type: GrantFiled: November 26, 1997Date of Patent: February 27, 2001Assignee: Compaq Computer CorporationInventors: George Z. Chrysos, Jeffrey Dean, James E. Hicks, Carl A. Waldspurger, William E. Weihl, Daniel L. Leibholz, Edward J. McLellan
-
Patent number: 6163840Abstract: An apparatus is provided for sampling multiple concurretly executing instructions in a processor pipeline of a system. The pipeline has a plurality of processing stages. The apparatus identifies multiple selected when the instructions are fetched into a first stage of the pipeline. A subset of the the multiple selected instructions to execute concurrently in the pipeline. State information of the system is sampled while any of the multiple selected instructions are in any stage of the pipeline. Software is informed whenever all of the selected instructions leave the pipeline so that the software can read any of the state information.Type: GrantFiled: November 26, 1997Date of Patent: December 19, 2000Assignee: Compaq Computer CorporationInventors: George Z. Chrysos, Jeffrey Dean, James E. Hicks, Daniel L. Leibholz, Edward J. McLellan, Carl A. Waldspurger, William E. Weihl
-
Patent number: 6081887Abstract: A technique for predicting the result of a conditional branch instruction for use with a processor having instruction pipeline. A stored predictor is connected to the front end of the pipeline and is trained from a truth based predictor connected to the back end of the pipeline. The stored predictor is accessible in one instruction cycle, and therefore provides minimum predictor latency. Update latency is minimized by storing multiple predictions in the front end stored predictor which are indexed by an index counter. The multiple predictions, as provided by the back end, are indexed by the index counter to select a particular one as current prediction on a given instruction pipeline cycle. The front end stored predictor also passes along to the back end predictor, such as through the instruction pipeline, a position value used to generate the predictions. This further structure accommodates ghost branch instructions that turn out to be flushed out of the pipeline when it must be backed up.Type: GrantFiled: November 12, 1998Date of Patent: June 27, 2000Assignee: Compaq Computer CorporationInventors: Simon C. Steely, Jr., Edward J. McLellan, Joel S. Emer
-
Patent number: 6000044Abstract: An apparatus is provided for sampling instructions in a processor pipeline of a system. The pipeline has a plurality of processing stages. The apparatus includes a fetch unit for fetching instructions into a first stage of the pipeline. Certain randomly selected instructions are identified, and state information of the system is sampled while a particular selected instruction is in any stage of the pipeline. Software is informed when the particular selected instruction leaves the pipeline so that the software can read any of the sampled state information.Type: GrantFiled: November 26, 1997Date of Patent: December 7, 1999Assignee: Digital Equipment CorporationInventors: George Z. Chrysos, Jeffrey Dean, James E. Hicks, Daniel L. Leibholz, Edward J. McLellan, Carl A. Waldspurger, William E. Weihl
-
Patent number: 5933860Abstract: A computer system including an instruction cache (I-cache) having a plurality of banks for storing a subset of data from memory is shown to include a prediction mechanism for predicting which bank of the I-cache contains the required data. A prediction value, including a sequential prediction hint and a branch prediction hint, is associated with each instruction stored in the I-cache. The prediction value may either be stored with the I-cache data, or in a separate memory included before the I-cache. If the predicted value is incorrect, the predicted hint is `trained` to provide a higher degree of accuracy for repetitive instruction stream operation. Processor performance is additionally improved by providing a branch hint that allows for smoother transition between changing instruction streams.Type: GrantFiled: July 29, 1997Date of Patent: August 3, 1999Assignee: Digital Equipment CorporationInventors: Joel S. Emer, Simon Steely, Edward J. McLellan