Patents by Inventor Edward J. McLellan

Edward J. McLellan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10776119
    Abstract: An example embodiment combines use of a branch predictor with cache-like storage of previously executed branch targets to improve processor performance while minimizing hardware cost. The branch predictor is configured to predict both conditional branch and indirect branch targets and includes a combined predictor table configured to store at least one tagged conditional branch prediction in combination with at least one tagged indirect branch target prediction. The at least one tagged indirect branch target prediction is configured to include a predicted partial target address of a complete target address, the complete target address associated with an indirect branch instruction of a processor. The predictor includes prediction logic configured to use the predicted partial target address to produce a predicted complete target address of the complete target address for use by the processor prior to execution of the indirect branch instruction.
    Type: Grant
    Filed: June 15, 2018
    Date of Patent: September 15, 2020
    Assignee: MARVELL ASIA PTE, LTD.
    Inventors: Edward J. McLellan, David A. Carlson, Rohit P. Thakar
  • Patent number: 10747541
    Abstract: Instructions are executed in a pipeline. Storage accessible to the pipeline stores branch prediction information characterizing results of branch instructions previously executed. A predicted branch result is provided, for at least some branch instructions, based on a selected predictor of multiple predictors. An actual branch result is provided based on an executed branch instruction, and the branch prediction information is updated based on the actual branch result. The predictors include: a first predictor that determines the predicted branch result based on at least a portion of the branch prediction information; and a second predictor that determines the predicted branch result independently from the branch prediction information.
    Type: Grant
    Filed: January 25, 2018
    Date of Patent: August 18, 2020
    Assignee: Marvell Asia Pte, Ltd.
    Inventors: Shubhendu Sekhar Mukherjee, David Kravitz, Edward J. McLellan
  • Publication number: 20190384609
    Abstract: An example embodiment combines use of a branch predictor with cache-like storage of previously executed branch targets to improve processor performance while minimizing hardware cost. The branch predictor is configured to predict both conditional branch and indirect branch targets and includes a combined predictor table configured to store at least one tagged conditional branch prediction in combination with at least one tagged indirect branch target prediction. The at least one tagged indirect branch target prediction is configured to include a predicted partial target address of a complete target address, the complete target address associated with an indirect branch instruction of a processor. The predictor includes prediction logic configured to use the predicted partial target address to produce a predicted complete target address of the complete target address for use by the processor prior to execution of the indirect branch instruction.
    Type: Application
    Filed: June 15, 2018
    Publication date: December 19, 2019
    Inventors: Edward J. McLellan, David A. Carlson, Rohit P. Thakar
  • Publication number: 20190227804
    Abstract: Instructions are executed in a pipeline. Storage accessible to the pipeline stores branch prediction information characterizing results of branch instructions previously executed. A predicted branch result is provided, for at least some branch instructions, based on a selected predictor of multiple predictors. An actual branch result is provided based on an executed branch instruction, and the branch prediction information is updated based on the actual branch result. The predictors include: a first predictor that determines the predicted branch result based on at least a portion of the branch prediction information; and a second predictor that determines the predicted branch result independently from the branch prediction information.
    Type: Application
    Filed: January 25, 2018
    Publication date: July 25, 2019
    Inventors: Shubhendu Sekhar Mukherjee, David Kravitz, Edward J. McLellan
  • Patent number: 9575553
    Abstract: A processor employs a set of replica paths at a processor to determine an operating frequency and voltage for the processor. The replica paths each represent one or more circuit paths at a functional module of the processor. The delays at the replica paths are normalized to increase the likelihood that the replica paths accurately represent the behavior of the circuit paths of the functional module. After normalization, a distribution of delay values is generated by varying, at each replica path, the delay at an output node of the replica path until a mismatch is detected between a signal at the output node of the replica path and the delayed representation of the signal. The resulting distribution of delay values can then be adjusted based on variations in reference voltages at the replica paths to account for potential distribution errors resulting from the reference voltage variations.
    Type: Grant
    Filed: December 19, 2014
    Date of Patent: February 21, 2017
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Seng Oon Toh, Edward J. McLellan, Stephen V. Kosonocky, Michael Leonard Golden, Samuel D. Naffziger
  • Publication number: 20160179186
    Abstract: A processor employs a set of replica paths at a processor to determine an operating frequency and voltage for the processor. The replica paths each represent one or more circuit paths at a functional module of the processor. The delays at the replica paths are normalized to increase the likelihood that the replica paths accurately represent the behavior of the circuit paths of the functional module. After normalization, a distribution of delay values is generated by varying, at each replica path, the delay at an output node of the replica path until a mismatch is detected between a signal at the output node of the replica path and the delayed representation of the signal. The resulting distribution of delay values can then be adjusted based on variations in reference voltages at the replica paths to account for potential distribution errors resulting from the reference voltage variations.
    Type: Application
    Filed: December 19, 2014
    Publication date: June 23, 2016
    Inventors: Seng Oon Toh, Edward J. McLellan, Stephen V. Kosonocky, Michael Leonard Golden, Samuel D. Naffziger
  • Patent number: 9021207
    Abstract: In response to a processor core exiting a low-power state, a cache is set to a minimum size so that fewer than all of the cache's entries are available to store data, thus reducing the cache's power consumption. Over time, the size of the cache can be increased to account for heightened processor activity, thus ensuring that processing efficiency is not significantly impacted by a reduced cache size. In some embodiments, the cache size is increased based on a measured processor performance metric, such as an eviction rate of the cache. In some embodiments, the cache size is increased at regular intervals until a maximum size is reached.
    Type: Grant
    Filed: December 20, 2012
    Date of Patent: April 28, 2015
    Assignee: Advanced Micro Devices, Inc.
    Inventors: John Kalamatianos, Edward J. McLellan, Paul Keltcher, Srilatha Manne, Richard E. Klass, James M. O'Connor
  • Publication number: 20150026407
    Abstract: As a processor enters selected low-power modes, a cache is flushed of data by writing data stored at the cache to other levels of a memory hierarchy. The flushing of the cache allows the size of the cache to be reduced without suffering an additional performance penalty of writing the data at the reduced cache locations to the memory hierarchy. Accordingly, when the cache exits the selected low-power modes, it is sized to a minimum size by setting the number of ways of the cache to a minimum number. In response to defined events at the processing system, a cache controller changes the number of ways of each set of the cache.
    Type: Application
    Filed: July 19, 2013
    Publication date: January 22, 2015
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Edward J. McLellan, Sudha Thiruvengadam, Douglas R. Beard, Carl D. Dietz, Stephen V. Kosonocky
  • Publication number: 20150026406
    Abstract: A size of a cache of a processing system is adjusted by ways, such that each set of the cache has the same number of ways. The cache is a set-associative cache, whereby each set includes a number of ways. In response to defined events at the processing system, a cache controller changes the number of ways of each set of the cache. For example, in response to a processor core indicating that it is entering a period of reduced activity, the cache controller can reduce the number of ways available in each set of the cache.
    Type: Application
    Filed: July 19, 2013
    Publication date: January 22, 2015
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Edward J. McLellan, Sudha Thiruvengadam, Douglas R. Beard, Carl D. Dietz, Stephen V. Kosonocky
  • Publication number: 20140181410
    Abstract: In response to a processor core exiting a low-power state, a cache is set to a minimum size so that fewer than all of the cache's entries are available to store data, thus reducing the cache's power consumption. Over time, the size of the cache can be increased to account for heightened processor activity, thus ensuring that processing efficiency is not significantly impacted by a reduced cache size. In some embodiments, the cache size is increased based on a measured processor performance metric, such as an eviction rate of the cache. In some embodiments, the cache size is increased at regular intervals until a maximum size is reached.
    Type: Application
    Filed: December 20, 2012
    Publication date: June 26, 2014
    Applicant: Advanced Micro Devices, Inc.
    Inventors: John Kalamatianos, Edward J. McLellan, Paul Keltcher, Srilatha Manne, Richard E. Klass, James M. O'Connor
  • Publication number: 20120166777
    Abstract: Techniques for switching or parking threads in a processor including a plurality of processor cores that share a microcode engine are disclosed. In a dual-core or multi-core system, a front end, (e.g., microcode engine), of the processor cores may be shared by the two or more active threads in order to reduce the area, cost, or the like. A currently running thread may be put to a sleep state and execution of another thread may be initiated when a yield microcode command issues while the currently thread is running. The thread may be resumed on a condition that the second thread goes to a sleep state, yields, exits the processing, etc. Alternatively, a thread may be put to a sleep state when a sleep microcode command issues which is programmed to occur when the thread needs to wait for an event to occur.
    Type: Application
    Filed: December 22, 2010
    Publication date: June 28, 2012
    Applicant: ADVANCED MICRO DEVICES, INC.
    Inventors: Edward J. McLellan, Magiting M. Talisayon, Donald A. Priore
  • Patent number: 7647472
    Abstract: An integrated circuit (203) for use in processing streams of data generally and streams of packets in particular. The integrated circuit (203) includes a number of packet processors (307, 313, 303), a table look up engine (301), a queue management engine (305) and a buffer management engine (315). The packet processors (307, 313, 303) include a receive processor (421), a transmit processor (427) and a risc core processor (401), all of which are programmable. The receive processor (421) and the core processor (401) cooperate to receive and route packets being received and the core processor (401) and the transmit processor (427) cooperate to transmit packets. Routing is done by using information from the table look up engine (301) to determine a queue (215) in the queue management engine (305) which is to receive a descriptor (217) describing the received packet's payload.
    Type: Grant
    Filed: August 25, 2006
    Date of Patent: January 12, 2010
    Assignee: Freescale Semiconductor, Inc.
    Inventors: Thomas B. Brightman, Andrew D. Funk, David J. Husak, Edward J. McLellan, Andrew T. Brown, John F. Brown, James A. Farrell, Donald A. Priore, Mark A. Sankey, Paul Schmitt
  • Patent number: 7100020
    Abstract: An integrated circuit (203) for use in processing streams of data generally and streams of packets in particular. The integrated circuit (203) includes a number of packet processors (307, 313, 303), a table look up engine (301), a queue management engine (305) and a buffer management engine (315). The packet processors (307, 313, 303) include a receive processor (421), a transmit processor (427) and a risc core processor (401), all of which are programmable. The receive processor (421) and the core processor (401) cooperate to receive and route packets being received and the core processor (401) and the transmit processor (427) cooperate to transmit packets. Routing is done by using information from the table look up engine (301) to determine a queue (215) in the queue management engine (305) which is to receive a descriptor (217) describing the received packet's payload.
    Type: Grant
    Filed: May 7, 1999
    Date of Patent: August 29, 2006
    Assignee: Freescale Semiconductor, Inc.
    Inventors: Thomas B. Brightman, Andrew T. Brown, John F. Brown, James A. Farrell, Andrew D. Funk, David J. Husak, Edward J. McLellan, Mark A. Sankey, Paul Schmitt, Donald A. Priore
  • Patent number: 6449713
    Abstract: A technique for handling a conditional move instruction in an out-of-order data processor. The technique involves detecting a conditional move instruction within an instruction stream, and generating multiple instructions according to the detected conditional move instruction. The technique further involves replacing the conditional move instruction within the instruction stream with the generated multiple instructions. The generated multiple instructions are generated such that each of the generated multiple instructions executes using no more than two input ports of an execution unit. The generated multiple instructions include a first generated instruction that produces a condition result indicating whether a condition exists, and a second generated instruction that inputs the condition result as a portion of an operand which identifies a register of the out-of-order data processor.
    Type: Grant
    Filed: November 18, 1998
    Date of Patent: September 10, 2002
    Assignee: Compaq Information Technologies Group, L.P.
    Inventors: Joel Springer Emer, Bruce Edwards, Daniel Lawrence Leibholz, Edward J. McLellan, Derrick R. Meyer
  • Publication number: 20020112142
    Abstract: A technique for handling a conditional move instruction in an out-of-order data processor. The technique involves detecting a conditional move instruction within an instruction stream, and generating multiple instructions according to the detected conditional move instruction. The technique further involves replacing the conditional move instruction within the instruction stream with the generated multiple instructions. The generated multiple instructions are generated such that each of the generated multiple instructions executes using no more than two input ports of an execution unit. The generated multiple instructions include a first generated instruction that produces a condition result indicating whether a condition exists, and a second generated instruction that inputs the condition result as a portion of an operand which identifies a register of the out-of-order data processor.
    Type: Application
    Filed: November 18, 1998
    Publication date: August 15, 2002
    Inventors: JOEL SPRINGER EMER, BRUCE EDWARDS, DANIEL LAWRENCE LEIBHOLZ, EDWARD J. MCLELLAN, DERRICK R. MEYER
  • Patent number: 6195748
    Abstract: An apparatus is provided for sampling instructions in a processor pipeline of a computer system. The pipeline has a plurality of processing stages. Instructions are fetched into a first stage of the pipeline. A subset of the fetched instructions are identified as selected instructions. Event, latency, and state information of the system is sampled while any of the selected instructions are in any stage of the pipeline. Software is informed whenever any of the selected instructions leaves the pipeline to read the event and latency information.
    Type: Grant
    Filed: November 26, 1997
    Date of Patent: February 27, 2001
    Assignee: Compaq Computer Corporation
    Inventors: George Z. Chrysos, Jeffrey Dean, James E. Hicks, Carl A. Waldspurger, William E. Weihl, Daniel L. Leibholz, Edward J. McLellan
  • Patent number: 6163840
    Abstract: An apparatus is provided for sampling multiple concurretly executing instructions in a processor pipeline of a system. The pipeline has a plurality of processing stages. The apparatus identifies multiple selected when the instructions are fetched into a first stage of the pipeline. A subset of the the multiple selected instructions to execute concurrently in the pipeline. State information of the system is sampled while any of the multiple selected instructions are in any stage of the pipeline. Software is informed whenever all of the selected instructions leave the pipeline so that the software can read any of the state information.
    Type: Grant
    Filed: November 26, 1997
    Date of Patent: December 19, 2000
    Assignee: Compaq Computer Corporation
    Inventors: George Z. Chrysos, Jeffrey Dean, James E. Hicks, Daniel L. Leibholz, Edward J. McLellan, Carl A. Waldspurger, William E. Weihl
  • Patent number: 6081887
    Abstract: A technique for predicting the result of a conditional branch instruction for use with a processor having instruction pipeline. A stored predictor is connected to the front end of the pipeline and is trained from a truth based predictor connected to the back end of the pipeline. The stored predictor is accessible in one instruction cycle, and therefore provides minimum predictor latency. Update latency is minimized by storing multiple predictions in the front end stored predictor which are indexed by an index counter. The multiple predictions, as provided by the back end, are indexed by the index counter to select a particular one as current prediction on a given instruction pipeline cycle. The front end stored predictor also passes along to the back end predictor, such as through the instruction pipeline, a position value used to generate the predictions. This further structure accommodates ghost branch instructions that turn out to be flushed out of the pipeline when it must be backed up.
    Type: Grant
    Filed: November 12, 1998
    Date of Patent: June 27, 2000
    Assignee: Compaq Computer Corporation
    Inventors: Simon C. Steely, Jr., Edward J. McLellan, Joel S. Emer
  • Patent number: 6000044
    Abstract: An apparatus is provided for sampling instructions in a processor pipeline of a system. The pipeline has a plurality of processing stages. The apparatus includes a fetch unit for fetching instructions into a first stage of the pipeline. Certain randomly selected instructions are identified, and state information of the system is sampled while a particular selected instruction is in any stage of the pipeline. Software is informed when the particular selected instruction leaves the pipeline so that the software can read any of the sampled state information.
    Type: Grant
    Filed: November 26, 1997
    Date of Patent: December 7, 1999
    Assignee: Digital Equipment Corporation
    Inventors: George Z. Chrysos, Jeffrey Dean, James E. Hicks, Daniel L. Leibholz, Edward J. McLellan, Carl A. Waldspurger, William E. Weihl
  • Patent number: 5933860
    Abstract: A computer system including an instruction cache (I-cache) having a plurality of banks for storing a subset of data from memory is shown to include a prediction mechanism for predicting which bank of the I-cache contains the required data. A prediction value, including a sequential prediction hint and a branch prediction hint, is associated with each instruction stored in the I-cache. The prediction value may either be stored with the I-cache data, or in a separate memory included before the I-cache. If the predicted value is incorrect, the predicted hint is `trained` to provide a higher degree of accuracy for repetitive instruction stream operation. Processor performance is additionally improved by providing a branch hint that allows for smoother transition between changing instruction streams.
    Type: Grant
    Filed: July 29, 1997
    Date of Patent: August 3, 1999
    Assignee: Digital Equipment Corporation
    Inventors: Joel S. Emer, Simon Steely, Edward J. McLellan