Patents by Inventor Emil TALPES

Emil TALPES has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9483273
    Abstract: A method includes suppressing execution of an operation portion of a load-operation instruction in a processor responsive to an invalid status of a load portion of load-operation instruction. A processor includes an instruction pipeline including an execution unit operable to execute instructions and a scheduler unit. The scheduler unit includes a scheduler queue and is operable to store a load-operation in the scheduler queue. The load-operation instruction includes a load portion and an operation portion. The scheduler unit schedules the load portion for execution in the execution unit, marks the operation portion in the scheduler queue as eligible for execution responsive to scheduling the load portion, receives an indication of an invalid status of the load portion, and suppresses execution of the operation portion responsive to the indication of the invalid status.
    Type: Grant
    Filed: July 16, 2013
    Date of Patent: November 1, 2016
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Francesco Spadini, Michael Achenbach, Emil Talpes, Ganesh Venkataramanan
  • Patent number: 9268575
    Abstract: Methods and apparatuses are provided for flush operations in a processor. The apparatus comprises an out-of-order execution unit for processing instructions issued in-order from an instruction decoder for first and second threads and being configured to identify an errored instruction in a first thread. A retire unit includes a retire queue for receiving completed instructions from the out-of-order execution unit, the retire unit being configured retire older in-order first thread instructions until the errored instruction would be the next instruction to be retired, and then flushing the errored instruction and all later in-order first thread instructions from the retire queue. The method comprises determining that an errored instruction is being processed by an out-of-order execution unit of a processor and continuing to process to completion instructions earlier in-order from the errored instruction until the completion of the errored instruction.
    Type: Grant
    Filed: June 30, 2011
    Date of Patent: February 23, 2016
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Jay E. Fleischman, Emil Talpes, Debjit DasSarma
  • Publication number: 20160041853
    Abstract: A processor includes an execution unit to execute instructions and a scheduler unit to store a queue of instructions for execution by the execution unit. The scheduler unit includes a wake array including a plurality of source slots to store source identifiers for sources associated with the instructions, a picker to schedule a particular instruction for execution in the execution unit, broadcast a destination identifier associated with the particular instruction to a first subset of the source slots, and a delay element to receive the destination identifier broadcast by the picker and communicate a delayed version of the destination identifier to a second subset of the source slots different from the first subset.
    Type: Application
    Filed: August 6, 2014
    Publication date: February 11, 2016
    Inventors: Srikanth Arekapudi, Emil Talpes, Sahil Arora
  • Patent number: 9176738
    Abstract: Method and apparatus for fast decoding of microinstructions are disclosed. An integrated circuit is disclosed wherein microinstructions are queued for execution in an execution unit having multiple pipelines where each pipeline is configured to execute a set of supported microinstructions. The execution unit receives microinstruction data including an operation code (opcode) or a complex opcode. The execution unit executes the microinstruction multiple times wherein the microinstruction is executed at least once to get an address value and at least once to get a result of an operation. The execution unit processes complex opcodes by utilizing both a load/store support and a simple opcode support by splitting the complex opcode into load/store and simple opcode components and creating an internal source/destination between the two components.
    Type: Grant
    Filed: January 12, 2011
    Date of Patent: November 3, 2015
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Ganesh Venkataramanan, Emil Talpes
  • Patent number: 9170638
    Abstract: A method and apparatus are described for reducing power consumption in a processor. A micro-operation is selected for execution, and a destination physical register tag of the selected micro-operation is compared to a plurality of source physical register tags of micro-operations dependent upon the selected micro-operation. If there is a match between the destination physical register tag and one of the source physical register tags, a corresponding physical register file (PRF) read operation is disabled. The comparison may be performed by a wakeup content-addressable memory (CAM) of a scheduler. The wakeup CAM may send a read control signal to the PRF to disable the read operation. Disabling the corresponding PRF read operation may include shutting off power in the PRF and related logic.
    Type: Grant
    Filed: December 16, 2010
    Date of Patent: October 27, 2015
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Ganesh Venkataramanan, Emil Talpes
  • Publication number: 20150121040
    Abstract: Methods, devices, and systems for accessing packed registers are presented. A state of the packed registers may be tracked and it may be determined whether the register is directly accessible based on the state. If the register is not directly accessible, an action may be performed which allows the register to be accessed directly. The action may include injecting at least one uop for reorganizing the physical storage of the register such that it is directly accessible. The action may include aligning the data with the least significant bit of a physical register or otherwise aligning the data with the datapath. The action may also include changing the state of the packed registers.
    Type: Application
    Filed: October 24, 2014
    Publication date: April 30, 2015
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Robert E. Weidner, Jay E. Fleischman, Michael C. Sedmak, Michael Estlick, Richard McGowen, II, Emil Talpes
  • Publication number: 20150121041
    Abstract: Described herein are methods and processors for flag renaming in groups to eliminate dependencies of instructions. Decoder and execution units in the processor may be configured to rename flags into groups that allow each group to be treated separately as appropriate. This flag renaming eliminates flag dependencies with respect to instructions. This allows an instruction to write exactly the flags that the instruction wants without having to create merge dependencies. Methods and processors are provided for handling immediate values embedded in instructions. A 16 bit immediate bus and a 4 bit encoding/control bus are added at the interface between decode and execution units. For an 8 or 12 bit immediate, the upper 4 bits of the immediate bus contain the encoding bits. For a 16 bit immediate, the encoding/control bus contains the encoding bits. The encoding/control bus indicates when to look at the top four bits of the immediate bus.
    Type: Application
    Filed: October 24, 2014
    Publication date: April 30, 2015
    Applicant: ADVANCED MICRO DEVICES, INC.
    Inventors: Ashok Venkatachar, Karthik Punukollu, Srikanth Arekapudi, Samir A. Chitnis, Emil Talpes
  • Publication number: 20150026686
    Abstract: A method includes suppressing execution of an operation portion of a load-operation instruction in a processor responsive to an invalid status of a load portion of load-operation instruction. A processor includes an instruction pipeline including an execution unit operable to execute instructions and a scheduler unit. The scheduler unit includes a scheduler queue and is operable to store a load-operation in the scheduler queue. The load-operation instruction includes a load portion and an operation portion. The scheduler unit schedules the load portion for execution in the execution unit, marks the operation portion in the scheduler queue as eligible for execution responsive to scheduling the load portion, receives an indication of an invalid status of the load portion, and suppresses execution of the operation portion responsive to the indication of the invalid status.
    Type: Application
    Filed: July 16, 2013
    Publication date: January 22, 2015
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Francesco Spadini, Michael Achenbach, Emil Talpes, Ganesh Venkataramanan
  • Publication number: 20150026436
    Abstract: The present invention provides a method and apparatus for scheduling based on tags of different types. Some embodiments of the method include broadcasting a first tag to entries in a queue of a scheduler. The first tag is broadcast in response to a first instruction associated with a first entry in the queue being picked for execution. The first tag includes information identifying the first entry and information indicating a type of the first tag. Some embodiments of the method also include marking at least one second entry in the queue is ready to be picked for execution in response to at least one second tag associated with at least one second entry in the queue matching the first tag.
    Type: Application
    Filed: July 17, 2013
    Publication date: January 22, 2015
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Michael Achenbach, Teik Tan, Gregory W. Smaus, Ganesh Venkataramanan, Emil Talpes
  • Publication number: 20150026685
    Abstract: A method includes suppressing execution of at least one dependent instruction of a first instruction by a processor responsive to an invalid status of an ancestor load instruction associated with the first instruction. A processor includes an instruction pipeline having an execution unit to execute instructions, a load store unit for retrieving data from a memory hierarchy, and a scheduler unit. The scheduler unit selects for execution in the execution unit a first load instruction having at least one dependent instruction linked to the first load instruction for data forwarding from the load store unit and suppresses execution of a second dependent instruction of the first dependent instruction responsive to an invalid status of the first load instruction.
    Type: Application
    Filed: July 16, 2013
    Publication date: January 22, 2015
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Francesco Spadini, Michael Achenbach, Emil Talpes, Ganesh Venkataramanan
  • Patent number: 8819397
    Abstract: Methods and apparatuses are provided for increased efficiency in a processor via control word prediction. The apparatus comprises an operational unit capable of determining whether an instruction will change a first control word to a second control word for processing dependent instructions. Execution units process the dependent instructions using a predicted control word and compare the second control word to the predicted control word. A scheduling unit causes the execution units to reprocess the dependent instructions when the predicted control word does not match the second control word. The method comprises determining that an instruction will change a first control word to a second control word and processing the dependent instructions using a predicted control word. The second control word is compared to the predicted control word and the dependent instructions are reprocessed using the second control word when the predicted control word does not match the second control word.
    Type: Grant
    Filed: March 1, 2011
    Date of Patent: August 26, 2014
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Michael D. Estlick, Jay Fleischman, Debjit Das Sarma, Emil Talpes, Krishnan V. Ramani, Chun Liu
  • Publication number: 20140181482
    Abstract: An arithmetic unit performs store-to-load forwarding based on predicted dependencies between store instructions and load instructions. In some embodiments, the arithmetic unit maintains a table of store instructions that are awaiting movement to a load/store unit of the instruction pipeline. In response to receiving a load instruction that is predicted to be dependent on a store instruction stored at the table, the arithmetic unit causes the data associated with the store instruction to be placed into the physical register targeted by the load instruction. In some embodiments, the arithmetic unit performs the forwarding by mapping the physical register targeted by the load instruction to the physical register where the data associated with the store instruction is located.
    Type: Application
    Filed: December 20, 2012
    Publication date: June 26, 2014
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Gregory W. Smaus, Francesco Spadini, Matthew A. Rafacz, Michael Achenbach, Christopher J. Burke, Emil Talpes, Matthew M. Crum
  • Publication number: 20130007418
    Abstract: Methods and apparatuses are provided for flush operations in a processor. The apparatus comprises an out-of-order execution unit for processing instructions issued in-order from an instruction decoder for first and second threads and being configured to identify an errored instruction in a first thread. A retire unit includes a retire queue for receiving completed instructions from the out-of-order execution unit, the retire unit being configured retire older in-order first thread instructions until the errored instruction would be the next instruction to be retired, and then flushing the errored instruction and all later in-order first thread instructions from the retire queue. The method comprises determining that an errored instruction is being processed by an out-of-order execution unit of a processor and continuing to process to completion instructions earlier in-order from the errored instruction until the completion of the errored instruction.
    Type: Application
    Filed: June 30, 2011
    Publication date: January 3, 2013
    Applicant: ADVANCED MICRO DEVICES, INC.
    Inventors: Jay E. Fleischman, Emil Talpes, Debjit DasSarma
  • Publication number: 20120226891
    Abstract: Methods and apparatuses are provided for increased efficiency in a processor via control word prediction. The apparatus comprises an operational unit capable of determining whether an instruction will change a first control word to a second control word for processing dependent instructions. Execution units process the dependent instructions using a predicted control word and compare the second control word to the predicted control word. A scheduling unit causes the execution units to reprocess the dependent instructions when the predicted control word does not match the second control word. The method comprises determining that an instruction will change a first control word to a second control word and processing the dependent instructions using a predicted control word. The second control word is compared to the predicted control word and the dependent instructions are reprocessed using the second control word when the predicted control word does not match the second control word.
    Type: Application
    Filed: March 1, 2011
    Publication date: September 6, 2012
    Applicant: ADVANCED MICRO DEVICES, INC.
    Inventors: Michael D. ESTLICK, Jay FLEISCHMAN, Debjit Das Sarma, Emil TALPES, Krishnan V. RAMANI, Chun LIU
  • Publication number: 20120179895
    Abstract: Method and apparatus for fast decoding of microinstructions are disclosed. An integrated circuit is disclosed wherein microinstructions are queued for execution in an execution unit having multiple pipelines where each pipeline is configured to execute a set of supported microinstructions. The execution unit receives microinstruction data including an operation code (opcode) or a complex opcode. The execution unit executes the microinstruction multiple times wherein the microinstruction is executed at least once to get an address value and at least once to get a result of an operation. The execution unit processes complex opcodes by utilizing both a load/store support and a simple opcode support by splitting the complex opcode into load/store and simple opcode components and creating an internal source/destination between the two components.
    Type: Application
    Filed: January 12, 2011
    Publication date: July 12, 2012
    Applicant: ADVANCED MICRO DEVICES, INC.
    Inventors: Ganesh Venkataramanan, Emil Talpes
  • Publication number: 20120159217
    Abstract: A method and apparatus are described for reducing power consumption in a processor. A micro-operation is selected for execution, and a destination physical register tag of the selected micro-operation is compared to a plurality of source physical register tags of micro-operations dependent upon the selected micro-operation. If there is a match between the destination physical register tag and one of the source physical register tags, a corresponding physical register file (PRF) read operation is disabled. The comparison may be performed by a wakeup content-addressable memory (CAM) of a scheduler. The wakeup CAM may send a read control signal to the PRF to disable the read operation. Disabling the corresponding PRF read operation may include shutting off power in the PRF and related logic.
    Type: Application
    Filed: December 16, 2010
    Publication date: June 21, 2012
    Applicant: ADVANCED MICRO DEVICES, INC.
    Inventors: Ganesh Venkataramanan, Emil Talpes
  • Publication number: 20120144175
    Abstract: An integrated circuit is disclosed wherein microinstructions are selectively queued for execution in an execution unit having multiple pipelines where each pipeline is configured to execute a selected subset of a set of supported microinstructions. The execution unit receives microinstruction data including an operation code OpCode and an operation type OpType. The OpType data being at least one bit less that a minimum binary size of an OpCode required to uniquely identify the microinstruction. The OpType data selected to indicate a category of microinstructions having common execution requirement characteristics. The microinstructions are selectively queued for pipeline processing by the execution unit pipelines based on the OpType without decoding the OpCode of the microinstruction.
    Type: Application
    Filed: December 2, 2010
    Publication date: June 7, 2012
    Applicant: ADVANCED MICRO DEVICES, INC.
    Inventors: Ganesh Venkataramanan, Emil Talpes
  • Publication number: 20120143885
    Abstract: A method and apparatus for maintaining source ready information are disclosed. A first copy of the source ready information is stored in an Architectural Register Name (ARN)-indexed structure and a second copy of the source ready information is stored in a Physical Register Number (PRN)-indexed structure. As new instructions become available that require at least one source, the ARN-indexed structure is accessed. If at least one new source becomes available, the ARN-indexed structure and the PRN-indexed structure are updated to include information regarding the new sources.
    Type: Application
    Filed: December 1, 2010
    Publication date: June 7, 2012
    Applicant: ADVANCED MICRO DEVICES, INC.
    Inventors: Emil Talpes, Ganesh Venkataramanan
  • Publication number: 20120144174
    Abstract: A method and apparatus for utilizing scheduling resources in a processor are disclosed. A complex operation is assigned for execution as two micro-operations; a first micro-operation and a second micro-operation. The first micro-operation, which may be an address-generation operation, is executed using at least one of a first processing unit or a load and store unit and the second micro-operation, which may be an execution operation, is executed using a second processing unit, where at least one operand of the second micro-operation is an outcome of the first micro-operation.
    Type: Application
    Filed: December 1, 2010
    Publication date: June 7, 2012
    Applicant: ADVANCED MICRO DEVICES, INC.
    Inventors: Emil Talpes, Ganesh Venkataramanan
  • Publication number: 20120110594
    Abstract: A method and apparatus for assigning operations in a processor are provided. An incoming instruction is received. The incoming instruction is capable of being processed: only by a first processing unit (PU), only by a second PU or by either first and second PUs. The processing of first and second PUs is load balanced by assigning the received instructions capable of being processed by either the first and the second PUs based on a metric representing differential loads placed on the first and the second PUs.
    Type: Application
    Filed: October 28, 2010
    Publication date: May 3, 2012
    Applicant: ADVANCED MICRO DEVICES, INC.
    Inventors: Emil Talpes, Ganesh Venkataramanan, Sean Lie