Patents by Inventor Michael Estlick

Michael Estlick has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11960897
    Abstract: In some implementations, a processor includes a plurality of parallel instruction pipes, a register file includes at least one shared read port configured to be shared across multiple pipes of the plurality of parallel instruction pipes. Control logic controls multiple parallel instruction pipes to read from the at least one shared read port. In certain examples, the at least one shared register file read port is coupled as a single read port for one of the parallel instruction pipes and as a shared register file read port for a plurality of other parallel instruction pipes.
    Type: Grant
    Filed: July 30, 2021
    Date of Patent: April 16, 2024
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Michael Estlick, Erik Swanson, Eric Dixon, Todd Baumgartner
  • Publication number: 20240111489
    Abstract: A processing unit includes a plurality of adders and a plurality of carry bit generation circuits. The plurality of adders add first and second X bit binary portion values of a first Y bit binary value and a second Y bit binary value. Y is a multiple of X. The plurality of adders further generate first carry bits. The plurality of carry bit generation circuits is coupled to the plurality of adders, respectively, and receive the first carry bits. The plurality of carry bit generation circuits generate second carry bits based on the first carry bits. The plurality of adders use the second carry bits to add the first and second X bit binary portions of the first and second Y bit binary values, respectively.
    Type: Application
    Filed: September 29, 2022
    Publication date: April 4, 2024
    Inventors: Onur Kayiran, Michael Estlick, Masab Ahmad, Gabriel H. Loh
  • Publication number: 20240111529
    Abstract: An integrated circuit includes a vector data processing unit that employs a cross-lane shuffle unit including multiplexing logic that programmably shuffles packed source lane values, each corresponding to one of a plurality of vector lanes, to different output vector result lane positions over multiple cycles. In certain implementations, in a first cycle, control logic in the cross-shuffle unit controls the multiplexing logic to select source lane values to be placed in a first group of output vector result lane positions for a vector result register; and in at least a second cycle, the same multiplexing logic is reused to select source lane values to be placed in a second group of output vector result lane positions for the vector result register wherein at least one of the selected source lane values is moved to a different result lane position. Associated methods are also presented.
    Type: Application
    Filed: September 30, 2022
    Publication date: April 4, 2024
    Inventors: ERIC DIXON, MICHAEL ESTLICK, ERIK D. SWANSON
  • Publication number: 20240111526
    Abstract: A data processing system includes a vector data processing unit that includes a shared scheduler queue configured to store in a same queue, at least one entry that includes at least a mask type instruction and another entry that includes at least a vector type instruction. Shared pipeline control logic controls a vector data path or a mask data path, based a type of instruction picked from the same queue. In some examples, at least one mask type instruction and the at least one vector type instruction each include a source operand having a corresponding shared source register bit field that indexes into both a mask register file and a vector register file. The shared pipeline control logic uses a mask register file or a vector register file depending on whether bits of the shared source register bit field identify a mask source register or a vector source register.
    Type: Application
    Filed: September 30, 2022
    Publication date: April 4, 2024
    Inventors: MICHAEL ESTLICK, ERIC DIXON, THEODORE CARLSON, ERIK D. SWANSON
  • Publication number: 20240095180
    Abstract: The disclosed computer-implemented method for interpolating register-based lookup tables can include identifying, within a set of registers, a lookup table that has been encoded for storage within the set of registers. The method can also include receiving a request to look up a value in the lookup table and responding to the request by interpolating, from the encoded lookup table stored in the set of registers, a representation of the requested value. Various other methods, systems, and computer-readable media are also disclosed.
    Type: Application
    Filed: December 23, 2022
    Publication date: March 21, 2024
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Gabriel H. Loh, Michael Estlick, Jay Fleischman, Michael J. Schulte, Bradford Beckmann, Yasuko Eckert
  • Patent number: 11907070
    Abstract: An integrated circuit includes one or more processing units that execute instructions that employ a register file, control logic creates a pre-startup register free list, prior to normal operation of at least one of the processing units, that includes a list of registers devoid of defective registers. In some implementations, no column and row repair information is provided to register file repair logic. In certain examples, the register file is configured as a repair-less register file. During normal operation of the one or more processing units, the integrated circuit employs the pre-startup register free list to select registers in a register file for the executing instructions. Associated methods are also presented.
    Type: Grant
    Filed: July 30, 2021
    Date of Patent: February 20, 2024
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Eric Busta, Michael L. Golden, Sean M. O′Mullan, James Wingfield, Keith A. Kasprak, Russell Schreiber, Michael Estlick
  • Publication number: 20240045694
    Abstract: A coprocessor such as a floating-point unit includes a pipeline that is partitioned into a first portion and a second portion. A controller is configured to provide control signals to the first portion and the second portion of the pipeline. A first physical distance traversed by control signals propagating from the controller to the first portion of the pipeline is shorter than a second physical distance traversed by control signals propagating from the controller to the second portion of the pipeline. A scheduler is configured to cause a physical register file to provide a first subset of bits of an instruction to the first portion at a first time. The physical register file provides a second subset of the bits of the instruction to the second portion at a second time subsequent to the first time.
    Type: Application
    Filed: June 16, 2023
    Publication date: February 8, 2024
    Inventors: Jay FLEISCHMAN, Michael ESTLICK, Michael Christopher SEDMAK, Erik SWANSON, Sneha V. DESAI
  • Publication number: 20240004664
    Abstract: The disclosed system may include a processor configured to detect that a data unit size for an instruction is smaller than a register. The processor may allocate a first portion of the register to the instruction in a manner that leaves a second portion of the register available for allocating to an additional instruction. The processor may also track the register as a split register. Various other methods, systems, and computer-readable media are also disclosed.
    Type: Application
    Filed: June 30, 2022
    Publication date: January 4, 2024
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Sree Harsha Kosuru, Eric Dixon, Erik Swanson, Michael Estlick, Patrick Michael Lowry
  • Patent number: 11847463
    Abstract: A processor includes a load/store unit and an execution pipeline to execute an instruction that represents a single-instruction-multiple-data (SIMD) operation, and which references a memory block storing operand data for one or more lanes of a plurality of lanes and a mask vector indicating which lanes of a plurality of lanes are enabled and which are disabled for the operation. The execution pipeline executes an instruction in a first execution mode unless a memory fault is generated during execution of the instruction in the first execution mode. In response to the memory fault, the execution pipeline re-executes the instruction in a second execution mode. In the first execution mode, a single load operation is attempted to access the memory block via the load/store unit. In the second execution mode, a separate load operation is performed by the load/store unit for each enabled lane of the plurality of lanes prior to executing the SIMD operation.
    Type: Grant
    Filed: September 27, 2019
    Date of Patent: December 19, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Kai Troester, Scott Thomas Bingham, John M. King, Michael Estlick, Erik Swanson, Robert Weidner
  • Patent number: 11842200
    Abstract: An apparatus includes a plurality of load buses and a load store unit that includes a plurality of load ports to access the plurality of load buses. The load store unit performs a gather operation to concurrently gather a plurality of subsets of data from a memory via the plurality of load buses in a first mode. The apparatus also includes a register that is partitioned into a plurality of portions to hold the plurality of subsets of data provided by the load store unit. The load store unit ignores exceptions or faults while performing the gather operation in the first mode and transitions to a second mode in response to an exception or fault. Two lanes are dispatched to concurrently perform the gather operation per clock cycle in the first mode and a single lane is dispatched to perform the gather operation per clock cycle in the second mode.
    Type: Grant
    Filed: September 27, 2019
    Date of Patent: December 12, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: John M. King, Magiting Talisayon, Michael Estlick
  • Publication number: 20230393855
    Abstract: An approach is provided for implementing register based single instruction, multiple data (SIMD) lookup table operations. According to the approach, an instruction set architecture (ISA) can support one or more SIMD instructions that enable vectors or multiple values in source data registers to be processed in parallel using a lookup table or truth table stored in one or more function registers. The SIMD instructions can be flexibly configured to support functions with inputs and outputs of various sizes and data formats. Various approaches are also described for supporting very large lookup tables that span multiple registers.
    Type: Application
    Filed: June 6, 2022
    Publication date: December 7, 2023
    Inventors: Gabriel H. Loh, Yasuko Eckert, Bradford Beckmann, Michael Estlick, Jay Fleischman
  • Patent number: 11709681
    Abstract: A coprocessor such as a floating-point unit includes a pipeline that is partitioned into a first portion and a second portion. A controller is configured to provide control signals to the first portion and the second portion of the pipeline. A first physical distance traversed by control signals propagating from the controller to the first portion of the pipeline is shorter than a second physical distance traversed by control signals propagating from the controller to the second portion of the pipeline. A scheduler is configured to cause a physical register file to provide a first subset of bits of an instruction to the first portion at a first time. The physical register file provides a second subset of the bits of the instruction to the second portion at a second time subsequent to the first time.
    Type: Grant
    Filed: December 11, 2017
    Date of Patent: July 25, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Jay Fleischman, Michael Estlick, Michael Christopher Sedmak, Erik Swanson, Sneha V. Desai
  • Patent number: 11573801
    Abstract: A processor includes a register file and control logic that detects multiple different sets of sequential zero bits of a register in the register file, wherein each of the multiple different sets has a bit length that corresponds to a partial instruction width and operates at a first partial instruction width or a second partial instruction width with the register file depending on number of sets of zero bits detected in the register. In certain examples, the control logic causes operating at first instruction width that avoids merging of a first bit length of data in the register and operating at the second instruction width that avoids merging of a second bit length of data in the register. In some examples, a register rename map table incudes multiple zero bits that identify the detected multiple different sets of bits of sequential zeros.
    Type: Grant
    Filed: September 29, 2021
    Date of Patent: February 7, 2023
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Eric Dixon, Erik Swanson, Theodore Carlson, Ruchir Dalal, Michael Estlick
  • Publication number: 20230032375
    Abstract: An integrated circuit includes one or more processing units that execute instructions that employ a register file, control logic creates a pre-startup register free list, prior to normal operation of at least one of the processing units, that includes a list of registers devoid of undefective registers. In some implementations, no column and row repair information is provided to register file repair logic. In certain examples, the register file is configured as a repair-less register file. During normal operation of the one or more processing units, the integrated circuit employs the pre-startup register free list to select registers in a register file for the executing instructions. Associated methods are also presented.
    Type: Application
    Filed: July 30, 2021
    Publication date: February 2, 2023
    Inventors: Eric Busta, Michael L. Golden, Sean M. O'Mullan, James Wingfield, Keith A. Kasprak, Russell Schreiber, Michael Estlick
  • Publication number: 20230034933
    Abstract: Methods, systems, and apparatuses provide support for allowing thread forward progress in a processing system and that improves quality of service. One system includes a processor; a bus coupled to the processor; a memory coupled to the processor via the bus; and a floating point unit coupled to the processor via the bus, wherein floating point unit comprises hardware control logic operative to: store for each thread, by a scheduler of the floating point unit, a counter; increase, by the scheduler, a value of the counter for each thread corresponding to a thread when at least one source ready operation exist for the thread; compare, by the scheduler, the value of the counter to a predetermined threshold; and make other threads ineligible to be picked by the scheduler when the counter is greater than or equal to the predetermined threshold.
    Type: Application
    Filed: July 30, 2021
    Publication date: February 2, 2023
    Inventors: Michael Estlick, Erik Swanson, Eric Dixon
  • Publication number: 20230034072
    Abstract: In some implementations, a processor includes a plurality of parallel instruction pipes, a register file includes at least one shared read port configured to be shared across multiple pipes of the plurality of parallel instruction pipes. Control logic controls multiple parallel instruction pipes to read from the at least one shared read port. In certain examples, the at least one shared register file read port is coupled as a single read port for one of the parallel instruction pipes and as a shared register file read port for a plurality of other parallel instruction pipes.
    Type: Application
    Filed: July 30, 2021
    Publication date: February 2, 2023
    Inventors: Michael Estlick, Erik Swanson, Eric Dixon, Todd Baumgartner
  • Patent number: 11567554
    Abstract: A pipeline includes a first portion configured to process a first subset of bits of an instruction and a second portion configured to process a second subset of the bits of the instruction. A first clock mesh is configured to provide a first clock signal to the first portion of the pipeline. A second clock mesh is configured to provide a second clock signal to the second portion of the pipeline. The first and second clock meshes selectively provide the first and second clock signals based on characteristics of in-flight instructions that have been dispatched to the pipeline but not yet retired. In some cases, a physical register file is configured to store values of bits representative of instructions. Only the first subset is stored in the physical register file in response to the value of the zero high bit indicating that the second subset is equal to zero.
    Type: Grant
    Filed: December 11, 2017
    Date of Patent: January 31, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Jay Fleischman, Michael Estlick, Michael Christopher Sedmak, Erik Swanson, Sneha V. Desai
  • Patent number: 11544065
    Abstract: A processor includes a front-end with an instruction set that operates at a first bit width and a floating point unit coupled to receive the instruction set in the processor that operates at the first bit width. The floating point unit operates at a second bit width and, based upon a bit width assessment of the instruction set provided to the floating point unit, the floating point unit employs a shadow-latch configured floating point register file to perform bit width reconfiguration. The shadow-latch configured floating point register file includes a plurality of regular latches and a plurality of shadow latches for storing data that is to be either read from or written to the shadow latches. The bit width reconfiguration enables the floating point unit that operates at the second bit width to operate on the instruction set received at the first bit width.
    Type: Grant
    Filed: September 27, 2019
    Date of Patent: January 3, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Arun A. Nair, Todd Baumgartner, Michael Estlick, Erik Swanson
  • Patent number: 11451241
    Abstract: A processor employs a set of bits to indicate values of portions of registers of a register file. In response to a specified instruction indicating an expected change of instruction types to be executed, the processor sets one or more of the bits and, for subsequent instructions, interprets corresponding portions of the registers as having a specified value (e.g., zero). By employing the set of bits to set the values of the register portions, rather than setting the individual portions of the registers to the specified value, the processor conserves processor resources (e.g., power) when the processor transitions between executing instructions of different types.
    Type: Grant
    Filed: December 14, 2017
    Date of Patent: September 20, 2022
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Erik Swanson, Sneha V. Desai, Michael Estlick
  • Patent number: 11281466
    Abstract: A floating point unit includes a non-pickable scheduler queue (NSQ) that offers a load operation concurrently with a load store unit retrieving load data for an operand that is to be loaded by the load operation. The floating point unit also includes a renamer that renames architectural registers used by the load operation and allocates physical register numbers to the load operation in response to receiving the load operation from the NSQ. The floating point unit further includes a set of pickable scheduler queues that receive the load operation from the renamer and store the load operation prior to execution. A physical register file is implemented in the floating point unit and a free list is used to store physical register numbers of entries in the physical register file that are available for allocation.
    Type: Grant
    Filed: October 22, 2019
    Date of Patent: March 22, 2022
    Assignees: Advanced Micro Devices, Inc., ATI TECHNOLOGIES ULC
    Inventors: Arun A. Nair, Michael Estlick, Erik Swanson, Sneha V. Desai, Donglin Ji