Patents by Inventor Mark M. Leather

Mark M. Leather has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230097097
    Abstract: Systems, apparatuses, and methods for preemptively reserving buffer space for primitives and positions in a graphics pipeline are disclosed. A system includes a graphics pipeline frontend with any number of geometry engines coupled to corresponding shader engines. Each geometry engine launches shader wavefronts to execute on a corresponding shader engine. The geometry engine preemptively reserves buffer space for each wavefront prior to the wavefront being launched on the shader engine. When the shader engine executes a wavefront, the shader engine exports primitive and position data to the reserved buffer space. Multiple scan converters will consume the primitive and position data, with each scan converter consuming primitive and position data based on the screen coverage of the scan converter. After consuming the primitive and position data, the scan converters mark the buffer space as freed so that the geometry engine can then allocate the freed buffer space to subsequent shader wavefronts.
    Type: Application
    Filed: September 29, 2021
    Publication date: March 30, 2023
    Inventors: Todd Martin, Tad Robert Litwiller, Nishank Pathak, Randy Wayne Ramsey, Michael J. Mantor, Christopher J. Brennan, Mark M. Leather, Ryan James Cash
  • Patent number: 10817302
    Abstract: Systems, apparatuses, and methods for implementing a high bandwidth, low power vector register file for use by a parallel processor are disclosed. In one embodiment, a system includes at least a parallel processing unit with a plurality of processing pipeline. The parallel processing unit includes a vector arithmetic logic unit and a high bandwidth, low power, vector register file. The vector register file includes multi-bank high density random-access memories (RAMs) to satisfy register bandwidth requirements. The parallel processing unit also includes an instruction request queue and an instruction operand buffer to provide enough local bandwidth for VALU instructions and vector I/O instructions. Also, the parallel processing unit is configured to leverage the RAM's output flops as a last level cache to reduce duplicate operand requests between multiple instructions. The parallel processing unit includes a vector destination cache to provide additional R/W bandwidth for the vector register file.
    Type: Grant
    Filed: July 7, 2017
    Date of Patent: October 27, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Jiasheng Chen, Bin He, Mark M. Leather, Michael J. Mantor, Yunxiao Zou
  • Patent number: 10474468
    Abstract: Systems, apparatuses, and methods for processing variable wavefront sizes on a processor are disclosed. In one embodiment, a processor includes at least a scheduler, cache, and multiple execution units. When operating in a first mode, the processor executes the same instruction on multiple portions of a wavefront before proceeding to the next instruction of the shader program. When operating in a second mode, the processor executes a set of instructions on a first portion of a wavefront. In the second mode, when the processor finishes executing the set of instructions on the first portion of the wavefront, the processor executes the set of instructions on a second portion of the wavefront, and so on until all portions of the wavefront have been processed. The processor determines the operating mode based on one or more conditions.
    Type: Grant
    Filed: February 22, 2017
    Date of Patent: November 12, 2019
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Michael J. Mantor, Brian D. Emberling, Mark Fowler, Mark M. Leather
  • Publication number: 20180357064
    Abstract: Systems, apparatuses, and methods for implementing a high bandwidth, low power vector register file for use by a parallel processor are disclosed. In one embodiment, a system includes at least a parallel processing unit with a plurality of processing pipeline. The parallel processing unit includes a vector arithmetic logic unit and a high bandwidth, low power, vector register file. The vector register file includes multi-bank high density random-access memories (RAMs) to satisfy register bandwidth requirements. The parallel processing unit also includes an instruction request queue and an instruction operand buffer to provide enough local bandwidth for VALU instructions and vector I/O instructions. Also, the parallel processing unit is configured to leverage the RAM's output flops as a last level cache to reduce duplicate operand requests between multiple instructions. The parallel processing unit includes a vector destination cache to provide additional R/W bandwidth for the vector register file.
    Type: Application
    Filed: July 7, 2017
    Publication date: December 13, 2018
    Inventors: Jiasheng Chen, Bin He, Mark M. Leather, Michael J. Mantor, Yunxiao Zou
  • Publication number: 20180239606
    Abstract: Systems, apparatuses, and methods for processing variable wavefront sizes on a processor are disclosed. In one embodiment, a processor includes at least a scheduler, cache, and multiple execution units. When operating in a first mode, the processor executes the same instruction on multiple portions of a wavefront before proceeding to the next instruction of the shader program. When operating in a second mode, the processor executes a set of instructions on a first portion of a wavefront. In the second mode, when the processor finishes executing the set of instructions on the first portion of the wavefront, the processor executes the set of instructions on a second portion of the wavefront, and so on until all portions of the wavefront have been processed. The processor determines the operating mode based on one or more conditions.
    Type: Application
    Filed: February 22, 2017
    Publication date: August 23, 2018
    Inventors: Michael J. Mantor, Brian D. Emberling, Mark Fowler, Mark M. Leather
  • Patent number: 8933945
    Abstract: A graphics processing circuit includes at least two pipelines operative to process data in a corresponding set of tiles of a repeating tile pattern, a respective one of the at least two pipelines operative to process data in a dedicated tile, wherein the repeating tile pattern includes a horizontally and vertically repeating pattern of square regions. A graphics processing method includes receiving vertex data for a primitive to be rendered; generating pixel data in response to the vertex data; determining the pixels within a set of tiles of a repeating tile pattern to be processed by a corresponding one of at least two graphics pipelines in response to the pixel data, the repeating tile pattern including a horizontally and vertically repeating pattern of square regions; and performing pixel operations on the pixels within the determined set of tiles by the corresponding one of the at least two graphics pipelines.
    Type: Grant
    Filed: June 12, 2003
    Date of Patent: January 13, 2015
    Assignee: ATI Technologies ULC
    Inventors: Mark M. Leather, Eric Demers
  • Patent number: 8593465
    Abstract: The present invention provides a system for handling extra contexts for shader constants, and applications thereof. In an embodiment there is provided a computer-based method for executing a series of compute packets in an execution pipeline. The execution pipeline includes a first plurality of registers configured to store state-updates of a first type and a second plurality of registers configured to store state-updates of a second type. A first number of state-updates of the first type and a second number of state-updates of the second type are respectively identified and stored in the first and second plurality of registers. A compute packet is sent to the execution pipeline responsive to the first number and the second number. Then, the compute packet is executed by the execution pipeline.
    Type: Grant
    Filed: June 13, 2007
    Date of Patent: November 26, 2013
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Mark M. Leather, Brian D. Emberling
  • Patent number: 8156314
    Abstract: A system and method are described that manage incremental state updates in such a way that multiple threads within a processor can each operate, in effect, on their own set of state data. The system and method are applicable to any processor in which multiple threads require access to sets of state information which differ from one another by a relatively small number of state changes.
    Type: Grant
    Filed: October 25, 2007
    Date of Patent: April 10, 2012
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Mark M. Leather, Brian D. Emberling
  • Patent number: 7920141
    Abstract: The present invention relates to a rasterizer interpolator. In one embodiment, a setup unit is used to distribute graphics primitive instructions to multiple parallel rasterizers. To increase efficiency, the setup unit calculates the polygon data and checks it against one or more tiles prior to distribution. An output screen is divided into a number of regions, with a number of assignment configurations possible for various number of rasterizer pipelines. For instance, the screen is sub-divided into four regions and one of four rasterizers is granted ownership of one quarter of the screen. To reduce time spent on processing empty times, a problem in prior art implementations, the present invention reduces empty tiles by the process of coarse grain tiling. This process occurs by a series of iterations performed in parallel. Each region undergoes an iterative calculation/tiling process where coverage of the primitive is deduced at a successively more detailed level.
    Type: Grant
    Filed: February 28, 2006
    Date of Patent: April 5, 2011
    Assignee: ATI Technologies ULC
    Inventor: Mark M. Leather
  • Patent number: 7796133
    Abstract: The present invention is a unified shader unit used in texture processing in graphics processing device. Unlike the conventional method of using one shader for texture coordinate shading and another for color shading, the present shader performs both operations. The unified shader uses the same precision for both texture coordinate and color shading, thus simplifying the complexity of programming for two separate conventional shaders with different levels of precision. Furthermore, the present invention uses enhanced scheduling logic to perform indirect texture and bump mapping in a single first-in, first-out (FIFO) memory structure and avoids the problems associated with large FIFOs with buffer registers found in conventional shaders. In one embodiment, a plurality of ALU-memory pairs are synchronized to form a plurality of pipelines to execution shading instructions. In another embodiment, a plurality of unified shaders are synchronized and connected together to processing shading operations concurrently.
    Type: Grant
    Filed: December 8, 2003
    Date of Patent: September 14, 2010
    Assignee: ATI Technologies ULC
    Inventors: Mark M. Leather, Eric Demers
  • Publication number: 20100110084
    Abstract: The present invention relates to a parallel pipeline graphics system. The parallel pipeline graphics system includes a back-end configured to receive primitives and combinations of primitives (i.e., geometry) and process the geometry to produce values to place in a frame buffer for rendering on screen. Unlike prior single pipeline implementation, some embodiments use two or four parallel pipelines, though other configurations having 2?n pipelines may be used. When geometry data is sent to the back-end, it is divided up and provided to one of the parallel pipelines. Each pipeline is a component of a raster back-end, where the display screen is divided into tiles and a defined portion of the screen is sent through a pipeline that owns that portion of the screen's tiles. In one embodiment, each pipeline comprises a scan converter, a hierarchical-Z unit, a z buffer logic, a rasterizer, a shader, and a color buffer logic.
    Type: Application
    Filed: November 4, 2009
    Publication date: May 6, 2010
    Applicant: ATI Technologies ULC
    Inventors: Mark M. Leather, Eric Demers
  • Patent number: 7633506
    Abstract: The present invention relates to a parallel pipeline graphics system. The parallel pipeline graphics system includes a back-end configured to receive primitives and combinations of primitives (i.e., geometry) and process the geometry to produce values to place in a frame buffer for rendering on screen. Unlike prior single pipeline implementation, some embodiments use two or four parallel pipelines, though other configurations having 2^n pipelines may be used. When geometry data is sent to the back-end, it is divided up and provided to one of the parallel pipelines. Each pipeline is a component of a raster back-end, where the display screen is divided into tiles and a defined portion of the screen is sent through a pipeline that owns that portion of the screen's tiles. In one embodiment, each pipeline comprises a scan converter, a hierarchical-Z unit, a z buffer logic, a rasterizer, a shader, and a color buffer logic.
    Type: Grant
    Filed: November 26, 2003
    Date of Patent: December 15, 2009
    Assignee: ATI Technologies ULC
    Inventors: Mark M. Leather, Eric Demers
  • Publication number: 20090276563
    Abstract: A system and method are described that manage incremental state updates in such a way that multiple threads within a processor can each operate, in effect, on their own set of state data. The system and method are applicable to any processor in which multiple threads require access to sets of state information which differ from one another by a relatively small number of state changes.
    Type: Application
    Filed: October 25, 2007
    Publication date: November 5, 2009
    Inventors: Mark M. Leather, Brian D. Emberling
  • Patent number: 7545387
    Abstract: The embodiments of the present invention are a method and apparatus to perform anti-aliasing using multi-sampling on a non-power-of-two pixel grid. Using the present invention with 6 sample multisampling gives the same visual antialiasing quality as 8 samples using a prior art technique but uses less memory. A non-power-of-two equally spaced sample from a conventional grid of size N×N, where N is 12 can be chosen using the present invention. A scan conversion to determine the set of pixels covered by a polygon is performed in two parts. According to one embodiment, the present invention can multiply and divide by “N” in order to multisample an image using samples per pixel chosen from a N×N sub-sample grid, where “N” is not necessarily a power of 2. The present invention performs the divide by “N” step, where the step is achieved using a quick divide by 3 or 12 technique.
    Type: Grant
    Filed: September 4, 2007
    Date of Patent: June 9, 2009
    Assignee: ATI Technologies ULC
    Inventors: Mark M. Leather, Eric Demers
  • Publication number: 20080313436
    Abstract: The present invention provides a system for handling extra contexts for shader constants, and applications thereof. In an embodiment there is provided a computer-based method for executing a series of compute packets in an execution pipeline. The execution pipeline includes a first plurality of registers configured to store state-updates of a first type and a second plurality of registers configured to store state-updates of a second type. A first number of state-updates of the first type and a second number of state-updates of the second type are respectively identified and stored in the first and second plurality of registers. A compute packet is sent to the execution pipeline responsive to the first number and the second number. Then, the compute packet is executed by the execution pipeline.
    Type: Application
    Filed: June 13, 2007
    Publication date: December 18, 2008
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Mark M. Leather, Brian D. Emberling
  • Patent number: 7317459
    Abstract: A graphics processor includes an embedded frame buffer for storing frame data prior to sending the frame data to an external location, such as main memory. A copy pipeline is provided which converts the data from one format to another format prior to writing the data to the external location. The conversion may be from one RGB color format to another RGB color format, from one YUV format to another YUV format, from an RGB color format to a YUV color format, or from a YUV color format to an RGB color format. MPEG image data initially stored in main memory in a YUV format as a texture is transferred to the embedded frame buffer prior to initiating a copy-out process via the copy pipeline from the embedded frame buffer to an external frame buffer in main memory. During the copy-out process, pixels are converted from YUV format to an RGB format.
    Type: Grant
    Filed: November 27, 2006
    Date of Patent: January 8, 2008
    Assignee: Nintendo Co., Ltd.
    Inventors: Farhad Fouladi, Mark M. Leather, Robert Moore, Howard Cheng, Timothy J. Van Hook
  • Patent number: 7307640
    Abstract: A graphics system including a custom graphics and audio processor produces exciting 2D and 3D graphics and surround sound. The system includes a graphics and audio processor including a 3D graphics pipeline and an audio digital signal processor. Emboss style effects are created using fully pipelined hardware including two distinct dot-product computation units that perform a scaled model view matrix multiply without requiring the Normal input vector and which also compute dot-products between the Binormal and Tangent vectors and a light direction vector in parallel. The resulting texture coordinate displacements are provided to texture mapping hardware that performs a texture mapping operation providing texture combining in one pass. The disclosed pipelined arrangement efficiently provides interesting embossed style image effects such as raised and lowered patterns on surfaces.
    Type: Grant
    Filed: April 15, 2005
    Date of Patent: December 11, 2007
    Assignee: Nintendo Co., Ltd.
    Inventors: Eric Demers, Mark M. Leather, Mark G. Segal
  • Patent number: 7307638
    Abstract: A graphics system including a custom graphics and audio processor produces exciting 2D and 3D graphics and surround sound. The system includes a graphics and audio processor including a 3D graphics pipeline and an audio digital signal processor. The graphics pipeline renders and prepares images for display at least in part in response to polygon vertex attribute data and texel color data stored as a texture images in an associated memory. An efficient texturing pipeline arrangement achieves a relatively low chip-footprint by utilizing a single texture coordinate/data processing unit that interleaves the processing of logical direct and indirect texture coordinate data and a texture lookup data feedback path for “recirculating” indirect texture lookup data retrieved from a single texture retrieval unit back to the texture coordinate/data processing unit.
    Type: Grant
    Filed: June 15, 2005
    Date of Patent: December 11, 2007
    Assignee: Nintendo Co., Ltd.
    Inventors: Mark M. Leather, Robert A. Drebin, Timothy J. Van Hook
  • Patent number: 7280119
    Abstract: The embodiments of the present invention are a method and apparatus to perform anti-aliasing using multi-sampling on a non-power-of-two pixel grid. Using the present invention with 6 sample multisampling gives the same visual antialiasing quality as 8 samples using a prior art technique but uses less memory. A non-power-of-two equally spaced sample from a conventional grid of size N×N, where N is 12 can be chosen using the present invention. A scan conversion to determine the set of pixels covered by a polygon is performed in two parts. According to one embodiment, the present invention can multiply and divide by “N” in order to multisample an image using samples per pixel chosen from a N×N sub-sample grid, where “N” is not necessarily a power of 2. The present invention performs the divide by “N” step, where the step is achieved using a quick divide by 3 or 12 technique.
    Type: Grant
    Filed: February 13, 2004
    Date of Patent: October 9, 2007
    Assignee: ATI Technologies Inc.
    Inventors: Mark M. Leather, Eric Demers
  • Patent number: 7205999
    Abstract: Apparatus and example methods for environment-mapped style of bump-mapping (EMBM) are provided that use a pre-computed bump-map texture accessed as an indirect texture along with pre-computed object surface normals (i.e., the Normal, Tangent and Binormal vectors) from each vertex of rendered polygons to effectively generate a new perturbed Normal vector per vertex. The perturbed new Normal vectors are then used to look up texels in an environment map. A specialized bump map texture data/coordinate processing “bump unit” is provided in the graphics pipeline for performing predetermined matrix multiplication operations on retrieved lookup data from the indirect-texture bump map.
    Type: Grant
    Filed: September 30, 2004
    Date of Patent: April 17, 2007
    Assignee: Nintendo Co., Ltd.
    Inventor: Mark M. Leather