Patents by Inventor Mark M. Leather
Mark M. Leather has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20230097097Abstract: Systems, apparatuses, and methods for preemptively reserving buffer space for primitives and positions in a graphics pipeline are disclosed. A system includes a graphics pipeline frontend with any number of geometry engines coupled to corresponding shader engines. Each geometry engine launches shader wavefronts to execute on a corresponding shader engine. The geometry engine preemptively reserves buffer space for each wavefront prior to the wavefront being launched on the shader engine. When the shader engine executes a wavefront, the shader engine exports primitive and position data to the reserved buffer space. Multiple scan converters will consume the primitive and position data, with each scan converter consuming primitive and position data based on the screen coverage of the scan converter. After consuming the primitive and position data, the scan converters mark the buffer space as freed so that the geometry engine can then allocate the freed buffer space to subsequent shader wavefronts.Type: ApplicationFiled: September 29, 2021Publication date: March 30, 2023Inventors: Todd Martin, Tad Robert Litwiller, Nishank Pathak, Randy Wayne Ramsey, Michael J. Mantor, Christopher J. Brennan, Mark M. Leather, Ryan James Cash
-
Patent number: 10817302Abstract: Systems, apparatuses, and methods for implementing a high bandwidth, low power vector register file for use by a parallel processor are disclosed. In one embodiment, a system includes at least a parallel processing unit with a plurality of processing pipeline. The parallel processing unit includes a vector arithmetic logic unit and a high bandwidth, low power, vector register file. The vector register file includes multi-bank high density random-access memories (RAMs) to satisfy register bandwidth requirements. The parallel processing unit also includes an instruction request queue and an instruction operand buffer to provide enough local bandwidth for VALU instructions and vector I/O instructions. Also, the parallel processing unit is configured to leverage the RAM's output flops as a last level cache to reduce duplicate operand requests between multiple instructions. The parallel processing unit includes a vector destination cache to provide additional R/W bandwidth for the vector register file.Type: GrantFiled: July 7, 2017Date of Patent: October 27, 2020Assignee: Advanced Micro Devices, Inc.Inventors: Jiasheng Chen, Bin He, Mark M. Leather, Michael J. Mantor, Yunxiao Zou
-
Patent number: 10474468Abstract: Systems, apparatuses, and methods for processing variable wavefront sizes on a processor are disclosed. In one embodiment, a processor includes at least a scheduler, cache, and multiple execution units. When operating in a first mode, the processor executes the same instruction on multiple portions of a wavefront before proceeding to the next instruction of the shader program. When operating in a second mode, the processor executes a set of instructions on a first portion of a wavefront. In the second mode, when the processor finishes executing the set of instructions on the first portion of the wavefront, the processor executes the set of instructions on a second portion of the wavefront, and so on until all portions of the wavefront have been processed. The processor determines the operating mode based on one or more conditions.Type: GrantFiled: February 22, 2017Date of Patent: November 12, 2019Assignee: Advanced Micro Devices, Inc.Inventors: Michael J. Mantor, Brian D. Emberling, Mark Fowler, Mark M. Leather
-
Publication number: 20180357064Abstract: Systems, apparatuses, and methods for implementing a high bandwidth, low power vector register file for use by a parallel processor are disclosed. In one embodiment, a system includes at least a parallel processing unit with a plurality of processing pipeline. The parallel processing unit includes a vector arithmetic logic unit and a high bandwidth, low power, vector register file. The vector register file includes multi-bank high density random-access memories (RAMs) to satisfy register bandwidth requirements. The parallel processing unit also includes an instruction request queue and an instruction operand buffer to provide enough local bandwidth for VALU instructions and vector I/O instructions. Also, the parallel processing unit is configured to leverage the RAM's output flops as a last level cache to reduce duplicate operand requests between multiple instructions. The parallel processing unit includes a vector destination cache to provide additional R/W bandwidth for the vector register file.Type: ApplicationFiled: July 7, 2017Publication date: December 13, 2018Inventors: Jiasheng Chen, Bin He, Mark M. Leather, Michael J. Mantor, Yunxiao Zou
-
Publication number: 20180239606Abstract: Systems, apparatuses, and methods for processing variable wavefront sizes on a processor are disclosed. In one embodiment, a processor includes at least a scheduler, cache, and multiple execution units. When operating in a first mode, the processor executes the same instruction on multiple portions of a wavefront before proceeding to the next instruction of the shader program. When operating in a second mode, the processor executes a set of instructions on a first portion of a wavefront. In the second mode, when the processor finishes executing the set of instructions on the first portion of the wavefront, the processor executes the set of instructions on a second portion of the wavefront, and so on until all portions of the wavefront have been processed. The processor determines the operating mode based on one or more conditions.Type: ApplicationFiled: February 22, 2017Publication date: August 23, 2018Inventors: Michael J. Mantor, Brian D. Emberling, Mark Fowler, Mark M. Leather
-
Patent number: 8933945Abstract: A graphics processing circuit includes at least two pipelines operative to process data in a corresponding set of tiles of a repeating tile pattern, a respective one of the at least two pipelines operative to process data in a dedicated tile, wherein the repeating tile pattern includes a horizontally and vertically repeating pattern of square regions. A graphics processing method includes receiving vertex data for a primitive to be rendered; generating pixel data in response to the vertex data; determining the pixels within a set of tiles of a repeating tile pattern to be processed by a corresponding one of at least two graphics pipelines in response to the pixel data, the repeating tile pattern including a horizontally and vertically repeating pattern of square regions; and performing pixel operations on the pixels within the determined set of tiles by the corresponding one of the at least two graphics pipelines.Type: GrantFiled: June 12, 2003Date of Patent: January 13, 2015Assignee: ATI Technologies ULCInventors: Mark M. Leather, Eric Demers
-
Patent number: 8593465Abstract: The present invention provides a system for handling extra contexts for shader constants, and applications thereof. In an embodiment there is provided a computer-based method for executing a series of compute packets in an execution pipeline. The execution pipeline includes a first plurality of registers configured to store state-updates of a first type and a second plurality of registers configured to store state-updates of a second type. A first number of state-updates of the first type and a second number of state-updates of the second type are respectively identified and stored in the first and second plurality of registers. A compute packet is sent to the execution pipeline responsive to the first number and the second number. Then, the compute packet is executed by the execution pipeline.Type: GrantFiled: June 13, 2007Date of Patent: November 26, 2013Assignee: Advanced Micro Devices, Inc.Inventors: Mark M. Leather, Brian D. Emberling
-
Patent number: 8156314Abstract: A system and method are described that manage incremental state updates in such a way that multiple threads within a processor can each operate, in effect, on their own set of state data. The system and method are applicable to any processor in which multiple threads require access to sets of state information which differ from one another by a relatively small number of state changes.Type: GrantFiled: October 25, 2007Date of Patent: April 10, 2012Assignee: Advanced Micro Devices, Inc.Inventors: Mark M. Leather, Brian D. Emberling
-
Patent number: 7920141Abstract: The present invention relates to a rasterizer interpolator. In one embodiment, a setup unit is used to distribute graphics primitive instructions to multiple parallel rasterizers. To increase efficiency, the setup unit calculates the polygon data and checks it against one or more tiles prior to distribution. An output screen is divided into a number of regions, with a number of assignment configurations possible for various number of rasterizer pipelines. For instance, the screen is sub-divided into four regions and one of four rasterizers is granted ownership of one quarter of the screen. To reduce time spent on processing empty times, a problem in prior art implementations, the present invention reduces empty tiles by the process of coarse grain tiling. This process occurs by a series of iterations performed in parallel. Each region undergoes an iterative calculation/tiling process where coverage of the primitive is deduced at a successively more detailed level.Type: GrantFiled: February 28, 2006Date of Patent: April 5, 2011Assignee: ATI Technologies ULCInventor: Mark M. Leather
-
Patent number: 7796133Abstract: The present invention is a unified shader unit used in texture processing in graphics processing device. Unlike the conventional method of using one shader for texture coordinate shading and another for color shading, the present shader performs both operations. The unified shader uses the same precision for both texture coordinate and color shading, thus simplifying the complexity of programming for two separate conventional shaders with different levels of precision. Furthermore, the present invention uses enhanced scheduling logic to perform indirect texture and bump mapping in a single first-in, first-out (FIFO) memory structure and avoids the problems associated with large FIFOs with buffer registers found in conventional shaders. In one embodiment, a plurality of ALU-memory pairs are synchronized to form a plurality of pipelines to execution shading instructions. In another embodiment, a plurality of unified shaders are synchronized and connected together to processing shading operations concurrently.Type: GrantFiled: December 8, 2003Date of Patent: September 14, 2010Assignee: ATI Technologies ULCInventors: Mark M. Leather, Eric Demers
-
Publication number: 20100110084Abstract: The present invention relates to a parallel pipeline graphics system. The parallel pipeline graphics system includes a back-end configured to receive primitives and combinations of primitives (i.e., geometry) and process the geometry to produce values to place in a frame buffer for rendering on screen. Unlike prior single pipeline implementation, some embodiments use two or four parallel pipelines, though other configurations having 2?n pipelines may be used. When geometry data is sent to the back-end, it is divided up and provided to one of the parallel pipelines. Each pipeline is a component of a raster back-end, where the display screen is divided into tiles and a defined portion of the screen is sent through a pipeline that owns that portion of the screen's tiles. In one embodiment, each pipeline comprises a scan converter, a hierarchical-Z unit, a z buffer logic, a rasterizer, a shader, and a color buffer logic.Type: ApplicationFiled: November 4, 2009Publication date: May 6, 2010Applicant: ATI Technologies ULCInventors: Mark M. Leather, Eric Demers
-
Patent number: 7633506Abstract: The present invention relates to a parallel pipeline graphics system. The parallel pipeline graphics system includes a back-end configured to receive primitives and combinations of primitives (i.e., geometry) and process the geometry to produce values to place in a frame buffer for rendering on screen. Unlike prior single pipeline implementation, some embodiments use two or four parallel pipelines, though other configurations having 2^n pipelines may be used. When geometry data is sent to the back-end, it is divided up and provided to one of the parallel pipelines. Each pipeline is a component of a raster back-end, where the display screen is divided into tiles and a defined portion of the screen is sent through a pipeline that owns that portion of the screen's tiles. In one embodiment, each pipeline comprises a scan converter, a hierarchical-Z unit, a z buffer logic, a rasterizer, a shader, and a color buffer logic.Type: GrantFiled: November 26, 2003Date of Patent: December 15, 2009Assignee: ATI Technologies ULCInventors: Mark M. Leather, Eric Demers
-
Publication number: 20090276563Abstract: A system and method are described that manage incremental state updates in such a way that multiple threads within a processor can each operate, in effect, on their own set of state data. The system and method are applicable to any processor in which multiple threads require access to sets of state information which differ from one another by a relatively small number of state changes.Type: ApplicationFiled: October 25, 2007Publication date: November 5, 2009Inventors: Mark M. Leather, Brian D. Emberling
-
Patent number: 7545387Abstract: The embodiments of the present invention are a method and apparatus to perform anti-aliasing using multi-sampling on a non-power-of-two pixel grid. Using the present invention with 6 sample multisampling gives the same visual antialiasing quality as 8 samples using a prior art technique but uses less memory. A non-power-of-two equally spaced sample from a conventional grid of size N×N, where N is 12 can be chosen using the present invention. A scan conversion to determine the set of pixels covered by a polygon is performed in two parts. According to one embodiment, the present invention can multiply and divide by “N” in order to multisample an image using samples per pixel chosen from a N×N sub-sample grid, where “N” is not necessarily a power of 2. The present invention performs the divide by “N” step, where the step is achieved using a quick divide by 3 or 12 technique.Type: GrantFiled: September 4, 2007Date of Patent: June 9, 2009Assignee: ATI Technologies ULCInventors: Mark M. Leather, Eric Demers
-
Publication number: 20080313436Abstract: The present invention provides a system for handling extra contexts for shader constants, and applications thereof. In an embodiment there is provided a computer-based method for executing a series of compute packets in an execution pipeline. The execution pipeline includes a first plurality of registers configured to store state-updates of a first type and a second plurality of registers configured to store state-updates of a second type. A first number of state-updates of the first type and a second number of state-updates of the second type are respectively identified and stored in the first and second plurality of registers. A compute packet is sent to the execution pipeline responsive to the first number and the second number. Then, the compute packet is executed by the execution pipeline.Type: ApplicationFiled: June 13, 2007Publication date: December 18, 2008Applicant: Advanced Micro Devices, Inc.Inventors: Mark M. Leather, Brian D. Emberling
-
Patent number: 7317459Abstract: A graphics processor includes an embedded frame buffer for storing frame data prior to sending the frame data to an external location, such as main memory. A copy pipeline is provided which converts the data from one format to another format prior to writing the data to the external location. The conversion may be from one RGB color format to another RGB color format, from one YUV format to another YUV format, from an RGB color format to a YUV color format, or from a YUV color format to an RGB color format. MPEG image data initially stored in main memory in a YUV format as a texture is transferred to the embedded frame buffer prior to initiating a copy-out process via the copy pipeline from the embedded frame buffer to an external frame buffer in main memory. During the copy-out process, pixels are converted from YUV format to an RGB format.Type: GrantFiled: November 27, 2006Date of Patent: January 8, 2008Assignee: Nintendo Co., Ltd.Inventors: Farhad Fouladi, Mark M. Leather, Robert Moore, Howard Cheng, Timothy J. Van Hook
-
Patent number: 7307640Abstract: A graphics system including a custom graphics and audio processor produces exciting 2D and 3D graphics and surround sound. The system includes a graphics and audio processor including a 3D graphics pipeline and an audio digital signal processor. Emboss style effects are created using fully pipelined hardware including two distinct dot-product computation units that perform a scaled model view matrix multiply without requiring the Normal input vector and which also compute dot-products between the Binormal and Tangent vectors and a light direction vector in parallel. The resulting texture coordinate displacements are provided to texture mapping hardware that performs a texture mapping operation providing texture combining in one pass. The disclosed pipelined arrangement efficiently provides interesting embossed style image effects such as raised and lowered patterns on surfaces.Type: GrantFiled: April 15, 2005Date of Patent: December 11, 2007Assignee: Nintendo Co., Ltd.Inventors: Eric Demers, Mark M. Leather, Mark G. Segal
-
Patent number: 7307638Abstract: A graphics system including a custom graphics and audio processor produces exciting 2D and 3D graphics and surround sound. The system includes a graphics and audio processor including a 3D graphics pipeline and an audio digital signal processor. The graphics pipeline renders and prepares images for display at least in part in response to polygon vertex attribute data and texel color data stored as a texture images in an associated memory. An efficient texturing pipeline arrangement achieves a relatively low chip-footprint by utilizing a single texture coordinate/data processing unit that interleaves the processing of logical direct and indirect texture coordinate data and a texture lookup data feedback path for “recirculating” indirect texture lookup data retrieved from a single texture retrieval unit back to the texture coordinate/data processing unit.Type: GrantFiled: June 15, 2005Date of Patent: December 11, 2007Assignee: Nintendo Co., Ltd.Inventors: Mark M. Leather, Robert A. Drebin, Timothy J. Van Hook
-
Patent number: 7280119Abstract: The embodiments of the present invention are a method and apparatus to perform anti-aliasing using multi-sampling on a non-power-of-two pixel grid. Using the present invention with 6 sample multisampling gives the same visual antialiasing quality as 8 samples using a prior art technique but uses less memory. A non-power-of-two equally spaced sample from a conventional grid of size N×N, where N is 12 can be chosen using the present invention. A scan conversion to determine the set of pixels covered by a polygon is performed in two parts. According to one embodiment, the present invention can multiply and divide by “N” in order to multisample an image using samples per pixel chosen from a N×N sub-sample grid, where “N” is not necessarily a power of 2. The present invention performs the divide by “N” step, where the step is achieved using a quick divide by 3 or 12 technique.Type: GrantFiled: February 13, 2004Date of Patent: October 9, 2007Assignee: ATI Technologies Inc.Inventors: Mark M. Leather, Eric Demers
-
Patent number: 7205999Abstract: Apparatus and example methods for environment-mapped style of bump-mapping (EMBM) are provided that use a pre-computed bump-map texture accessed as an indirect texture along with pre-computed object surface normals (i.e., the Normal, Tangent and Binormal vectors) from each vertex of rendered polygons to effectively generate a new perturbed Normal vector per vertex. The perturbed new Normal vectors are then used to look up texels in an environment map. A specialized bump map texture data/coordinate processing “bump unit” is provided in the graphics pipeline for performing predetermined matrix multiplication operations on retrieved lookup data from the indirect-texture bump map.Type: GrantFiled: September 30, 2004Date of Patent: April 17, 2007Assignee: Nintendo Co., Ltd.Inventor: Mark M. Leather