Patents by Inventor Mark M. Leather

Mark M. Leather has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

GRAPHICS PRIMITIVES AND POSITIONS THROUGH MEMORY BUFFERS

Publication number: 20230097097

Abstract: Systems, apparatuses, and methods for preemptively reserving buffer space for primitives and positions in a graphics pipeline are disclosed. A system includes a graphics pipeline frontend with any number of geometry engines coupled to corresponding shader engines. Each geometry engine launches shader wavefronts to execute on a corresponding shader engine. The geometry engine preemptively reserves buffer space for each wavefront prior to the wavefront being launched on the shader engine. When the shader engine executes a wavefront, the shader engine exports primitive and position data to the reserved buffer space. Multiple scan converters will consume the primitive and position data, with each scan converter consuming primitive and position data based on the screen coverage of the scan converter. After consuming the primitive and position data, the scan converters mark the buffer space as freed so that the geometry engine can then allocate the freed buffer space to subsequent shader wavefronts.

Type: Application

Filed: September 29, 2021

Publication date: March 30, 2023

Inventors: Todd Martin, Tad Robert Litwiller, Nishank Pathak, Randy Wayne Ramsey, Michael J. Mantor, Christopher J. Brennan, Mark M. Leather, Ryan James Cash
Processor support for bypassing vector source operands

Patent number: 10817302

Abstract: Systems, apparatuses, and methods for implementing a high bandwidth, low power vector register file for use by a parallel processor are disclosed. In one embodiment, a system includes at least a parallel processing unit with a plurality of processing pipeline. The parallel processing unit includes a vector arithmetic logic unit and a high bandwidth, low power, vector register file. The vector register file includes multi-bank high density random-access memories (RAMs) to satisfy register bandwidth requirements. The parallel processing unit also includes an instruction request queue and an instruction operand buffer to provide enough local bandwidth for VALU instructions and vector I/O instructions. Also, the parallel processing unit is configured to leverage the RAM's output flops as a last level cache to reduce duplicate operand requests between multiple instructions. The parallel processing unit includes a vector destination cache to provide additional R/W bandwidth for the vector register file.

Type: Grant

Filed: July 7, 2017

Date of Patent: October 27, 2020

Assignee: Advanced Micro Devices, Inc.

Inventors: Jiasheng Chen, Bin He, Mark M. Leather, Michael J. Mantor, Yunxiao Zou
Indicating instruction scheduling mode for processing wavefront portions

Patent number: 10474468

Abstract: Systems, apparatuses, and methods for processing variable wavefront sizes on a processor are disclosed. In one embodiment, a processor includes at least a scheduler, cache, and multiple execution units. When operating in a first mode, the processor executes the same instruction on multiple portions of a wavefront before proceeding to the next instruction of the shader program. When operating in a second mode, the processor executes a set of instructions on a first portion of a wavefront. In the second mode, when the processor finishes executing the set of instructions on the first portion of the wavefront, the processor executes the set of instructions on a second portion of the wavefront, and so on until all portions of the wavefront have been processed. The processor determines the operating mode based on one or more conditions.

Type: Grant

Filed: February 22, 2017

Date of Patent: November 12, 2019

Assignee: Advanced Micro Devices, Inc.

Inventors: Michael J. Mantor, Brian D. Emberling, Mark Fowler, Mark M. Leather
STREAM PROCESSOR WITH HIGH BANDWIDTH AND LOW POWER VECTOR REGISTER FILE

Publication number: 20180357064

Abstract: Systems, apparatuses, and methods for implementing a high bandwidth, low power vector register file for use by a parallel processor are disclosed. In one embodiment, a system includes at least a parallel processing unit with a plurality of processing pipeline. The parallel processing unit includes a vector arithmetic logic unit and a high bandwidth, low power, vector register file. The vector register file includes multi-bank high density random-access memories (RAMs) to satisfy register bandwidth requirements. The parallel processing unit also includes an instruction request queue and an instruction operand buffer to provide enough local bandwidth for VALU instructions and vector I/O instructions. Also, the parallel processing unit is configured to leverage the RAM's output flops as a last level cache to reduce duplicate operand requests between multiple instructions. The parallel processing unit includes a vector destination cache to provide additional R/W bandwidth for the vector register file.

Type: Application

Filed: July 7, 2017

Publication date: December 13, 2018

Inventors: Jiasheng Chen, Bin He, Mark M. Leather, Michael J. Mantor, Yunxiao Zou
VARIABLE WAVEFRONT SIZE

Publication number: 20180239606

Abstract: Systems, apparatuses, and methods for processing variable wavefront sizes on a processor are disclosed. In one embodiment, a processor includes at least a scheduler, cache, and multiple execution units. When operating in a first mode, the processor executes the same instruction on multiple portions of a wavefront before proceeding to the next instruction of the shader program. When operating in a second mode, the processor executes a set of instructions on a first portion of a wavefront. In the second mode, when the processor finishes executing the set of instructions on the first portion of the wavefront, the processor executes the set of instructions on a second portion of the wavefront, and so on until all portions of the wavefront have been processed. The processor determines the operating mode based on one or more conditions.

Type: Application

Filed: February 22, 2017

Publication date: August 23, 2018

Inventors: Michael J. Mantor, Brian D. Emberling, Mark Fowler, Mark M. Leather
Dividing work among multiple graphics pipelines using a super-tiling technique

Patent number: 8933945

Abstract: A graphics processing circuit includes at least two pipelines operative to process data in a corresponding set of tiles of a repeating tile pattern, a respective one of the at least two pipelines operative to process data in a dedicated tile, wherein the repeating tile pattern includes a horizontally and vertically repeating pattern of square regions. A graphics processing method includes receiving vertex data for a primitive to be rendered; generating pixel data in response to the vertex data; determining the pixels within a set of tiles of a repeating tile pattern to be processed by a corresponding one of at least two graphics pipelines in response to the pixel data, the repeating tile pattern including a horizontally and vertically repeating pattern of square regions; and performing pixel operations on the pixels within the determined set of tiles by the corresponding one of the at least two graphics pipelines.

Type: Grant

Filed: June 12, 2003

Date of Patent: January 13, 2015

Assignee: ATI Technologies ULC

Inventors: Mark M. Leather, Eric Demers
Handling of extra contexts for shader constants

Patent number: 8593465

Abstract: The present invention provides a system for handling extra contexts for shader constants, and applications thereof. In an embodiment there is provided a computer-based method for executing a series of compute packets in an execution pipeline. The execution pipeline includes a first plurality of registers configured to store state-updates of a first type and a second plurality of registers configured to store state-updates of a second type. A first number of state-updates of the first type and a second number of state-updates of the second type are respectively identified and stored in the first and second plurality of registers. A compute packet is sent to the execution pipeline responsive to the first number and the second number. Then, the compute packet is executed by the execution pipeline.

Type: Grant

Filed: June 13, 2007

Date of Patent: November 26, 2013

Assignee: Advanced Micro Devices, Inc.

Inventors: Mark M. Leather, Brian D. Emberling
Incremental state updates

Patent number: 8156314

Abstract: A system and method are described that manage incremental state updates in such a way that multiple threads within a processor can each operate, in effect, on their own set of state data. The system and method are applicable to any processor in which multiple threads require access to sets of state information which differ from one another by a relatively small number of state changes.

Type: Grant

Filed: October 25, 2007

Date of Patent: April 10, 2012

Assignee: Advanced Micro Devices, Inc.

Inventors: Mark M. Leather, Brian D. Emberling
Method and apparatus for rasterizer interpolation

Patent number: 7920141

Abstract: The present invention relates to a rasterizer interpolator. In one embodiment, a setup unit is used to distribute graphics primitive instructions to multiple parallel rasterizers. To increase efficiency, the setup unit calculates the polygon data and checks it against one or more tiles prior to distribution. An output screen is divided into a number of regions, with a number of assignment configurations possible for various number of rasterizer pipelines. For instance, the screen is sub-divided into four regions and one of four rasterizers is granted ownership of one quarter of the screen. To reduce time spent on processing empty times, a problem in prior art implementations, the present invention reduces empty tiles by the process of coarse grain tiling. This process occurs by a series of iterations performed in parallel. Each region undergoes an iterative calculation/tiling process where coverage of the primitive is deduced at a successively more detailed level.

Type: Grant

Filed: February 28, 2006

Date of Patent: April 5, 2011

Assignee: ATI Technologies ULC

Inventor: Mark M. Leather
Unified shader

Patent number: 7796133

Abstract: The present invention is a unified shader unit used in texture processing in graphics processing device. Unlike the conventional method of using one shader for texture coordinate shading and another for color shading, the present shader performs both operations. The unified shader uses the same precision for both texture coordinate and color shading, thus simplifying the complexity of programming for two separate conventional shaders with different levels of precision. Furthermore, the present invention uses enhanced scheduling logic to perform indirect texture and bump mapping in a single first-in, first-out (FIFO) memory structure and avoids the problems associated with large FIFOs with buffer registers found in conventional shaders. In one embodiment, a plurality of ALU-memory pairs are synchronized to form a plurality of pipelines to execution shading instructions. In another embodiment, a plurality of unified shaders are synchronized and connected together to processing shading operations concurrently.

Type: Grant

Filed: December 8, 2003

Date of Patent: September 14, 2010

Assignee: ATI Technologies ULC

Inventors: Mark M. Leather, Eric Demers
PARALLEL PIPELINE GRAPHICS SYSTEM

Publication number: 20100110084

Abstract: The present invention relates to a parallel pipeline graphics system. The parallel pipeline graphics system includes a back-end configured to receive primitives and combinations of primitives (i.e., geometry) and process the geometry to produce values to place in a frame buffer for rendering on screen. Unlike prior single pipeline implementation, some embodiments use two or four parallel pipelines, though other configurations having 2?n pipelines may be used. When geometry data is sent to the back-end, it is divided up and provided to one of the parallel pipelines. Each pipeline is a component of a raster back-end, where the display screen is divided into tiles and a defined portion of the screen is sent through a pipeline that owns that portion of the screen's tiles. In one embodiment, each pipeline comprises a scan converter, a hierarchical-Z unit, a z buffer logic, a rasterizer, a shader, and a color buffer logic.

Type: Application

Filed: November 4, 2009

Publication date: May 6, 2010

Applicant: ATI Technologies ULC

Inventors: Mark M. Leather, Eric Demers
Parallel pipeline graphics system

Patent number: 7633506

Abstract: The present invention relates to a parallel pipeline graphics system. The parallel pipeline graphics system includes a back-end configured to receive primitives and combinations of primitives (i.e., geometry) and process the geometry to produce values to place in a frame buffer for rendering on screen. Unlike prior single pipeline implementation, some embodiments use two or four parallel pipelines, though other configurations having 2^n pipelines may be used. When geometry data is sent to the back-end, it is divided up and provided to one of the parallel pipelines. Each pipeline is a component of a raster back-end, where the display screen is divided into tiles and a defined portion of the screen is sent through a pipeline that owns that portion of the screen's tiles. In one embodiment, each pipeline comprises a scan converter, a hierarchical-Z unit, a z buffer logic, a rasterizer, a shader, and a color buffer logic.

Type: Grant

Filed: November 26, 2003

Date of Patent: December 15, 2009

Assignee: ATI Technologies ULC

Inventors: Mark M. Leather, Eric Demers
Incremental State Updates

Publication number: 20090276563

Abstract: A system and method are described that manage incremental state updates in such a way that multiple threads within a processor can each operate, in effect, on their own set of state data. The system and method are applicable to any processor in which multiple threads require access to sets of state information which differ from one another by a relatively small number of state changes.

Type: Application

Filed: October 25, 2007

Publication date: November 5, 2009

Inventors: Mark M. Leather, Brian D. Emberling
Method and apparatus for sampling on a non-power-of-two pixel grid

Patent number: 7545387

Abstract: The embodiments of the present invention are a method and apparatus to perform anti-aliasing using multi-sampling on a non-power-of-two pixel grid. Using the present invention with 6 sample multisampling gives the same visual antialiasing quality as 8 samples using a prior art technique but uses less memory. A non-power-of-two equally spaced sample from a conventional grid of size N×N, where N is 12 can be chosen using the present invention. A scan conversion to determine the set of pixels covered by a polygon is performed in two parts. According to one embodiment, the present invention can multiply and divide by “N” in order to multisample an image using samples per pixel chosen from a N×N sub-sample grid, where “N” is not necessarily a power of 2. The present invention performs the divide by “N” step, where the step is achieved using a quick divide by 3 or 12 technique.

Type: Grant

Filed: September 4, 2007

Date of Patent: June 9, 2009

Assignee: ATI Technologies ULC

Inventors: Mark M. Leather, Eric Demers
Handling of extra contexts for shader constants

Publication number: 20080313436

Abstract: The present invention provides a system for handling extra contexts for shader constants, and applications thereof. In an embodiment there is provided a computer-based method for executing a series of compute packets in an execution pipeline. The execution pipeline includes a first plurality of registers configured to store state-updates of a first type and a second plurality of registers configured to store state-updates of a second type. A first number of state-updates of the first type and a second number of state-updates of the second type are respectively identified and stored in the first and second plurality of registers. A compute packet is sent to the execution pipeline responsive to the first number and the second number. Then, the compute packet is executed by the execution pipeline.

Type: Application

Filed: June 13, 2007

Publication date: December 18, 2008

Applicant: Advanced Micro Devices, Inc.

Inventors: Mark M. Leather, Brian D. Emberling
Graphics system with copy out conversions between embedded frame buffer and main memory for producing a streaming video image as a texture on a displayed object image

Patent number: 7317459

Abstract: A graphics processor includes an embedded frame buffer for storing frame data prior to sending the frame data to an external location, such as main memory. A copy pipeline is provided which converts the data from one format to another format prior to writing the data to the external location. The conversion may be from one RGB color format to another RGB color format, from one YUV format to another YUV format, from an RGB color format to a YUV color format, or from a YUV color format to an RGB color format. MPEG image data initially stored in main memory in a YUV format as a texture is transferred to the embedded frame buffer prior to initiating a copy-out process via the copy pipeline from the embedded frame buffer to an external frame buffer in main memory. During the copy-out process, pixels are converted from YUV format to an RGB format.

Type: Grant

Filed: November 27, 2006

Date of Patent: January 8, 2008

Assignee: Nintendo Co., Ltd.

Inventors: Farhad Fouladi, Mark M. Leather, Robert Moore, Howard Cheng, Timothy J. Van Hook
Method and apparatus for efficient generation of texture coordinate displacements for implementing emboss-style bump mapping in a graphics rendering system

Patent number: 7307640

Abstract: A graphics system including a custom graphics and audio processor produces exciting 2D and 3D graphics and surround sound. The system includes a graphics and audio processor including a 3D graphics pipeline and an audio digital signal processor. Emboss style effects are created using fully pipelined hardware including two distinct dot-product computation units that perform a scaled model view matrix multiply without requiring the Normal input vector and which also compute dot-products between the Binormal and Tangent vectors and a light direction vector in parallel. The resulting texture coordinate displacements are provided to texture mapping hardware that performs a texture mapping operation providing texture combining in one pass. The disclosed pipelined arrangement efficiently provides interesting embossed style image effects such as raised and lowered patterns on surfaces.

Type: Grant

Filed: April 15, 2005

Date of Patent: December 11, 2007

Assignee: Nintendo Co., Ltd.

Inventors: Eric Demers, Mark M. Leather, Mark G. Segal
Method and apparatus for interleaved processing of direct and indirect texture coordinates in a graphics system

Patent number: 7307638

Abstract: A graphics system including a custom graphics and audio processor produces exciting 2D and 3D graphics and surround sound. The system includes a graphics and audio processor including a 3D graphics pipeline and an audio digital signal processor. The graphics pipeline renders and prepares images for display at least in part in response to polygon vertex attribute data and texel color data stored as a texture images in an associated memory. An efficient texturing pipeline arrangement achieves a relatively low chip-footprint by utilizing a single texture coordinate/data processing unit that interleaves the processing of logical direct and indirect texture coordinate data and a texture lookup data feedback path for “recirculating” indirect texture lookup data retrieved from a single texture retrieval unit back to the texture coordinate/data processing unit.

Type: Grant

Filed: June 15, 2005

Date of Patent: December 11, 2007

Assignee: Nintendo Co., Ltd.

Inventors: Mark M. Leather, Robert A. Drebin, Timothy J. Van Hook
Method and apparatus for sampling on a non-power-of-two pixel grid

Patent number: 7280119

Abstract: The embodiments of the present invention are a method and apparatus to perform anti-aliasing using multi-sampling on a non-power-of-two pixel grid. Using the present invention with 6 sample multisampling gives the same visual antialiasing quality as 8 samples using a prior art technique but uses less memory. A non-power-of-two equally spaced sample from a conventional grid of size N×N, where N is 12 can be chosen using the present invention. A scan conversion to determine the set of pixels covered by a polygon is performed in two parts. According to one embodiment, the present invention can multiply and divide by “N” in order to multisample an image using samples per pixel chosen from a N×N sub-sample grid, where “N” is not necessarily a power of 2. The present invention performs the divide by “N” step, where the step is achieved using a quick divide by 3 or 12 technique.

Type: Grant

Filed: February 13, 2004

Date of Patent: October 9, 2007

Assignee: ATI Technologies Inc.

Inventors: Mark M. Leather, Eric Demers
Method and apparatus for environment-mapped bump-mapping in a graphics system

Patent number: 7205999

Abstract: Apparatus and example methods for environment-mapped style of bump-mapping (EMBM) are provided that use a pre-computed bump-map texture accessed as an indirect texture along with pre-computed object surface normals (i.e., the Normal, Tangent and Binormal vectors) from each vertex of rendered polygons to effectively generate a new perturbed Normal vector per vertex. The perturbed new Normal vectors are then used to look up texels in an environment map. A specialized bump map texture data/coordinate processing “bump unit” is provided in the graphics pipeline for performing predetermined matrix multiplication operations on retrieved lookup data from the indirect-texture bump map.

Type: Grant

Filed: September 30, 2004

Date of Patent: April 17, 2007

Assignee: Nintendo Co., Ltd.

Inventor: Mark M. Leather

1 2 3 next