Patents by Inventor Christopher J. Brennan

Christopher J. Brennan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Synchronization Method for Low Latency Communication for Efficient Scheduling

Publication number: 20240111575

Abstract: Systems, apparatuses, and methods for implementing a message passing system to schedule work in a computing system. In various implementations, a processor includes a global scheduler, and a plurality of local schedulers with each of the local schedulers coupled to a plurality of processors. The processor further includes a shared cache that is shared by the plurality of local schedulers. Also, a plurality of mailboxes are implemented to enable communication between the local schedulers and the global scheduler. To schedule work items for execution, the global scheduler is configured to store one or more work items in the shared cache and store an indication in a mailbox for a first local scheduler of the plurality of local schedulers. Responsive to detecting the message in the mailbox, the first local scheduler identifies a location of the one or more work items in the shared cache and retrieves them for scheduling locally.

Type: Application

Filed: September 29, 2022

Publication date: April 4, 2024

Inventors: Matthäus G. Chajdas, Michael J. Mantor, Rex Eldon McCrary, Christopher J. Brennan, Robert Martin, Dominik Baumeister, Fabian Robert Sebastian Wildgrube
Work Graph Scheduler Implementation

Publication number: 20240111574

Abstract: Systems, apparatuses, and methods for implementing a hierarchical scheduler. In various implementations, a processor includes a global scheduler, and a plurality of independent local schedulers with each of the local schedulers coupled to a plurality of processors. In one implementation, the processor is a graphics processing unit and the processors are computation units. The processor further includes a shared cache that is shared by the plurality of local schedulers. Each of the local schedulers also includes a local cache used by the local scheduler and processors coupled to the local scheduler. To schedule work items for execution, the global scheduler is configured to store one or more work items in the shared cache and convey an indication to a first local scheduler of the plurality of local schedulers which causes the first local scheduler to retrieve the one or more work items from the shared cache.

Type: Application

Filed: September 29, 2022

Publication date: April 4, 2024

Inventors: Matthäus G. Chajdas, Michael J. Mantor, Rex Eldon McCrary, Christopher J. Brennan, Robert Martin, Dominik Baumeister, Fabian Robert Sebastian Wildgrube
HIERARCHICAL WORK SCHEDULING

Publication number: 20240111578

Abstract: A method for hierarchical work scheduling includes consuming a work item at a first scheduling domain having a local scheduler circuit and one or more workgroup processing elements. Consuming the work item produces a set of new work items. Subsequently, the local scheduler circuit distributes at least one new work item of the set of new work items to be executed locally at the first scheduling domain. If the local scheduler circuit of the first scheduling domain determines that the set of new work items includes one or more work items that would overload the first scheduling domain with work if scheduled for local execution, those work items are distributed to the next higher-level scheduler circuit in a scheduling domain hierarchy for redistribution to one or more other scheduling domains.

Type: Application

Filed: September 30, 2022

Publication date: April 4, 2024

Inventors: Matthaeus G. Chajdas, Christopher J. Brennan, Michael Mantor, Robert W. Martin, Nicolai Haehnle
SPATIAL TEST OF BOUNDING VOLUMES FOR RASTERIZATION

Publication number: 20240112397

Abstract: In response to receiving a scene description, a processing system generates a set of planes in the scene and a bounding volume representing a partition of the scene. Using the set of planes in the scene, a compute unit of an accelerated processing unit performs a spatial test on the bounding volume to determine whether the bounding volume intersects one or more planes of the set of planes in the scene. Based on the spatial test, the compute unit generates intersection data indicating whether the bounding volume intersects one or more planes of the set of planes in the scene. The accelerated processing unit then uses the intersection data to render the scene.

Type: Application

Filed: September 30, 2022

Publication date: April 4, 2024

Inventors: Christopher J. Brennan, Matthaeus G. Chajdas
DYNAMIC PERFORMANCE ADJUSTMENT

Publication number: 20240037031

Abstract: A technique for operating a device is disclosed. The technique includes recording log data for the device; analyzing the log data to determine one or more performance settings adjustments to apply to the device; and applying the one or more performance settings adjustments to the device.

Type: Application

Filed: September 28, 2023

Publication date: February 1, 2024

Applicant: Advanced Micro Devices, Inc.

Inventors: Christopher J. Brennan, Akshay Lahiry
Live profile-driven cache aging policies

Patent number: 11860784

Abstract: A technique for operating a cache is disclosed. The technique includes recording access data for a first set of memory accesses of a first frame; identifying parameters for a second set of memory accesses of a second frame subsequent to the first frame, based on the access data; and applying the parameters to the second set of memory accesses.

Type: Grant

Filed: June 27, 2022

Date of Patent: January 2, 2024

Assignee: Advanced Micro Devices, Inc.

Inventors: Christopher J. Brennan, Akshay Lahiry
LIVE PROFILE-DRIVEN CACHE AGING POLICIES

Publication number: 20230418744

Abstract: A technique for operating a cache is disclosed. The technique includes recording access data for a first set of memory accesses of a first frame; identifying parameters for a second set of memory accesses of a second frame subsequent to the first frame, based on the access data; and applying the parameters to the second set of memory accesses.

Type: Application

Filed: June 27, 2022

Publication date: December 28, 2023

Applicant: Advanced Micro Devices, Inc.

Inventors: Christopher J. Brennan, Akshay Lahiry
PARTIAL SORTING FOR COHERENCY RECOVERY

Publication number: 20230409337

Abstract: Devices and methods for partial sorting for coherence recovery are provided. The partial sorting is efficiently executed by utilizing existing hardware along the memory path (e.g., memory local to the compute unit). The devices include an accelerated processing device which comprises memory and a processor. The processor is, for example, a compute unit of a GPU which comprises a plurality of SIMD units and is configured to determine, for data entries each comprising a plurality of bits, a number of occurrences of different types of the data entries by storing the number of occurrences in one or more portions of the memory local to the processor, sort the data entries based on the determined number of occurrences stored in the one or more portions of the memory local to the processor and execute the sorted data entries.

Type: Application

Filed: June 21, 2022

Publication date: December 21, 2023

Applicant: Advanced Micro Devices, Inc.

Inventors: Matthäus G. Chajdas, Christopher J. Brennan
VRS RATE FEEDBACK

Publication number: 20230252713

Abstract: Techniques for performing shader operations are provided. The techniques include, performing pixel shading at a shading rate defined by pixel shader variable rate shading (“VRS”) data, and updating the pixel VRS data that indicates one or more shading rates for one or more tiles based on whether the tiles of the one or more tiles include triangle edges or do not include triangle edges, to generate updated VRS data.

Type: Application

Filed: April 20, 2023

Publication date: August 10, 2023

Applicant: Advanced Micro Devices, Inc.

Inventors: Skyler Jonathon Saleh, Vineet Goel, Pazhani Pillai, Ruijin Wu, Christopher J. Brennan, Andrew S. Pomianowski
Pixelation optimized delta color compression

Patent number: 11715253

Abstract: A technique for compressing an original image is disclosed. According to the technique, an original image is obtained and a delta-encoded image is generated based on the original image. Next, a segregated image is generated based on the delta-encoded image and then the segregated image is compressed to produce a compressed image. The segregated image is generated because the segregated image may be compressed more efficiently than the original image and the delta image.

Type: Grant

Filed: June 14, 2021

Date of Patent: August 1, 2023

Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Ruijin Wu, Skyler Jonathon Saleh, Christopher J. Brennan, Kei Ming Kwong, Anthony Hung-Cheong Chan
Wave throttling based on a parameter buffer

Patent number: 11710207

Abstract: A graphics pipeline includes a first shader that generates first wave groups, a shader processor input (SPI) that launches the first wave groups for execution by shaders, and a scan converter that generates second waves for execution on the shaders based on results of processing the first wave groups the one or more shaders. The first wave groups are selectively throttled based on a comparison of in-flight first wave groups and second waves pending execution on the at least one second shader. A cache holds information that is written to the cache in response to the first wave groups finishing execution on the shaders. Information is read from the cache in response to read requests issued by the second waves. In some cases, the first wave groups are selectively throttled by comparing how many first wave groups are in-flight and how many read requests to the cache are pending.

Type: Grant

Filed: March 30, 2021

Date of Patent: July 25, 2023

Assignee: Advanced Micro Devices, Inc.

Inventors: Christopher J. Brennan, Nishank Pathak
GRAPHICS DISCARD ENGINE

Publication number: 20230206559

Abstract: Systems, apparatuses, and methods for implementing a discard engine in a graphics pipeline are disclosed. A system includes a graphics pipeline with a geometry engine launching shaders that generate attribute data for vertices of each primitive of a set of primitives. The attribute data is consumed by pixel shaders, with each pixel shader generating a deallocation message when the pixel shader no longer needs the attribute data. A discard engine gathers deallocations from multiple pixel shaders and determines when the attribute data is no longer needed. Once a block of attributes has been consumed by all potential pixel shader consumers, the discard engine deallocates the given block of attributes. The discard engine sends a discard command to the caches so that the attribute data can be invalidated and not written back to memory.

Type: Application

Filed: December 27, 2021

Publication date: June 29, 2023

Inventors: Christopher J. Brennan, Randy Wayne Ramsey, Nishank Pathak, Ricky Wai Yeung Iu, Jimshed Mirza, Anthony Chan
OPTIMIZING PARTIAL WRITES TO COMPRESSED BLOCKS

Publication number: 20230206380

Abstract: A processor for optimizing partial writes to compressed blocks is configured to identify that a write request targets less than an entirety of a compressed block of pixel data, identify, based on a compression key, a compressed segment of the compressed block of pixel data that includes a target of the write request, and decompress, responsive to the write request, only the identified compressed segment of the compressed block of pixel data.

Type: Application

Filed: December 28, 2021

Publication date: June 29, 2023

Inventors: ANTHONY HC CHAN, CHRISTOPHER J. BRENNAN, MARK FOWLER, DAVID CHUI, LEON K.N. LAI, JIMSHED MIRZA
COLOR CHANNEL CORRELATION DETECTION

Publication number: 20230206503

Abstract: Systems, apparatuses, and methods for performing color channel correlation detection are disclosed. A compression engine performs a color channel transform on an original set of pixel data to generate a channel transformed set of pixel data. An analysis unit determines whether to compress the channel transformed set of pixel data or the original set of pixel data based on performing a comparison of the two sets of pixel data. In one scenario, the channel transformed set of pixel data is generated by calculating the difference between a first pixel component and a second pixel component for each pixel of the set of pixel data. The difference is then compared to the original first pixel component for each pixel. If the difference is less than or equal to the original for a threshold number of pixels, then the analysis unit decides to apply the color channel transform prior to compression.

Type: Application

Filed: December 27, 2021

Publication date: June 29, 2023

Inventors: Anthony Chan, Christopher J. Brennan, Angel Serah
STOCHASTIC OPTIMIZATION OF SURFACE CACHEABILITY IN PARALLEL PROCESSING UNITS

Publication number: 20230195639

Abstract: A processing system selectively allocates storage at a local cache of a parallel processing unit for cache lines of a repeating pattern of data that exceeds the storage capacity of the cache. The processing system identifies repeating patterns of data having cache lines that have a reuse distance that exceeds the storage capacity of the cache. A cache controller allocates storage for only a subset of cache lines of the repeating pattern of data at the cache and excludes the remainder of cache lines of the repeating pattern of data from the cache. By restricting the cache to store only a subset of cache lines of the repeating pattern of data, the cache controller increases the hit rate at the cache for the subset of cache lines.

Type: Application

Filed: December 21, 2021

Publication date: June 22, 2023

Inventors: Saurabh Sharma, Jeremy Lukacs, Hashem Hashemi, Gianpaolo Tommasi, Christopher J. Brennan
METHOD AND SYSTEM FOR INTEGRATING COMPRESSION

Publication number: 20230186523

Abstract: A method and apparatus for integrating data compression in a computer system includes receiving first data at a first system level. Based upon a number of planes of the first data being less than or equal to a threshold, the data is compressed with a first data compression scheme, and transferred to a second system level for processing. Based upon the number of planes of the first data exceeding the threshold, the first data is transferred uncompressed to the second system level for processing. Based upon the received data at the second system level being compressed with the first compression scheme, the data is transferred to a third system level, and based upon the received data at the second system level being uncompressed with the first compression scheme, compressing the data with a second compression scheme, and transferring the compressed data to the third system level.

Type: Application

Filed: December 14, 2021

Publication date: June 15, 2023

Applicant: Advanced Micro Devices, Inc.

Inventors: Christopher J. Brennan, Pazhani Pillai
VRS rate feedback

Patent number: 11657560

Abstract: Techniques for performing shader operations are provided. The techniques include, performing pixel shading at a shading rate defined by pixel shader variable rate shading (“VRS”) data, and updating the pixel VRS data that indicates one or more shading rates for one or more tiles based on whether the tiles of the one or more tiles include triangle edges or do not include triangle edges, to generate updated VRS data.

Type: Grant

Filed: September 23, 2021

Date of Patent: May 23, 2023

Assignee: Advanced Micro Devices, Inc.

Inventors: Skyler Jonathon Saleh, Vineet Goel, Pazhani Pillai, Ruijin Wu, Christopher J. Brennan, Andrew S. Pomianowski
Graphics texture footprint discovery

Patent number: 11620788

Abstract: Accesses to a mipmap by a shader in a graphics pipeline are monitored. The mipmap is stored in a memory or cache associated with the shader and the mipmap represents a texture at a hierarchy of levels of detail. A footprint in the mipmap of the texture is marked based on the monitored accesses. The footprint indicates, on a per-tile, per-level-of-detail (LOD) basis, tiles of the mipmap that are expected to be accessed in subsequent shader operations. In some cases, the footprint is defined by a plurality of footprint indicators that indicate whether the tiles of the mipmap are expected to be accessed in subsequent shader operations. In that case, the plurality of footprint indicators are set to a first value to indicate that the tile was not access during the first frame or a second value to indicate that the tile was accessed during the first frame.

Type: Grant

Filed: December 16, 2020

Date of Patent: April 4, 2023

Assignee: Advanced Micro Devices, Inc.

Inventor: Christopher J. Brennan
GRAPHICS PRIMITIVES AND POSITIONS THROUGH MEMORY BUFFERS

Publication number: 20230097097

Abstract: Systems, apparatuses, and methods for preemptively reserving buffer space for primitives and positions in a graphics pipeline are disclosed. A system includes a graphics pipeline frontend with any number of geometry engines coupled to corresponding shader engines. Each geometry engine launches shader wavefronts to execute on a corresponding shader engine. The geometry engine preemptively reserves buffer space for each wavefront prior to the wavefront being launched on the shader engine. When the shader engine executes a wavefront, the shader engine exports primitive and position data to the reserved buffer space. Multiple scan converters will consume the primitive and position data, with each scan converter consuming primitive and position data based on the screen coverage of the scan converter. After consuming the primitive and position data, the scan converters mark the buffer space as freed so that the geometry engine can then allocate the freed buffer space to subsequent shader wavefronts.

Type: Application

Filed: September 29, 2021

Publication date: March 30, 2023

Inventors: Todd Martin, Tad Robert Litwiller, Nishank Pathak, Randy Wayne Ramsey, Michael J. Mantor, Christopher J. Brennan, Mark M. Leather, Ryan James Cash
Aggregating commands in a stream based on cache line addresses

Patent number: 11614889

Abstract: An operation combiner receives a series of commands with read addresses, a modification operation, and write addresses. In some cases, the commands have serial dependencies that limit the rate at which they can be processed. The operation combiner compares the addresses for compatibility, transforms the operations to break serial dependencies, and combines multiple source commands into a smaller number of aggregate commands that can be executed much faster than the source commands. Some embodiments of the operation combiner receive a first command including one or more first read addresses and a first write address. The operation combiner compares the first read addresses and the first write address to one or more second read addresses and a second write address of a second command stored in a buffer. The operation combiner selectively combines the first and second commands to form an aggregate command based on the comparison.

Type: Grant

Filed: November 29, 2018

Date of Patent: March 28, 2023

Assignee: Advanced Micro Devices, Inc.

Inventor: Christopher J. Brennan

1 2 3 4 next