Patents by Inventor Matthaeus G. Chajdas
Matthaeus G. Chajdas has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12293456Abstract: A method and a processing device for performing rendering are disclosed. The method comprises generating a base hierarchy tree comprising data representing a first object and generating a second hierarchy tree representing a second object comprising shared data of the base hierarchy tree and the second hierarchy tree and difference data. The method further comprises storing the difference data in the memory without storing the shared data, and generating an overlay hierarchy tree comprising the shared data, the difference data, and indication information indicating nodes of the overlay hierarchy tree that comprise the difference data. The method further comprises rendering the first object using the data stored for the base hierarchy tree, and rendering the second object using any one or a combination of the shared data, the difference data, and the indication information.Type: GrantFiled: November 10, 2023Date of Patent: May 6, 2025Assignee: Advanced Micro Devices, Inc.Inventors: Matthäus G. Chajdas, Konstantin I. Shkurko
-
Publication number: 20250110776Abstract: A processing system includes dispatch circuitry that sends elements to one or more processing circuits such as shader circuitry for execution. The dispatch circuitry includes a dispatch queue and an arbitration circuit. The dispatch queue stores the elements to be sent to the one or more processing circuits. The arbitration circuit schedules the elements of the dispatch queue for execution based on priority indicators corresponding to the elements. As a result, prioritization of the elements is implemented at the dispatch circuitry in hardware without changing a design of the dispatch queue to store the priority information.Type: ApplicationFiled: September 29, 2023Publication date: April 3, 2025Inventors: Fabian Robert Sebastian Wildgrube, Matthaeus G. Chajdas
-
Patent number: 12260494Abstract: In response to receiving a scene description, a processing system generates a set of planes in the scene and a bounding volume representing a partition of the scene. Using the set of planes in the scene, a compute unit of an accelerated processing unit performs a spatial test on the bounding volume to determine whether the bounding volume intersects one or more planes of the set of planes in the scene. Based on the spatial test, the compute unit generates intersection data indicating whether the bounding volume intersects one or more planes of the set of planes in the scene. The accelerated processing unit then uses the intersection data to render the scene.Type: GrantFiled: September 30, 2022Date of Patent: March 25, 2025Assignee: Advanced Micro Devices, Inc.Inventors: Christopher J. Brennan, Matthaeus G. Chajdas
-
Publication number: 20250068464Abstract: A method for hierarchical work scheduling includes consuming a work item at a first scheduling domain having a local scheduler circuit and one or more workgroup processing elements. Consuming the work item produces a set of new work items. Subsequently, the local scheduler circuit distributes at least one new work item of the set of new work items to be executed locally at the first scheduling domain. If the local scheduler circuit of the first scheduling domain determines that the set of new work items includes one or more work items that would overload the first scheduling domain with work if scheduled for local execution, those work items are distributed to the next higher-level scheduler circuit in a scheduling domain hierarchy for redistribution to one or more other scheduling domains.Type: ApplicationFiled: November 8, 2024Publication date: February 27, 2025Inventors: Matthaeus G. Chajdas, Christopher J. Brennan, Michael Mantor, Robert W. Martin, Nicolai Haehnle
-
Publication number: 20250068429Abstract: A Streaming Wave Coalescer (SWC) circuit stores a first set of state values associated with a first subset of threads of a first wave in a bin based on each of the first subset of threads including a first set of instructions to be executed. A second set of state values associated with a second subset of threads of a second wave is stored in the bin based on each of the second subset of threads including the first set of instructions to be executed and based on the first wave and the second wave both being associated with a hard key. A third wave is formed from the threads of the first subset and the second subset and is emitted for execution. As a result of reorganizing the threads and reconstituting a different wave, thread divergence of waves sent for execution is reduced.Type: ApplicationFiled: December 12, 2023Publication date: February 27, 2025Inventors: John Stephen Junkins, Christopher J. Brennan, Ian Richard Beaumont, Kellie Marks, Matthaeus G. Chajdas, Max Oberberger, Michael John Bedy, Michael Mantor, Sean Keely
-
Publication number: 20250004982Abstract: A semiconductor module comprises multiple non-homogeneous semiconductor dies disposed on the semiconductor module, with each semiconductor die having a set of circuitry modules that are common to all of the semiconductor dies and also a set of supporting circuitry modules that are distinct between the semiconductor dies. An interconnect communicatively couples the semiconductor dies together. Commands for processing by the semiconductor module may be routed to individual semiconductor dies based on capabilities of the particular circuitry modules disposed on those individual semiconductor dies.Type: ApplicationFiled: May 31, 2024Publication date: January 2, 2025Inventor: Matthaeus G. Chajdas
-
Patent number: 12153957Abstract: A method for hierarchical work scheduling includes consuming a work item at a first scheduling domain having a local scheduler circuit and one or more workgroup processing elements. Consuming the work item produces a set of new work items. Subsequently, the local scheduler circuit distributes at least one new work item of the set of new work items to be executed locally at the first scheduling domain. If the local scheduler circuit of the first scheduling domain determines that the set of new work items includes one or more work items that would overload the first scheduling domain with work if scheduled for local execution, those work items are distributed to the next higher-level scheduler circuit in a scheduling domain hierarchy for redistribution to one or more other scheduling domains.Type: GrantFiled: September 30, 2022Date of Patent: November 26, 2024Assignee: Advanced Micro Devices, Inc.Inventors: Matthaeus G. Chajdas, Christopher J. Brennan, Michael Mantor, Robert W. Martin, Nicolai Haehnle
-
Patent number: 12079919Abstract: Described herein is a technique for performing operations for a bounding volume hierarchy. The techniques include: for a bounding box with quantized orientation, the bounding box being part of a bounding volume hierarchy, rotating a ray according to the quantized orientation to generate a rotated ray; performing an intersection test against the bounding box with the rotated ray; and according to the results of the intersection test, continuing traversal of the bounding volume hierarchy.Type: GrantFiled: September 29, 2021Date of Patent: September 3, 2024Assignee: Advanced Micro Devices, Inc.Inventors: David Ronald Oldcorn, Matthäus G. Chajdas, Michael A. Kern
-
Patent number: 12032967Abstract: Devices and methods for partial sorting for coherence recovery are provided. The partial sorting is efficiently executed by utilizing existing hardware along the memory path (e.g., memory local to the compute unit). The devices include an accelerated processing device which comprises memory and a processor. The processor is, for example, a compute unit of a GPU which comprises a plurality of SIMD units and is configured to determine, for data entries each comprising a plurality of bits, a number of occurrences of different types of the data entries by storing the number of occurrences in one or more portions of the memory local to the processor, sort the data entries based on the determined number of occurrences stored in the one or more portions of the memory local to the processor and execute the sorted data entries.Type: GrantFiled: June 21, 2022Date of Patent: July 9, 2024Assignee: Advanced Micro Devices, Inc.Inventors: Matthäus G. Chajdas, Christopher J. Brennan
-
Publication number: 20240202003Abstract: Systems, apparatuses, and methods for implementing a hierarchical scheduling in fixed-function graphics pipeline are disclosed. In various implementations, a processor includes a pipeline comprising a plurality of fixed-function units and a scheduler. The scheduler is configured to schedule a first operation for execution by one or more fixed-function units of the pipeline by scheduling the first operation with a first unit of the pipeline, responsive to a first mode of operation and schedule a second operation for execution by a selected fixed-function unit of the pipeline by scheduling the second operation directly to the selected fixed-function unit, independent of a sequential arrangement of the one or more fixed-function units in the pipeline, responsive to a second mode of operation.Type: ApplicationFiled: December 14, 2022Publication date: June 20, 2024Inventors: Matthäus G. Chajdas, Michael J. Mantor, Rex Eldon McCrary, Christopher J. Brennan, Robert Martin, Brian Kenneth Bennett
-
Patent number: 12013810Abstract: A semiconductor module comprises multiple non-homogeneous semiconductor dies disposed on the semiconductor module, with each semiconductor die having a set of circuitry modules that are common to all of the semiconductor dies and also a set of supporting circuitry modules that are distinct between the semiconductor dies. An interconnect communicatively couples the semiconductor dies together. Commands for processing by the semiconductor module may be routed to individual semiconductor dies based on capabilities of the particular circuitry modules disposed on those individual semiconductor dies.Type: GrantFiled: September 29, 2022Date of Patent: June 18, 2024Assignee: Advanced Micro Devices, Inc.Inventor: Matthaeus G. Chajdas
-
Publication number: 20240112397Abstract: In response to receiving a scene description, a processing system generates a set of planes in the scene and a bounding volume representing a partition of the scene. Using the set of planes in the scene, a compute unit of an accelerated processing unit performs a spatial test on the bounding volume to determine whether the bounding volume intersects one or more planes of the set of planes in the scene. Based on the spatial test, the compute unit generates intersection data indicating whether the bounding volume intersects one or more planes of the set of planes in the scene. The accelerated processing unit then uses the intersection data to render the scene.Type: ApplicationFiled: September 30, 2022Publication date: April 4, 2024Inventors: Christopher J. Brennan, Matthaeus G. Chajdas
-
Publication number: 20240111710Abstract: A semiconductor module comprises multiple non-homogeneous semiconductor dies disposed on the semiconductor module, with each semiconductor die having a set of circuitry modules that are common to all of the semiconductor dies and also a set of supporting circuitry modules that are distinct between the semiconductor dies. An interconnect communicatively couples the semiconductor dies together. Commands for processing by the semiconductor module may be routed to individual semiconductor dies based on capabilities of the particular circuitry modules disposed on those individual semiconductor dies.Type: ApplicationFiled: September 29, 2022Publication date: April 4, 2024Inventor: Matthaeus G. Chajdas
-
Publication number: 20240111578Abstract: A method for hierarchical work scheduling includes consuming a work item at a first scheduling domain having a local scheduler circuit and one or more workgroup processing elements. Consuming the work item produces a set of new work items. Subsequently, the local scheduler circuit distributes at least one new work item of the set of new work items to be executed locally at the first scheduling domain. If the local scheduler circuit of the first scheduling domain determines that the set of new work items includes one or more work items that would overload the first scheduling domain with work if scheduled for local execution, those work items are distributed to the next higher-level scheduler circuit in a scheduling domain hierarchy for redistribution to one or more other scheduling domains.Type: ApplicationFiled: September 30, 2022Publication date: April 4, 2024Inventors: Matthaeus G. Chajdas, Christopher J. Brennan, Michael Mantor, Robert W. Martin, Nicolai Haehnle
-
Publication number: 20240111575Abstract: Systems, apparatuses, and methods for implementing a message passing system to schedule work in a computing system. In various implementations, a processor includes a global scheduler, and a plurality of local schedulers with each of the local schedulers coupled to a plurality of processors. The processor further includes a shared cache that is shared by the plurality of local schedulers. Also, a plurality of mailboxes are implemented to enable communication between the local schedulers and the global scheduler. To schedule work items for execution, the global scheduler is configured to store one or more work items in the shared cache and store an indication in a mailbox for a first local scheduler of the plurality of local schedulers. Responsive to detecting the message in the mailbox, the first local scheduler identifies a location of the one or more work items in the shared cache and retrieves them for scheduling locally.Type: ApplicationFiled: September 29, 2022Publication date: April 4, 2024Inventors: Matthäus G. Chajdas, Michael J. Mantor, Rex Eldon McCrary, Christopher J. Brennan, Robert Martin, Dominik Baumeister, Fabian Robert Sebastian Wildgrube
-
Publication number: 20240111574Abstract: Systems, apparatuses, and methods for implementing a hierarchical scheduler. In various implementations, a processor includes a global scheduler, and a plurality of independent local schedulers with each of the local schedulers coupled to a plurality of processors. In one implementation, the processor is a graphics processing unit and the processors are computation units. The processor further includes a shared cache that is shared by the plurality of local schedulers. Each of the local schedulers also includes a local cache used by the local scheduler and processors coupled to the local scheduler. To schedule work items for execution, the global scheduler is configured to store one or more work items in the shared cache and convey an indication to a first local scheduler of the plurality of local schedulers which causes the first local scheduler to retrieve the one or more work items from the shared cache.Type: ApplicationFiled: September 29, 2022Publication date: April 4, 2024Inventors: Matthäus G. Chajdas, Michael J. Mantor, Rex Eldon McCrary, Christopher J. Brennan, Robert Martin, Dominik Baumeister, Fabian Robert Sebastian Wildgrube
-
Publication number: 20240087223Abstract: A method and a processing device for performing rendering are disclosed. The method comprises generating a base hierarchy tree comprising data representing a first object and generating a second hierarchy tree representing a second object comprising shared data of the base hierarchy tree and the second hierarchy tree and difference data. The method further comprises storing the difference data in the memory without storing the shared data, and generating an overlay hierarchy tree comprising the shared data, the difference data, and indication information indicating nodes of the overlay hierarchy tree that comprise the difference data. The method further comprises rendering the first object using the data stored for the base hierarchy tree, and rendering the second object using any one or a combination of the shared data, the difference data, and the indication information.Type: ApplicationFiled: November 10, 2023Publication date: March 14, 2024Applicant: Advanced Micro Devices, Inc.Inventors: Matthäus G. Chajdas, Konstantin I. Shkurko
-
Patent number: 11915337Abstract: Systems, apparatuses, and methods for implementing a downsampler in a single compute shader pass are disclosed. A central processing unit (CPU) issues a single-pass compute shader kernel to perform downsampling of a texture on a graphics processing unit (GPU). The GPU includes a plurality of compute units for executing thread groups of the kernel. Each thread group fetches a patch of the texture, and each individual thread downsamples four quads of texels to compute mip levels 1 and 2 independently of the other threads. For mip level 3, texel data is written back over one of the local data share (LDS) entries from which the texel data was loaded. This eliminates the need for a barrier between loads and stores for computing mip level 3. The remaining mip levels are computed in a similar fashion by the thread groups of the single-pass kernel.Type: GrantFiled: February 23, 2021Date of Patent: February 27, 2024Assignee: Advanced Micro Devices, Inc.Inventors: Lou Isabelle Kramer, Matthäus G. Chajdas
-
Patent number: 11854138Abstract: Described herein is a technique for modifying a bounding volume hierarchy. The techniques include combining preferred orientations of child nodes of a first bounding box node to generate a first preferred orientation; based on the first preferred orientation, converting one or more child nodes of the first bounding box node into one or more oriented bounding box nodes; combining preferred orientations of child nodes of a second bounding box node to generate a second preferred orientation; and based on the second preferred orientation, maintaining one or more children of the second bounding box node as non-oriented bounding box nodes.Type: GrantFiled: July 23, 2021Date of Patent: December 26, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Matthäus G Chajdas, Michael A. Kern, David Ronald Oldcorn
-
Publication number: 20230409337Abstract: Devices and methods for partial sorting for coherence recovery are provided. The partial sorting is efficiently executed by utilizing existing hardware along the memory path (e.g., memory local to the compute unit). The devices include an accelerated processing device which comprises memory and a processor. The processor is, for example, a compute unit of a GPU which comprises a plurality of SIMD units and is configured to determine, for data entries each comprising a plurality of bits, a number of occurrences of different types of the data entries by storing the number of occurrences in one or more portions of the memory local to the processor, sort the data entries based on the determined number of occurrences stored in the one or more portions of the memory local to the processor and execute the sorted data entries.Type: ApplicationFiled: June 21, 2022Publication date: December 21, 2023Applicant: Advanced Micro Devices, Inc.Inventors: Matthäus G. Chajdas, Christopher J. Brennan