Patents by Inventor Michael Mantor

Michael Mantor has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

PROCESSING UNIT WITH SMALL FOOTPRINT ARITHMETIC LOGIC UNIT

Publication number: 20240143283

Abstract: A parallel processing unit employs an arithmetic logic unit (ALU) having a relatively small footprint, thereby reducing the overall power consumption and circuit area of the processing unit. To support the smaller footprint, the ALU includes multiple stages to execute operations corresponding to a received instruction. The ALU executes at least one operation at a precision indicated by the received instruction, and then reduces the resulting data of the at least one operation to a smaller size before providing the results to another stage of the ALU to continue execution of the instruction.

Type: Application

Filed: July 7, 2023

Publication date: May 2, 2024

Inventors: Bin HE, Shubh SHAH, Michael MANTOR
HYBRID RENDER WITH DEFERRED PRIMITIVE BATCH BINNING

Publication number: 20240135626

Abstract: A method, computer system, and a non-transitory computer-readable storage medium for performing primitive batch binning are disclosed. The method, computer system, and non-transitory computer-readable storage medium include techniques for generating a primitive batch from a plurality of primitives, computing respective bin intercepts for each of the plurality of primitives in the primitive batch, and shading the primitive batch by iteratively processing each of the respective bin intercepts computed until all of the respective bin intercepts are processed.

Type: Application

Filed: January 2, 2024

Publication date: April 25, 2024

Applicants: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Michael Mantor, Laurent Lefebvre, Mark Fowler, Timothy Kelley, Mikko Alho, Mika Tuomi, Kiia Kallio, Patrick Klas Rudolf Buss, Jari Antero Komppa, Kaj Tuomi
Prefetch kernels on data-parallel processors

Patent number: 11954036

Abstract: Embodiments include methods, systems and non-transitory computer-readable computer readable media including instructions for executing a prefetch kernel that includes memory accesses for prefetching data for a processing kernel into a memory, and, subsequent to executing at least a portion of the prefetch kernel, executing the processing kernel where the processing kernel includes accesses to data that is stored into the memory resulting from execution of the prefetch kernel.

Type: Grant

Filed: November 11, 2022

Date of Patent: April 9, 2024

Assignee: Advanced Micro Devices, Inc.

Inventors: Nuwan S. Jayasena, James Michael O'Connor, Michael Mantor
Hybrid render with preferred primitive batch binning and sorting

Patent number: 11954782

Abstract: A method, system, and non-transitory computer readable storage medium for rasterizing primitives are disclosed. The method, system, and non-transitory computer readable storage medium includes: generating a primitive batch from a sequence of one or more primitives, wherein the primitive batch includes primitives sorted into one or more row groups based on which row of a plurality of rows each primitive intersects; and processing each row group, the processing for each row group including: identifying one or more primitive column intercepts for each of the one or more primitives in the row group, wherein each combination of primitive column intercept and row identifies a bin; and rasterizing the one or more primitives that intersect the bin.

Type: Grant

Filed: March 22, 2021

Date of Patent: April 9, 2024

Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Michael Mantor, Laurent Lefebvre, Mikko Alho, Mika Tuomi, Kiia Kallio
MATRIX MULTIPLICATION UNIT WITH FLEXIBLE PRECISION OPERATIONS

Publication number: 20240111530

Abstract: A processing unit such as a graphics processing unit (GPU) includes a plurality of vector signal processors (VSPs) that include multiply/accumulate elements. The processing unit also includes a plurality of registers associated with the plurality of VSPs. First portions of first and second matrices are fetched into the plurality of registers prior to a first round that includes a plurality of iterations. The multiply/accumulate elements perform matrix multiplication and accumulation on different combinations of subsets of the first portions of the first and second matrices in the plurality of iterations prior to fetching second portions of the first and second matrices into the plurality of registers for a second round. The accumulated results of multiplying the first portions of the first and second matrices are written into an output buffer in response to completing the plurality of iterations.

Type: Application

Filed: September 7, 2023

Publication date: April 4, 2024

Inventors: Bin HE, Michael MANTOR, Jiasheng CHEN, Jian HUANG
HIERARCHICAL WORK SCHEDULING

Publication number: 20240111578

Abstract: A method for hierarchical work scheduling includes consuming a work item at a first scheduling domain having a local scheduler circuit and one or more workgroup processing elements. Consuming the work item produces a set of new work items. Subsequently, the local scheduler circuit distributes at least one new work item of the set of new work items to be executed locally at the first scheduling domain. If the local scheduler circuit of the first scheduling domain determines that the set of new work items includes one or more work items that would overload the first scheduling domain with work if scheduled for local execution, those work items are distributed to the next higher-level scheduler circuit in a scheduling domain hierarchy for redistribution to one or more other scheduling domains.

Type: Application

Filed: September 30, 2022

Publication date: April 4, 2024

Inventors: Matthaeus G. Chajdas, Christopher J. Brennan, Michael Mantor, Robert W. Martin, Nicolai Haehnle
CREATING INTERCONNECTS BETWEEN DIES USING A CROSS-OVER DIE AND THROUGH-DIE VIAS

Publication number: 20240071940

Abstract: A semiconductor package includes a first die, a second die, and an interconnect die coupled to a first plurality of through-die vias in the first die and a second plurality of through-die vias in the second die. The interconnect die provides communications pathways the first die and the second die.

Type: Application

Filed: November 9, 2023

Publication date: February 29, 2024

Inventors: RAHUL AGARWAL, RAJA SWAMINATHAN, MICHAEL S. ALFANO, GABRIEL H. LOH, ALAN D. SMITH, GABRIEL WONG, MICHAEL MANTOR
Hybrid render with deferred primitive batch binning

Patent number: 11880926

Abstract: A method, computer system, and a non-transitory computer-readable storage medium for performing primitive batch binning are disclosed. The method, computer system, and non-transitory computer-readable storage medium include techniques for generating a primitive batch from a plurality of primitives, computing respective bin intercepts for each of the plurality of primitives in the primitive batch, and shading the primitive batch by iteratively processing each of the respective bin intercepts computed until all of the respective bin intercepts are processed.

Type: Grant

Filed: May 16, 2022

Date of Patent: January 23, 2024

Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Michael Mantor, Laurent Lefebvre, Mark Fowler, Timothy Kelley, Mikko Alho, Mika Tuomi, Kiia Kallio, Patrick Klas Rudolf Buss, Jari Antero Komppa, Kaj Tuomi
Graphics processing unit traversal engine

Patent number: 11854139

Abstract: A processing unit employs a hardware traversal engine to traverse an acceleration structure such as a ray tracing structure. The hardware traversal engine includes one or more memory modules to store state information and other data used for the structure traversal, and control logic to execute a traversal process based on the stored data and based on received information indicating a source node of the acceleration structure to be used for the traversal process. By employing a hardware traversal engine, the processing unit is able to execute the traversal process more quickly and efficiently, conserving processing resources and improving overall processing efficiency.

Type: Grant

Filed: December 28, 2021

Date of Patent: December 26, 2023

Assignee: Advanced Micro Devices, Inc.

Inventors: Konstantin Igorevich Shkurko, Michael Mantor
Creating interconnects between dies using a cross-over die and through-die vias

Patent number: 11830817

Abstract: A semiconductor package includes a first die, a second die, and an interconnect die coupled to a first plurality of through-die vias in the first die and a second plurality of through-die vias in the second die. The interconnect die provides communications pathways the first die and the second die.

Type: Grant

Filed: October 30, 2020

Date of Patent: November 28, 2023

Assignees: ADVANCED MICRO DEVICES, INC., ATI TECHNOLOGIES ULC

Inventors: Rahul Agarwal, Raja Swaminathan, Michael S. Alfano, Gabriel H. Loh, Alan D. Smith, Gabriel Wong, Michael Mantor
Broadcast synchronization for dynamically adaptable arrays

Patent number: 11803385

Abstract: An array processor includes processor element arrays (PEAs) distributed in rows and columns. The PEAs are configured to perform operations on parameter values. A first sequencer received a first direct memory access (DMA) instruction that includes a request to read data from at least one address in memory. A texture address (TA) engine requests the data from the memory based on the at least one address and a texture data (TD) engine provides the data to the PEAs. The PEAs provide first synchronization signals to the TD engine to indicate availability of registers for receiving the data. The TD engine provides second synchronization signals to the first sequencer in response to receiving acknowledgments that the PEAs have consumed the data.

Type: Grant

Filed: December 10, 2021

Date of Patent: October 31, 2023

Assignee: Advanced Micro Devices, Inc.

Inventors: Sateesh Lagudu, Arun Vaidyanathan Ananthanarayan, Michael Mantor, Allen H. Rush
Processing unit with mixed precision operations

Patent number: 11768664

Abstract: A graphics processing unit (GPU) implements operations, with associated op codes, to perform mixed precision mathematical operations. The GPU includes an arithmetic logic unit (ALU) with different execution paths, wherein each execution path executes a different mixed precision operation. By implementing mixed precision operations at the ALU in response to designate op codes that delineate the operations, the GPU efficiently increases the precision of specified mathematical operations while reducing execution overhead.

Type: Grant

Filed: October 2, 2019

Date of Patent: September 26, 2023

Assignee: Advanced Micro Devices, Inc.

Inventors: Bin He, Michael Mantor, Jiasheng Chen
Matrix multiplication unit with flexible precision operations

Patent number: 11762658

Abstract: A processing unit such as a graphics processing unit (GPU) includes a plurality of vector signal processors (VSPs) that include multiply/accumulate elements. The processing unit also includes a plurality of registers associated with the plurality of VSPs. First portions of first and second matrices are fetched into the plurality of registers prior to a first round that includes a plurality of iterations. The multiply/accumulate elements perform matrix multiplication and accumulation on different combinations of subsets of the first portions of the first and second matrices in the plurality of iterations prior to fetching second portions of the first and second matrices into the plurality of registers for a second round. The accumulated results of multiplying the first portions of the first and second matrices are written into an output buffer in response to completing the plurality of iterations.

Type: Grant

Filed: September 24, 2019

Date of Patent: September 19, 2023

Assignee: Advanced Micro Devices, Inc.

Inventors: Bin He, Michael Mantor, Jiasheng Chen, Jian Huang
VERTICAL AND HORIZONTAL BROADCAST OF SHARED OPERANDS

Publication number: 20230289191

Abstract: An array processor includes processor element arrays distributed in rows and columns. The processor element arrays perform operations on parameter values. The array processor also includes memory interfaces that broadcast sets of the parameter values to mutually exclusive subsets of the rows and columns of the processor element arrays. In some cases, the array processor includes single-instruction-multiple-data (SIMD) units including subsets of the processor element arrays in corresponding rows, workgroup processors (WGPs) including subsets of the SIMD units, and a memory fabric configured to interconnect with an external memory that stores the parameter values. The memory interfaces broadcast the parameter values to the SIMD units that include the processor element arrays in rows associated with the memory interfaces and columns of processor element arrays that are implemented across the SIMD units in the WGPs. The memory interfaces access the parameter values from the external memory via the memory fabric.

Type: Application

Filed: March 30, 2023

Publication date: September 14, 2023

Inventors: Sateesh LAGUDU, Allen H. Rush, Michael Mantor, Arun Vaidyanathan Ananthanarayan, Prasad Nagabhushanamgari, Maxim V. Kazakov
System and method for protecting GPU memory instructions against faults

Patent number: 11726868

Abstract: A system and method for protecting memory instructions against faults are described. The system and method include converting the slave instructions to dummy operations, modifying memory arbiter to issue up to N master and N slave global/shared memory instructions per cycle, sending master memory requests to memory system, using slave requests for error checking, entering master requests to the GM/LM FIFO, storing slave requests in a register, and comparing the entered master requests with the stored slave requests.

Type: Grant

Filed: December 7, 2020

Date of Patent: August 15, 2023

Assignee: Advanced Micro Devices, Inc.

Inventors: John Kalamatianos, Michael Mantor, Sudhanva Gurumurthi
Processing unit with small footprint arithmetic logic unit

Patent number: 11720328

Abstract: A parallel processing unit employs an arithmetic logic unit (ALU) having a relatively small footprint, thereby reducing the overall power consumption and circuit area of the processing unit. To support the smaller footprint, the ALU includes multiple stages to execute operations corresponding to a received instruction. The ALU executes at least one operation at a precision indicated by the received instruction, and then reduces the resulting data of the at least one operation to a smaller size before providing the results to another stage of the ALU to continue execution of the instruction.

Type: Grant

Filed: September 23, 2020

Date of Patent: August 8, 2023

Assignee: Advanced Micro Devices, Inc.

Inventors: Bin He, Shubh Shah, Michael Mantor
GRAPHICS PROCESSING UNIT TRAVERSAL ENGINE

Publication number: 20230206543

Abstract: A processing unit employs a hardware traversal engine to traverse an acceleration structure such as a ray tracing structure. The hardware traversal engine includes one or more memory modules to store state information and other data used for the structure traversal, and control logic to execute a traversal process based on the stored data and based on received information indicating a source node of the acceleration structure to be used for the traversal process. By employing a hardware traversal engine, the processing unit is able to execute the traversal process more quickly and efficiently, conserving processing resources and improving overall processing efficiency.

Type: Application

Filed: December 28, 2021

Publication date: June 29, 2023

Inventors: Konstantin Igorevich SHKURKO, Michael Mantor
Dual vector arithmetic logic unit

Patent number: 11675568

Abstract: A processing system executes wavefronts at multiple arithmetic logic unit (ALU) pipelines of a single instruction multiple data (SIMD) unit in a single execution cycle. The ALU pipelines each include a number of ALUs that execute instructions on wavefront operands that are collected from vector general process register (VGPR) banks at a cache and output results of the instructions executed on the wavefronts at a buffer. By storing wavefronts supplied by the VGPR banks at the cache, a greater number of wavefronts can be made available to the SIMD unit without increasing the VGPR bandwidth, enabling multiple ALU pipelines to execute instructions during a single execution cycle.

Type: Grant

Filed: December 14, 2020

Date of Patent: June 13, 2023

Assignee: Advanced Micro Devices, Inc.

Inventors: Bin He, Brian Emberling, Mark Leather, Michael Mantor
Vertical and horizontal broadcast of shared operands

Patent number: 11635967

Abstract: An array processor includes processor element arrays distributed in rows and columns. The processor element arrays perform operations on parameter values. The array processor also includes memory interfaces that broadcast sets of the parameter values to mutually exclusive subsets of the rows and columns of the processor element arrays. In some cases, the array processor includes single-instruction-multiple-data (SIMD) units including subsets of the processor element arrays in corresponding rows, workgroup processors (WGPs) including subsets of the SIMD units, and a memory fabric configured to interconnect with an external memory that stores the parameter values. The memory interfaces broadcast the parameter values to the SIMD units that include the processor element arrays in rows associated with the memory interfaces and columns of processor element arrays that are implemented across the SIMD units in the WGPs. The memory interfaces access the parameter values from the external memory via the memory fabric.

Type: Grant

Filed: September 25, 2020

Date of Patent: April 25, 2023

Assignee: Advanced Micro Devices, Inc.

Inventors: Sateesh Lagudu, Allen H. Rush, Michael Mantor, Arun Vaidyanathan Ananthanarayan, Prasad Nagabhushanamgari, Maxim V. Kazakov
Dedicated vector sub-processor system

Patent number: 11630667

Abstract: A processor includes a plurality of vector sub-processors (VSPs) and a plurality of memory banks dedicated to respective VSPs. A first memory bank corresponding to a first VSP includes a first plurality of high vector general purpose register (VGPR) banks and a first plurality of low VGPR banks corresponding to the first plurality of high VGPR banks. The first memory bank further includes a plurality of operand gathering components that store operands from respective high VGPR banks and low VGPR banks. The operand gathering components are assigned to individual threads while the threads are executed by the first VSP.

Type: Grant

Filed: November 27, 2019

Date of Patent: April 18, 2023

Assignee: Advanced Micro Devices, Inc.

Inventors: Jiasheng Chen, Bin He, Jian Huang, Michael Mantor

prev 1 2 3 4 5 6 … next