Patents by Inventor Michael Mantor

Michael Mantor has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Processing unit with small footprint arithmetic logic unit

Patent number: 11720328

Abstract: A parallel processing unit employs an arithmetic logic unit (ALU) having a relatively small footprint, thereby reducing the overall power consumption and circuit area of the processing unit. To support the smaller footprint, the ALU includes multiple stages to execute operations corresponding to a received instruction. The ALU executes at least one operation at a precision indicated by the received instruction, and then reduces the resulting data of the at least one operation to a smaller size before providing the results to another stage of the ALU to continue execution of the instruction.

Type: Grant

Filed: September 23, 2020

Date of Patent: August 8, 2023

Assignee: Advanced Micro Devices, Inc.

Inventors: Bin He, Shubh Shah, Michael Mantor
GRAPHICS PROCESSING UNIT TRAVERSAL ENGINE

Publication number: 20230206543

Abstract: A processing unit employs a hardware traversal engine to traverse an acceleration structure such as a ray tracing structure. The hardware traversal engine includes one or more memory modules to store state information and other data used for the structure traversal, and control logic to execute a traversal process based on the stored data and based on received information indicating a source node of the acceleration structure to be used for the traversal process. By employing a hardware traversal engine, the processing unit is able to execute the traversal process more quickly and efficiently, conserving processing resources and improving overall processing efficiency.

Type: Application

Filed: December 28, 2021

Publication date: June 29, 2023

Inventors: Konstantin Igorevich SHKURKO, Michael Mantor
Dual vector arithmetic logic unit

Patent number: 11675568

Abstract: A processing system executes wavefronts at multiple arithmetic logic unit (ALU) pipelines of a single instruction multiple data (SIMD) unit in a single execution cycle. The ALU pipelines each include a number of ALUs that execute instructions on wavefront operands that are collected from vector general process register (VGPR) banks at a cache and output results of the instructions executed on the wavefronts at a buffer. By storing wavefronts supplied by the VGPR banks at the cache, a greater number of wavefronts can be made available to the SIMD unit without increasing the VGPR bandwidth, enabling multiple ALU pipelines to execute instructions during a single execution cycle.

Type: Grant

Filed: December 14, 2020

Date of Patent: June 13, 2023

Assignee: Advanced Micro Devices, Inc.

Inventors: Bin He, Brian Emberling, Mark Leather, Michael Mantor
Vertical and horizontal broadcast of shared operands

Patent number: 11635967

Abstract: An array processor includes processor element arrays distributed in rows and columns. The processor element arrays perform operations on parameter values. The array processor also includes memory interfaces that broadcast sets of the parameter values to mutually exclusive subsets of the rows and columns of the processor element arrays. In some cases, the array processor includes single-instruction-multiple-data (SIMD) units including subsets of the processor element arrays in corresponding rows, workgroup processors (WGPs) including subsets of the SIMD units, and a memory fabric configured to interconnect with an external memory that stores the parameter values. The memory interfaces broadcast the parameter values to the SIMD units that include the processor element arrays in rows associated with the memory interfaces and columns of processor element arrays that are implemented across the SIMD units in the WGPs. The memory interfaces access the parameter values from the external memory via the memory fabric.

Type: Grant

Filed: September 25, 2020

Date of Patent: April 25, 2023

Assignee: Advanced Micro Devices, Inc.

Inventors: Sateesh Lagudu, Allen H. Rush, Michael Mantor, Arun Vaidyanathan Ananthanarayan, Prasad Nagabhushanamgari, Maxim V. Kazakov
Dedicated vector sub-processor system

Patent number: 11630667

Abstract: A processor includes a plurality of vector sub-processors (VSPs) and a plurality of memory banks dedicated to respective VSPs. A first memory bank corresponding to a first VSP includes a first plurality of high vector general purpose register (VGPR) banks and a first plurality of low VGPR banks corresponding to the first plurality of high VGPR banks. The first memory bank further includes a plurality of operand gathering components that store operands from respective high VGPR banks and low VGPR banks. The operand gathering components are assigned to individual threads while the threads are executed by the first VSP.

Type: Grant

Filed: November 27, 2019

Date of Patent: April 18, 2023

Assignee: Advanced Micro Devices, Inc.

Inventors: Jiasheng Chen, Bin He, Jian Huang, Michael Mantor
ACCELERATION STRUCTURES WITH DELTA INSTANCES

Publication number: 20230097562

Abstract: Described herein is a technique for performing ray tracing operations. The technique includes encountering, at a non-leaf node, a pointer to a bottom-level acceleration structure having one or more delta instances; identifying an index associated with the pointer, wherein the index identifies an instance within the bottom-level acceleration structure; and obtaining data for the instance based on the pointer and the index.

Type: Application

Filed: September 28, 2021

Publication date: March 30, 2023

Applicant: Advanced Micro Devices, Inc.

Inventors: Konstantin I. Shkurko, Matthäus G. Chajdas, Michael Mantor
CONVOLUTIONAL NEURAL NETWORK OPERATIONS

Publication number: 20230097279

Abstract: Methods and systems are disclosed for executing operations on single-instruction-multiple-data (SIMD) units. Techniques disclosed perform a dot product operation on input data during one computer cycle, including convolving the input data, generating intermediate data, and applying one or more transitional operations to the intermediate data to generate output data. Aspects described, wherein the input data is an input to a layer of a convolutional neural network and the generated output data is the output of the layer.

Type: Application

Filed: September 29, 2021

Publication date: March 30, 2023

Applicant: Advanced Micro Devices, Inc.

Inventors: Brian Emberling, Michael Mantor, Michael Y. Chow, Bin He
Precise suspend and resume of workloads in a processing unit

Patent number: 11609791

Abstract: A first workload is executed in a first subset of pipelines of a processing unit. A second workload is executed in a second subset of the pipelines of the processing unit. The second workload is dependent upon the first workload. The first and second workloads are suspended and state information for the first and second workloads is stored in a first memory in response to suspending the first and second workloads. In some cases, a third workload executes in a third subset of the pipelines of the processing unit concurrently with executing the first and second workloads. In some cases, a fourth workload is executed in the first and second pipelines after suspending the first and second workloads. The first and second pipelines are resumed on the basis of the stored state information in response to completion or suspension of the fourth workload.

Type: Grant

Filed: November 30, 2017

Date of Patent: March 21, 2023

Assignee: Advanced Micro Devices, Inc.

Inventors: Anirudh R. Acharya, Michael Mantor
PREFETCH KERNELS ON DATA-PARALLEL PROCESSORS

Publication number: 20230076872

Abstract: Embodiments include methods, systems and non-transitory computer-readable computer readable media including instructions for executing a prefetch kernel that includes memory accesses for prefetching data for a processing kernel into a memory, and, subsequent to executing at least a portion of the prefetch kernel, executing the processing kernel where the processing kernel includes accesses to data that is stored into the memory resulting from execution of the prefetch kernel.

Type: Application

Filed: November 11, 2022

Publication date: March 9, 2023

Applicant: Advanced Micro Devices, Inc.

Inventors: Nuwan S. Jayasena, James Michael O'Connor, Michael Mantor
Prefetch kernels on data-parallel processors

Patent number: 11500778

Abstract: Embodiments include methods, systems and non-transitory computer-readable computer readable media including instructions for executing a prefetch kernel with reduced intermediate state storage resource requirements. These include executing a prefetch kernel on a graphics processing unit (GPU), such that the prefetch kernel begins executing before a processing kernel. The prefetch kernel performs memory operations that are based upon at least a subset of memory operations in the processing kernel.

Type: Grant

Filed: March 9, 2020

Date of Patent: November 15, 2022

Assignee: Advanced Micro Devices, Inc.

Inventors: Nuwan S. Jayasena, James Michael O'Connor, Michael Mantor
Pipeline including separate hardware data paths for different instruction types

Patent number: 11494192

Abstract: A processing element is implemented in a stage of a pipeline and configured to execute an instruction. A first array of multiplexers is to provide information associated with the instruction to the processing element in response to the instruction being in a first set of instructions. A second array of multiplexers is to provide information associated with the instruction to the first processing element in response to the instruction being in a second set of instructions. A control unit is to gate at least one of power or a clock signal provided to the first array of multiplexers in response to the instruction being in the second set.

Type: Grant

Filed: April 28, 2020

Date of Patent: November 8, 2022

Assignees: Advanced Micro Devices, Inc., ADVANCED MICRO DEVICES (SHANGHAI) CO., LTD.

Inventors: Jiasheng Chen, YunXiao Zou, Bin He, Angel E. Socarras, QingCheng Wang, Wei Yuan, Michael Mantor
DIE STACKING FOR MODULAR PARALLEL PROCESSORS

Publication number: 20220320042

Abstract: A multi-die parallel processor semiconductor package includes a first base IC die including a first plurality of virtual compute dies 3D stacked on top of the first base IC die. A first subset of a parallel processing pipeline logic is positioned at the first plurality of virtual compute dies. Additionally, a second subset of the parallel processing pipeline logic is positioned at the first base IC die. The multi-die parallel processor semiconductor package also includes a second base IC die including a second plurality of virtual compute dies 3D stacked on top of the second base IC die. An active bridge chip communicably couples a first interconnect structure of the first base IC die to a first interconnect structure of the second base IC die.

Type: Application

Filed: March 30, 2021

Publication date: October 6, 2022

Inventor: Michael MANTOR
HYBRID RENDER WITH DEFERRED PRIMITIVE BATCH BINNING

Publication number: 20220277508

Abstract: A method, computer system, and a non-transitory computer-readable storage medium for performing primitive batch binning are disclosed. The method, computer system, and non-transitory computer-readable storage medium include techniques for generating a primitive batch from a plurality of primitives, computing respective bin intercepts for each of the plurality of primitives in the primitive batch, and shading the primitive batch by iteratively processing each of the respective bin intercepts computed until all of the respective bin intercepts are processed.

Type: Application

Filed: May 16, 2022

Publication date: September 1, 2022

Applicants: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Michael Mantor, Laurent Lefebvre, Mark Fowler, Timothy Kelley, Mikko Alho, Mika Tuomi, Kiia Kallio, Patrick Klas Rudolf Buss, Jari Antero Komppa, Kaj Tuomi
ACCESS LOG AND ADDRESS TRANSLATION LOG FOR A PROCESSOR

Publication number: 20220269620

Abstract: A processor maintains an access log indicating a stream of cache misses at a cache of the processor. In response to each of at least a subset of cache misses at the cache, the processor records a corresponding entry in the access log, indicating a physical memory address of the memory access request that resulted in the corresponding miss. In addition, the processor maintains an address translation log that indicates a mapping of physical memory addresses to virtual memory addresses. In response to an address translation (e.g., a page walk) that translates a virtual address to a physical address, the processor stores a mapping of the physical address to the corresponding virtual address at an entry of the address translation log. Software executing at the processor can use the two logs for memory management.

Type: Application

Filed: February 8, 2022

Publication date: August 25, 2022

Inventors: Benjamin T. SANDER, Mark Fowler, Anthony Asaro, Gongxian Jeffrey Cheng, Michael Mantor
Dynamically adaptable arrays for vector and matrix operations

Patent number: 11409840

Abstract: An array processor includes processor element arrays distributed in rows and columns. The processor element arrays perform operations on parameter values. The array processor also includes memory interfaces that are dynamically mapped to mutually exclusive subsets of the rows and columns of the processor element arrays based on dimensions of matrices that provide the parameter values to the processor element arrays. In some cases, the processor element arrays are vector arithmetic logic unit (ALU) processors and the memory interfaces are direct memory access (DMA) engines. The rows of the processor element arrays in the subsets are mutually exclusive to the rows in the other subsets and the columns of the processor element arrays in the subsets are mutually exclusive to the columns in the other subsets. The matrices can be symmetric or asymmetric, e.g., one of the matrices can be a vector having a single column.

Type: Grant

Filed: September 25, 2020

Date of Patent: August 9, 2022

Assignee: Advanced Micro Devices, Inc.

Inventors: Sateesh Lagudu, Allen H. Rush, Michael Mantor, Arun Vaidyanathan Ananthanarayan, Prasad Nagabhushanamgari
Pairing SIMD lanes to perform double precision operations

Patent number: 11409536

Abstract: A method and apparatus for performing a multi-precision computation in a plurality of arithmetic logic units (ALUs) includes pairing a first Single Instruction/Multiple Data (SIMD) block channel device with a second SIMD block channel device to create a first block pair having one-level staggering between the first and second channel devices. A third SIMD block channel device is paired with a fourth SIMD block channel device to create a second block pair having one-level staggering between the third and fourth channel devices. A plurality of source inputs are received at the first block pair and the second block pair. The first block pair computes a first result, and the second block pair computes a second result.

Type: Grant

Filed: November 3, 2016

Date of Patent: August 9, 2022

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Bin He, YunXiao Zou, Jiasheng Chen, Michael Mantor
SPATIAL PARTITIONING IN A MULTI-TENANCY GRAPHICS PROCESSING UNIT

Publication number: 20220237851

Abstract: A graphics processing unit (GPU) or other apparatus includes a plurality of shader engines. The apparatus also includes a first front end (FE) circuit and one or more second FE circuits. The first FE circuit is configured to schedule geometry workloads for the plurality of shader engines in a first mode. The first FE circuit is configured to schedule geometry workloads for a first subset of the plurality of shader engines and the one or more second FE circuits are configured to schedule geometry workloads for a second subset of the plurality of shader engines in a second mode. In some cases, a partition switch is configured to selectively connect the first FE circuit or the one or more second FE circuits to the second subset of the plurality of shader engines depending on whether the apparatus is in the first mode or the second mode.

Type: Application

Filed: March 29, 2022

Publication date: July 28, 2022

Inventors: Mark LEATHER, Michael MANTOR
Selectively dispatching waves based on accumulators holding behavioral characteristics of waves currently executing

Patent number: 11397578

Abstract: An apparatus such as a graphics processing unit (GPU) includes a plurality of processing elements configured to concurrently execute a plurality of first waves and accumulators associated with the plurality of processing elements. The accumulators are configured to store accumulated values representative of behavioral characteristics of the plurality of first waves that are concurrently executing on the plurality of processing elements. The apparatus also includes a dispatcher configured to dispatch second waves to the plurality of processing elements based on comparisons of values representative of behavioral characteristics of the second waves and the accumulated values stored in the accumulators. In some cases, the behavioral characteristics of the plurality of first waves comprise at least one of fetch bandwidths, usage of an arithmetic logic unit (ALU), and number of export operations.

Type: Grant

Filed: August 30, 2019

Date of Patent: July 26, 2022

Assignee: Advanced Micro Devices, Inc.

Inventors: Randy Ramsey, William David Isenberg, Michael Mantor
Exception handler for sampling draw dispatch identifiers

Patent number: 11386518

Abstract: The address of the draw or dispatch packet responsible for creating an exception is tied to a shader/wavefront back to the draw command from which it originated. In various embodiments, a method of operating a graphics pipeline and exception handling includes receiving, at a command processor of a graphics processing unit (GPU), an exception signal indicating an occurrence of a pipeline exception at a shader stage of a graphics pipeline. The shader stage generates an exception signal in response to a pipeline exception and transmits the exception signal to the command processor. The command processor determines, based on the exception signal, an address of a command packet responsible for the occurrence of the pipeline exception.

Type: Grant

Filed: September 24, 2019

Date of Patent: July 12, 2022

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Michael Mantor, Alexander Fuad Ashkar, Randy Ramsey, Mangesh P. Nijasure, Brian Emberling
Primitive shader

Patent number: 11379941

Abstract: Improvements in the graphics processing pipeline are disclosed. More specifically, a new primitive shader stage performs tasks of the vertex shader stage or a domain shader stage if tessellation is enabled, a geometry shader if enabled, and a fixed function primitive assembler. The primitive shader stage is compiled by a driver from user-provided vertex or domain shader code, geometry shader code, and from code that performs functions of the primitive assembler. Moving tasks of the fixed function primitive assembler to a primitive shader that executes in programmable hardware provides many benefits, such as removal of a fixed function crossbar, removal of dedicated parameter and position buffers that are unusable in general compute mode, and other benefits.

Type: Grant

Filed: January 25, 2017

Date of Patent: July 5, 2022

Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Todd Martin, Mangesh P. Nijasure, Randy W. Ramsey, Michael Mantor, Laurent Lefebvre

prev 1 2 3 4 5 6 7 … next