Patents by Inventor Michael Mantor

Michael Mantor has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20210049729
    Abstract: A graphics processing unit (GPU) includes a plurality of programmable processing cores configured to process graphics primitives and corresponding data and a plurality of fixed-function hardware units. The plurality of processing cores and the plurality of fixed-function hardware units are configured to implement a configurable number of virtual pipelines to concurrently process different command flows. Each virtual pipeline includes a configurable number of fragments and an operational state of each virtual pipeline is specified by a different context. The configurable number of virtual pipelines can be modified from a first number to a second number that is different than the first number. An emulation of a fixed-function hardware unit can be instantiated on one or more of the graphics processing cores in response to detection of a bottleneck in a fixed-function hardware unit. One or more of the virtual pipelines can then be reconfigured to utilize the emulation instead of the fixed-function hardware unit.
    Type: Application
    Filed: May 21, 2020
    Publication date: February 18, 2021
    Inventors: Timour T. PALTASHEV, Michael MANTOR, Rex Eldon MCCRARY
  • Patent number: 10922868
    Abstract: Improvements in the graphics processing pipeline that allow multiple pipelines to cooperate to render a single frame are disclosed. Two approaches are provided. In a first approach, world-space pipelines for the different graphics processing pipelines process all work for draw calls received from a central processing unit (CPU). In a second approach, the world-space pipelines divide up the work. Work that is divided is synchronized and redistributed at various points in the world-space pipeline. In either approach, the triangles output by the world-space pipelines are distributed to the screen-space pipelines based on the portions of the render surface overlapped by the triangles. Triangles are rendered by screen-space pipelines associated with the render surface portions overlapped by those triangles.
    Type: Grant
    Filed: June 26, 2019
    Date of Patent: February 16, 2021
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Mangesh P. Nijasure, Todd Martin, Michael Mantor
  • Patent number: 10860418
    Abstract: A system and method for protecting memory instructions against faults are described. The system and method include converting the slave instructions to dummy operations, modifying memory arbiter to issue up to N master and N slave global/shared memory instructions per cycle, sending master memory requests to memory system, using slave requests for error checking, entering master requests to the GM/LM FIFO, storing slave requests in a register, and comparing the entered master requests with the stored slave requests.
    Type: Grant
    Filed: April 8, 2019
    Date of Patent: December 8, 2020
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: John Kalamatianos, Michael Mantor, Sudhanva Gurumurthi
  • Publication number: 20200379767
    Abstract: A method of context bouncing includes receiving, at a command processor of a graphics processing unit (GPU), a conditional execute packet providing a hash identifier corresponding to an encapsulated state. The encapsulated state includes one or more context state packets following the conditional execute packet. A command packet following the encapsulated state is executed based at least in part on determining whether the hash identifier of the encapsulated state matches one of a plurality of hash identifiers of active context states currently stored at the GPU.
    Type: Application
    Filed: May 30, 2019
    Publication date: December 3, 2020
    Inventors: Rex Eldon MCCRARY, Yi LUO, Harry J. WISE, Alexander Fuad ASHKAR, Michael MANTOR
  • Publication number: 20200293286
    Abstract: A graphics processing unit (GPU) implements operations, with associated op codes, to perform mixed precision mathematical operations. The GPU includes an arithmetic logic unit (ALU) with different execution paths, wherein each execution path executes a different mixed precision operation. By implementing mixed precision operations at the ALU in response to designate op codes that delineate the operations, the GPU efficiently increases the precision of specified mathematical operations while reducing execution overhead.
    Type: Application
    Filed: October 2, 2019
    Publication date: September 17, 2020
    Inventors: Bin HE, Michael MANTOR, Jiasheng CHEN
  • Publication number: 20200293329
    Abstract: A processing element is implemented in a stage of a pipeline and configured to execute an instruction. A first array of multiplexers is to provide information associated with the instruction to the processing element in response to the instruction being in a first set of instructions. A second array of multiplexers is to provide information associated with the instruction to the first processing element in response to the instruction being in a second set of instructions. A control unit is to gate at least one of power or a clock signal provided to the first array of multiplexers in response to the instruction being in the second set.
    Type: Application
    Filed: April 28, 2020
    Publication date: September 17, 2020
    Inventors: Jiasheng CHEN, YunXiao ZOU, Bin HE, Angel E. SOCARRAS, QingCheng WANG, Wei YUAN, Michael MANTOR
  • Publication number: 20200210341
    Abstract: Embodiments include methods, systems and non-transitory computer-readable computer readable media including instructions for executing a prefetch kernel with reduced intermediate state storage resource requirements. These include executing a prefetch kernel on a graphics processing unit (GPU), such that the prefetch kernel begins executing before a processing kernel. The prefetch kernel performs memory operations that are based upon at least a subset of memory operations in the processing kernel.
    Type: Application
    Filed: March 9, 2020
    Publication date: July 2, 2020
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Nuwan S. Jayasena, James Michael O'Connor, Michael Mantor
  • Patent number: 10664942
    Abstract: A graphics processing unit (GPU) includes a plurality of programmable processing cores configured to process graphics primitives and corresponding data and a plurality of fixed-function hardware units. The plurality of processing cores and the plurality of fixed-function hardware units are configured to implement a configurable number of virtual pipelines to concurrently process different command flows. Each virtual pipeline includes a configurable number of fragments and an operational state of each virtual pipeline is specified by a different context. The configurable number of virtual pipelines can be modified from a first number to a second number that is different than the first number. An emulation of a fixed-function hardware unit can be instantiated on one or more of the graphics processing cores in response to detection of a bottleneck in a fixed-function hardware unit. One or more of the virtual pipelines can then be reconfigured to utilize the emulation instead of the fixed-function hardware unit.
    Type: Grant
    Filed: October 21, 2016
    Date of Patent: May 26, 2020
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Timour T. Paltashev, Michael Mantor, Rex Eldon McCrary
  • Patent number: 10656951
    Abstract: A processing element is implemented in a stage of a pipeline and configured to execute an instruction. A first array of multiplexers is to provide information associated with the instruction to the processing element in response to the instruction being in a first set of instructions. A second array of multiplexers is to provide information associated with the instruction to the first processing element in response to the instruction being in a second set of instructions. A control unit is to gate at least one of power or a clock signal provided to the first array of multiplexers in response to the instruction being in the second set.
    Type: Grant
    Filed: October 20, 2017
    Date of Patent: May 19, 2020
    Assignees: ADVANCED MICRO DEVICES, INC., ADVANCED MICRO DEVICES (SHANGHAI) CO., LTD.
    Inventors: Jiasheng Chen, YunXiao Zou, Bin He, Angel E. Socarras, QingCheng Wang, Wei Yuan, Michael Mantor
  • Patent number: 10585801
    Abstract: Embodiments include methods, systems and computer readable media configured to execute a first kernel (e.g. compute or graphics kernel) with reduced intermediate state storage resource requirements. These include executing a first and second (e.g. prefetch) kernel on a data-parallel processor, such that the second kernel begins executing before the first kernel. The second kernel performs memory operations that are based upon at least a subset of memory operations in the first kernel.
    Type: Grant
    Filed: November 26, 2012
    Date of Patent: March 10, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Nuwan S. Jayasena, James Michael O'Connor, Michael Mantor
  • Patent number: 10579388
    Abstract: A method for use in a processor for arbitrating between multiple processes to select wavefronts for execution on a shader core is provided. The processor includes a compute pipeline configured to issue wavefronts to the shader core for execution, a hardware queue descriptor associated with the compute pipeline, and the shader core. The shader core is configured to execute work for the compute pipeline corresponding to a first memory queue descriptor executed using data for the first memory queue descriptor that is loaded into a first hardware queue descriptor. The processor is configured to detect a context switch condition, and, responsive to the context switch condition, perform a context switch operation including loading data for a second memory queue descriptor into the first hardware queue descriptor. The shader core is configured to execute work corresponding to the second memory queue descriptor that is loaded into the first hardware queue descriptor.
    Type: Grant
    Filed: July 19, 2018
    Date of Patent: March 3, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Robert Scott Hartog, Mark Leather, Michael Mantor, Rex McCrary, Sebastien Nussbaum, Philip J. Rogers, Ralph Clay Taylor, Thomas Woller
  • Patent number: 10558499
    Abstract: Footprints, or resource allocations, of waves within resources that are shared by processor cores in a multithreaded processor are measured concurrently with the waves executing on the processor cores. The footprints are averaged over a time interval. A number of waves are spawned and dispatched for execution in the multithreaded processor based on the average footprint. In some cases, the waves are spawned at a rate that is determined based on the average value of the footprints of waves within the resources. The rate of spawning waves is modified in response to a change in the average value of the footprints of the waves within the resources.
    Type: Grant
    Filed: October 26, 2017
    Date of Patent: February 11, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Maxim V. Kazakov, Michael Mantor
  • Patent number: 10546365
    Abstract: An apparatus, such as a head mounted device (HMD), includes one or more processors configured to implement a graphics pipeline that renders pixels in window space with a nonuniform pixel spacing. The apparatus also includes a first distortion function that maps the non-uniformly spaced pixels in window space to uniformly spaced pixels in raster space. The apparatus further includes a scan converter configured to sample the pixels in window space through the first distortion function. The scan converter is configured to render display pixels used to generate an image for display to a user based on the uniformly spaced pixels in raster space. In some cases, the pixels in the window space are rendered such that a pixel density per subtended area is constant across the user's field of view.
    Type: Grant
    Filed: December 15, 2017
    Date of Patent: January 28, 2020
    Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: Michael Mantor, Laurent Lefebvre, Mika Tuomi, Kiia Kallio
  • Patent number: 10453243
    Abstract: Processing of non-real-time and real-time workloads is performed using discrete pipelines. A first pipeline includes a first shader and one or more fixed function hardware blocks. A second pipeline includes a second shader that is configured to emulate the at least one fixed function hardware block. First and second memory elements store first state information for the first pipeline and second state information for the second pipeline, respectively. A non-real-time workload executing in the first pipeline is preempted at a primitive boundary in response to a real-time workload being dispatched for execution in the second pipeline. The first memory element retains the first state information in response to preemption of the non-real-time workload. The first pipeline is configured to resume processing the subsequent primitive on the basis of the first state information stored in the first memory element.
    Type: Grant
    Filed: January 3, 2019
    Date of Patent: October 22, 2019
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Anirudh R. Acharya, Swapnil Sakharshete, Michael Mantor, Mangesh P. Nijasure, Todd Martin, Vineet Goel
  • Publication number: 20190318527
    Abstract: Improvements in the graphics processing pipeline that allow multiple pipelines to cooperate to render a single frame are disclosed. Two approaches are provided. In a first approach, world-space pipelines for the different graphics processing pipelines process all work for draw calls received from a central processing unit (CPU). In a second approach, the world-space pipelines divide up the work. Work that is divided is synchronized and redistributed at various points in the world-space pipeline. In either approach, the triangles output by the world-space pipelines are distributed to the screen-space pipelines based on the portions of the render surface overlapped by the triangles. Triangles are rendered by screen-space pipelines associated with the render surface portions overlapped by those triangles.
    Type: Application
    Filed: June 26, 2019
    Publication date: October 17, 2019
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Mangesh P. NIJASURE, Todd MARTIN, Michael MANTOR
  • Publication number: 20190278605
    Abstract: A system includes a processor configured to operate in at least a first mode and a second mode. In the first mode the first processor operates to execute an instruction for an entire wavefront before executing a next instruction for the entire wavefront. In the second mode the processor operates to execute a set instructions for a portion of a wavefront before executing the set instructions for another portion of the same wavefront. The system further includes a memory coupled to the processor. The memory is configured to store a shader program for execution by the processor, wherein the shader program includes at least one indication associated with one of the first mode or the second mode. The processor is further to implement one of the first mode or the second mode while executing the shader program responsive to the at least one indication present in the first shader program.
    Type: Application
    Filed: May 29, 2019
    Publication date: September 12, 2019
    Inventors: Brian EMBERLING, Michael MANTOR
  • Patent number: 10388056
    Abstract: Improvements in the graphics processing pipeline that allow multiple pipelines to cooperate to render a single frame are disclosed. Two approaches are provided. In a first approach, world-space pipelines for the different graphics processing pipelines process all work for draw calls received from a central processing unit (CPU). In a second approach, the world-space pipelines divide up the work. Work that is divided is synchronized and redistributed at various points in the world-space pipeline. In either approach, the triangles output by the world-space pipelines are distributed to the screen-space pipelines based on the portions of the render surface overlapped by the triangles. Triangles are rendered by screen-space pipelines associated with the render surface portions overlapped by those triangles.
    Type: Grant
    Filed: January 26, 2017
    Date of Patent: August 20, 2019
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Mangesh P. Nijasure, Todd Martin, Michael Mantor
  • Patent number: 10372522
    Abstract: Techniques for handling memory errors are disclosed. Various memory units of an accelerated processing device (“APD”) include error units for detecting errors in data stored in the memory (e.g., using parity protection or error correcting code). Upon detecting an error considered to be an “initial uncorrectable error,” the error unit triggers transmission of an initial uncorrectable error interrupt (“IUE interrupt”) to a processor. This IUE interrupt includes information identifying the specific memory unit in which the error occurred (and possible other information about the error). A halt interrupt is generated and transmitted to the processor in response to the data having the error being consumed (i.e., used by an operation such as an instruction or command), which causes the APD to halt operations. If the data having the error is not consumed, then the halt interrupt is never generated (that the error occurred may remain logged, however).
    Type: Grant
    Filed: April 28, 2017
    Date of Patent: August 6, 2019
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Carlos Sampayo, Michael Mantor
  • Publication number: 20190235953
    Abstract: A system and method for protecting memory instructions against faults are described. The system and method include converting the slave instructions to dummy operations, modifying memory arbiter to issue up to N master and N slave global/shared memory instructions per cycle, sending master memory requests to memory system, using slave requests for error checking, entering master requests to the GM/LM FIFO, storing slave requests in a register, and comparing the entered master requests with the stored slave requests.
    Type: Application
    Filed: April 8, 2019
    Publication date: August 1, 2019
    Applicant: Advanced Micro Devices, Inc.
    Inventors: John Kalamatianos, Michael Mantor, Sudhanva Gurumurthi
  • Patent number: 10360177
    Abstract: Described is a method and processing apparatus to improve power efficiency by gating redundant threads processing. In particular, the method for gating redundant threads in a graphics processor includes determining if data for a thread and data for at least another thread are within a predetermined similarity threshold, gating execution of the at least another thread if the data for the thread and the data for the at least another thread are within the predetermined similarity threshold, and using an output data from the thread as an output data for the at least another thread.
    Type: Grant
    Filed: June 22, 2016
    Date of Patent: July 23, 2019
    Assignees: ATI Technologies ULC, Advanced Micro Devices, Inc.
    Inventors: Syed Zohaib M. Gilani, Jiasheng Chen, QingCheng Wang, YunXiao Zou, Michael Mantor, Bin He, Timour T. Paltashev