Patents by Inventor Michael Mantor

Michael Mantor has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

PRIMITIVE LEVEL PREEMPTION USING DISCRETE NON-REAL-TIME AND REAL TIME PIPELINES

Publication number: 20190164328

Abstract: Processing of non-real-time and real-time workloads is performed using discrete pipelines. A first pipeline includes a first shader and one or more fixed function hardware blocks. A second pipeline includes a second shader that is configured to emulate the at least one fixed function hardware block. First and second memory elements store first state information for the first pipeline and second state information for the second pipeline, respectively. A non-real-time workload executing in the first pipeline is preempted at a primitive boundary in response to a real-time workload being dispatched for execution in the second pipeline. The first memory element retains the first state information in response to preemption of the non-real-time workload. The first pipeline is configured to resume processing the subsequent primitive on the basis of the first state information stored in the first memory element.

Type: Application

Filed: January 3, 2019

Publication date: May 30, 2019

Inventors: Anirudh R. ACHARYA, Swapnil SAKHARSHETE, Michael MANTOR, Mangesh P. NIJASURE, Todd MARTIN, Vineet GOEL
PRECISE SUSPEND AND RESUME OF WORKLOADS IN A PROCESSING UNIT

Publication number: 20190163527

Abstract: A first workload is executed in a first subset of pipelines of a processing unit. A second workload is executed in a second subset of the pipelines of the processing unit. The second workload is dependent upon the first workload. The first and second workloads are suspended and state information for the first and second workloads is stored in a first memory in response to suspending the first and second workloads. In some cases, a third workload executes in a third subset of the pipelines of the processing unit concurrently with executing the first and second workloads. In some cases, a fourth workload is executed in the first and second pipelines after suspending the first and second workloads. The first and second pipelines are resumed on the basis of the stored state information in response to completion or suspension of the fourth workload.

Type: Application

Filed: November 30, 2017

Publication date: May 30, 2019

Inventors: Anirudh R. ACHARYA, Michael MANTOR
SELECTIVE PREFETCHING IN MULTITHREADED PROCESSING UNITS

Publication number: 20190155604

Abstract: A processing unit includes a plurality of processing elements and one or more caches. A first thread executes a program that includes one or more prefetch instructions to prefetch information into a first cache. Prefetching is selectively enabled when executing the first thread on a first processing element dependent upon whether one or more second threads previously executed the program on the first processing element. The first thread is then dispatched to execute the program on the first processing element. In some cases, a dispatcher receives the first thread four dispatching to the first processing element. The dispatcher modifies the prefetch instruction to disable prefetching into the first cache in response to the one or more second threads having previously executed the program on the first processing element.

Type: Application

Filed: November 20, 2017

Publication date: May 23, 2019

Inventors: Brian EMBERLING, Michael MANTOR
WAVE CREATION CONTROL WITH DYNAMIC RESOURCE ALLOCATION

Publication number: 20190129756

Abstract: Footprints, or resource allocations, of waves within resources that are shared by processor cores in a multithreaded processor are measured concurrently with the waves executing on the processor cores. The footprints are averaged over a time interval. A number of waves are spawned and dispatched for execution in the multithreaded processor based on the average footprint. In some cases, the waves are spawned at a rate that is determined based on the average value of the footprints of waves within the resources. The rate of spawning waves is modified in response to a change in the average value of the footprints of the waves within the resources.

Type: Application

Filed: October 26, 2017

Publication date: May 2, 2019

Inventors: Maxim V. KAZAKOV, Michael MANTOR
HYBRID RENDER WITH DEFERRED PRIMITIVE BATCH BINNING

Publication number: 20190122417

Abstract: A system, method and a non-transitory computer readable storage medium are provided for hybrid rendering with deferred primitive batch binning. A primitive batch is generated from one or more primitives. A bin is identified for processing the primitive batch. At least a portion of each primitive intersecting the identified bin is processed and a next bin for processing the primitive batch is identified based on an intercept walk order. The processing is iteratively repeated for the one or more primitives in the primitive batch for successive bins until all primitives of the primitive batch are completely processed. Then, the one or more primitives in the primitive batch are further processed.

Type: Application

Filed: November 2, 2018

Publication date: April 25, 2019

Applicants: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Michael Mantor, Laurent Lefebvre, Mark Fowler, Timothy Kelley, Mikko Alho, Mika Tuomi, Kiia Kallio, Patrick Klas Rudolf Buss, Jari Antero Komppa, Kaj Tuomi
System and method for protecting GPU memory instructions against faults

Patent number: 10255132

Abstract: A system and method for protecting memory instructions against faults are described. The system and method include converting the slave instructions to dummy operations, modifying memory arbiter to issue up to N master and N slave global/shared memory instructions per cycle, sending master memory requests to memory system, using slave requests for error checking, entering master requests to the GM/LM FIFO, storing slave requests in a register, and comparing the entered master requests with the stored slave requests.

Type: Grant

Filed: June 22, 2016

Date of Patent: April 9, 2019

Assignee: Advanced Micro Devices, Inc.

Inventors: John Kalamatianos, Michael Mantor, Sudhanva Gurumurthi
Preemptive context switching of processes on an accelerated processing device (APD) based on time quanta

Patent number: 10242420

Abstract: Methods and apparatus are described. A method includes an accelerated processing device running a process. When a maximum time interval during which the process is permitted to run expires before the process completes, the accelerated processing device receives an operating-system-initiated instruction to stop running the process. The accelerated processing device stops the process from running in response to the received operating-system-initiated instruction.

Type: Grant

Filed: November 28, 2016

Date of Patent: March 26, 2019

Assignee: Advanced Micro Devices, Inc.

Inventors: Robert Scott Hartog, Ralph Clayton Taylor, Michael Mantor, Kevin John McGrath, Sebastien Nussbaum, Nuwan Jayasena, Rex McCrary, Mark Leather, Philip J. Rogers, Thomas Woller
Multithreaded computing

Patent number: 10235220

Abstract: A system, method, and computer program product are provided for improving resource utilization of multithreaded applications. Rather than requiring threads to block while waiting for data from a channel or requiring context switching to minimize blocking, the techniques disclosed herein provide an event-driven approach to launch kernels only when needed to perform operations on channel data, and then terminate in order to free resources. These operations are handled efficiently in hardware, but are flexible enough to be implemented in all manner of programming models.

Type: Grant

Filed: September 7, 2012

Date of Patent: March 19, 2019

Assignee: Advanced Micro Devices, Inc.

Inventors: Lee W. Howes, Benedict R. Gaster, Michael Clair Houston, Michael Mantor
Primitive level preemption using discrete non-real-time and real time pipelines

Patent number: 10210650

Abstract: Processing of non-real-time and real-time workloads is performed using discrete pipelines. A first pipeline includes a first shader and one or more fixed function hardware blocks. A second pipeline includes a second shader that is configured to emulate the at least one fixed function hardware block. First and second memory elements store first state information for the first pipeline and second state information for the second pipeline, respectively. A non-real-time workload executing in the first pipeline is preempted at a primitive boundary in response to a real-time workload being dispatched for execution in the second pipeline. The first memory element retains the first state information in response to preemption of the non-real-time workload. The first pipeline is configured to resume processing the subsequent primitive on the basis of the first state information stored in the first memory element.

Type: Grant

Filed: November 30, 2017

Date of Patent: February 19, 2019

Assignee: Advanced Micro Devices, Inc.

Inventors: Anirudh R. Acharya, Swapnil Sakharshete, Michael Mantor, Mangesh P. Nijasure, Todd Martin, Vineet Goel
BIN STREAMOUT PREEMPTION IN A GRAPHICS PROCESSING PIPELINE

Publication number: 20190005604

Abstract: A stage of a graphics pipeline in a graphics processing unit (GPU) detects an interrupt concurrently with the stage processing primitives in a first bin that represents a first portion of a first frame generated by a first application. The stage forwards a completed portion of the primitives to a subsequent stage of the graphics pipeline in response to the interrupt. The stage diverts a second bin that represents a second portion of the first frame from the stage to a memory in response to the interrupt. The stage processes primitives in a third bin that represents a portion of a second frame generated by a second application subsequent to diverting the second bin to the memory. The stage can then retrieve the second bin from the memory in response to the stage completing processing of the primitives in the third bin for additional processing.

Type: Application

Filed: June 30, 2017

Publication date: January 3, 2019

Inventors: Anirudh R. ACHARYA, Michael MANTOR, Vineet GOEL, Swapnil SAKHARSHETE
Hybrid render with deferred primitive batch binning

Patent number: 10169906

Abstract: A system, method and a computer program product are provided for hybrid rendering with deferred primitive batch binning. A primitive batch is generated from a sequence of primitives. Initial bin intercepts are identified for primitives in the primitive batch. A bin for processing is identified. The bin corresponds to a region of a screen space. Pixels of the primitives intercepting the identified bin are processed. Next bin intercepts are identified while the primitives intercepting the identified bin are processed.

Type: Grant

Filed: March 29, 2013

Date of Patent: January 1, 2019

Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Michael Mantor, Laurent Lefebvre, Mark Fowler, Timothy Kelley, Mikko Alho, Mika Tuomi, Kiia Kallio, Patrick Klas Rudolf Buss, Jari Antero Komppa, Kaj Tuomi
Graphics processing hardware for using compute shaders as front end for vertex shaders

Patent number: 10134102

Abstract: A GPU is configured to read and process data produced by a compute shader via the one or more ring buffers and pass the resulting processed data to a vertex shader as input. The GPU is further configured to allow the compute shader and vertex shader to write through a cache. Each ring buffer is configured to synchronize the compute shader and the vertex shader to prevent processed data generated by the compute shader that is written to a particular ring buffer from being overwritten before the data is accessed by the vertex shader. It is emphasized that this abstract is provided to comply with the rules requiring an abstract that will allow a searcher or other reader to quickly ascertain the subject matter of the technical disclosure. It is submitted with the understanding that it will not be used to interpret or limit the scope or meaning of the claims.

Type: Grant

Filed: June 5, 2014

Date of Patent: November 20, 2018

Assignees: SONY INTERACTIVE ENTERTAINMENT INC., ADVANCED MICRO DEVICES, INC.

Inventors: Mark Evan Cerny, David Simpson, Jason Scanlin, Michael Mantor
POLICIES FOR SHADER RESOURCE ALLOCATION IN A SHADER CORE

Publication number: 20180321946

Abstract: A method for use in a processor for arbitrating between multiple processes to select wavefronts for execution on a shader core is provided. The processor includes a compute pipeline configured to issue wavefronts to the shader core for execution, a hardware queue descriptor associated with the compute pipeline, and the shader core. The shader core is configured to execute work for the compute pipeline corresponding to a first memory queue descriptor executed using data for the first memory queue descriptor that is loaded into a first hardware queue descriptor. The processor is configured to detect a context switch condition, and, responsive to the context switch condition, perform a context switch operation including loading data for a second memory queue descriptor into the first hardware queue descriptor. The shader core is configured to execute work corresponding to the second memory queue descriptor that is loaded into the first hardware queue descriptor.

Type: Application

Filed: July 19, 2018

Publication date: November 8, 2018

Applicant: Advanced Micro Devices, Inc.

Inventors: Robert Scott Hartog, Mark Leather, Michael Mantor, Rex McCrary, Sebastien Nussbaum, Philip J. Rogers, Ralph Clay Taylor, Thomas Woller
MEMORY PROTECTION IN HIGHLY PARALLEL COMPUTING HARDWARE

Publication number: 20180314579

Abstract: Techniques for handling memory errors are disclosed. Various memory units of an accelerated processing device (“APD”) include error units for detecting errors in data stored in the memory (e.g., using parity protection or error correcting code). Upon detecting an error considered to be an “initial uncorrectable error,” the error unit triggers transmission of an initial uncorrectable error interrupt (“IUE interrupt”) to a processor. This IUE interrupt includes information identifying the specific memory unit in which the error occurred (and possible other information about the error). A halt interrupt is generated and transmitted to the processor in response to the data having the error being consumed (i.e., used by an operation such as an instruction or command), which causes the APD to halt operations. If the data having the error is not consumed, then the halt interrupt is never generated (that the error occurred may remain logged, however).

Type: Application

Filed: April 28, 2017

Publication date: November 1, 2018

Applicant: Advanced Micro Devices, Inc.

Inventors: Carlos Sampayo, Michael Mantor
SINGLE PASS FLEXIBLE SCREEN/SCALE RASTERIZATION

Publication number: 20180276790

Abstract: An apparatus, such as a head mounted device (HMD), includes one or more processors configured to implement a graphics pipeline that renders pixels in window space with a nonuniform pixel spacing. The apparatus also includes a first distortion function that maps the non-uniformly spaced pixels in window space to uniformly spaced pixels in raster space. The apparatus further includes a scan converter configured to sample the pixels in window space through the first distortion function. The scan converter is configured to render display pixels used to generate an image for display to a user based on the uniformly spaced pixels in raster space. In some cases, the pixels in the window space are rendered such that a pixel density per subtended area is constant across the user's field of view.

Type: Application

Filed: December 15, 2017

Publication date: September 27, 2018

Inventors: Michael MANTOR, Laurent LEFEBVRE, Mika TUOMI, Kiia KALLIO
STEREO RENDERING

Publication number: 20180211434

Abstract: Techniques for generating a stereo image from a single set of input geometry in a three-dimensional rendering pipeline are disclosed. Vertices are processed through the end of the world-space pipeline. In the primitive assembler, at the end of the world-space pipeline, before perspective division, each clip-space vertex is duplicated. The primitive assembler generates this duplicated clip-space vertex using the y, z, and w coordinates of the original vertex and based on an x coordinate that is offset in the x-direction in clip-space as compared with the x coordinate of the original vertex. Both the original vertex clip-space vertex and the modified clip-space vertex are then sent through the rest of the pipeline for processing, including perspective division, viewport transform, rasterization, pixel shading, and other operations. The result is that a single set of input vertices is rendered into a stereo image.

Type: Application

Filed: January 25, 2017

Publication date: July 26, 2018

Applicant: Advanced Micro Devices, Inc.

Inventors: Mangesh P. Nijasure, Michael Mantor, Jeffrey M. Smith
SPLIT FRAME RENDERING

Publication number: 20180211435

Abstract: Improvements in the graphics processing pipeline that allow multiple pipelines to cooperate to render a single frame are disclosed. Two approaches are provided. In a first approach, world-space pipelines for the different graphics processing pipelines process all work for draw calls received from a central processing unit (CPU). In a second approach, the world-space pipelines divide up the work. Work that is divided is synchronized and redistributed at various points in the world-space pipeline. In either approach, the triangles output by the world-space pipelines are distributed to the screen-space pipelines based on the portions of the render surface overlapped by the triangles. Triangles are rendered by screen-space pipelines associated with the render surface portions overlapped by those triangles.

Type: Application

Filed: January 26, 2017

Publication date: July 26, 2018

Applicant: Advanced Micro Devices, Inc.

Inventors: Mangesh P. Nijasure, Todd Martin, Michael Mantor
REMOVING OR IDENTIFYING OVERLAPPING FRAGMENTS AFTER Z-CULLING

Publication number: 20180165872

Abstract: Techniques for removing or identifying overlapping fragments in a fragment stream after z-culling are disclosed. The techniques include maintaining a first-in-first-out buffer that stores post-z-cull fragments. Each time a new fragment is received at the buffer, the screen position of the fragment is checked against all other fragments in the buffer. If the screen position of the fragment matches the screen position of a fragment in the buffer, then the fragment in the buffer is removed or marked as overlapping. If the screen position of the fragment does not match the screen position of any fragment in the buffer, then no modification is performed to fragments already in the buffer. In either case, he fragment is added to the buffer. The contents of the buffer are transmitted to the pixel shader for pixel shading at a later time.

Type: Application

Filed: December 9, 2016

Publication date: June 14, 2018

Applicants: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Laurent Lefebvre, Michael Mantor, Mark Fowler, Mikko Alho, Mika Tuomi, Kiia Kallio, Patrick Klas Rudolf Buss, Jari Antero Komppa, Kaj Tuomi, Christopher J. Brennan
SUPER SINGLE INSTRUCTION MULTIPLE DATA (SUPER-SIMD) FOR GRAPHICS PROCESSING UNIT (GPU) COMPUTING

Publication number: 20180121386

Abstract: A super single instruction, multiple data (SIMD) computing structure and a method of executing instructions in the super-SIMD is disclosed. The super-SIMD structure is capable of executing more than one instruction from a single or multiple thread and includes a plurality of vector general purpose registers (VGPRs), a first arithmetic logic unit (ALU), the first ALU coupled to the plurality of VGPRs, a second ALU, the second ALU coupled to the plurality of VGPRs, and a destination cache (Do$) that is coupled via bypass and forwarding logic to the first ALU, the second ALU and receiving an output of the first ALU and the second ALU. The Do$ holds multiple instructions results to extend an operand by-pass network to save read and write transactions power. A compute unit (CU) and a small CU including a plurality of super-SIMDs are also disclosed.

Type: Application

Filed: November 17, 2016

Publication date: May 3, 2018

Applicant: Advanced Micro Devices, Inc.

Inventors: Jiasheng Chen, Angel E. Socarras, Michael Mantor, YunXiao Zou, Bin He
RECONFIGURABLE VIRTUAL GRAPHICS AND COMPUTE PROCESSOR PIPELINE

Publication number: 20180114290

Abstract: A graphics processing unit (GPU) includes a plurality of programmable processing cores configured to process graphics primitives and corresponding data and a plurality of fixed-function hardware units. The plurality of processing cores and the plurality of fixed-function hardware units are configured to implement a configurable number of virtual pipelines to concurrently process different command flows. Each virtual pipeline includes a configurable number of fragments and an operational state of each virtual pipeline is specified by a different context. The configurable number of virtual pipelines can be modified from a first number to a second number that is different than the first number. An emulation of a fixed-function hardware unit can be instantiated on one or more of the graphics processing cores in response to detection of a bottleneck in a fixed-function hardware unit. One or more of the virtual pipelines can then be reconfigured to utilize the emulation instead of the fixed-function hardware unit.

Type: Application

Filed: October 21, 2016

Publication date: April 26, 2018

Inventors: Timour T. Paltashev, Michael Mantor, Rex Eldon McCrary

prev 1 2 3 4 5 6 7 8 9 … next