Patents by Inventor Michael Mantor

Michael Mantor has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20180211434
    Abstract: Techniques for generating a stereo image from a single set of input geometry in a three-dimensional rendering pipeline are disclosed. Vertices are processed through the end of the world-space pipeline. In the primitive assembler, at the end of the world-space pipeline, before perspective division, each clip-space vertex is duplicated. The primitive assembler generates this duplicated clip-space vertex using the y, z, and w coordinates of the original vertex and based on an x coordinate that is offset in the x-direction in clip-space as compared with the x coordinate of the original vertex. Both the original vertex clip-space vertex and the modified clip-space vertex are then sent through the rest of the pipeline for processing, including perspective division, viewport transform, rasterization, pixel shading, and other operations. The result is that a single set of input vertices is rendered into a stereo image.
    Type: Application
    Filed: January 25, 2017
    Publication date: July 26, 2018
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Mangesh P. Nijasure, Michael Mantor, Jeffrey M. Smith
  • Publication number: 20180165872
    Abstract: Techniques for removing or identifying overlapping fragments in a fragment stream after z-culling are disclosed. The techniques include maintaining a first-in-first-out buffer that stores post-z-cull fragments. Each time a new fragment is received at the buffer, the screen position of the fragment is checked against all other fragments in the buffer. If the screen position of the fragment matches the screen position of a fragment in the buffer, then the fragment in the buffer is removed or marked as overlapping. If the screen position of the fragment does not match the screen position of any fragment in the buffer, then no modification is performed to fragments already in the buffer. In either case, he fragment is added to the buffer. The contents of the buffer are transmitted to the pixel shader for pixel shading at a later time.
    Type: Application
    Filed: December 9, 2016
    Publication date: June 14, 2018
    Applicants: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: Laurent Lefebvre, Michael Mantor, Mark Fowler, Mikko Alho, Mika Tuomi, Kiia Kallio, Patrick Klas Rudolf Buss, Jari Antero Komppa, Kaj Tuomi, Christopher J. Brennan
  • Publication number: 20180121386
    Abstract: A super single instruction, multiple data (SIMD) computing structure and a method of executing instructions in the super-SIMD is disclosed. The super-SIMD structure is capable of executing more than one instruction from a single or multiple thread and includes a plurality of vector general purpose registers (VGPRs), a first arithmetic logic unit (ALU), the first ALU coupled to the plurality of VGPRs, a second ALU, the second ALU coupled to the plurality of VGPRs, and a destination cache (Do$) that is coupled via bypass and forwarding logic to the first ALU, the second ALU and receiving an output of the first ALU and the second ALU. The Do$ holds multiple instructions results to extend an operand by-pass network to save read and write transactions power. A compute unit (CU) and a small CU including a plurality of super-SIMDs are also disclosed.
    Type: Application
    Filed: November 17, 2016
    Publication date: May 3, 2018
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Jiasheng Chen, Angel E. Socarras, Michael Mantor, YunXiao Zou, Bin He
  • Publication number: 20180113709
    Abstract: A method and apparatus for performing a multi-precision computation in a plurality of arithmetic logic units (ALUs) includes pairing a first Single Instruction/Multiple Data (SIMD) block channel device with a second SIMD block channel device to create a first block pair having one-level staggering between the first and second channel devices. A third SIMD block channel device is paired with a fourth SIMD block channel device to create a second block pair having one-level staggering between the third and fourth channel devices. A plurality of source inputs are received at the first block pair and the second block pair. The first block pair computes a first result, and the second block pair computes a second result.
    Type: Application
    Filed: November 3, 2016
    Publication date: April 26, 2018
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Bin He, YunXiao Zou, Jiasheng Chen, Michael Mantor
  • Publication number: 20180113714
    Abstract: A processing element is implemented in a stage of a pipeline and configured to execute an instruction. A first array of multiplexers is to provide information associated with the instruction to the processing element in response to the instruction being in a first set of instructions. A second array of multiplexers is to provide information associated with the instruction to the first processing element in response to the instruction being in a second set of instructions. A control unit is to gate at least one of power or a clock signal provided to the first array of multiplexers in response to the instruction being in the second set.
    Type: Application
    Filed: October 20, 2017
    Publication date: April 26, 2018
    Inventors: Jiasheng CHEN, YunXiao ZOU, Bin HE, Angel E. SOCARRAS, QingCheng WANG, Wei YUAN, Michael MANTOR
  • Publication number: 20180114290
    Abstract: A graphics processing unit (GPU) includes a plurality of programmable processing cores configured to process graphics primitives and corresponding data and a plurality of fixed-function hardware units. The plurality of processing cores and the plurality of fixed-function hardware units are configured to implement a configurable number of virtual pipelines to concurrently process different command flows. Each virtual pipeline includes a configurable number of fragments and an operational state of each virtual pipeline is specified by a different context. The configurable number of virtual pipelines can be modified from a first number to a second number that is different than the first number. An emulation of a fixed-function hardware unit can be instantiated on one or more of the graphics processing cores in response to detection of a bottleneck in a fixed-function hardware unit. One or more of the virtual pipelines can then be reconfigured to utilize the emulation instead of the fixed-function hardware unit.
    Type: Application
    Filed: October 21, 2016
    Publication date: April 26, 2018
    Inventors: Timour T. Paltashev, Michael Mantor, Rex Eldon McCrary
  • Publication number: 20180082399
    Abstract: Improvements in the graphics processing pipeline are disclosed. More specifically, a new primitive shader stage performs tasks of the vertex shader stage or a domain shader stage if tessellation is enabled, a geometry shader if enabled, and a fixed function primitive assembler. The primitive shader stage is compiled by a driver from user-provided vertex or domain shader code, geometry shader code, and from code that performs functions of the primitive assembler. Moving tasks of the fixed function primitive assembler to a primitive shader that executes in programmable hardware provides many benefits, such as removal of a fixed function crossbar, removal of dedicated parameter and position buffers that are unusable in general compute mode, and other benefits.
    Type: Application
    Filed: January 25, 2017
    Publication date: March 22, 2018
    Applicants: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: Todd Martin, Mangesh P. Nijasure, Randy W. Ramsey, Michael Mantor, Laurent Lefebvre
  • Publication number: 20170371743
    Abstract: A system and method for protecting memory instructions against faults are described. The system and method include converting the slave instructions to dummy operations, modifying memory arbiter to issue up to N master and N slave global/shared memory instructions per cycle, sending master memory requests to memory system, using slave requests for error checking, entering master requests to the GM/LM FIFO, storing slave requests in a register, and comparing the entered master requests with the stored slave requests.
    Type: Application
    Filed: June 22, 2016
    Publication date: December 28, 2017
    Applicant: Advanced Micro Devices, Inc.
    Inventors: John Kalamatianos, Michael Mantor, Sudhanva Gurumurthi
  • Publication number: 20170371654
    Abstract: Described is a system and method for using virtual vector register files. In particular, a graphics processor includes a logic unit, a virtual vector register file coupled to the logic unit, a vector register backing store coupled to the virtual vector register file, and a virtual vector register file controller coupled to the virtual vector register file. The virtual vector register file includes a N deep vector register file and a M deep vector register file, where N is less than M. The virtual vector register file controller performing eviction and allocation between the N deep vector register file, the M deep vector register file and the vector register backing store dependent on at least access requests for certain vector registers.
    Type: Application
    Filed: June 23, 2016
    Publication date: December 28, 2017
    Applicants: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: Ljubisa Bajic, Michael Mantor, Syed Zohaib M. Gilani, Rajabali M. Koduri
  • Publication number: 20170371393
    Abstract: Described is a method and processing apparatus to improve power efficiency by gating redundant threads processing. In particular, the method for gating redundant threads in a graphics processor includes determining if data for a thread and data for at least another thread are within a predetermined similarity threshold, gating execution of the at least another thread if the data for the thread and the data for the at least another thread are within the predetermined similarity threshold, and using an output data from the thread as an output data for the at least another thread.
    Type: Application
    Filed: June 22, 2016
    Publication date: December 28, 2017
    Applicants: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: Syed Zohaib M. Gilani, Jiasheng Chen, QingCheng Wang, YunXiao Zou, Michael Mantor, Bin He, Timour T. Paltashev
  • Publication number: 20170076421
    Abstract: Methods and apparatus are described. A method includes an accelerated processing device running a process. When a maximum time interval during which the process is permitted to run expires before the process completes, the accelerated processing device receives an operating-system-initiated instruction to stop running the process. The accelerated processing device stops the process from running in response to the received operating-system-initiated instruction.
    Type: Application
    Filed: November 28, 2016
    Publication date: March 16, 2017
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Robert Scott Hartog, Ralph Clayton Taylor, Michael Mantor, Kevin John McGrath, Sebastien Nussbaum, Nuwan Jayasena, Rex McCrary, Mark Leather, Philip J. Rogers, Thomas Woller
  • Patent number: 9529632
    Abstract: A method of allocating a memory to a plurality of concurrent threads is presented. The method includes dynamically determining writer threads each having at least one pending write to the memory; and dynamically allocating respective contiguous blocks in the memory for each of the writer threads. Another method of allocating a memory to a plurality of concurrent threads includes launching the plurality of threads as a plurality of wavefronts, dynamically determining a group of wavefronts each having at least one thread requiring a write to the memory, and dynamically allocating respective contiguous blocks in the memory for each wavefront from the group of wavefronts.
    Type: Grant
    Filed: September 3, 2009
    Date of Patent: December 27, 2016
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Michael Mantor, John McCardle, Marcos Zini, Brian Emberling
  • Publication number: 20160371873
    Abstract: A system, method and a computer program product are provided for hybrid rendering with deferred primitive batch binning A primitive batch is generated from a sequence of primitives. Initial bin intercepts are identified for primitives in the primitive batch. A bin for processing is identified. The bin corresponds to a region of a screen space. Pixels of the primitives intercepting the identified bin are processed. Next bin intercepts are identified while the primitives intercepting the identified bin are processed.
    Type: Application
    Filed: August 29, 2016
    Publication date: December 22, 2016
    Applicants: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: Michael Mantor, Laurent Lefebvre, Mikko Alho, Mika Tuomi, Kiia Kallio
  • Patent number: 9507632
    Abstract: Methods, systems, and computer readable media for preemptive context-switching of processes on an accelerated processing device are based upon a comparison of the running time of the process and a threshold time quanta. A method includes preempting a process running on an accelerated processing device based upon a running time of the process and a threshold time quanta.
    Type: Grant
    Filed: November 4, 2011
    Date of Patent: November 29, 2016
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Robert Scott Hartog, Ralph Clay Taylor, Michael Mantor, Kevin McGrath, Sebastien Nussbaum, Nuwan Jayasena, Rex McCrary, Mark Leather, Philip Rogers, Thomas Woller
  • Publication number: 20160260246
    Abstract: A method, a non-transitory computer readable medium, and a processor for performing display shading for computer graphics are presented. Frame data is received by a display shader, the frame data including at least a portion of a rendered frame. Parameters for modifying the frame data are received by the display shader. The parameters are applied to the frame data by the display shader to create a modified frame. The modified frame is displayed on a display device.
    Type: Application
    Filed: March 2, 2015
    Publication date: September 8, 2016
    Applicant: Advanced Micro Devices, Inc.
    Inventors: David Oldcorn, Chris Brennan, Michael Mantor, Layla A. Mah
  • Patent number: 9424099
    Abstract: Disclosed methods, systems, and computer program products embodiments include synchronizing a group of workitems on a processor by storing a respective program counter associated with each of the workitems, selecting at least one first workitem from the group for execution, and executing the selected at least one first workitem on the processor. The selecting is based upon the respective stored program counter associated with the at least one first workitem.
    Type: Grant
    Filed: November 8, 2012
    Date of Patent: August 23, 2016
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Michael C. Houston, Benedict R. Gaster, Lee W. Howes, Michael Mantor, Dominik Behr
  • Patent number: 9329893
    Abstract: A method resumes an accelerated processing device (APD) wavefront in which a subset of elements have faulted. A restore command for a job including a wavefront is received. A list of context states for the wavefront is read from a memory associated with a APD. An empty shell wavefront is created for restoring the list of context states. A portion of not acknowledged data is masked over a portion of acknowledged data within the restored wavefronts.
    Type: Grant
    Filed: December 14, 2011
    Date of Patent: May 3, 2016
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Robert Scott Hartog, Ralph Clay Taylor, Michael Mantor, Thomas R. Woller, Kevin McGrath, Sebastien Nussbaum, Nuwan Jayasena, Rex McCrary, Philip J. Rogers, Mark Leather
  • Patent number: 9304772
    Abstract: A system and method is provided for improving efficiency, power, and bandwidth consumption in parallel processing. Rather than requiring memory polling to ensure ordered execution of processes or threads in wavefronts, the techniques disclosed herein provide a system and method to allow any process or thread in a wavefront to run out of order as long as needed, but ensure ordered execution of multiple ordered instructions when needed. These operations are handled efficiently in hardware, but are flexible enough to be implemented in all manner of programming models.
    Type: Grant
    Filed: March 29, 2012
    Date of Patent: April 5, 2016
    Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: Laurent Lefebvre, Michael Mantor
  • Patent number: 9299121
    Abstract: Methods, systems, and computer readable media embodiments are disclosed for preemptive context-switching of processes running on a accelerated processing device. Embodiments include, detecting by an accelerated processing device a memory exception, and preempting a process from running on the accelerated processing device based upon the detected exception.
    Type: Grant
    Filed: November 4, 2011
    Date of Patent: March 29, 2016
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Robert Scott Hartog, Ralph Clay Taylor, Michael Mantor, Kevin McGrath, Sebastien Nussbaum, Nuwan Jayasena, Rex McCrary, Mark Leather, Philip J. Rogers, Thomas R. Woller
  • Patent number: 9256465
    Abstract: Methods, systems, and computer readable media embodiments are disclosed for preemptive context-switching of processes running on an accelerated processing device. A method includes, responsive to an exception upon access to a memory by a process running on a accelerated processing device, whether to preempt the process based on the exception, and preempting, based upon the determining, the process from running on the accelerated processing device.
    Type: Grant
    Filed: November 4, 2011
    Date of Patent: February 9, 2016
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Robert Scott Hartog, Ralph Clay Taylor, Michael Mantor, Kevin McGrath, Sebastien Nussbaum, Nuwan Jayasena, Rex McCrary, Mark Leather, Philip J. Rogers, Thomas R. Woller