Patents by Inventor Michael J. Mantor

Michael J. Mantor has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9311102
    Abstract: Systems and methods to improve performance in a graphics processing unit are described herein. Embodiments achieve power saving in a graphics processing unit by dynamically activating/deactivating individual SIMDs in a shader complex that comprises multiple SIMD units. On-the-fly dynamic disabling and enabling of individual SIMDs provides flexibility in achieving a required performance and power level for a given processing application. Embodiments of the invention also achieve dynamic medium grain clock gating of SIMDs in a shader complex. Embodiments reduce switching power by shutting down clock trees to unused logic by providing a clock on demand mechanism. In this way, embodiments enhance clock gating to save more switching power for the duration of time when SIMDs are idle (or assigned no work). Embodiments can also save leakage power by power gating SIMDs for a duration when SIMDs are idle for an extended period of time.
    Type: Grant
    Filed: July 12, 2011
    Date of Patent: April 12, 2016
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Tushar K. Shah, Michael J. Mantor, Brian Emberling
  • Publication number: 20150332427
    Abstract: Methods, systems and non-transitory computer readable media are described. A system includes a shader pipe array, a redundant shader pipe array, a sequencer and a redundant shader switch. The shader pipe array includes multiple shader pipes, each of which perform rendering calculations on data provided thereto. The redundant shader pipe array also performs rendering calculations on data provided thereto. The sequencer identifies at least one defective shader pipe in the shader pipe array, and, in response, generates a signal. The redundant shader switch receives the generated signal, and, in response, transfers the data destined for each shader pipe identified as being defective independently to the redundant shader pipe array.
    Type: Application
    Filed: July 24, 2015
    Publication date: November 19, 2015
    Applicant: ADVANCED MICRO DEVICES, INC.
    Inventors: Michael J. Mantor, Jeffrey T. Brady, Angel E. Socarras
  • Patent number: 9093040
    Abstract: A method and apparatus for shader data repair utilizing a Redundant Shader Switch (RSS). The RSS consists of an input and output section whereby when a defective shader pipe is detected, the RSS multiplexes shader pipe data destined to the defective shader pipe to a redundant shader pipe array for processing. Once processed, the shader pipe data is multiplexed back to the RSS where the processed shader pipe data is directed to the corresponding output column of the RSS. The RSS contains delay pipes used to re-align and synchronize the repaired shader pipe data with output export data.
    Type: Grant
    Filed: June 1, 2009
    Date of Patent: July 28, 2015
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Michael J. Mantor, Jeffrey T. Brady, Angel E. Socarras
  • Patent number: 8558836
    Abstract: A Scalable and Unified Compute System performs scalable, repairable general purpose and graphics shading operations, memory load/store operations and texture filtering. A Scalable and Unified Compute. Unit Module comprises a shader pipe array, a texture mapping unit, and a level one texture cache system. It accepts ALU instructions, input/output instructions, and texture or memory requests for a specified set of pixels, vertices, primitives, surfaces, or general compute work items from a shader program and performs associated operations to compute the programmed output data. The texture mapping unit accepts source data addresses and instruction constants in order to fetch, format, and perform instructed filtering interpolations to generate formatted results based on the specific corresponding data stored in a level one texture cache system. The texture mapping unit consists of an address generating system, a pre-formatter module, interpolator module, accumulator module and a format module.
    Type: Grant
    Filed: June 1, 2009
    Date of Patent: October 15, 2013
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Michael J. Mantor, Jeffrey T. Brady, Mark C. Fowler, Marcos P. Zini
  • Patent number: 8473721
    Abstract: Disclosed herein is a processing unit configured to process video data, and applications thereof. In an embodiment, the processing unit includes a buffer and an execution unit. The buffer is configured to store a data word, wherein the data word comprises a plurality of bytes of video data. The execution unit is configured to execute a single instruction to (i) shift bytes of video data contained in the data word to align a desired byte of video data and (ii) process the desired byte of the video data to provide processed video data.
    Type: Grant
    Filed: April 16, 2010
    Date of Patent: June 25, 2013
    Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: Michael J. Mantor, Jeffrey T. Brady, Christopher L. Spencer, Daniel W. Wong, Andrew E. Gruber
  • Patent number: 8468191
    Abstract: Systems and methods for multi-precision computation are disclosed. One embodiment of the present invention includes a plurality of multiply-add units (MADDs) configured to perform one or more single precision operations and an arrangement generator to generate one or more mantissa arrangements using a plurality of double precision numbers. Each MADD is configured to receive and load said mantissa arrangements from the arrangement generator. The MADDs compute a result of a multi-precision computation using the mantissa arrangements. In an embodiment, the MADDs are configured to simultaneously perform operations that include, single precision operations, double-precision additions and double-precision multiply and additions.
    Type: Grant
    Filed: June 10, 2010
    Date of Patent: June 18, 2013
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Michael J. Mantor, Jeffrey T. Brady, Daniel B. Clifton, Christopher Spencer
  • Patent number: 8316252
    Abstract: A method, computer program product, and system are provided for controlling a clock distribution network. For example, an embodiment of the method can include programming a predetermined delay time into a plurality of processing elements and controlling an activation and de-activation of these processing elements in a sequence based on the predetermined delay time. The processing elements are located in a system incorporating the clock distribution network, where the predetermined delay time can be programmed in a control register of a clock gate control circuit residing in the processing element. Further, when controlling the activation and de-activation of the processing elements, this activity can be controlled with a state machine based on the system's mode of operation.
    Type: Grant
    Filed: August 15, 2008
    Date of Patent: November 20, 2012
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Michael J. Mantor, Tushar K. Shah, Donald P. Lee
  • Patent number: 8195882
    Abstract: A shader pipe texture filter utilizes a level one cache system as a primary method of storage but with the ability to have the level one cache system read and write to a level two cache system when necessary. The level one cache system communicates with the level two cache system via a wide channel memory bus. In addition, the level one cache system can be configured to support dual shader pipe texture filters while maintaining access to the level two cache system. A method utilizing a level one cache system as a primary method of storage with the ability to have the level one cache system read and write a level two cache system when necessary is also presented. In addition, level one cache systems can allocate a defined area of memory to be sharable amongst other resources.
    Type: Grant
    Filed: June 1, 2009
    Date of Patent: June 5, 2012
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Anthony P. DeLaurier, Mark Leather, Robert S. Hartog, Michael J. Mantor, Mark C. Fowler, Marcos P. Zini
  • Publication number: 20120013627
    Abstract: Systems and methods to improve performance in a graphics processing unit are described herein. Embodiments achieve power saving in a graphics processing unit by dynamically activating/deactivating individual SIMDs in a shader complex that comprises multiple SIMD units. On-the-fly dynamic disabling and enabling of individual SIMDs provides flexibility in achieving a required performance and power level for a given processing application. Embodiments of the invention also achieve dynamic medium grain clock gating of SIMDs in a shader complex. Embodiments reduce switching power by shutting down clock trees to unused logic by providing a clock on demand mechanism. In this way, embodiments enhance clock gating to save more switching power for the duration of time when SIMDs are idle (or assigned no work). Embodiments can also save leakage power by power gating SIMDs for a duration when SIMDs are idle for an extended period of time.
    Type: Application
    Filed: July 12, 2011
    Publication date: January 19, 2012
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Tushar K. Shah, Michael J. Mantor, Brian Emberling
  • Publication number: 20110057940
    Abstract: Disclosed herein is a processing unit configured to process video data, and applications thereof. In an embodiment, the processing unit includes a buffer and an execution unit. The buffer is configured to store a data word, wherein the data word comprises a plurality of bytes of video data. The execution unit is configured to execute a single instruction to (i) shift bytes of video data contained in the data word to align a desired byte of video data and (ii) process the desired byte of the video data to provide processed video data.
    Type: Application
    Filed: April 16, 2010
    Publication date: March 10, 2011
    Applicants: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: Michael J. MANTOR, Jeffrey T. Brady, Christopher L. Spencer, Daniel W. Wong, Andrew E. Gruber
  • Publication number: 20110055308
    Abstract: Systems and methods for multi-precision computation are disclosed. One embodiment of the present invention includes a plurality of multiply-add units (MADDs) configured to perform one or more single precision operations and an arrangement generator to generate one or more mantissa arrangements using a plurality of double precision numbers. Each MADD is configured to receive and load said mantissa arrangements from the arrangement generator. The MADDs compute a result of a multi-precision computation using the mantissa arrangements. In an embodiment, the MADDs are configured to simultaneously perform operations that include, single precision operations, double-precision additions and double-precision multiply and additions.
    Type: Application
    Filed: June 10, 2010
    Publication date: March 3, 2011
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Michael J. Mantor, Jeffrey T. Brady, Daniel B. Clifton, Christopher Spencer
  • Publication number: 20100146211
    Abstract: A shader pipe texture filter utilizes a level one cache system as a primary method of storage but with the ability to have the level one cache system read and write to a level two cache system when necessary. The level one cache system communicates with the level two cache system via a wide channel memory bus. In addition, the level one cache system can be configured to support dual shader pipe texture filters while maintaining access to the level two cache system. A method utilizing a level one cache system as a primary method of storage with the ability to have the level one cache system read and write a level two cache system when necessary is also presented. In addition, level one cache systems can allocate a defined area of memory to be sharable amongst other resources.
    Type: Application
    Filed: June 1, 2009
    Publication date: June 10, 2010
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Anthony P. DeLaurier, Mark Leather, Robert S. Hartog, Michael J. Mantor, Mark C. Fowler, Marcos P. Zini
  • Publication number: 20090315909
    Abstract: Each row of a row based shader engine comprises a shader pipe array, a texture filter, and a level one texture cache system. The shader pipe array accepts texture requests for a specified pixel from a resource and performs associated rendering calculations, outputting texel data. The texture mapping unit receives texel data from a level one cache system and through formatting and bilinear filtering interpolations, generates a formatted bilinear result based on a specific pixel's corresponding four texels. Utilizing multiple rows of a row based shader engine within the shader engine allows for the parallel processing of multiple simultaneous resource requests. A method for texture filtering utilizing a row based shader engine is also presented.
    Type: Application
    Filed: June 1, 2009
    Publication date: December 24, 2009
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Anthony P. DeLaurier, Mark Leather, Robert S. Hartog, Michael J. Mantor, Jeffrey T. Brady, Mark C. Fowler, Marcos P. Zini
  • Publication number: 20090309896
    Abstract: Apparatus and systems utilizing multiple shader engines where each shader engine comprises multiple rows of shader engine filters combined with level one and level two cache systems. Each unified shader engine filter comprises a shader pipe array, and a texture mapping unit with access to a level one cache system and a level two cache. The shader pipe array accepts texture requests for a specified pixel from a resource and performs associated rendering calculations, outputting texel data. The texture mapping unit retrieves texel data stored in a level one cache system, with the ability to read and write to and from a level two cache system, and through formatting and bilinear filtering interpolations generates a formatted bilinear result based on the specific pixel's neighboring texels. Utilizing multiple rows of shader engine filters within a shader engine allows for the parallel processing of multiple simultaneous resource requests.
    Type: Application
    Filed: June 1, 2009
    Publication date: December 17, 2009
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Anthony P. DeLaurier, Mark Leather, Robert S. Hartog, Michael J. Mantor, Mark C. Fowler, Jeffrey T. Brady, Marcos P. Zini
  • Publication number: 20090300388
    Abstract: A method, computer program product, and system are provided for controlling a clock distribution network. For example, an embodiment of the method can include programming a predetermined delay time into a plurality of processing elements and controlling an activation and de-activation of these processing elements in a sequence based on the predetermined delay time. The processing elements are located in a system incorporating the clock distribution network, where the predetermined delay time can be programmed in a control register of a clock gate control circuit residing in the processing element. Further, when controlling the activation and de-activation of the processing elements, this activity can be controlled with a state machine based on the system's mode of operation.
    Type: Application
    Filed: August 15, 2008
    Publication date: December 3, 2009
    Applicant: Advanced Micro Devices Inc.
    Inventors: Michael J. MANTOR, Tushar K. SHAH, Donald P. LEE
  • Publication number: 20090295821
    Abstract: A Scalable and Unified Compute System performs scalable, repairable general purpose and graphics shading operations, memory load/store operations and texture filtering. A Scalable and Unified Compute Unit Module comprises a shader pipe array, a texture mapping unit, and a level one texture cache system. The Scalable and Unified Compute Unit Module accepts ALU instructions, input/output instructions, and texture or memory requests for a specified set of pixels, vertices, primitives, surfaces, or general compute work items from a shader program and performs associated operations to compute the programmed output data. The texture mapping unit accepts source data addresses and instruction constants in order to fetch, format, and perform instructed filtering interpolations to generate formatted results based on the specific corresponding data stored in a level one texture cache system.
    Type: Application
    Filed: June 1, 2009
    Publication date: December 3, 2009
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Michael J. Mantor, Jeffrey T. Brady, Mark C. Fowler, Marcos P. Zini
  • Publication number: 20090300293
    Abstract: Methods and systems for dynamically partitioning a cache and maintaining cache coherency are provided. In an embodiment, a system for processing memory requests includes a cache and a cache controller configured to compare a memory address and a type of a received memory request to a memory address and a type, respectively, corresponding to a cache line of the cache to determine whether the memory request hits on the cache line. In another embodiment, a method for processing fetch memory requests includes receiving a memory request and determining if the memory request hits on a cache line of a cache by determining if a memory address and a type of the memory request match a memory address and a type, respectively, corresponding to a cache line of the cache.
    Type: Application
    Filed: July 1, 2008
    Publication date: December 3, 2009
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Michael J. MANTOR, Brian A. Buchner, John P. McCardle, II
  • Publication number: 20090295820
    Abstract: A method and apparatus for shader data repair utilizing a Redundant Shader Switch (RSS). The RSS consists of an input and output section whereby when a defective shader pipe is detected, the RSS multiplexes shader pipe data destined to the defective shader pipe to a redundant shader pipe array for processing. Once processed, the shader pipe data is multiplexed back to the RSS where the processed shader pipe data is directed to the corresponding output column of the RSS. The RSS contains delay pipes used to re-align and synchronize the repaired shader pipe data with output export data.
    Type: Application
    Filed: June 1, 2009
    Publication date: December 3, 2009
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Michael J. Mantor, Jeffrey T. Brady, Angel E. Socarras
  • Publication number: 20090300621
    Abstract: A graphics processing unit is disclosed, the graphics processing unit having a processor having one or more SIMD processing units, and a local data share corresponding to one of the one or more SIMD processing units, the local data share comprising one or more low latency accessible memory regions for each group of threads assigned to one or more execution wavefronts, and a global data share comprising one or more low latency memory regions for each group of threads.
    Type: Application
    Filed: June 1, 2009
    Publication date: December 3, 2009
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Michael J. MANTOR, Brian Emberling
  • Patent number: 6943800
    Abstract: In a graphics processing circuit, up to N sets of state data are stored in a buffer such that a total length of the N sets of state data does not exceed the total length of the buffer. When a length of additional state data would exceed a length of available space in the buffer, storage of the additional set of state data in the buffer is delayed until at least M of the N sets of state data are no longer being used to process graphics primitives, wherein M is less than or equal to N. The buffer is preferably implemented as a ring buffer, thereby minimizing the impact of state data updates. To further prevent corruption of state data, additional sets of state data are prohibited from being added to the buffer if a maximum number of allowed states is already stored in the buffer.
    Type: Grant
    Filed: August 13, 2001
    Date of Patent: September 13, 2005
    Assignee: ATI Technologies, Inc.
    Inventors: Ralph C. Taylor, Michael J. Mantor