Patents by Inventor Michael J. Mantor
Michael J. Mantor has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20180143907Abstract: A system and method for efficiently processing access requests for a shared resource are described. Each of many requestors are assigned to a partition of a shared resource. When a controller determines no requestor generates an access request for an unassigned partition, the controller permits simultaneous access to the assigned partitions for active requestors. When the controller determines at least one active requestor generates an access request for an unassigned partition, the controller allows a single active requestor to gain exclusive access to the entire shared resource while stalling access for the other active requestors. The controller alternatives exclusive access among the active requestors. In various embodiments, the shared resource is a local data store in a graphics processing unit and each of the multiple requestors is a single instruction multiple data (SIMD) compute unit.Type: ApplicationFiled: November 23, 2016Publication date: May 24, 2018Inventors: Daniel Clifton, Michael J. Mantor, Hans Burton
-
Publication number: 20170212757Abstract: A graphics processing unit is disclosed, the graphics processing unit having a processor having one or more SIMD processing units, and a local data share corresponding to one of the one or more SIMD processing units, the local data share comprising one or more low latency accessible memory regions for each group of threads assigned to one or more execution wavefronts, and a global data share comprising one or more low latency memory regions for each group of threads.Type: ApplicationFiled: April 10, 2017Publication date: July 27, 2017Applicant: Advanced Micro Devices, Inc.Inventors: Michael J. Mantor, Brian Emberling
-
Patent number: 9619428Abstract: A graphics processing unit is disclosed, the graphics processing unit having a processor having one or more SIMD processing units, and a local data share corresponding to one of the one or more SIMD processing units, the local data share comprising one or more low latency accessible memory regions for each group of threads assigned to one or more execution wavefronts, and a global data share comprising one or more low latency memory regions for each group of threads.Type: GrantFiled: June 1, 2009Date of Patent: April 11, 2017Assignee: Advanced Micro Devices, Inc.Inventors: Michael J. Mantor, Brian Emberling
-
Publication number: 20160260192Abstract: Methods, systems and non-transitory computer readable media are described. A system includes a shader pipe array, a redundant shader pipe array, a sequencer and a redundant shader switch. The shader pipe array includes multiple shader pipes, each of which perform rendering calculations on data provided thereto. The redundant shader pipe array also performs rendering calculations on data provided thereto. The sequencer identifies at least one defective shader pipe in the shader pipe array, and, in response, generates a signal. The redundant shader switch receives the generated signal, and, in response, transfers the data destined for each shader pipe identified as being defective independently to the redundant shader pipe array.Type: ApplicationFiled: May 17, 2016Publication date: September 8, 2016Applicant: Advanced Micro Devices, Inc.Inventors: Michael J. Mantor, Jeffrey T. Brady, Angel E. Socarras
-
Patent number: 9367891Abstract: Methods, systems and non-transitory computer readable media are described. A system includes a shader pipe array, a redundant shader pipe array, a sequencer and a redundant shader switch. The shader pipe array includes multiple shader pipes, each of which perform rendering calculations on data provided thereto. The redundant shader pipe array also performs rendering calculations on data provided thereto. The sequencer identifies at least one defective shader pipe in the shader pipe array, and, in response, generates a signal. The redundant shader switch receives the generated signal, and, in response, transfers the data destined for each shader pipe identified as being defective independently to the redundant shader pipe array.Type: GrantFiled: July 24, 2015Date of Patent: June 14, 2016Assignee: Advanced Micro Devices, Inc.Inventors: Michael J. Mantor, Jeffrey T. Brady, Angel E. Socarras
-
Patent number: 9311102Abstract: Systems and methods to improve performance in a graphics processing unit are described herein. Embodiments achieve power saving in a graphics processing unit by dynamically activating/deactivating individual SIMDs in a shader complex that comprises multiple SIMD units. On-the-fly dynamic disabling and enabling of individual SIMDs provides flexibility in achieving a required performance and power level for a given processing application. Embodiments of the invention also achieve dynamic medium grain clock gating of SIMDs in a shader complex. Embodiments reduce switching power by shutting down clock trees to unused logic by providing a clock on demand mechanism. In this way, embodiments enhance clock gating to save more switching power for the duration of time when SIMDs are idle (or assigned no work). Embodiments can also save leakage power by power gating SIMDs for a duration when SIMDs are idle for an extended period of time.Type: GrantFiled: July 12, 2011Date of Patent: April 12, 2016Assignee: Advanced Micro Devices, Inc.Inventors: Tushar K. Shah, Michael J. Mantor, Brian Emberling
-
Publication number: 20150332427Abstract: Methods, systems and non-transitory computer readable media are described. A system includes a shader pipe array, a redundant shader pipe array, a sequencer and a redundant shader switch. The shader pipe array includes multiple shader pipes, each of which perform rendering calculations on data provided thereto. The redundant shader pipe array also performs rendering calculations on data provided thereto. The sequencer identifies at least one defective shader pipe in the shader pipe array, and, in response, generates a signal. The redundant shader switch receives the generated signal, and, in response, transfers the data destined for each shader pipe identified as being defective independently to the redundant shader pipe array.Type: ApplicationFiled: July 24, 2015Publication date: November 19, 2015Applicant: ADVANCED MICRO DEVICES, INC.Inventors: Michael J. Mantor, Jeffrey T. Brady, Angel E. Socarras
-
Patent number: 9093040Abstract: A method and apparatus for shader data repair utilizing a Redundant Shader Switch (RSS). The RSS consists of an input and output section whereby when a defective shader pipe is detected, the RSS multiplexes shader pipe data destined to the defective shader pipe to a redundant shader pipe array for processing. Once processed, the shader pipe data is multiplexed back to the RSS where the processed shader pipe data is directed to the corresponding output column of the RSS. The RSS contains delay pipes used to re-align and synchronize the repaired shader pipe data with output export data.Type: GrantFiled: June 1, 2009Date of Patent: July 28, 2015Assignee: Advanced Micro Devices, Inc.Inventors: Michael J. Mantor, Jeffrey T. Brady, Angel E. Socarras
-
Patent number: 8558836Abstract: A Scalable and Unified Compute System performs scalable, repairable general purpose and graphics shading operations, memory load/store operations and texture filtering. A Scalable and Unified Compute. Unit Module comprises a shader pipe array, a texture mapping unit, and a level one texture cache system. It accepts ALU instructions, input/output instructions, and texture or memory requests for a specified set of pixels, vertices, primitives, surfaces, or general compute work items from a shader program and performs associated operations to compute the programmed output data. The texture mapping unit accepts source data addresses and instruction constants in order to fetch, format, and perform instructed filtering interpolations to generate formatted results based on the specific corresponding data stored in a level one texture cache system. The texture mapping unit consists of an address generating system, a pre-formatter module, interpolator module, accumulator module and a format module.Type: GrantFiled: June 1, 2009Date of Patent: October 15, 2013Assignee: Advanced Micro Devices, Inc.Inventors: Michael J. Mantor, Jeffrey T. Brady, Mark C. Fowler, Marcos P. Zini
-
Patent number: 8473721Abstract: Disclosed herein is a processing unit configured to process video data, and applications thereof. In an embodiment, the processing unit includes a buffer and an execution unit. The buffer is configured to store a data word, wherein the data word comprises a plurality of bytes of video data. The execution unit is configured to execute a single instruction to (i) shift bytes of video data contained in the data word to align a desired byte of video data and (ii) process the desired byte of the video data to provide processed video data.Type: GrantFiled: April 16, 2010Date of Patent: June 25, 2013Assignees: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Michael J. Mantor, Jeffrey T. Brady, Christopher L. Spencer, Daniel W. Wong, Andrew E. Gruber
-
Patent number: 8468191Abstract: Systems and methods for multi-precision computation are disclosed. One embodiment of the present invention includes a plurality of multiply-add units (MADDs) configured to perform one or more single precision operations and an arrangement generator to generate one or more mantissa arrangements using a plurality of double precision numbers. Each MADD is configured to receive and load said mantissa arrangements from the arrangement generator. The MADDs compute a result of a multi-precision computation using the mantissa arrangements. In an embodiment, the MADDs are configured to simultaneously perform operations that include, single precision operations, double-precision additions and double-precision multiply and additions.Type: GrantFiled: June 10, 2010Date of Patent: June 18, 2013Assignee: Advanced Micro Devices, Inc.Inventors: Michael J. Mantor, Jeffrey T. Brady, Daniel B. Clifton, Christopher Spencer
-
Patent number: 8316252Abstract: A method, computer program product, and system are provided for controlling a clock distribution network. For example, an embodiment of the method can include programming a predetermined delay time into a plurality of processing elements and controlling an activation and de-activation of these processing elements in a sequence based on the predetermined delay time. The processing elements are located in a system incorporating the clock distribution network, where the predetermined delay time can be programmed in a control register of a clock gate control circuit residing in the processing element. Further, when controlling the activation and de-activation of the processing elements, this activity can be controlled with a state machine based on the system's mode of operation.Type: GrantFiled: August 15, 2008Date of Patent: November 20, 2012Assignee: Advanced Micro Devices, Inc.Inventors: Michael J. Mantor, Tushar K. Shah, Donald P. Lee
-
Patent number: 8195882Abstract: A shader pipe texture filter utilizes a level one cache system as a primary method of storage but with the ability to have the level one cache system read and write to a level two cache system when necessary. The level one cache system communicates with the level two cache system via a wide channel memory bus. In addition, the level one cache system can be configured to support dual shader pipe texture filters while maintaining access to the level two cache system. A method utilizing a level one cache system as a primary method of storage with the ability to have the level one cache system read and write a level two cache system when necessary is also presented. In addition, level one cache systems can allocate a defined area of memory to be sharable amongst other resources.Type: GrantFiled: June 1, 2009Date of Patent: June 5, 2012Assignee: Advanced Micro Devices, Inc.Inventors: Anthony P. DeLaurier, Mark Leather, Robert S. Hartog, Michael J. Mantor, Mark C. Fowler, Marcos P. Zini
-
Publication number: 20120013627Abstract: Systems and methods to improve performance in a graphics processing unit are described herein. Embodiments achieve power saving in a graphics processing unit by dynamically activating/deactivating individual SIMDs in a shader complex that comprises multiple SIMD units. On-the-fly dynamic disabling and enabling of individual SIMDs provides flexibility in achieving a required performance and power level for a given processing application. Embodiments of the invention also achieve dynamic medium grain clock gating of SIMDs in a shader complex. Embodiments reduce switching power by shutting down clock trees to unused logic by providing a clock on demand mechanism. In this way, embodiments enhance clock gating to save more switching power for the duration of time when SIMDs are idle (or assigned no work). Embodiments can also save leakage power by power gating SIMDs for a duration when SIMDs are idle for an extended period of time.Type: ApplicationFiled: July 12, 2011Publication date: January 19, 2012Applicant: Advanced Micro Devices, Inc.Inventors: Tushar K. Shah, Michael J. Mantor, Brian Emberling
-
Publication number: 20110057940Abstract: Disclosed herein is a processing unit configured to process video data, and applications thereof. In an embodiment, the processing unit includes a buffer and an execution unit. The buffer is configured to store a data word, wherein the data word comprises a plurality of bytes of video data. The execution unit is configured to execute a single instruction to (i) shift bytes of video data contained in the data word to align a desired byte of video data and (ii) process the desired byte of the video data to provide processed video data.Type: ApplicationFiled: April 16, 2010Publication date: March 10, 2011Applicants: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Michael J. MANTOR, Jeffrey T. Brady, Christopher L. Spencer, Daniel W. Wong, Andrew E. Gruber
-
Publication number: 20110055308Abstract: Systems and methods for multi-precision computation are disclosed. One embodiment of the present invention includes a plurality of multiply-add units (MADDs) configured to perform one or more single precision operations and an arrangement generator to generate one or more mantissa arrangements using a plurality of double precision numbers. Each MADD is configured to receive and load said mantissa arrangements from the arrangement generator. The MADDs compute a result of a multi-precision computation using the mantissa arrangements. In an embodiment, the MADDs are configured to simultaneously perform operations that include, single precision operations, double-precision additions and double-precision multiply and additions.Type: ApplicationFiled: June 10, 2010Publication date: March 3, 2011Applicant: Advanced Micro Devices, Inc.Inventors: Michael J. Mantor, Jeffrey T. Brady, Daniel B. Clifton, Christopher Spencer
-
Publication number: 20100146211Abstract: A shader pipe texture filter utilizes a level one cache system as a primary method of storage but with the ability to have the level one cache system read and write to a level two cache system when necessary. The level one cache system communicates with the level two cache system via a wide channel memory bus. In addition, the level one cache system can be configured to support dual shader pipe texture filters while maintaining access to the level two cache system. A method utilizing a level one cache system as a primary method of storage with the ability to have the level one cache system read and write a level two cache system when necessary is also presented. In addition, level one cache systems can allocate a defined area of memory to be sharable amongst other resources.Type: ApplicationFiled: June 1, 2009Publication date: June 10, 2010Applicant: Advanced Micro Devices, Inc.Inventors: Anthony P. DeLaurier, Mark Leather, Robert S. Hartog, Michael J. Mantor, Mark C. Fowler, Marcos P. Zini
-
Publication number: 20090315909Abstract: Each row of a row based shader engine comprises a shader pipe array, a texture filter, and a level one texture cache system. The shader pipe array accepts texture requests for a specified pixel from a resource and performs associated rendering calculations, outputting texel data. The texture mapping unit receives texel data from a level one cache system and through formatting and bilinear filtering interpolations, generates a formatted bilinear result based on a specific pixel's corresponding four texels. Utilizing multiple rows of a row based shader engine within the shader engine allows for the parallel processing of multiple simultaneous resource requests. A method for texture filtering utilizing a row based shader engine is also presented.Type: ApplicationFiled: June 1, 2009Publication date: December 24, 2009Applicant: Advanced Micro Devices, Inc.Inventors: Anthony P. DeLaurier, Mark Leather, Robert S. Hartog, Michael J. Mantor, Jeffrey T. Brady, Mark C. Fowler, Marcos P. Zini
-
Publication number: 20090309896Abstract: Apparatus and systems utilizing multiple shader engines where each shader engine comprises multiple rows of shader engine filters combined with level one and level two cache systems. Each unified shader engine filter comprises a shader pipe array, and a texture mapping unit with access to a level one cache system and a level two cache. The shader pipe array accepts texture requests for a specified pixel from a resource and performs associated rendering calculations, outputting texel data. The texture mapping unit retrieves texel data stored in a level one cache system, with the ability to read and write to and from a level two cache system, and through formatting and bilinear filtering interpolations generates a formatted bilinear result based on the specific pixel's neighboring texels. Utilizing multiple rows of shader engine filters within a shader engine allows for the parallel processing of multiple simultaneous resource requests.Type: ApplicationFiled: June 1, 2009Publication date: December 17, 2009Applicant: Advanced Micro Devices, Inc.Inventors: Anthony P. DeLaurier, Mark Leather, Robert S. Hartog, Michael J. Mantor, Mark C. Fowler, Jeffrey T. Brady, Marcos P. Zini
-
Publication number: 20090300293Abstract: Methods and systems for dynamically partitioning a cache and maintaining cache coherency are provided. In an embodiment, a system for processing memory requests includes a cache and a cache controller configured to compare a memory address and a type of a received memory request to a memory address and a type, respectively, corresponding to a cache line of the cache to determine whether the memory request hits on the cache line. In another embodiment, a method for processing fetch memory requests includes receiving a memory request and determining if the memory request hits on a cache line of a cache by determining if a memory address and a type of the memory request match a memory address and a type, respectively, corresponding to a cache line of the cache.Type: ApplicationFiled: July 1, 2008Publication date: December 3, 2009Applicant: Advanced Micro Devices, Inc.Inventors: Michael J. MANTOR, Brian A. Buchner, John P. McCardle, II