Patents by Inventor Michael J. Mantor

Michael J. Mantor has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

DUAL MODE LOCAL DATA STORE

Publication number: 20180143907

Abstract: A system and method for efficiently processing access requests for a shared resource are described. Each of many requestors are assigned to a partition of a shared resource. When a controller determines no requestor generates an access request for an unassigned partition, the controller permits simultaneous access to the assigned partitions for active requestors. When the controller determines at least one active requestor generates an access request for an unassigned partition, the controller allows a single active requestor to gain exclusive access to the entire shared resource while stalling access for the other active requestors. The controller alternatives exclusive access among the active requestors. In various embodiments, the shared resource is a local data store in a graphics processing unit and each of the multiple requestors is a single instruction multiple data (SIMD) compute unit.

Type: Application

Filed: November 23, 2016

Publication date: May 24, 2018

Inventors: Daniel Clifton, Michael J. Mantor, Hans Burton
SIMD PROCESSING UNIT WITH LOCAL DATA SHARE AND ACCESS TO A GLOBAL DATA SHARE OF A GPU

Publication number: 20170212757

Abstract: A graphics processing unit is disclosed, the graphics processing unit having a processor having one or more SIMD processing units, and a local data share corresponding to one of the one or more SIMD processing units, the local data share comprising one or more low latency accessible memory regions for each group of threads assigned to one or more execution wavefronts, and a global data share comprising one or more low latency memory regions for each group of threads.

Type: Application

Filed: April 10, 2017

Publication date: July 27, 2017

Applicant: Advanced Micro Devices, Inc.

Inventors: Michael J. Mantor, Brian Emberling
SIMD processing unit with local data share and access to a global data share of a GPU

Patent number: 9619428

Abstract: A graphics processing unit is disclosed, the graphics processing unit having a processor having one or more SIMD processing units, and a local data share corresponding to one of the one or more SIMD processing units, the local data share comprising one or more low latency accessible memory regions for each group of threads assigned to one or more execution wavefronts, and a global data share comprising one or more low latency memory regions for each group of threads.

Type: Grant

Filed: June 1, 2009

Date of Patent: April 11, 2017

Assignee: Advanced Micro Devices, Inc.

Inventors: Michael J. Mantor, Brian Emberling
REDUNDANCY METHOD AND APPARATUS FOR SHADER COLUMN REPAIR

Publication number: 20160260192

Abstract: Methods, systems and non-transitory computer readable media are described. A system includes a shader pipe array, a redundant shader pipe array, a sequencer and a redundant shader switch. The shader pipe array includes multiple shader pipes, each of which perform rendering calculations on data provided thereto. The redundant shader pipe array also performs rendering calculations on data provided thereto. The sequencer identifies at least one defective shader pipe in the shader pipe array, and, in response, generates a signal. The redundant shader switch receives the generated signal, and, in response, transfers the data destined for each shader pipe identified as being defective independently to the redundant shader pipe array.

Type: Application

Filed: May 17, 2016

Publication date: September 8, 2016

Applicant: Advanced Micro Devices, Inc.

Inventors: Michael J. Mantor, Jeffrey T. Brady, Angel E. Socarras
Redundancy method and apparatus for shader column repair

Patent number: 9367891

Abstract: Methods, systems and non-transitory computer readable media are described. A system includes a shader pipe array, a redundant shader pipe array, a sequencer and a redundant shader switch. The shader pipe array includes multiple shader pipes, each of which perform rendering calculations on data provided thereto. The redundant shader pipe array also performs rendering calculations on data provided thereto. The sequencer identifies at least one defective shader pipe in the shader pipe array, and, in response, generates a signal. The redundant shader switch receives the generated signal, and, in response, transfers the data destined for each shader pipe identified as being defective independently to the redundant shader pipe array.

Type: Grant

Filed: July 24, 2015

Date of Patent: June 14, 2016

Assignee: Advanced Micro Devices, Inc.

Inventors: Michael J. Mantor, Jeffrey T. Brady, Angel E. Socarras
Dynamic control of SIMDs

Patent number: 9311102

Abstract: Systems and methods to improve performance in a graphics processing unit are described herein. Embodiments achieve power saving in a graphics processing unit by dynamically activating/deactivating individual SIMDs in a shader complex that comprises multiple SIMD units. On-the-fly dynamic disabling and enabling of individual SIMDs provides flexibility in achieving a required performance and power level for a given processing application. Embodiments of the invention also achieve dynamic medium grain clock gating of SIMDs in a shader complex. Embodiments reduce switching power by shutting down clock trees to unused logic by providing a clock on demand mechanism. In this way, embodiments enhance clock gating to save more switching power for the duration of time when SIMDs are idle (or assigned no work). Embodiments can also save leakage power by power gating SIMDs for a duration when SIMDs are idle for an extended period of time.

Type: Grant

Filed: July 12, 2011

Date of Patent: April 12, 2016

Assignee: Advanced Micro Devices, Inc.

Inventors: Tushar K. Shah, Michael J. Mantor, Brian Emberling
REDUNDANCY METHOD AND APPARATUS FOR SHADER COLUMN REPAIR

Publication number: 20150332427

Abstract: Methods, systems and non-transitory computer readable media are described. A system includes a shader pipe array, a redundant shader pipe array, a sequencer and a redundant shader switch. The shader pipe array includes multiple shader pipes, each of which perform rendering calculations on data provided thereto. The redundant shader pipe array also performs rendering calculations on data provided thereto. The sequencer identifies at least one defective shader pipe in the shader pipe array, and, in response, generates a signal. The redundant shader switch receives the generated signal, and, in response, transfers the data destined for each shader pipe identified as being defective independently to the redundant shader pipe array.

Type: Application

Filed: July 24, 2015

Publication date: November 19, 2015

Applicant: ADVANCED MICRO DEVICES, INC.

Inventors: Michael J. Mantor, Jeffrey T. Brady, Angel E. Socarras
Redundancy method and apparatus for shader column repair

Patent number: 9093040

Abstract: A method and apparatus for shader data repair utilizing a Redundant Shader Switch (RSS). The RSS consists of an input and output section whereby when a defective shader pipe is detected, the RSS multiplexes shader pipe data destined to the defective shader pipe to a redundant shader pipe array for processing. Once processed, the shader pipe data is multiplexed back to the RSS where the processed shader pipe data is directed to the corresponding output column of the RSS. The RSS contains delay pipes used to re-align and synchronize the repaired shader pipe data with output export data.

Type: Grant

Filed: June 1, 2009

Date of Patent: July 28, 2015

Assignee: Advanced Micro Devices, Inc.

Inventors: Michael J. Mantor, Jeffrey T. Brady, Angel E. Socarras
Scalable and unified compute system

Patent number: 8558836

Abstract: A Scalable and Unified Compute System performs scalable, repairable general purpose and graphics shading operations, memory load/store operations and texture filtering. A Scalable and Unified Compute. Unit Module comprises a shader pipe array, a texture mapping unit, and a level one texture cache system. It accepts ALU instructions, input/output instructions, and texture or memory requests for a specified set of pixels, vertices, primitives, surfaces, or general compute work items from a shader program and performs associated operations to compute the programmed output data. The texture mapping unit accepts source data addresses and instruction constants in order to fetch, format, and perform instructed filtering interpolations to generate formatted results based on the specific corresponding data stored in a level one texture cache system. The texture mapping unit consists of an address generating system, a pre-formatter module, interpolator module, accumulator module and a format module.

Type: Grant

Filed: June 1, 2009

Date of Patent: October 15, 2013

Assignee: Advanced Micro Devices, Inc.

Inventors: Michael J. Mantor, Jeffrey T. Brady, Mark C. Fowler, Marcos P. Zini
Video instruction processing of desired bytes in multi-byte buffers by shifting to matching byte location

Patent number: 8473721

Abstract: Disclosed herein is a processing unit configured to process video data, and applications thereof. In an embodiment, the processing unit includes a buffer and an execution unit. The buffer is configured to store a data word, wherein the data word comprises a plurality of bytes of video data. The execution unit is configured to execute a single instruction to (i) shift bytes of video data contained in the data word to align a desired byte of video data and (ii) process the desired byte of the video data to provide processed video data.

Type: Grant

Filed: April 16, 2010

Date of Patent: June 25, 2013

Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Michael J. Mantor, Jeffrey T. Brady, Christopher L. Spencer, Daniel W. Wong, Andrew E. Gruber
Method and system for multi-precision computation

Patent number: 8468191

Abstract: Systems and methods for multi-precision computation are disclosed. One embodiment of the present invention includes a plurality of multiply-add units (MADDs) configured to perform one or more single precision operations and an arrangement generator to generate one or more mantissa arrangements using a plurality of double precision numbers. Each MADD is configured to receive and load said mantissa arrangements from the arrangement generator. The MADDs compute a result of a multi-precision computation using the mantissa arrangements. In an embodiment, the MADDs are configured to simultaneously perform operations that include, single precision operations, double-precision additions and double-precision multiply and additions.

Type: Grant

Filed: June 10, 2010

Date of Patent: June 18, 2013

Assignee: Advanced Micro Devices, Inc.

Inventors: Michael J. Mantor, Jeffrey T. Brady, Daniel B. Clifton, Christopher Spencer
Distributed clock gating with centralized state machine control

Patent number: 8316252

Abstract: A method, computer program product, and system are provided for controlling a clock distribution network. For example, an embodiment of the method can include programming a predetermined delay time into a plurality of processing elements and controlling an activation and de-activation of these processing elements in a sequence based on the predetermined delay time. The processing elements are located in a system incorporating the clock distribution network, where the predetermined delay time can be programmed in a control register of a clock gate control circuit residing in the processing element. Further, when controlling the activation and de-activation of the processing elements, this activity can be controlled with a state machine based on the system's mode of operation.

Type: Grant

Filed: August 15, 2008

Date of Patent: November 20, 2012

Assignee: Advanced Micro Devices, Inc.

Inventors: Michael J. Mantor, Tushar K. Shah, Donald P. Lee
Shader complex with distributed level one cache system and centralized level two cache

Patent number: 8195882

Abstract: A shader pipe texture filter utilizes a level one cache system as a primary method of storage but with the ability to have the level one cache system read and write to a level two cache system when necessary. The level one cache system communicates with the level two cache system via a wide channel memory bus. In addition, the level one cache system can be configured to support dual shader pipe texture filters while maintaining access to the level two cache system. A method utilizing a level one cache system as a primary method of storage with the ability to have the level one cache system read and write a level two cache system when necessary is also presented. In addition, level one cache systems can allocate a defined area of memory to be sharable amongst other resources.

Type: Grant

Filed: June 1, 2009

Date of Patent: June 5, 2012

Assignee: Advanced Micro Devices, Inc.

Inventors: Anthony P. DeLaurier, Mark Leather, Robert S. Hartog, Michael J. Mantor, Mark C. Fowler, Marcos P. Zini
DYNAMIC CONTROL OF SIMDs

Publication number: 20120013627

Abstract: Systems and methods to improve performance in a graphics processing unit are described herein. Embodiments achieve power saving in a graphics processing unit by dynamically activating/deactivating individual SIMDs in a shader complex that comprises multiple SIMD units. On-the-fly dynamic disabling and enabling of individual SIMDs provides flexibility in achieving a required performance and power level for a given processing application. Embodiments of the invention also achieve dynamic medium grain clock gating of SIMDs in a shader complex. Embodiments reduce switching power by shutting down clock trees to unused logic by providing a clock on demand mechanism. In this way, embodiments enhance clock gating to save more switching power for the duration of time when SIMDs are idle (or assigned no work). Embodiments can also save leakage power by power gating SIMDs for a duration when SIMDs are idle for an extended period of time.

Type: Application

Filed: July 12, 2011

Publication date: January 19, 2012

Applicant: Advanced Micro Devices, Inc.

Inventors: Tushar K. Shah, Michael J. Mantor, Brian Emberling
Processing Unit to Implement Video Instructions and Applications Thereof

Publication number: 20110057940

Abstract: Disclosed herein is a processing unit configured to process video data, and applications thereof. In an embodiment, the processing unit includes a buffer and an execution unit. The buffer is configured to store a data word, wherein the data word comprises a plurality of bytes of video data. The execution unit is configured to execute a single instruction to (i) shift bytes of video data contained in the data word to align a desired byte of video data and (ii) process the desired byte of the video data to provide processed video data.

Type: Application

Filed: April 16, 2010

Publication date: March 10, 2011

Applicants: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Michael J. MANTOR, Jeffrey T. Brady, Christopher L. Spencer, Daniel W. Wong, Andrew E. Gruber
Method And System For Multi-Precision Computation

Publication number: 20110055308

Abstract: Systems and methods for multi-precision computation are disclosed. One embodiment of the present invention includes a plurality of multiply-add units (MADDs) configured to perform one or more single precision operations and an arrangement generator to generate one or more mantissa arrangements using a plurality of double precision numbers. Each MADD is configured to receive and load said mantissa arrangements from the arrangement generator. The MADDs compute a result of a multi-precision computation using the mantissa arrangements. In an embodiment, the MADDs are configured to simultaneously perform operations that include, single precision operations, double-precision additions and double-precision multiply and additions.

Type: Application

Filed: June 10, 2010

Publication date: March 3, 2011

Applicant: Advanced Micro Devices, Inc.

Inventors: Michael J. Mantor, Jeffrey T. Brady, Daniel B. Clifton, Christopher Spencer
Shader Complex with Distributed Level One Cache System and Centralized Level Two Cache

Publication number: 20100146211

Abstract: A shader pipe texture filter utilizes a level one cache system as a primary method of storage but with the ability to have the level one cache system read and write to a level two cache system when necessary. The level one cache system communicates with the level two cache system via a wide channel memory bus. In addition, the level one cache system can be configured to support dual shader pipe texture filters while maintaining access to the level two cache system. A method utilizing a level one cache system as a primary method of storage with the ability to have the level one cache system read and write a level two cache system when necessary is also presented. In addition, level one cache systems can allocate a defined area of memory to be sharable amongst other resources.

Type: Application

Filed: June 1, 2009

Publication date: June 10, 2010

Applicant: Advanced Micro Devices, Inc.

Inventors: Anthony P. DeLaurier, Mark Leather, Robert S. Hartog, Michael J. Mantor, Mark C. Fowler, Marcos P. Zini
Unified Shader Engine Filtering System

Publication number: 20090315909

Abstract: Each row of a row based shader engine comprises a shader pipe array, a texture filter, and a level one texture cache system. The shader pipe array accepts texture requests for a specified pixel from a resource and performs associated rendering calculations, outputting texel data. The texture mapping unit receives texel data from a level one cache system and through formatting and bilinear filtering interpolations, generates a formatted bilinear result based on a specific pixel's corresponding four texels. Utilizing multiple rows of a row based shader engine within the shader engine allows for the parallel processing of multiple simultaneous resource requests. A method for texture filtering utilizing a row based shader engine is also presented.

Type: Application

Filed: June 1, 2009

Publication date: December 24, 2009

Applicant: Advanced Micro Devices, Inc.

Inventors: Anthony P. DeLaurier, Mark Leather, Robert S. Hartog, Michael J. Mantor, Jeffrey T. Brady, Mark C. Fowler, Marcos P. Zini
Multi Instance Unified Shader Engine Filtering System With Level One and Level Two Cache

Publication number: 20090309896

Abstract: Apparatus and systems utilizing multiple shader engines where each shader engine comprises multiple rows of shader engine filters combined with level one and level two cache systems. Each unified shader engine filter comprises a shader pipe array, and a texture mapping unit with access to a level one cache system and a level two cache. The shader pipe array accepts texture requests for a specified pixel from a resource and performs associated rendering calculations, outputting texel data. The texture mapping unit retrieves texel data stored in a level one cache system, with the ability to read and write to and from a level two cache system, and through formatting and bilinear filtering interpolations generates a formatted bilinear result based on the specific pixel's neighboring texels. Utilizing multiple rows of shader engine filters within a shader engine allows for the parallel processing of multiple simultaneous resource requests.

Type: Application

Filed: June 1, 2009

Publication date: December 17, 2009

Applicant: Advanced Micro Devices, Inc.

Inventors: Anthony P. DeLaurier, Mark Leather, Robert S. Hartog, Michael J. Mantor, Mark C. Fowler, Jeffrey T. Brady, Marcos P. Zini
Dynamically Partitionable Cache

Publication number: 20090300293

Abstract: Methods and systems for dynamically partitioning a cache and maintaining cache coherency are provided. In an embodiment, a system for processing memory requests includes a cache and a cache controller configured to compare a memory address and a type of a received memory request to a memory address and a type, respectively, corresponding to a cache line of the cache to determine whether the memory request hits on the cache line. In another embodiment, a method for processing fetch memory requests includes receiving a memory request and determining if the memory request hits on a cache line of a cache by determining if a memory address and a type of the memory request match a memory address and a type, respectively, corresponding to a cache line of the cache.

Type: Application

Filed: July 1, 2008

Publication date: December 3, 2009

Applicant: Advanced Micro Devices, Inc.

Inventors: Michael J. MANTOR, Brian A. Buchner, John P. McCardle, II

prev 1 2 3 4 next