Patents by Inventor David Gotwalt
David Gotwalt has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12190164Abstract: Disclosed embodiments relate to controlling sets of graphics work (e.g., kicks) assigned to graphics processor circuitry. In some embodiments, tracking slot circuitry implements entries for multiple tracking slots. Slot manager circuitry may store, using an entry of the tracking slot circuitry, software-specified information for a set of graphics work, where the information includes: type of work, dependencies on other sets of graphics work, and location of data for the set of graphics work. The slot manager circuitry may prefetch, from the location and prior to allocating shader core resources for the set of graphics work, configuration register data for the set of graphics work. Control circuitry may program configuration registers for the set of graphics work using the prefetched data and initiate processing of the set of graphics work by the graphics processor circuitry according to the dependencies. Disclosed techniques may reduce kick-to-kick transition time, in some embodiments.Type: GrantFiled: August 11, 2021Date of Patent: January 7, 2025Assignee: Apple Inc.Inventors: Steven Fishwick, Fergus W. MacGarry, Jonathan M. Redshaw, David A. Gotwalt, Ali Rabbani Rankouhi, Benjamin Bowman
-
Patent number: 12086644Abstract: Disclosed techniques relate to work distribution in graphics processors. In some embodiments, an apparatus includes circuitry that implements a plurality of logical slots and a set of graphics processor sub-units that each implement multiple distributed hardware slots. The circuitry may determine different distribution rules for first and second sets of graphics work and map logical slots to distributed hardware slots based on the distribution rules. In various embodiments, disclosed techniques may advantageously distribute work efficiently across distributed shader processors for graphics kicks of various sizes.Type: GrantFiled: August 11, 2021Date of Patent: September 10, 2024Assignee: Apple Inc.Inventors: Andrew M. Havlir, Steven Fishwick, David A. Gotwalt, Benjamin Bowman, Ralph C. Taylor, Melissa L. Velez, Mladen Wilder, Ali Rabbani Rankouhi, Fergus W. MacGarry
-
Publication number: 20240273666Abstract: Disclosed techniques relate to scheduling sets of graphics work using queues. In some embodiments, tracking circuitry implements entries for multiple tracking slots for a graphics processor. Queue access circuitry may access a data structure in memory that specifies multiple queues, where each queue enqueues control information for multiple sets of graphics work. Queue select circuitry may select sets of graphics work from the data structure based on one or more selection parameters and store control information for selected sets of graphics work in tracking slots of the tracking slot circuitry. Distribution circuitry may assign portions of respective sets of graphics work from the tracking slots to graphics processor circuitry for execution.Type: ApplicationFiled: August 16, 2023Publication date: August 15, 2024Inventors: Steven Fishwick, David A. Gotwalt, Pratik Chandresh Shah, Jackson Dsouza, Subodh Asthana, Jairaj Dave, Piotr A. Dittrich, David E. Roberts
-
Patent number: 12026098Abstract: Techniques are disclosed relating to updating page pools in the context of cached page pool descriptors. In some embodiments, a processor is configured to assign a set of processing work to a first page pool of memory pages. Page manager circuitry may cache page pool descriptor entries in cache circuitry, where a given page pool descriptor entry indicates a set of pages assigned to a page pool. In response to a determination to grow the first page pool, the processor may communicate a grow list to the page manager circuitry, that identifies a set of memory blocks from the memory to be added to the first page pool. The page manager circuitry may then update a cached page pool descriptor entry for the first page pool to indicate the added memory blocks and generate a signal to inform the processor that the cached page pool descriptor entry is updated.Type: GrantFiled: April 21, 2022Date of Patent: July 2, 2024Assignee: Apple Inc.Inventors: Arjun Thottappilly, David A. Gotwalt, Frank W. Liljeros
-
Publication number: 20230050061Abstract: Disclosed techniques relate to work distribution in graphics processors. In some embodiments, an apparatus includes circuitry that implements a plurality of logical slots and a set of graphics processor sub-units that each implement multiple distributed hardware slots. The circuitry may determine different distribution rules for first and second sets of graphics work and map logical slots to distributed hardware slots based on the distribution rules. In various embodiments, disclosed techniques may advantageously distribute work efficiently across distributed shader processors for graphics kicks of various sizes.Type: ApplicationFiled: August 11, 2021Publication date: February 16, 2023Inventors: Andrew M. Havlir, Steven Fishwick, David A. Gotwalt, Benjamin Bowman, Ralph C. Taylor, Melissa L. Velez, Mladen Wilder, Ali Rabbani Rankouhi, Fergus W. MacGarry
-
Publication number: 20230048951Abstract: Disclosed embodiments relate to controlling sets of graphics work (e.g., kicks) assigned to graphics processor circuitry. In some embodiments, tracking slot circuitry implements entries for multiple tracking slots. Slot manager circuitry may store, using an entry of the tracking slot circuitry, software-specified information for a set of graphics work, where the information includes: type of work, dependencies on other sets of graphics work, and location of data for the set of graphics work. The slot manager circuitry may prefetch, from the location and prior to allocating shader core resources for the set of graphics work, configuration register data for the set of graphics work. Control circuitry may program configuration registers for the set of graphics work using the prefetched data and initiate processing of the set of graphics work by the graphics processor circuitry according to the dependencies. Disclosed techniques may reduce kick-to-kick transition time, in some embodiments.Type: ApplicationFiled: August 11, 2021Publication date: February 16, 2023Inventors: Steven Fishwick, Fergus W. MacGarry, Jonathan M. Redshaw, David A. Gotwalt, Ali Rabbani Rankouhi, Benjamin Bowman
-
Patent number: 10970223Abstract: Systems, apparatuses, and methods for efficiently allocating data in a cache are described. In various embodiments, a processor decodes an indication in a software application identifying a temporal data set. The data set is flagged with a data set identifier (DSID) indicating temporal data to drop after consumption. When the data set is allocated in a cache, the data set is stored with a non-replaceable attribute to prevent a cache replacement policy from evicting the data set before it is dropped. A drop command with an indication of the DSID of the data set is later issued after the data set is read (consumed). A copy of the data set is not written back to the lower-level memory although the data set is removed from the cache. An interrupt is generated to notify firmware or other software of the completion of the drop command.Type: GrantFiled: May 13, 2019Date of Patent: April 6, 2021Assignee: Apple Inc.Inventors: Wolfgang H. Klingauf, Kenneth C. Dyke, Karthik Ramani, Winnie W. Yeung, Anthony P. DeLaurier, Luc R. Semeria, David A. Gotwalt, Srinivasa Rangan Sridharan, Muditha Kanchana
-
Publication number: 20190266102Abstract: Systems, apparatuses, and methods for efficiently allocating data in a cache are described. In various embodiments, a processor decodes an indication in a software application identifying a temporal data set. The data set is flagged with a data set identifier (DSID) indicating temporal data to drop after consumption. When the data set is allocated in a cache, the data set is stored with a non-replaceable attribute to prevent a cache replacement policy from evicting the data set before it is dropped. A drop command with an indication of the DSID of the data set is later issued after the data set is read (consumed). A copy of the data set is not written back to the lower-level memory although the data set is removed from the cache. An interrupt is generated to notify firmware or other software of the completion of the drop command.Type: ApplicationFiled: May 13, 2019Publication date: August 29, 2019Inventors: Wolfgang H. Klingauf, Kenneth C. Dyke, Karthik Ramani, Winnie W. Yeung, Anthony P. DeLaurier, Luc R. Semeria, David A. Gotwalt, Srinivasa Rangan Sridharan, Muditha Kanchana
-
Patent number: 10289565Abstract: Systems, apparatuses, and methods for efficiently allocating data in a cache are described. In various embodiments, a processor decodes an indication in a software application identifying a temporal data set. The data set is flagged with a data set identifier (DSID) indicating temporal data to drop after consumption. When the data set is allocated in a cache, the data set is stored with a non-replaceable attribute to prevent a cache replacement policy from evicting the data set before it is dropped. A drop command with an indication of the DSID of the data set is later issued after the data set is read (consumed). A copy of the data set is not written back to the lower-level memory although the data set is removed from the cache. An interrupt is generated to notify firmware or other software of the completion of the drop command.Type: GrantFiled: May 31, 2017Date of Patent: May 14, 2019Assignee: Apple Inc.Inventors: Wolfgang H. Klingauf, Kenneth C. Dyke, Karthik Ramani, Winnie W. Yeung, Anthony P. DeLaurier, Luc R. Semeria, David A. Gotwalt, Srinivasa Rangan Sridharan, Muditha Kanchana
-
Publication number: 20180349291Abstract: Systems, apparatuses, and methods for efficiently allocating data in a cache are described. In various embodiments, a processor decodes an indication in a software application identifying a temporal data set. The data set is flagged with a data set identifier (DSID) indicating temporal data to drop after consumption. When the data set is allocated in a cache, the data set is stored with a non-replaceable attribute to prevent a cache replacement policy from evicting the data set before it is dropped. A drop command with an indication of the DSID of the data set is later issued after the data set is read (consumed). A copy of the data set is not written back to the lower-level memory although the data set is removed from the cache. An interrupt is generated to notify firmware or other software of the completion of the drop command.Type: ApplicationFiled: May 31, 2017Publication date: December 6, 2018Inventors: Wolfgang H. Klingauf, Kenneth C. Dyke, Karthik Ramani, Winnie W. Yeung, Anthony P. DeLaurier, Luc R. Semeria, David A. Gotwalt, Srinivasa Rangan Sridharan, Muditha Kanchana
-
Patent number: 10078882Abstract: A method of processing commands is provided. The method includes holding commands in queues and executing the commands in an order based on their respective priority. Commands having the same priority are held in the same queue.Type: GrantFiled: November 8, 2016Date of Patent: September 18, 2018Assignee: Advanced Micro Devices, Inc.Inventors: Philip J. Rogers, David Gotwalt, Tom Frisinger, Rex McCrary
-
Publication number: 20180181491Abstract: Techniques are disclosed relating to flushing cache lines. In some embodiments, a graphics processing unit includes a cache and one or more storage elements configured to store a plurality of command buffers that include instructions executable to manipulate data stored in the cache. In some embodiments, ones of the cache lines in the cache are configured to store data to be operated on by instructions in the command buffers and a first tag portion that identifies a command buffer that has stored data in the cache line. In some embodiments, the graphics processing unit is configured to receive a request to flush cache lines that store data of a particular command buffer, and to flush ones of the cache lines having first tag portions indicating the particular command buffer as having data stored in the cache lines while maintaining data stored in other ones of the cache lines as valid.Type: ApplicationFiled: December 22, 2016Publication date: June 28, 2018Inventors: Anthony P. DeLaurier, Luc R. Semeria, Gokhan Avkarogullari, David A. Gotwalt, Robert S. Hartog, Michael J. Swift
-
Publication number: 20170053377Abstract: A method of processing commands is provided. The method includes holding commands in queues and executing the commands in an order based on their respective priority. Commands having the same priority are held in the same queue.Type: ApplicationFiled: November 8, 2016Publication date: February 23, 2017Applicant: Advanced Micro Devices, Inc.Inventors: Philip J. Rogers, David Gotwalt, Tom Frisinger, Rex McCrary
-
Patent number: 9519943Abstract: A method of processing commands is provided. The method includes holding commands in queues and executing the commands in an order based on their respective priority. Commands having the same priority are held in the same queue.Type: GrantFiled: June 29, 2011Date of Patent: December 13, 2016Assignee: Advanced Micro Devices, Inc.Inventors: Philip J. Rogers, David Gotwalt, Tom Frisinger, Rex McCrary
-
Patent number: 9489313Abstract: The present disclosure provides for systems and methods to process a non-resident page that may include attempting to access the non-resident page, an address for the non-resident page pointing to a memory page containing default values, determining that the non-resident page should not cause a page fault based on an indicator indicating that a particular non-resident page should not generate a page fault, returning an indication that a memory read did not translate and returning the default value when the access of the non-resident page is a read and the non-resident page should not cause a page fault. Another example may discontinue a write when the access of the non-resident page is a write and the non-resident page should not cause a page fault.Type: GrantFiled: September 24, 2013Date of Patent: November 8, 2016Assignee: QUALCOMM IncorporatedInventors: David A. Gotwalt, Thomas Edwin Frisinger, Andrew Evan Gruber, Eric Demers, Colin Christopher Sharp
-
Publication number: 20150089146Abstract: The present disclosure provides for systems and methods to process a non-resident page that may include attempting to access the non-resident page, an address for the non-resident page pointing to a memory page containing default values, determining that the non-resident page should not cause a page fault based on an indicator indicating that a particular non-resident page should not generate a page fault, returning an indication that a memory read did not translate and returning the default value when the access of the non-resident page is a read and the non-resident page should not cause a page fault. Another example may discontinue a write when the access of the non-resident page is a write and the non-resident page should not cause a page fault.Type: ApplicationFiled: September 24, 2013Publication date: March 26, 2015Applicant: QUALCOMM IncorporatedInventors: David A. Gotwalt, Thomas Edwin Frisinger, Andrew Evan Gruber, Eric Demers, Colin Christopher Sharp
-
Patent number: 8427486Abstract: A multiprocessor system includes a plurality of special purpose processors that perform different portions of a related processing task. A set of commands that cause each of the processors to perform the portions of the related task are distributed, and the set of commands includes a predicated execution command that precedes other commands within the set of commands. It is determined whether commands subsequent to the predicated execution command are intended to be executed by a first processor or a second processor based on information in the predicated execution command and the set of commands includes all commands to be executed by each processor.Type: GrantFiled: September 23, 2011Date of Patent: April 23, 2013Assignee: ATI Technologies ULCInventors: Timothy M. Kelley, Jonathan L. Campbell, David A. Gotwalt
-
Patent number: 8305380Abstract: A method of managing resources is provided. The method includes identifying a resource associated with a processor responsive to an impending transition, and copying the identified resource from a memory associated with the GPU or to the memory associated with the GPU.Type: GrantFiled: January 28, 2010Date of Patent: November 6, 2012Assignees: Advanced Micro Devices, Inc., ATI Technologies UTCInventors: David Gotwalt, Oleksandr Khodorkovsky
-
Patent number: 8217950Abstract: A processing unit, method, and graphics processing system are provided for processing a plurality of frames of graphics data. For instance, the processing unit can include a first plurality of graphics processing units (GPUs), a second plurality of GPUs, and a plurality of compositors. The first plurality of GPUs can be configured to process a first frame of graphics data. Likewise, the second plurality of GPUs can be configured to process a second frame of graphics data. Further, each compositor in the plurality of compositors can be coupled to a respective GPU from the first and second pluralities of GPUs, where the plurality of compositors is configured to sequentially pass the first and second frames of graphics data to a display module.Type: GrantFiled: September 2, 2009Date of Patent: July 10, 2012Assignee: Advanced Micro Devices, Inc.Inventors: Rajabali M. Koduri, David Gotwalt, Andrew Pomianowski
-
Publication number: 20120147015Abstract: A method, computer program product, and computing system are provided for processing a graphics operation. For instance, the method can include receiving the graphics operation from an application. The method can also include allocating a first portion of the graphics operation to a first processing unit and a second portion of the graphics operation to a second processing unit. This allocation between the first and second processing units can be based on at least one of a performance profile and a functionality profile of the first and second processing units.Type: ApplicationFiled: December 13, 2011Publication date: June 14, 2012Applicant: Advanced Micro Devices, Inc.Inventors: Philip J. Rogers, David A. Gotwalt