Patents by Inventor Brian L. Sumner
Brian L. Sumner has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11875197Abstract: Systems, apparatuses, and methods for managing a number of wavefronts permitted to concurrently execute in a processing system. An apparatus includes a register file with a plurality of registers and a plurality of compute units configured to execute wavefronts. A control unit of the apparatus is configured to allow a first number of wavefronts to execute concurrently on the plurality of compute units. The control unit is configured to allow no more than a second number of wavefronts to execute concurrently on the plurality of compute units, wherein the second number is less than the first number, in response to detection that thrashing of the register file is above a threshold. The control unit is configured to detect said thrashing based at least in part on a number of registers in use by executing wavefronts that spill to memory.Type: GrantFiled: December 29, 2020Date of Patent: January 16, 2024Assignee: Advanced Micro Devices, Inc.Inventors: Bradford Michael Beckmann, Steven Tony Tye, Brian L. Sumner, Nicolai Hähnle
-
Publication number: 20230153149Abstract: Systems, apparatuses, and methods for dynamic graphics processing unit (GPU) register allocation are disclosed. A GPU includes at least a plurality of compute units (CUs), a control unit, and a plurality of registers for each CU. If a new wavefront requests more registers than are currently available on the CU, the control unit spills registers associated with stack frames at the bottom of a stack since they will not likely be used in the near future. The control unit has complete flexibility determining how many registers to spill based on dynamic demands and can prefetch the upcoming necessary fills without software involvement. Effectively, the control unit manages the physical register file as a cache. This allows younger workgroups to be dynamically descheduled so that older workgroups can allocate additional registers when needed to ensure improved fairness and better forward progress guarantees.Type: ApplicationFiled: January 12, 2023Publication date: May 18, 2023Inventors: Bradford Michael Beckmann, Steven Tony Tye, Brian L. Sumner, Nicolai Hähnle
-
Patent number: 11579922Abstract: Systems, apparatuses, and methods for dynamic graphics processing unit (GPU) register allocation are disclosed. A GPU includes at least a plurality of compute units (CUs), a control unit, and a plurality of registers for each CU. If a new wavefront requests more registers than are currently available on the CU, the control unit spills registers associated with stack frames at the bottom of a stack since they will not likely be used in the near future. The control unit has complete flexibility determining how many registers to spill based on dynamic demands and can prefetch the upcoming necessary fills without software involvement. Effectively, the control unit manages the physical register file as a cache. This allows younger workgroups to be dynamically descheduled so that older workgroups can allocate additional registers when needed to ensure improved fairness and better forward progress guarantees.Type: GrantFiled: December 29, 2020Date of Patent: February 14, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Bradford Michael Beckmann, Steven Tony Tye, Brian L. Sumner, Nicolai Hähnle
-
Patent number: 11544106Abstract: Systems, apparatuses, and methods for implementing continuation analysis tasks (CATs) are disclosed. In one embodiment, a system implements hardware acceleration of CATs to manage the dependencies and scheduling of an application composed of multiple tasks. In one embodiment, a continuation packet is referenced directly by a first task. When the first task completes, the first task enqueues a continuation packet on a first queue. The first task can specify on which queue to place the continuation packet. The agent responsible for the first queue dequeues and executes the continuation packet which invokes an analysis phase which is performed prior to determining which dependent tasks to enqueue. If it is determined during the analysis phase that a second task is now ready to be launched, the second task is enqueued on one of the queues. Then, an agent responsible for this queue dequeues and executes the second task.Type: GrantFiled: April 13, 2020Date of Patent: January 3, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Steven Tony Tye, Brian L. Sumner, Bradford Michael Beckmann, Sooraj Puthoor
-
Publication number: 20220206876Abstract: Systems, apparatuses, and methods for managing a number of wavefronts permitted to concurrently execute in a processing system. An apparatus includes a register file with a plurality of registers and a plurality of compute units configured to execute wavefronts. A control unit of the apparatus is configured to allow a first number of wavefronts to execute concurrently on the plurality of compute units. The control unit is configured to allow no more than a second number of wavefronts to execute concurrently on the plurality of compute units, wherein the second number is less than the first number, in response to detection that thrashing of the register file is above a threshold.Type: ApplicationFiled: December 29, 2020Publication date: June 30, 2022Inventors: Bradford Michael Beckmann, Steven Tony Tye, Brian L. Sumner, Nicolai Hähnle
-
Publication number: 20220206841Abstract: Systems, apparatuses, and methods for dynamic graphics processing unit (GPU) register allocation are disclosed. A GPU includes at least a plurality of compute units (CUs), a control unit, and a plurality of registers for each CU. If a new wavefront requests more registers than are currently available on the CU, the control unit spills registers associated with stack frames at the bottom of a stack since they will not likely be used in the near future. The control unit has complete flexibility determining how many registers to spill based on dynamic demands and can prefetch the upcoming necessary fills without software involvement. Effectively, the control unit manages the physical register file as a cache. This allows younger workgroups to be dynamically descheduled so that older workgroups can allocate additional registers when needed to ensure improved fairness and better forward progress guarantees.Type: ApplicationFiled: December 29, 2020Publication date: June 30, 2022Inventors: Bradford Michael Beckmann, Steven Tony Tye, Brian L. Sumner, Nicolai Hähnle
-
Publication number: 20200379802Abstract: Systems, apparatuses, and methods for implementing continuation analysis tasks (CATs) are disclosed. In one embodiment, a system implements hardware acceleration of CATs to manage the dependencies and scheduling of an application composed of multiple tasks. In one embodiment, a continuation packet is referenced directly by a first task. When the first task completes, the first task enqueues a continuation packet on a first queue. The first task can specify on which queue to place the continuation packet. The agent responsible for the first queue dequeues and executes the continuation packet which invokes an analysis phase which is performed prior to determining which dependent tasks to enqueue. If it is determined during the analysis phase that a second task is now ready to be launched, the second task is enqueued on one of the queues. Then, an agent responsible for this queue dequeues and executes the second task.Type: ApplicationFiled: April 13, 2020Publication date: December 3, 2020Inventors: Steven Tony Tye, Brian L. Sumner, Bradford Michael Beckmann, Sooraj Puthoor
-
Patent number: 10620994Abstract: Systems, apparatuses, and methods for implementing continuation analysis tasks (CATs) are disclosed. In one embodiment, a system implements hardware acceleration of CATs to manage the dependencies and scheduling of an application composed of multiple tasks. In one embodiment, a continuation packet is referenced directly by a first task. When the first task completes, the first task enqueues a continuation packet on a first queue. The first task can specify on which queue to place the continuation packet. The agent responsible for the first queue dequeues and executes the continuation packet which invokes an analysis phase which is performed prior to determining which dependent tasks to enqueue. If it is determined during the analysis phase that a second task is now ready to be launched, the second task is enqueued on one of the queues. Then, an agent responsible for this queue dequeues and executes the second task.Type: GrantFiled: May 30, 2017Date of Patent: April 14, 2020Assignee: Advanced Micro Devices, Inc.Inventors: Steven Tony Tye, Brian L. Sumner, Bradford Michael Beckmann, Sooraj Puthoor
-
Publication number: 20180349145Abstract: Systems, apparatuses, and methods for implementing continuation analysis tasks (CATs) are disclosed. In one embodiment, a system implements hardware acceleration of CATs to manage the dependencies and scheduling of an application composed of multiple tasks. In one embodiment, a continuation packet is referenced directly by a first task. When the first task completes, the first task enqueues a continuation packet on a first queue. The first task can specify on which queue to place the continuation packet. The agent responsible for the first queue dequeues and executes the continuation packet which invokes an analysis phase which is performed prior to determining which dependent tasks to enqueue. If it is determined during the analysis phase that a second task is now ready to be launched, the second task is enqueued on one of the queues. Then, an agent responsible for this queue dequeues and executes the second task.Type: ApplicationFiled: May 30, 2017Publication date: December 6, 2018Inventors: Steven Tony Tye, Brian L. Sumner, Bradford Michael Beckmann, Sooraj Puthoor
-
Patent number: 9563402Abstract: A method and apparatus for additive range reduction are disclosed. A constant may be pre-stored in a look-up table (LUT), and at least one section of the constant may be retrieved from the LUT for generating a product of an input argument and the constant such that a precision of the product may be controlled in any granularity. For a trigonometric function, 2/? is stored in the LUT, and at least one section of 2/? may be retrieved from the LUT. The argument is multiplied with the retrieved sections of 2/?. The retrieved sections are determined to correctly generate the two least significant bits (LSBs) of an integer portion and a scalable number of most significant bits of the multiplication result. An output of the trigonometric function is generated for the argument with a fractional portion of the multiplication result based on two LSBs of the integer portion of the multiplication result.Type: GrantFiled: September 1, 2011Date of Patent: February 7, 2017Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Christopher L. Spencer, Yun-Xiao Zou, Brian L. Sumner
-
Publication number: 20130060829Abstract: A method and apparatus for additive range reduction are disclosed. A constant may be pre-stored in a look-up table (LUT), and at least one section of the constant may be retrieved from the LUT for generating a product of an input argument and the constant such that a precision of the product may be controlled in any granularity. For a trigonometric function, 2/? is stored in the LUT, and at least one section of 2/? may be retrieved from the LUT. The argument is multiplied with the retrieved sections of 2/?. The retrieved sections are determined to correctly generate the two least significant bits (LSBs) of an integer portion and a scalable number of most significant bits of the multiplication result. An output of the trigonometric function is generated for the argument with a fractional portion of the multiplication result based on two LSBs of the integer portion of the multiplication result.Type: ApplicationFiled: September 1, 2011Publication date: March 7, 2013Applicant: ADVANCED MICRO DEVICES, INC.Inventors: Christopher L. Spencer, Yun-Xiao Zou, Brian L. Sumner