Patents by Inventor Karl D. Mann

Karl D. Mann has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

On-demand memory allocation

Patent number: 12265474

Abstract: Techniques are disclosed relating to dynamically allocating and mapping private memory for requesting circuitry. Disclosed circuitry may receive a private address and translate the private address to a virtual address (which an MMU may then translate to physical address to actually access a storage element). In some embodiments, private memory allocation circuitry is configured to generate page table information and map private memory pages for requests if the page table information is not already setup. In various embodiments, this may advantageously allow dynamic private memory allocation, e.g., to efficiently allocate memory for graphics shaders with different types of workloads. Disclosed caching techniques for page table information may improve performance relative to traditional techniques. Further, disclosed embodiments may facilitate memory consolidation across a device such as a graphics processor.

Type: Grant

Filed: October 19, 2023

Date of Patent: April 1, 2025

Assignee: Apple Inc.

Inventors: Justin A. Hensley, Karl D. Mann, Yoong Chert Foo, Terence M. Potter, Frank W. Liljeros, Ralph C. Taylor
Page Management and Forward Progress for Ray Tracing

Publication number: 20250095273

Abstract: Techniques are disclosed relating to memory page allocation for graphics processor. In some embodiments, a shader program includes a primary thread associated with ray tracing (that includes an instruction that indicates for the apparatus to launch one or more secondary threads). Memory resource allocator circuitry may receive a request to allocate a memory page in a page pool to a thread of the shader program, where the page pool includes a set of protected pages and a set of public pages. The allocator may allocate a page of the page pool to the requesting thread according to an allocation restriction, such that protected pages are allocable only to secondary threads that are launched based on a primary thread and public pages are allocable to both primary and secondary threads.

Type: Application

Filed: November 15, 2023

Publication date: March 20, 2025

Inventors: Frank W. Liljeros, Karl D. Mann, Per Christian Corneliussen
Compute Kernel Parsing with Limits in one or more Dimensions

Publication number: 20240345892

Abstract: Techniques are disclosed relating to dispatching compute work from a compute stream. In some embodiments, a graphics processor executes instructions of compute kernels. Workload parser circuitry may determine, for distribution to the graphics processor circuitry, a set of workgroups from a compute kernel that includes workgroups organized in multiple dimensions, including a first number of workgroups in a first dimension and a second number of workgroups in a second dimension. This may include determining multiple sub-kernels for the compute kernel, wherein a first sub-kernel includes, in the first dimension, a limited number of workgroups that is smaller than the first number of workgroups. The parser circuitry may iterate through workgroups in both the first and second dimensions to generate the set of workgroups, proceeding through the first sub-kernel before iterating through any of the other sub-kernels. Disclosed techniques may provide desirable shapes for batches of workgroups.

Type: Application

Filed: May 24, 2024

Publication date: October 17, 2024

Inventors: Andrew M. Havlir, Ajay Simha Modugala, Karl D. Mann
Compute kernel parsing with limits in one or more dimensions with iterating through workgroups in the one or more dimensions for execution

Patent number: 12020075

Abstract: Techniques are disclosed relating to dispatching compute work from a compute stream. In some embodiments, a graphics processor executes instructions of compute kernels. Workload parser circuitry may determine, for distribution to the graphics processor circuitry, a set of workgroups from a compute kernel that includes workgroups organized in multiple dimensions, including a first number of workgroups in a first dimension and a second number of workgroups in a second dimension. This may include determining multiple sub-kernels for the compute kernel, wherein a first sub-kernel includes, in the first dimension, a limited number of workgroups that is smaller than the first number of workgroups. The parser circuitry may iterate through workgroups in both the first and second dimensions to generate the set of workgroups, proceeding through the first sub-kernel before iterating through any of the other sub-kernels. Disclosed techniques may provide desirable shapes for batches of workgroups.

Type: Grant

Filed: September 11, 2020

Date of Patent: June 25, 2024

Assignee: Apple Inc.

Inventors: Andrew M. Havlir, Ajay Simha Modugala, Karl D. Mann
Preemption Techniques for Memory-Backed Registers

Publication number: 20240095176

Abstract: Techniques are disclosed relating to thread preemption in the context of memory-backed registers. In some embodiments, a memory hierarchy includes one or more cache levels and one or more memory circuits. Execution circuitry may operate on operands in architectural registers to execute instructions of threads, where data for the architectural registers is stored and backed by the memory hierarchy. Control circuitry may, in response to a context switch indication for a given thread: flush and invalidate a set of architectural register data from a first cache level and store memory page information (e.g., a page catalog base address) associated with the set of architectural register data.

Type: Application

Filed: November 10, 2022

Publication date: March 21, 2024

Inventors: Benjiman L. Goodman, Yoong Chert Foo, Karl D. Mann, Terence M. Potter, Frank W. Liljeros, Jeffrey T. Brady
On-demand Memory Allocation

Publication number: 20240045808

Abstract: Techniques are disclosed relating to dynamically allocating and mapping private memory for requesting circuitry. Disclosed circuitry may receive a private address and translate the private address to a virtual address (which an MMU may then translate to physical address to actually access a storage element). In some embodiments, private memory allocation circuitry is configured to generate page table information and map private memory pages for requests if the page table information is not already setup. In various embodiments, this may advantageously allow dynamic private memory allocation, e.g., to efficiently allocate memory for graphics shaders with different types of workloads. Disclosed caching techniques for page table information may improve performance relative to traditional techniques. Further, disclosed embodiments may facilitate memory consolidation across a device such as a graphics processor.

Type: Application

Filed: October 19, 2023

Publication date: February 8, 2024

Inventors: Justin A. Hensley, Karl D. Mann, Yoong Chert Foo, Terence M. Potter, Frank W. Liljeros, Ralph C. Taylor
On-demand memory allocation

Patent number: 11829298

Abstract: Techniques are disclosed relating to dynamically allocating and mapping private memory for requesting circuitry. Disclosed circuitry may receive a private address and translate the private address to a virtual address (which an MMU may then translate to physical address to actually access a storage element). In some embodiments, private memory allocation circuitry is configured to generate page table information and map private memory pages for requests if the page table information is not already setup. In various embodiments, this may advantageously allow dynamic private memory allocation, e.g., to efficiently allocate memory for graphics shaders with different types of workloads. Disclosed caching techniques for page table information may improve performance relative to traditional techniques. Further, disclosed embodiments may facilitate memory consolidation across a device such as a graphics processor.

Type: Grant

Filed: February 28, 2020

Date of Patent: November 28, 2023

Assignee: Apple Inc.

Inventors: Justin A. Hensley, Karl D. Mann, Yoong Chert Foo, Terence M. Potter, Frank W. Liljeros, Ralph C. Taylor
Compression techniques and hierarchical caching

Patent number: 11488350

Abstract: Techniques are disclosed relating to compression of data stored at different cache levels. In some embodiments, a memory system implements a storage hierarchy that includes first cache circuitry and second cache circuitry at different levels of the hierarchy. Processor circuitry generates write data to be written to the memory system. In some embodiments, first compression circuitry is configured to compress a first block of write data in response to full accumulation of the first block in the first cache circuitry and second compression circuitry is configured to compress a second block of write data in response to full accumulation of the second block in the second cache circuitry. Write circuitry may write the first and second compressed blocks of data in a single combined write to a higher level in the storage hierarchy.

Type: Grant

Filed: June 4, 2021

Date of Patent: November 1, 2022

Assignee: Apple Inc.

Inventors: Anthony P. DeLaurier, Karl D. Mann, Tyson J. Bergland, Winnie W. Yeung
Compute Kernel Parsing with Limits in one or more Dimensions

Publication number: 20220083377

Abstract: Techniques are disclosed relating to dispatching compute work from a compute stream. In some embodiments, a graphics processor executes instructions of compute kernels. Workload parser circuitry may determine, for distribution to the graphics processor circuitry, a set of workgroups from a compute kernel that includes workgroups organized in multiple dimensions, including a first number of workgroups in a first dimension and a second number of workgroups in a second dimension. This may include determining multiple sub-kernels for the compute kernel, wherein a first sub-kernel includes, in the first dimension, a limited number of workgroups that is smaller than the first number of workgroups. The parser circuitry may iterate through workgroups in both the first and second dimensions to generate the set of workgroups, proceeding through the first sub-kernel before iterating through any of the other sub-kernels. Disclosed techniques may provide desirable shapes for batches of workgroups.

Type: Application

Filed: September 11, 2020

Publication date: March 17, 2022

Inventors: Andrew M. Havlir, Ajay Simha Modugala, Karl D. Mann
Compression Techniques and Hierarchical Caching

Publication number: 20210295593

Abstract: Techniques are disclosed relating to compression of data stored at different cache levels. In some embodiments, a memory system implements a storage hierarchy that includes first cache circuitry and second cache circuitry at different levels of the hierarchy. Processor circuitry generates write data to be written to the memory system. In some embodiments, first compression circuitry is configured to compress a first block of write data in response to full accumulation of the first block in the first cache circuitry and second compression circuitry is configured to compress a second block of write data in response to full accumulation of the second block in the second cache circuitry. Write circuitry may write the first and second compressed blocks of data in a single combined write to a higher level in the storage hierarchy.

Type: Application

Filed: June 4, 2021

Publication date: September 23, 2021

Inventors: Anthony P. DeLaurier, Karl D. Mann, Tyson J. Bergland, Winnie W. Yeung
Multi-space rendering with configurable transformation parameters

Patent number: 11113788

Abstract: Techniques are disclosed relating to rendering graphics objects. In some embodiments, a graphics unit is configured to transform graphics objects from a virtual space into a second space according to different transformation parameters for different portions of the second space. This may result in sampling different portions of the virtual space at different sample rates, which may reduce the number of samples required in various stages of the rendering process. In the disclosed techniques, transformation may occur prior to rasterization and shading, which may further reduce computation and power consumption in a graphics unit, improve image quality as displayed to a user, and/or reduce bandwidth usage or latency of video content on a network. In some embodiments, a transformed image may be viewed through a distortion-compensating lens or resampled prior to display.

Type: Grant

Filed: August 24, 2020

Date of Patent: September 7, 2021

Assignee: Apple Inc.

Inventors: Justin A. Hensley, Karl D. Mann, Ralph C. Taylor, Randall R. Rauwendaal, Jonathan M. Redshaw
On-demand Memory Allocation

Publication number: 20210271606

Abstract: Techniques are disclosed relating to dynamically allocating and mapping private memory for requesting circuitry. Disclosed circuitry may receive a private address and translate the private address to a virtual address (which an MMU may then translate to physical address to actually access a storage element). In some embodiments, private memory allocation circuitry is configured to generate page table information and map private memory pages for requests if the page table information is not already setup. In various embodiments, this may advantageously allow dynamic private memory allocation, e.g., to efficiently allocate memory for graphics shaders with different types of workloads. Disclosed caching techniques for page table information may improve performance relative to traditional techniques. Further, disclosed embodiments may facilitate memory consolidation across a device such as a graphics processor.

Type: Application

Filed: February 28, 2020

Publication date: September 2, 2021

Inventors: Justin A. Hensley, Karl D. Mann, Yoong Chert Foo, Terence M. Potter, Frank W. Liljeros, Ralph C. Taylor
Dependency scheduling for control stream in parallel processor

Patent number: 11080101

Abstract: Techniques are disclosed relating to processing a control stream such as a compute control stream. In some embodiments, the control stream includes kernels and commands for multiple substreams. In some embodiments, multiple substream processors are each configured to: fetch and parse portions of the control stream corresponding to an assigned substream and, in response to a neighbor barrier command in the assigned substream that identifies another substream, communicate the identified other substream to a barrier clearing circuitry. In some embodiments, the barrier clearing circuitry is configured to determine whether to allow the assigned substream to proceed past the neighbor barrier command based on communication of a most-recently-completed command from a substream processor to which the other substream is assigned (e.g., based on whether the most-recently-completed command meets a command identifier communicated in the neighbor barrier command).

Type: Grant

Filed: March 22, 2019

Date of Patent: August 3, 2021

Assignee: Apple Inc.

Inventors: Andrew M. Havlir, Jason D. Carroll, Karl D. Mann
Compression techniques for pixel write data

Patent number: 11062507

Abstract: Techniques are disclosed relating to compression of data stored at different cache levels. In some embodiments, programmable shader circuitry is configured to execute program instructions of compute kernels that write pixel data. In some embodiments, a first cache is configured to store pixel write data from the programmable shader circuitry and first compression circuitry is configured to compress a first block of pixel write data in response to full accumulation of the first block in the first cache circuitry. In some embodiments, second cache circuitry is configured to store pixel write data from the programmable shader circuitry at a higher level in a storage hierarchy than the first cache circuitry and second compression circuitry is configured to compress a second block of pixel write data in response to full accumulation of the second block in the second cache circuitry.

Type: Grant

Filed: November 4, 2019

Date of Patent: July 13, 2021

Assignee: Apple Inc.

Inventors: Anthony P. DeLaurier, Karl D. Mann, Tyson J. Bergland, Winnie W. Yeung
Compression Techniques for Pixel Write Data

Publication number: 20210134052

Abstract: Techniques are disclosed relating to compression of data stored at different cache levels. In some embodiments, programmable shader circuitry is configured to execute program instructions of compute kernels that write pixel data. In some embodiments, a first cache is configured to store pixel write data from the programmable shader circuitry and first compression circuitry is configured to compress a first block of pixel write data in response to full accumulation of the first block in the first cache circuitry. In some embodiments, second cache circuitry is configured to store pixel write data from the programmable shader circuitry at a higher level in a storage hierarchy than the first cache circuitry and second compression circuitry is configured to compress a second block of pixel write data in response to full accumulation of the second block in the second cache circuitry.

Type: Application

Filed: November 4, 2019

Publication date: May 6, 2021

Inventors: Anthony P. DeLaurier, Karl D. Mann, Tyson J. Bergland, Winnie W. Yeung
Multi-Space Rendering with Configurable Transformation Parameters

Publication number: 20200388007

Abstract: Techniques are disclosed relating to rendering graphics objects. In some embodiments, a graphics unit is configured to transform graphics objects from a virtual space into a second space according to different transformation parameters for different portions of the second space. This may result in sampling different portions of the virtual space at different sample rates, which may reduce the number of samples required in various stages of the rendering process. In the disclosed techniques, transformation may occur prior to rasterization and shading, which may further reduce computation and power consumption in a graphics unit, improve image quality as displayed to a user, and/or reduce bandwidth usage or latency of video content on a network. In some embodiments, a transformed image may be viewed through a distortion-compensating lens or resampled prior to display.

Type: Application

Filed: August 24, 2020

Publication date: December 10, 2020

Inventors: Justin A. Hensley, Karl D. Mann, Ralph C. Taylor, Randall R. Rauwendaal, Jonathan M. Redshaw
Dependency Scheduling for Control Stream in Parallel Processor

Publication number: 20200301753

Abstract: Techniques are disclosed relating to processing a control stream such as a compute control stream. In some embodiments, the control stream includes kernels and commands for multiple substreams. In some embodiments, multiple substream processors are each configured to: fetch and parse portions of the control stream corresponding to an assigned substream and, in response to a neighbor barrier command in the assigned substream that identifies another substream, communicate the identified other substream to a barrier clearing circuitry. In some embodiments, the barrier clearing circuitry is configured to determine whether to allow the assigned substream to proceed past the neighbor barrier command based on communication of a most-recently-completed command from a substream processor to which the other substream is assigned (e.g., based on whether the most-recently-completed command meets a command identifier communicated in the neighbor barrier command).

Type: Application

Filed: March 22, 2019

Publication date: September 24, 2020

Inventors: Andrew M. Havlir, Jason D. Carroll, Karl D. Mann
Multi-space rendering with configurable transformation parameters

Patent number: 10755383

Abstract: Techniques are disclosed relating to rendering graphics objects. In some embodiments, a graphics unit is configured to transform graphics objects from a virtual space into a second space according to different transformation parameters for different portions of the second space. This may result in sampling different portions of the virtual space at different sample rates, which may reduce the number of samples required in various stages of the rendering process. In the disclosed techniques, transformation may occur prior to rasterization and shading, which may further reduce computation and power consumption in a graphics unit, improve image quality as displayed to a user, and/or reduce bandwidth usage or latency of video content on a network. In some embodiments, a transformed image may be viewed through a distortion-compensating lens or resampled prior to display.

Type: Grant

Filed: September 13, 2018

Date of Patent: August 25, 2020

Assignee: Apple Inc.

Inventors: Justin A. Hensley, Karl D. Mann, Ralph C. Taylor, Randall R. Rauwendaal, Jonathan M. Redshaw
Assigning resources for control signalling for a wireless device

Patent number: 10425948

Abstract: Assigning resources for control signalling for a wireless device is described wherein the resources can be used for user data when not used for control signalling. A network node obtains a collection of at least one configuration group, wherein each configuration group refers to a selection of the resources; determines a load for each one of the configuration groups; when no configuration group of the collection has a load which is less than a first threshold value, instantiates a new configuration group referring to a selection of the resources, and assigns at least part of the resources of the new configuration group to the wireless device; and when there is a first configuration group in the collection having a load which is less than the first threshold value, assigns at least part of the resources of the first configuration group to the wireless device.

Type: Grant

Filed: November 20, 2015

Date of Patent: September 24, 2019

Assignee: Telefonaktiebolaget L M Ericsson (publ)

Inventors: Anders Johansson, Ying Sun, Karl D. Mann
Multi-Space Rendering with Configurable Transformation Parameters

Publication number: 20190102865

Abstract: Techniques are disclosed relating to rendering graphics objects. In some embodiments, a graphics unit is configured to transform graphics objects from a virtual space into a second space according to different transformation parameters for different portions of the second space. This may result in sampling different portions of the virtual space at different sample rates, which may reduce the number of samples required in various stages of the rendering process. In the disclosed techniques, transformation may occur prior to rasterization and shading, which may further reduce computation and power consumption in a graphics unit, improve image quality as displayed to a user, and/or reduce bandwidth usage or latency of video content on a network. In some embodiments, a transformed image may be viewed through a distortion-compensating lens or resampled prior to display.

Type: Application

Filed: September 13, 2018

Publication date: April 4, 2019

Inventors: Justin A. Hensley, Karl D. Mann, Ralph C. Taylor, Randall R. Rauwendaal, Jonathan M. Redshaw

1 2 next