Patents by Inventor Ziyad Hakura

Ziyad Hakura has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230360305
    Abstract: In embodiments a graphics pipeline includes a logic that can assess whether to enable or disable tiled rendering for sets of graphics primitives. The logic applies one or more rules or heuristics to a set of graphics primitives associated with a frame, and determines whether to enable tiled rendering for that set of graphics primitives if the one or more rules or heuristics are satisfied. Otherwise, the logic determines to disable tiled rendering for that set of graphics primitives. As further graphics primitives are received for the frame, the logic may make additional decisions as to whether or not to render the further graphics primitives using tiled rendering.
    Type: Application
    Filed: May 9, 2022
    Publication date: November 9, 2023
    Inventors: Ziyad Hakura, Sriram Venkateshan, Sharad Raj
  • Patent number: 11468630
    Abstract: The disclosure provides a cloud-based renderer and methods of rendering a scene on a computing system using a combination of raytracing and rasterization. In one example, a method of rendering a scene includes: (1) generating at least one raytracing acceleration structure from scene data of the scene, (2) selecting raytracing and rasterization algorithms for rendering the scene based on the scene data, and (3) rendering the scene utilizing a combination of the raytracing algorithms and the rasterization algorithms, wherein the rasterization algorithms utilize primitive cluster data from the raytracing acceleration structures.
    Type: Grant
    Filed: December 1, 2020
    Date of Patent: October 11, 2022
    Assignee: NVIDIA Corporation
    Inventors: Christoph Kubisch, Ziyad Hakura, Manuel Kraemer
  • Patent number: 11016802
    Abstract: In various embodiments, an ordered atomic operation enables a parallel processing subsystem to executes an atomic operation associated with a memory location in a specified order relative to other ordered atomic operations associated with the memory location. A level 2 (L2) cache slice includes an atomic processing circuit and a content-addressable memory (CAM). The CAM stores an ordered atomic operation specifying at least a memory address, an atomic operation, and an ordering number. In operation, the atomic processing circuit performs a look-up operation on the CAM, where the look-up operation specifies the memory address. After the atomic processing circuit determines that the ordering number is equal to a current ordering number associated with the memory address, the atomic processing circuit executes the atomic operation and returns the result to a processor executing an algorithm.
    Type: Grant
    Filed: January 26, 2018
    Date of Patent: May 25, 2021
    Assignee: NVIDIA Corporation
    Inventors: Ziyad Hakura, Olivier Giroux, Wishwesh Gandhi
  • Publication number: 20210082177
    Abstract: The disclosure provides a cloud-based renderer and methods of rendering a scene on a computing system using a combination of raytracing and rasterization. In one example, a method of rendering a scene includes: (1) generating at least one raytracing acceleration structure from scene data of the scene, (2) selecting raytracing and rasterization algorithms for rendering the scene based on the scene data, and (3) rendering the scene utilizing a combination of the raytracing algorithms and the rasterization algorithms, wherein the rasterization algorithms utilize primitive cluster data from the raytracing acceleration structures.
    Type: Application
    Filed: December 1, 2020
    Publication date: March 18, 2021
    Inventors: Christoph Kubisch, Ziyad Hakura, Manuel Kraemer
  • Patent number: 10909739
    Abstract: In various embodiments, a parallel processor implements a graphics processing pipeline that generates rendered images. In operation, the parallel processor causes execution threads to execute a task shading program on an input mesh to generate a task shader output specifying a mesh shader count. The parallel processor then generates mesh shader identifiers, where the total number of the mesh shader identifiers equals the mesh shader count. For each mesh shader identifier, the parallel processor invokes a mesh shader based on the mesh shader identifier and the task shader output to generate geometry associated with the mesh shader identifier. Subsequently, the parallel processor performs operations on the geometries associated with the mesh shader identifiers to generate a rendered image. Advantageously, unlike conventional graphics processing pipelines, the performance of the graphics processing pipeline is not limited by a primitive distributor.
    Type: Grant
    Filed: January 26, 2018
    Date of Patent: February 2, 2021
    Assignee: NVIDIA Corporation
    Inventors: Ziyad Hakura, Yury Uralsky, Christoph Kubisch, Pierre Boudier, Henry Moreton
  • Patent number: 10878611
    Abstract: In various embodiments, a deduplication application pre-processes index buffers for a graphics processing pipeline that generates rendered images via a shading program. In operation, the deduplication application causes execution threads to identify a set of unique vertices specified in an index buffer based on an instruction. The deduplication application then generates a vertex buffer and an indirect index buffer based on the set of unique vertices. The vertex buffer and the indirect index buffer are associated with a portion of an input mesh. The graphics processing pipeline then renders a first frame and a second frame based on the vertex buffer, the indirect index buffer, and the shading program. Advantageously, the graphics processing pipeline may re-use the vertex buffer and indirect index buffer until the topology of the input mesh changes.
    Type: Grant
    Filed: January 26, 2018
    Date of Patent: December 29, 2020
    Assignee: NVIDIA Corporation
    Inventors: Ziyad Hakura, Yury Uralsky, Christoph Kubisch, Pierre Boudier, Henry Moreton
  • Patent number: 10853994
    Abstract: The disclosure is directed to methods and processes of rendering a complex scene using a combination of raytracing and rasterization. The methods and processes can be implemented in a video driver or software library. A developer of an application can provide information to an application programming interface (API) call as if a conventional raytrace API is being called. The method and processes can analyze the scene using a variety of parameters to determine a grouping of objects within the scene. The rasterization algorithm can use as input primitive cluster data retrieved from raytracing acceleration structures. Each group of objects can be rendered using its own balance of raytracing and rasterization to improve rendering performance while maintaining a visual quality target level.
    Type: Grant
    Filed: May 23, 2019
    Date of Patent: December 1, 2020
    Assignee: Nvidia Corporation
    Inventors: Christoph Kubisch, Ziyad Hakura, Manuel Kraemer
  • Publication number: 20200372703
    Abstract: The disclosure is directed to methods and processes of rendering a complex scene using a combination of raytracing and rasterization. The methods and processes can be implemented in a video driver or software library. A developer of an application can provide information to an application programming interface (API) call as if a conventional raytrace API is being called. The method and processes can analyze the scene using a variety of parameters to determine a grouping of objects within the scene. The rasterization algorithm can use as input primitive cluster data retrieved from raytracing acceleration structures. Each group of objects can be rendered using its own balance of raytracing and rasterization to improve rendering performance while maintaining a visual quality target level.
    Type: Application
    Filed: May 23, 2019
    Publication date: November 26, 2020
    Inventors: Christoph Kubisch, Ziyad Hakura, Manuel Kraemer
  • Patent number: 10600229
    Abstract: In various embodiments, a parallel processor implements a graphics processing pipeline that generates rendered images via a shading program. In operation, the parallel processor causes a first set of execution threads to execute the shading program on a first portion of the input mesh to generate first geometry stored in an on-chip memory. The parallel processor also causes a second set of execution threads to execute the mesh shading program on a second portion of the input mesh to generate second geometry stored in the on-chip memory. Subsequently, the parallel processor reads the first geometry and the second geometry from the on-chip memory, and performs operations on the first geometry and the second geometry to generate a rendered image derived from the input mesh. Advantageously, unlike conventional graphics processing pipelines, the performance of the graphics processing pipeline is not limited by a primitive distributor.
    Type: Grant
    Filed: January 26, 2018
    Date of Patent: March 24, 2020
    Assignee: NVIDIA Corporation
    Inventors: Ziyad Hakura, Yury Uralsky, Christoph Kubisch, Pierre Boudier, Henry Moreton
  • Patent number: 10453168
    Abstract: A tile coalescer within a graphics processing pipeline coalesces coverage data into tiles. The coverage data indicates, for a set of XY positions, whether a graphics primitive covers those XY positions. The tile indicates, for a larger set of XY positions, whether one or more graphics primitives cover those XY positions. The tile coalescer includes coverage data in the tile only once for each XY position, thereby allowing the API ordering of the graphics primitives covering each XY position to be preserved. The tile is then distributed to a set of streaming multiprocessors for shading and blending operations. The different streaming multiprocessors execute thread groups to process the tile. In doing so, those thread groups may perform read-modify-write operations with data stored in memory. Each such thread group is scheduled to execute via atomic operations, and according to the API order of the associated graphics primitives.
    Type: Grant
    Filed: August 17, 2018
    Date of Patent: October 22, 2019
    Assignee: NVIDIA CORPORATION
    Inventors: Ziyad Hakura, Eric Lum, Dale Kirkland, Jack Choquette, Patrick R. Brown, Yury Y. Uralsky, Jeffrey Bolz
  • Patent number: 10430989
    Abstract: A multi-pass unit interoperates with a device driver to configure a screen space pipeline to perform multiple processing passes with buffered graphics primitives. The multi-pass unit receives primitive data and state bundles from the device driver. The primitive data includes a graphics primitive and a primitive mask. The primitive mask indicates the specific passes when the graphics primitive should be processed. The state bundles include one or more state settings and a state mask. The state mask indicates the specific passes where the state settings should be applied. The primitives and state settings are interleaved. For a given pass, the multi-pass unit extracts the interleaved state settings for that pass and configures the screen space pipeline according to those state settings. The multi-pass unit also extracts the interleaved graphics primitives to be processed in that pass. Then, the multi-pass unit causes the screen space pipeline to process those graphics primitives.
    Type: Grant
    Filed: November 25, 2015
    Date of Patent: October 1, 2019
    Assignee: NVIDIA CORPORATION
    Inventors: Ziyad Hakura, Cynthia Allison, Dale Kirkland, Jeffrey Bolz, Yury Uralsky, Jonah Alben
  • Publication number: 20190236829
    Abstract: In various embodiments, a deduplication application pre-processes index buffers for a graphics processing pipeline that generates rendered images via a shading program. In operation, the deduplication application causes execution threads to identify a set of unique vertices specified in an index buffer based on an instruction. The deduplication application then generates a vertex buffer and an indirect index buffer based on the set of unique vertices. The vertex buffer and the indirect index buffer are associated with a portion of an input mesh. The graphics processing pipeline then renders a first frame and a second frame based on the vertex buffer, the indirect index buffer, and the shading program. Advantageously, the graphics processing pipeline may re-use the vertex buffer and indirect index buffer until the topology of the input mesh changes.
    Type: Application
    Filed: January 26, 2018
    Publication date: August 1, 2019
    Inventors: Ziyad HAKURA, Yury URALSKY, Christoph KUBISCH, Pierre BOUDIER, Henry MORETON
  • Publication number: 20190235915
    Abstract: In various embodiments, an ordered atomic operation enables a parallel processing subsystem to executes an atomic operation associated with a memory location in a specified order relative to other ordered atomic operations associated with the memory location. A level 2 (L2) cache slice includes an atomic processing circuit and a content-addressable memory (CAM). The CAM stores an ordered atomic operation specifying at least a memory address, an atomic operation, and an ordering number. In operation, the atomic processing circuit performs a look-up operation on the CAM, where the look-up operation specifies the memory address. After the atomic processing circuit determines that the ordering number is equal to a current ordering number associated with the memory address, the atomic processing circuit executes the atomic operation and returns the result to a processor executing an algorithm.
    Type: Application
    Filed: January 26, 2018
    Publication date: August 1, 2019
    Inventors: Ziyad HAKURA, Olivier GIROUX, Wishwesh GANDHI
  • Publication number: 20190236827
    Abstract: In various embodiments, a parallel processor implements a graphics processing pipeline that generates rendered images via a shading program. In operation, the parallel processor causes a first set of execution threads to execute the shading program on a first portion of the input mesh to generate first geometry stored in an on-chip memory. The parallel processor also causes a second set of execution threads to execute the mesh shading program on a second portion of the input mesh to generate second geometry stored in the on-chip memory. Subsequently, the parallel processor reads the first geometry and the second geometry from the on-chip memory, and performs operations on the first geometry and the second geometry to generate a rendered image derived from the input mesh. Advantageously, unlike conventional graphics processing pipelines, the performance of the graphics processing pipeline is not limited by a primitive distributor.
    Type: Application
    Filed: January 26, 2018
    Publication date: August 1, 2019
    Inventors: Ziyad HAKURA, Yury URALSKY, Christoph KUBISCH, Pierre BOUDIER, Henry MORETON
  • Publication number: 20190236828
    Abstract: In various embodiments, a parallel processor implements a graphics processing pipeline that generates rendered images. In operation, the parallel processor causes execution threads to execute a task shading program on an input mesh to generate a task shader output specifying a mesh shader count. The parallel processor then generates mesh shader identifiers, where the total number of the mesh shader identifiers equals the mesh shader count. For each mesh shader identifier, the parallel processor invokes a mesh shader based on the mesh shader identifier and the task shader output to generate geometry associated with the mesh shader identifier. Subsequently, the parallel processor performs operations on the geometries associated with the mesh shader identifiers to generate a rendered image. Advantageously, unlike conventional graphics processing pipelines, the performance of the graphics processing pipeline is not limited by a primitive distributor.
    Type: Application
    Filed: January 26, 2018
    Publication date: August 1, 2019
    Inventors: Ziyad HAKURA, Yury URALSKY, Christoph KUBISCH, Pierre BOUDIER, Henry MORETON
  • Patent number: 10332310
    Abstract: One embodiment of the present invention includes a technique for distributing work slices associated with a graphics processing unit for processing. A primitive distribution system receives a draw command related to a graphics object associated with a plurality of indices. The primitive distribution system creates a plurality of work slices, where each work slice is associated with a different subset of the indices included in the plurality of indices. The primitive distribution system scans a first subset of indices to identify a first set of characteristics that is needed to process a second subset of indices. The primitive distribution system processes the second subset of indices based at least in part on the one or more characteristics.
    Type: Grant
    Filed: December 22, 2015
    Date of Patent: June 25, 2019
    Assignee: NVIDIA CORPORATION
    Inventors: Niket Agrawal, Amit Jain, Dale Kirkland, Karim Abdalla, Ziyad Hakura, Haren Kethareswaran
  • Publication number: 20180374185
    Abstract: A tile coalescer within a graphics processing pipeline coalesces coverage data into tiles. The coverage data indicates, for a set of XY positions, whether a graphics primitive covers those XY positions. The tile indicates, for a larger set of XY positions, whether one or more graphics primitives cover those XY positions. The tile coalescer includes coverage data in the tile only once for each XY position, thereby allowing the API ordering of the graphics primitives covering each XY position to be preserved. The tile is then distributed to a set of streaming multiprocessors for shading and blending operations. The different streaming multiprocessors execute thread groups to process the tile. In doing so, those thread groups may perform read-modify-write operations with data stored in memory. Each such thread group is scheduled to execute via atomic operations, and according to the API order of the associated graphics primitives.
    Type: Application
    Filed: August 17, 2018
    Publication date: December 27, 2018
    Inventors: Ziyad Hakura, Eric Lum, Dale Kirkland, Jack Choquette, Patrick R. Brown, Yury Y. Uralsky, Jeffrey Bolz
  • Patent number: 10147222
    Abstract: A multi-pass unit interoperates with a device driver to configure a screen space pipeline to perform multiple processing passes with buffered graphics primitives. The multi-pass unit receives primitive data and state bundles from the device driver. The primitive data includes a graphics primitive and a primitive mask. The primitive mask indicates the specific passes when the graphics primitive should be processed. The state bundles include one or more state settings and a state mask. The state mask indicates the specific passes where the state settings should be applied. The primitives and state settings are interleaved. For a given pass, the multi-pass unit extracts the interleaved state settings for that pass and configures the screen space pipeline according to those state settings. The multi-pass unit also extracts the interleaved graphics primitives to be processed in that pass. Then, the multi-pass unit causes the screen space pipeline to process those graphics primitives.
    Type: Grant
    Filed: November 25, 2015
    Date of Patent: December 4, 2018
    Assignee: NVIDIA CORPORATION
    Inventors: Ziyad Hakura, Cynthia Allison, Dale Kirkland, Jeffrey Bolz, Yury Uralsky, Jonah Alben
  • Patent number: 10055806
    Abstract: A tile coalescer within a graphics processing pipeline coalesces coverage data into tiles. The coverage data indicates, for a set of XY positions, whether a graphics primitive covers those XY positions. The tile indicates, for a larger set of XY positions, whether one or more graphics primitives cover those XY positions. The tile coalescer includes coverage data in the tile only once for each XY position, thereby allowing the API ordering of the graphics primitives covering each XY position to be preserved. The tile is then distributed to a set of streaming multiprocessors for shading and blending operations. The different streaming multiprocessors execute thread groups to process the tile. In doing so, those thread groups may perform read-modify-write operations with data stored in memory. Each such thread group is scheduled to execute via atomic operations, and according to the API order of the associated graphics primitives.
    Type: Grant
    Filed: October 27, 2015
    Date of Patent: August 21, 2018
    Assignee: NVIDIA CORPORATION
    Inventors: Ziyad Hakura, Eric Lum, Dale Kirkland, Jack Choquette, Patrick R. Brown, Yury Y. Uralsky, Jeffrey Bolz
  • Patent number: 10032245
    Abstract: A tile coalescer within a graphics processing pipeline coalesces coverage data into tiles. The coverage data indicates, for a set of XY positions, whether a graphics primitive covers those XY positions. The tile indicates, for a larger set of XY positions, whether one or more graphics primitives cover those XY positions. The tile coalescer includes coverage data in the tile only once for each XY position, thereby allowing the API ordering of the graphics primitives covering each XY position to be preserved. The tile is then distributed to a set of streaming multiprocessors for shading and blending operations. The different streaming multiprocessors execute thread groups to process the tile. In doing so, those thread groups may perform read-modify-write operations with data stored in memory. Each such thread group is scheduled to execute via atomic operations, and according to the API order of the associated graphics primitives.
    Type: Grant
    Filed: October 27, 2015
    Date of Patent: July 24, 2018
    Assignee: NVIDIA CORPORATION
    Inventors: Ziyad Hakura, Eric Lum, Dale Kirkland, Jack Choquette, Patrick R. Brown, Yury Y. Uralsky, Jeffrey Bolz