Patents by Inventor Ignacio Llamas

Ignacio Llamas has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20200051316
    Abstract: A hardware-based traversal coprocessor provides acceleration of tree traversal operations searching for intersections between primitives represented in a tree data structure and a ray. The primitives may include opaque and alpha triangles used in generating a virtual scene. The hardware-based traversal coprocessor is configured to determine primitives intersected by the ray, and return intersection information to a streaming multiprocessor for further processing. The hardware-based traversal coprocessor is configured to provide a deterministic result of intersected triangles regardless of the order that the memory subsystem returns triangle range blocks for processing, while opportunistically eliminating alpha intersections that lie further along the length of the ray than closer opaque intersections.
    Type: Application
    Filed: August 10, 2018
    Publication date: February 13, 2020
    Inventors: Samuli LAINE, Tero KARRAS, Greg MUTHLER, William Parsons NEWHALL, JR., Ronald Charles BABICH, Ignacio LLAMAS, John BURGESS
  • Publication number: 20200050451
    Abstract: Systems and methods for an efficient and robust multiprocessor-coprocessor interface that may be used between a streaming multiprocessor and an acceleration coprocessor in a GPU are provided. According to an example implementation, in order to perform an acceleration of a particular operation using the coprocessor, the multiprocessor: issues a series of write instructions to write input data for the operation into coprocessor-accessible storage locations, issues an operation instruction to cause the coprocessor to execute the particular operation; and then issues a series of read instructions to read result data of the operation from coprocessor-accessible storage locations to multiprocessor-accessible storage locations.
    Type: Application
    Filed: August 10, 2018
    Publication date: February 13, 2020
    Inventors: Ronald Babich, John BURGESS, Jack CHOQUETTE, Tero KARRAS, Samuli LAINE, Ignacio LLAMAS, Gregory MUTHLER, William Parsons NEWHALL, JR.
  • Publication number: 20200051315
    Abstract: Methods and systems are described in some examples for changing the traversal of an acceleration data structure in a highly dynamic query-specific manner, with each query specifying test parameters, a test opcode and a mapping of test results to actions. In an example ray tracing implementation, traversal of a bounding volume hierarchy by a ray is performed with the default behavior of the traversal being changed in accordance with results of a test performed using the test opcode and test parameters specified in the ray data structure and another test parameter specified in a node of the bounding volume hierarchy. In an example implementation a traversal coprocessor is configured to perform the traversal of the bounding volume hierarchy.
    Type: Application
    Filed: August 10, 2018
    Publication date: February 13, 2020
    Inventors: Samuli Laine, Timo AILA, Tero KARRAS, Gregory MUTHLER, William Parsons NEWHALL, JR., Ronald Charles BABICH, JR., Craig KOLB, Ignacio LLAMAS
  • Publication number: 20190311531
    Abstract: In various examples, shader bindings may be recorded in a shader binding table that includes shader records. Geometry of a 3D scene may be instantiated using object instances, and each may be associated with a respective set of the shader records using a location identifier of the set of shader records in memory. The set of shader records may represent shader bindings for an object instance under various predefined conditions. One or more of these predefined conditions may be implicit in the way the shader records are arranged in memory (e.g., indexed by ray type, by sub-geometry, etc.). For example, a section selector value (e.g., a section index) may be computed to locate and select a shader record based at least in part on a result of a ray tracing query (e.g., what sub-geometry was hit, what ray type was traced, etc.).
    Type: Application
    Filed: April 5, 2019
    Publication date: October 10, 2019
    Inventors: Martin Stich, Ignacio Llamas, Steven Parker
  • Publication number: 20190287294
    Abstract: Disclosed approaches may leverage the actual spatial and reflective properties of a virtual environment—such as the size, shape, and orientation of a bidirectional reflectance distribution function (BRDF) lobe of a light path and its position relative to a reflection surface, a virtual screen, and a virtual camera—to produce, for a pixel, an anisotropic kernel filter having dimensions and weights that accurately reflect the spatial characteristics of the virtual environment as well as the reflective properties of the surface. In order to accomplish this, geometry may be computed that corresponds to a projection of a reflection of the BRDF lobe below the surface along a view vector to the pixel. Using this approach, the dimensions of the anisotropic filter kernel may correspond to the BRDF lobe to accurately reflect the spatial characteristics of the virtual environment as well as the reflective properties of the surface.
    Type: Application
    Filed: March 15, 2019
    Publication date: September 19, 2019
    Inventors: Shiqiu Liu, Christopher Ryan Wyman, Jon Hasselgren, Jacob Munkberg, Ignacio Llamas
  • Patent number: 10002031
    Abstract: A first thread is placed into a blocked state by causing the thread to perform a blocking pop operation on a hardware-accelerated, single-entry queue. When a synchronization event completes, a second thread may release the first thread from the blocked state pushing a data value onto the hardware accelerated, single-entry queue. The push operation satisfies the blocking pop operation, and the first thread is released.
    Type: Grant
    Filed: May 8, 2013
    Date of Patent: June 19, 2018
    Assignee: NVIDIA CORPORATION
    Inventors: Ignacio Llamas, James David Balfour
  • Patent number: 9928104
    Abstract: A system, method, and computer program product are provided for accessing a queue. The method includes receiving a first request to reserve a data record entry in a queue, updating a queue state block based on the first request, and returning a response to the request. A second request is received to commit the data record entry and the queue state block is updated based on the second request.
    Type: Grant
    Filed: June 19, 2013
    Date of Patent: March 27, 2018
    Assignee: NVIDIA Corporation
    Inventors: William J. Dally, James David Balfour, Ignacio Llamas Ubieto
  • Patent number: 9760968
    Abstract: A method for using a graphics processor by an electronic device for subdividing an input image into multiple sub-regions. For each particular sub-region, a data structure is created that identifies one or more primitives that are visible in each quad of the particular sub-region. Existing coverage of one or more quads is erased based on graphics state (GState) information resulting in surviving coverage for one or more quads.
    Type: Grant
    Filed: October 31, 2014
    Date of Patent: September 12, 2017
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Derek Lentz, Michael Shebanow, Ignacio Llamas
  • Patent number: 9697044
    Abstract: An application programming interface (API) provides various software constructs that allow a developer to assemble a processing pipeline having arbitrary structure and complexity. Once assembled, the processing pipeline is configured to include a set of interconnected pipestages. Those pipestages are associated with one or more different CTAs that may execute in parallel with one another on a parallel processing unit. The developer specifies the configuration of the pipestages, including the configuration of the different CTAs across all pipestages, as well as the different processing operations performed by each different CTA.
    Type: Grant
    Filed: May 21, 2013
    Date of Patent: July 4, 2017
    Assignee: NVIDIA Corporation
    Inventor: Ignacio Llamas
  • Publication number: 20170083373
    Abstract: One embodiment of the present invention sets forth a technique for performing nested kernel execution within a parallel processing subsystem. The technique involves enabling a parent thread to launch a nested child grid on the parallel processing subsystem, and enabling the parent thread to perform a thread synchronization barrier on the child grid for proper execution semantics between the parent thread and the child grid. This technique advantageously enables the parallel processing subsystem to perform a richer set of programming constructs, such as conditionally executed and nested operations and externally defined library functions without the additional complexity of CPU involvement.
    Type: Application
    Filed: December 2, 2016
    Publication date: March 23, 2017
    Inventors: Stephen Jones, Philip Alexander Cuadra, Daniel Elliot Wexler, Ignacio Llamas, Lacky V. Shah, Jerome F. Duluk, Christopher Lamb
  • Patent number: 9513975
    Abstract: One embodiment of the present invention sets forth a technique for performing nested kernel execution within a parallel processing subsystem. The technique involves enabling a parent thread to launch a nested child grid on the parallel processing subsystem, and enabling the parent thread to perform a thread synchronization barrier on the child grid for proper execution semantics between the parent thread and the child grid. This technique advantageously enables the parallel processing subsystem to perform a richer set of programming constructs, such as conditionally executed and nested operations and externally defined library functions without the additional complexity of CPU involvement.
    Type: Grant
    Filed: May 2, 2012
    Date of Patent: December 6, 2016
    Assignee: NVIDIA Corporation
    Inventors: Stephen Jones, Philip Alexander Cuadra, Daniel Elliot Wexler, Ignacio Llamas, Lacky V. Shah, Jerome F. Duluk, Jr., Christopher Lamb
  • Patent number: 9489245
    Abstract: One embodiment of the present invention enables threads executing on a processor to locally generate and execute work within that processor by way of work queues and command blocks. A device driver, as an initialization procedure for establishing memory objects that enable the threads to locally generate and execute work, generates a work queue, and sets a GP_GET pointer of the work queue to the first entry in the work queue. The device driver also, during the initialization procedure, sets a GP_PUT pointer of the work queue to the last free entry included in the work queue, thereby establishing a range of entries in the work queue into which new work generated by the threads can be loaded and subsequently executed by the processor. The threads then populate command blocks with generated work and point entries in the work queue to the command blocks to effect processor execution of the work stored in the command blocks.
    Type: Grant
    Filed: October 26, 2012
    Date of Patent: November 8, 2016
    Assignee: NVIDIA Corporation
    Inventors: Ignacio Llamas, Craig Ross Duttweiler, Jeffrey A. Bolz, Daniel Elliot Wexler
  • Patent number: 9355483
    Abstract: A system, method, and computer program product are provided for shading primitive fragments. A target buffer may be recast when shaded samples that are covered by a primitive fragment are generated at a first shading rate using a first sampling mode, the shaded samples are stored in the target buffer that is associated with the first sampling mode and the first shading rate, a second sampling mode is determined, and the target buffer is associated with the second sampling mode. A sampling mode and/or shading rate may be changed for a primitive. A primitive fragment that is associated with a first sampling mode and a first shading rate is received and a second sampling mode is determined for the primitive fragment. Shaded samples corresponding to the primitive fragment are generated, at a second shading rate, using the second sampling mode and the shaded samples are stored in a target buffer.
    Type: Grant
    Filed: July 19, 2013
    Date of Patent: May 31, 2016
    Assignee: NVIDIA Corporation
    Inventors: Eric B. Lum, Rouslan L. Dimitrov, Ignacio Llamas Ubieto, Patrick James Neill, Yury Uralsky, Albert Meixner
  • Patent number: 9268601
    Abstract: One embodiment of the present invention sets forth a technique for launching work on a processor. The method includes the steps of initializing a first state object within a memory region accessible to a program executing on the processor, populating the first state object with data associated with a first workload that is generated by the program, and triggering the processing of the first workload on the processor according to the data within the first state object.
    Type: Grant
    Filed: March 31, 2011
    Date of Patent: February 23, 2016
    Assignee: NVIDIA Corporation
    Inventors: Timothy Paul Lottes Farrar, Ignacio Llamas, Daniel Elliot Wexler, Craig Ross Duttweiler
  • Patent number: 9135081
    Abstract: One embodiment of the present invention enables threads executing on a processor to locally generate and execute work within that processor by way of work queues and command blocks. A device driver, as an initialization procedure for establishing memory objects that enable the threads to locally generate and execute work, generates a work queue, and sets a GP_GET pointer of the work queue to the first entry in the work queue. The device driver also, during the initialization procedure, sets a GP_PUT pointer of the work queue to the last free entry included in the work queue, thereby establishing a range of entries in the work queue into which new work generated by the threads can be loaded and subsequently executed by the processor. The threads then populate command blocks with generated work and point entries in the work queue to the command blocks to effect processor execution of the work stored in the command blocks.
    Type: Grant
    Filed: October 26, 2012
    Date of Patent: September 15, 2015
    Assignee: NVIDIA Corporation
    Inventors: Ignacio Llamas, Craig Ross Duttweiler, Jeffrey A. Bolz, Daniel Elliot Wexler
  • Publication number: 20150022537
    Abstract: A system, method, and computer program product are provided for shading primitive fragments. A target buffer may be recast when shaded samples that are covered by a primitive fragment are generated at a first shading rate using a first sampling mode, the shaded samples are stored in the target buffer that is associated with the first sampling mode and the first shading rate, a second sampling mode is determined, and the target buffer is associated with the second sampling mode. A sampling mode and/or shading rate may be changed for a primitive. A primitive fragment that is associated with a first sampling mode and a first shading rate is received and a second sampling mode is determined for the primitive fragment. Shaded samples corresponding to the primitive fragment are generated, at a second shading rate, using the second sampling mode and the shaded samples are stored in a target buffer.
    Type: Application
    Filed: July 19, 2013
    Publication date: January 22, 2015
    Inventors: Eric B. Lum, Rouslan L. Dimitrov, Ignacio Llamas, Patrick James Neill, Yury Uralsky, Albert Meixner
  • Patent number: 8928677
    Abstract: One embodiment of the present invention sets forth a technique for performing low latency computation on a parallel processing subsystem. A low latency functional node is exposed to an operating system. The low latency functional node and a generic functional node are configured to target the same underlying processor resource within the parallel processing subsystem. The operating system stores low latency tasks generated by a user application within a low latency command buffer associated with the low latency functional node. The parallel processing subsystem advantageously executes tasks from the low latency command buffer prior to completing execution of tasks in the generic command buffer, thereby reducing completion latency for the low latency tasks.
    Type: Grant
    Filed: January 24, 2012
    Date of Patent: January 6, 2015
    Assignee: NVIDIA Corporation
    Inventors: Daniel Elliot Wexler, Jeffrey A. Bolz, Jesse David Hall, Philip Alexander Cuadra, Naveen Leekha, Ignacio Llamas
  • Publication number: 20140380002
    Abstract: A system, method, and computer program product are provided for accessing a queue. The method includes receiving a first request to reserve a data record entry in a queue, updating a queue state block based on the first request, and returning a response to the request. A second request is received to commit the data record entry and the queue state block is updated based on the second request.
    Type: Application
    Filed: June 19, 2013
    Publication date: December 25, 2014
    Inventors: William J. Dally, James David Balfour, Ignacio Llamas Ubieto
  • Publication number: 20140351826
    Abstract: An application programming interface (API) provides various software constructs that allow a developer to assemble a processing pipeline having arbitrary structure and complexity. Once assembled, the processing pipeline is configured to include a set of interconnected pipestages. Those pipestages are associated with one or more different CTAs that may execute in parallel with one another on a parallel processing unit. The developer specifies the configuration of the pipestages, including the configuration of the different CTAs across all pipestages, as well as the different processing operations performed by each different CTA.
    Type: Application
    Filed: May 21, 2013
    Publication date: November 27, 2014
    Applicant: NVIDIA CORPORATION
    Inventor: Ignacio LLAMAS
  • Publication number: 20140351827
    Abstract: An application programming interface (API) provides various software constructs that allow a developer to assemble a processing pipeline having arbitrary structure and complexity. Once assembled, the processing pipeline is configured to include a set of interconnected pipestages. Those pipestages are associated with one or more different CTAs that may execute in parallel with one another on a parallel processing unit. The developer specifies the configuration of the pipestages, including the configuration of the different CTAs across all pipestages, as well as the different processing operations performed by each different CTA.
    Type: Application
    Filed: May 21, 2013
    Publication date: November 27, 2014
    Applicant: NVIDIA CORPORATION
    Inventor: Ignacio LLAMAS