Patents by Inventor Ignacio Llamas
Ignacio Llamas has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20200051316Abstract: A hardware-based traversal coprocessor provides acceleration of tree traversal operations searching for intersections between primitives represented in a tree data structure and a ray. The primitives may include opaque and alpha triangles used in generating a virtual scene. The hardware-based traversal coprocessor is configured to determine primitives intersected by the ray, and return intersection information to a streaming multiprocessor for further processing. The hardware-based traversal coprocessor is configured to provide a deterministic result of intersected triangles regardless of the order that the memory subsystem returns triangle range blocks for processing, while opportunistically eliminating alpha intersections that lie further along the length of the ray than closer opaque intersections.Type: ApplicationFiled: August 10, 2018Publication date: February 13, 2020Inventors: Samuli LAINE, Tero KARRAS, Greg MUTHLER, William Parsons NEWHALL, JR., Ronald Charles BABICH, Ignacio LLAMAS, John BURGESS
-
Publication number: 20200050451Abstract: Systems and methods for an efficient and robust multiprocessor-coprocessor interface that may be used between a streaming multiprocessor and an acceleration coprocessor in a GPU are provided. According to an example implementation, in order to perform an acceleration of a particular operation using the coprocessor, the multiprocessor: issues a series of write instructions to write input data for the operation into coprocessor-accessible storage locations, issues an operation instruction to cause the coprocessor to execute the particular operation; and then issues a series of read instructions to read result data of the operation from coprocessor-accessible storage locations to multiprocessor-accessible storage locations.Type: ApplicationFiled: August 10, 2018Publication date: February 13, 2020Inventors: Ronald Babich, John BURGESS, Jack CHOQUETTE, Tero KARRAS, Samuli LAINE, Ignacio LLAMAS, Gregory MUTHLER, William Parsons NEWHALL, JR.
-
Publication number: 20200051315Abstract: Methods and systems are described in some examples for changing the traversal of an acceleration data structure in a highly dynamic query-specific manner, with each query specifying test parameters, a test opcode and a mapping of test results to actions. In an example ray tracing implementation, traversal of a bounding volume hierarchy by a ray is performed with the default behavior of the traversal being changed in accordance with results of a test performed using the test opcode and test parameters specified in the ray data structure and another test parameter specified in a node of the bounding volume hierarchy. In an example implementation a traversal coprocessor is configured to perform the traversal of the bounding volume hierarchy.Type: ApplicationFiled: August 10, 2018Publication date: February 13, 2020Inventors: Samuli Laine, Timo AILA, Tero KARRAS, Gregory MUTHLER, William Parsons NEWHALL, JR., Ronald Charles BABICH, JR., Craig KOLB, Ignacio LLAMAS
-
Publication number: 20190311531Abstract: In various examples, shader bindings may be recorded in a shader binding table that includes shader records. Geometry of a 3D scene may be instantiated using object instances, and each may be associated with a respective set of the shader records using a location identifier of the set of shader records in memory. The set of shader records may represent shader bindings for an object instance under various predefined conditions. One or more of these predefined conditions may be implicit in the way the shader records are arranged in memory (e.g., indexed by ray type, by sub-geometry, etc.). For example, a section selector value (e.g., a section index) may be computed to locate and select a shader record based at least in part on a result of a ray tracing query (e.g., what sub-geometry was hit, what ray type was traced, etc.).Type: ApplicationFiled: April 5, 2019Publication date: October 10, 2019Inventors: Martin Stich, Ignacio Llamas, Steven Parker
-
Publication number: 20190287294Abstract: Disclosed approaches may leverage the actual spatial and reflective properties of a virtual environment—such as the size, shape, and orientation of a bidirectional reflectance distribution function (BRDF) lobe of a light path and its position relative to a reflection surface, a virtual screen, and a virtual camera—to produce, for a pixel, an anisotropic kernel filter having dimensions and weights that accurately reflect the spatial characteristics of the virtual environment as well as the reflective properties of the surface. In order to accomplish this, geometry may be computed that corresponds to a projection of a reflection of the BRDF lobe below the surface along a view vector to the pixel. Using this approach, the dimensions of the anisotropic filter kernel may correspond to the BRDF lobe to accurately reflect the spatial characteristics of the virtual environment as well as the reflective properties of the surface.Type: ApplicationFiled: March 15, 2019Publication date: September 19, 2019Inventors: Shiqiu Liu, Christopher Ryan Wyman, Jon Hasselgren, Jacob Munkberg, Ignacio Llamas
-
Patent number: 10002031Abstract: A first thread is placed into a blocked state by causing the thread to perform a blocking pop operation on a hardware-accelerated, single-entry queue. When a synchronization event completes, a second thread may release the first thread from the blocked state pushing a data value onto the hardware accelerated, single-entry queue. The push operation satisfies the blocking pop operation, and the first thread is released.Type: GrantFiled: May 8, 2013Date of Patent: June 19, 2018Assignee: NVIDIA CORPORATIONInventors: Ignacio Llamas, James David Balfour
-
Patent number: 9928104Abstract: A system, method, and computer program product are provided for accessing a queue. The method includes receiving a first request to reserve a data record entry in a queue, updating a queue state block based on the first request, and returning a response to the request. A second request is received to commit the data record entry and the queue state block is updated based on the second request.Type: GrantFiled: June 19, 2013Date of Patent: March 27, 2018Assignee: NVIDIA CorporationInventors: William J. Dally, James David Balfour, Ignacio Llamas Ubieto
-
Patent number: 9760968Abstract: A method for using a graphics processor by an electronic device for subdividing an input image into multiple sub-regions. For each particular sub-region, a data structure is created that identifies one or more primitives that are visible in each quad of the particular sub-region. Existing coverage of one or more quads is erased based on graphics state (GState) information resulting in surviving coverage for one or more quads.Type: GrantFiled: October 31, 2014Date of Patent: September 12, 2017Assignee: Samsung Electronics Co., Ltd.Inventors: Derek Lentz, Michael Shebanow, Ignacio Llamas
-
Patent number: 9697044Abstract: An application programming interface (API) provides various software constructs that allow a developer to assemble a processing pipeline having arbitrary structure and complexity. Once assembled, the processing pipeline is configured to include a set of interconnected pipestages. Those pipestages are associated with one or more different CTAs that may execute in parallel with one another on a parallel processing unit. The developer specifies the configuration of the pipestages, including the configuration of the different CTAs across all pipestages, as well as the different processing operations performed by each different CTA.Type: GrantFiled: May 21, 2013Date of Patent: July 4, 2017Assignee: NVIDIA CorporationInventor: Ignacio Llamas
-
Publication number: 20170083373Abstract: One embodiment of the present invention sets forth a technique for performing nested kernel execution within a parallel processing subsystem. The technique involves enabling a parent thread to launch a nested child grid on the parallel processing subsystem, and enabling the parent thread to perform a thread synchronization barrier on the child grid for proper execution semantics between the parent thread and the child grid. This technique advantageously enables the parallel processing subsystem to perform a richer set of programming constructs, such as conditionally executed and nested operations and externally defined library functions without the additional complexity of CPU involvement.Type: ApplicationFiled: December 2, 2016Publication date: March 23, 2017Inventors: Stephen Jones, Philip Alexander Cuadra, Daniel Elliot Wexler, Ignacio Llamas, Lacky V. Shah, Jerome F. Duluk, Christopher Lamb
-
Patent number: 9513975Abstract: One embodiment of the present invention sets forth a technique for performing nested kernel execution within a parallel processing subsystem. The technique involves enabling a parent thread to launch a nested child grid on the parallel processing subsystem, and enabling the parent thread to perform a thread synchronization barrier on the child grid for proper execution semantics between the parent thread and the child grid. This technique advantageously enables the parallel processing subsystem to perform a richer set of programming constructs, such as conditionally executed and nested operations and externally defined library functions without the additional complexity of CPU involvement.Type: GrantFiled: May 2, 2012Date of Patent: December 6, 2016Assignee: NVIDIA CorporationInventors: Stephen Jones, Philip Alexander Cuadra, Daniel Elliot Wexler, Ignacio Llamas, Lacky V. Shah, Jerome F. Duluk, Jr., Christopher Lamb
-
Patent number: 9489245Abstract: One embodiment of the present invention enables threads executing on a processor to locally generate and execute work within that processor by way of work queues and command blocks. A device driver, as an initialization procedure for establishing memory objects that enable the threads to locally generate and execute work, generates a work queue, and sets a GP_GET pointer of the work queue to the first entry in the work queue. The device driver also, during the initialization procedure, sets a GP_PUT pointer of the work queue to the last free entry included in the work queue, thereby establishing a range of entries in the work queue into which new work generated by the threads can be loaded and subsequently executed by the processor. The threads then populate command blocks with generated work and point entries in the work queue to the command blocks to effect processor execution of the work stored in the command blocks.Type: GrantFiled: October 26, 2012Date of Patent: November 8, 2016Assignee: NVIDIA CorporationInventors: Ignacio Llamas, Craig Ross Duttweiler, Jeffrey A. Bolz, Daniel Elliot Wexler
-
Patent number: 9355483Abstract: A system, method, and computer program product are provided for shading primitive fragments. A target buffer may be recast when shaded samples that are covered by a primitive fragment are generated at a first shading rate using a first sampling mode, the shaded samples are stored in the target buffer that is associated with the first sampling mode and the first shading rate, a second sampling mode is determined, and the target buffer is associated with the second sampling mode. A sampling mode and/or shading rate may be changed for a primitive. A primitive fragment that is associated with a first sampling mode and a first shading rate is received and a second sampling mode is determined for the primitive fragment. Shaded samples corresponding to the primitive fragment are generated, at a second shading rate, using the second sampling mode and the shaded samples are stored in a target buffer.Type: GrantFiled: July 19, 2013Date of Patent: May 31, 2016Assignee: NVIDIA CorporationInventors: Eric B. Lum, Rouslan L. Dimitrov, Ignacio Llamas Ubieto, Patrick James Neill, Yury Uralsky, Albert Meixner
-
Patent number: 9268601Abstract: One embodiment of the present invention sets forth a technique for launching work on a processor. The method includes the steps of initializing a first state object within a memory region accessible to a program executing on the processor, populating the first state object with data associated with a first workload that is generated by the program, and triggering the processing of the first workload on the processor according to the data within the first state object.Type: GrantFiled: March 31, 2011Date of Patent: February 23, 2016Assignee: NVIDIA CorporationInventors: Timothy Paul Lottes Farrar, Ignacio Llamas, Daniel Elliot Wexler, Craig Ross Duttweiler
-
Patent number: 9135081Abstract: One embodiment of the present invention enables threads executing on a processor to locally generate and execute work within that processor by way of work queues and command blocks. A device driver, as an initialization procedure for establishing memory objects that enable the threads to locally generate and execute work, generates a work queue, and sets a GP_GET pointer of the work queue to the first entry in the work queue. The device driver also, during the initialization procedure, sets a GP_PUT pointer of the work queue to the last free entry included in the work queue, thereby establishing a range of entries in the work queue into which new work generated by the threads can be loaded and subsequently executed by the processor. The threads then populate command blocks with generated work and point entries in the work queue to the command blocks to effect processor execution of the work stored in the command blocks.Type: GrantFiled: October 26, 2012Date of Patent: September 15, 2015Assignee: NVIDIA CorporationInventors: Ignacio Llamas, Craig Ross Duttweiler, Jeffrey A. Bolz, Daniel Elliot Wexler
-
Publication number: 20150022537Abstract: A system, method, and computer program product are provided for shading primitive fragments. A target buffer may be recast when shaded samples that are covered by a primitive fragment are generated at a first shading rate using a first sampling mode, the shaded samples are stored in the target buffer that is associated with the first sampling mode and the first shading rate, a second sampling mode is determined, and the target buffer is associated with the second sampling mode. A sampling mode and/or shading rate may be changed for a primitive. A primitive fragment that is associated with a first sampling mode and a first shading rate is received and a second sampling mode is determined for the primitive fragment. Shaded samples corresponding to the primitive fragment are generated, at a second shading rate, using the second sampling mode and the shaded samples are stored in a target buffer.Type: ApplicationFiled: July 19, 2013Publication date: January 22, 2015Inventors: Eric B. Lum, Rouslan L. Dimitrov, Ignacio Llamas, Patrick James Neill, Yury Uralsky, Albert Meixner
-
Patent number: 8928677Abstract: One embodiment of the present invention sets forth a technique for performing low latency computation on a parallel processing subsystem. A low latency functional node is exposed to an operating system. The low latency functional node and a generic functional node are configured to target the same underlying processor resource within the parallel processing subsystem. The operating system stores low latency tasks generated by a user application within a low latency command buffer associated with the low latency functional node. The parallel processing subsystem advantageously executes tasks from the low latency command buffer prior to completing execution of tasks in the generic command buffer, thereby reducing completion latency for the low latency tasks.Type: GrantFiled: January 24, 2012Date of Patent: January 6, 2015Assignee: NVIDIA CorporationInventors: Daniel Elliot Wexler, Jeffrey A. Bolz, Jesse David Hall, Philip Alexander Cuadra, Naveen Leekha, Ignacio Llamas
-
Publication number: 20140380002Abstract: A system, method, and computer program product are provided for accessing a queue. The method includes receiving a first request to reserve a data record entry in a queue, updating a queue state block based on the first request, and returning a response to the request. A second request is received to commit the data record entry and the queue state block is updated based on the second request.Type: ApplicationFiled: June 19, 2013Publication date: December 25, 2014Inventors: William J. Dally, James David Balfour, Ignacio Llamas Ubieto
-
Publication number: 20140351826Abstract: An application programming interface (API) provides various software constructs that allow a developer to assemble a processing pipeline having arbitrary structure and complexity. Once assembled, the processing pipeline is configured to include a set of interconnected pipestages. Those pipestages are associated with one or more different CTAs that may execute in parallel with one another on a parallel processing unit. The developer specifies the configuration of the pipestages, including the configuration of the different CTAs across all pipestages, as well as the different processing operations performed by each different CTA.Type: ApplicationFiled: May 21, 2013Publication date: November 27, 2014Applicant: NVIDIA CORPORATIONInventor: Ignacio LLAMAS
-
Publication number: 20140351827Abstract: An application programming interface (API) provides various software constructs that allow a developer to assemble a processing pipeline having arbitrary structure and complexity. Once assembled, the processing pipeline is configured to include a set of interconnected pipestages. Those pipestages are associated with one or more different CTAs that may execute in parallel with one another on a parallel processing unit. The developer specifies the configuration of the pipestages, including the configuration of the different CTAs across all pipestages, as well as the different processing operations performed by each different CTA.Type: ApplicationFiled: May 21, 2013Publication date: November 27, 2014Applicant: NVIDIA CORPORATIONInventor: Ignacio LLAMAS