Patents by Inventor Jerome F. Duluk, Jr.

Jerome F. Duluk, Jr. has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Migration directives in a unified virtual memory system architecture

Patent number: 9430400

Abstract: One embodiment of the present invention sets forth a computer-implemented method for altering migration rules for a unified virtual memory system. The method includes detecting that a migration rule trigger has been satisfied. The method also includes identifying a migration rule action that is associated with the migration rule trigger. The method further includes executing the migration rule action. Other embodiments of the present invention include a computer-readable medium, a computing device, and a unified virtual memory subsystem. One advantage of the disclosed approach is that various settings of the unified virtual memory system may be modified during program execution. This ability to alter the settings allows for an application to vary the manner in which memory pages are migrated and otherwise manipulated, which provides the application the ability to optimize the unified virtual memory system for efficient execution.

Type: Grant

Filed: December 17, 2013

Date of Patent: August 30, 2016

Assignee: NVIDIA Corporation

Inventor: Jerome F. Duluk, Jr.
Migrating pages of different sizes between heterogeneous processors

Patent number: 9424201

Abstract: One embodiment of the present invention sets forth a computer-implemented method for migrating a memory page from a first memory to a second memory. The method includes determining a first page size supported by the first memory. The method also includes determining a second page size supported by the second memory. The method further includes determining a use history of the memory page based on an entry in a page state directory associated with the memory page. The method also includes migrating the memory page between the first memory and the second memory based on the first page size, the second page size, and the use history.

Type: Grant

Filed: December 19, 2013

Date of Patent: August 23, 2016

Assignee: NVIDIA Corporation

Inventors: Jerome F. Duluk, Jr., Cameron Buschardt, James Leroy Deming, Lucien Dunning, Brian Fahs, Mark Hairgrove, Chenghuan Jia, John Mashey, James M. Van Dyke
Technique for storing shared vertices

Patent number: 9418616

Abstract: A graphics processing unit includes a set of geometry processing units each configured to process graphics primitives in parallel with one another. A given geometry processing unit generates one or more graphics primitives or geometry objects and buffers the associated vertex data locally. The geometry processing unit also buffers different sets of indices to those vertices, where each such set represents a different graphics primitive or geometry object. The geometry processing units may then stream the buffered vertices and indices to global buffers in parallel with one another. A stream output synchronization unit coordinates the parallel streaming of vertices and indices by providing each geometry processing unit with a different base address within a global vertex buffer where vertices may be written. The stream output synchronization unit also provides each geometry processing unit with a different base address within a global index buffer where indices may be written.

Type: Grant

Filed: December 20, 2012

Date of Patent: August 16, 2016

Assignee: NVIDIA CORPORATION

Inventors: Jerome F. Duluk, Jr., Ziyad S. Hakura, Henry Packard Moreton
Rendering using multiple render target sample masks

Patent number: 9396515

Abstract: One embodiment sets forth a method for transforming 3-D images into 2-D rendered images using render target sample masks. A software application creates multiple render targets associated with a surface. For each render target, the software application also creates an associated render target sample mask configured to select one or more samples included in each pixel. Within the graphics pipeline, a pixel shader processes each pixel individually and outputs multiple render target-specific color values. For each render target, a ROP unit uses the associated render target sample mask to select covered samples included in the pixel. Subsequently, the ROP unit uses the render target-specific color value to update the selected samples in the render target, thereby achieving sample-level color granularity.

Type: Grant

Filed: August 16, 2013

Date of Patent: July 19, 2016

Assignee: NVIDIA CORPORATION

Inventors: Eric B. Lum, Jerome F. Duluk, Jr., Yury Y. Uralsky, Rouslan Dimitrov, Rui M. Bastos
System, method, and computer program product for low latency scheduling and launch of memory defined tasks

Patent number: 9378139

Abstract: A system, method, and computer program product for low-latency scheduling and launch of memory defined tasks. The method includes the steps of receiving a task metadata data structure to be stored in a memory associated with a processor, transmitting the task metadata data structure to a scheduling unit of the processor, storing the task metadata data structure in a cache unit included in the scheduling unit, and copying the task metadata data structure from the cache unit to the memory.

Type: Grant

Filed: May 8, 2013

Date of Patent: June 28, 2016

Assignee: NVIDIA Corporation

Inventors: Scott Ricketts, Brian Scott Pharris, Nicholas Wang, Luke David Durant, Philip Alexander Cuadra, Jerome F. Duluk, Jr.
Frame buffer access tracking via a sliding window in a unified virtual memory system

Patent number: 9355041

Abstract: One embodiment of the present invention is a memory subsystem that includes a sliding window tracker that tracks memory accesses associated with a sliding window of memory page groups. When the sliding window tracker detects an access operation associated with a memory page group within the sliding window, the sliding window tracker sets a reference bit that is associated with the memory page group and is included in a reference vector that represents accesses to the memory page groups within the sliding window. Based on the values of the reference bits, the sliding window tracker causes the selection a memory page in a memory page group that has fallen into disuse from a first memory to a second memory. Because the sliding window tracker tunes the memory pages that are resident in the first memory to reflect memory access patterns, the overall performance of the memory subsystem is improved.

Type: Grant

Filed: December 12, 2013

Date of Patent: May 31, 2016

Assignee: NVIDIA Corporation

Inventors: John Mashey, Cameron Buschardt, James Leroy Deming, Jerome F. Duluk, Jr., Brian Fahs
Techniques for interleaving surfaces

Patent number: 9355430

Abstract: One embodiment sets forth a method for allocating memory to surfaces. A software application specifies surface data, including interleaving state data. Based on the interleaving state data, a surface access unit bloats addressees derived from discrete coordinates associated with the surface, creating a bloated virtual address space with a predictable pattern of addresses that do not correspond to data. Advantageously, by creating predictable regions of addresses that do not correspond to data, the software application program may configure the surface to share physical memory space with one or more other surfaces. In particular, the software application may map the virtual address space together with one or more virtual address spaces corresponding to complementary data patterns to the same physical base address. And, by overlapping the virtual address spaces onto the same pages in physical address space, the physical memory may be more densely packed than by using prior-art allocation techniques.

Type: Grant

Filed: September 20, 2013

Date of Patent: May 31, 2016

Assignee: NVIDIA Corporation

Inventors: Eric B. Lum, Cass W. Everitt, Henry Packard Moreton, Yury Y. Uralsky, Cyril Crassin, Jerome F. Duluk, Jr.
Managing per-tile event count reports in a tile-based architecture

Patent number: 9311097

Abstract: A graphics processing system configured to track per-tile event counts in a tile-based architecture. A tiling unit in the graphics processing system is configured to cause a screen-space pipeline to load a count value associated with a first cache tile into a count memory and to cause the screen-space pipeline to process a first set of primitives that intersect the first cache tile. The tiling unit is further configured to cause the screen-space pipeline to store a second count value in a report memory location. The tiling unit is also configured to cause the screen-space pipeline to process a second set of primitives that intersect the first cache tile and to cause the screen-space pipeline to store a third count value in the first accumulating memory. Conditional rendering operations may be performed on a per-cache tile basis, based on the per-tile event count.

Type: Grant

Filed: October 23, 2013

Date of Patent: April 12, 2016

Assignee: NVIDIA Corporation

Inventors: Ziyad S. Hakura, Jerome F. Duluk, Jr.
Technique for storing shared vertices

Patent number: 9293109

Abstract: A graphics processing unit includes a set of geometry processing units each configured to process graphics primitives in parallel with one another. A given geometry processing unit generates one or more graphics primitives or geometry objects and buffers the associated vertex data locally. The geometry processing unit also buffers different sets of indices to those vertices, where each such set represents a different graphics primitive or geometry object. The geometry processing units may then stream the buffered vertices and indices to global buffers in parallel with one another. A stream output synchronization unit coordinates the parallel streaming of vertices and indices by providing each geometry processing unit with a different base address within a global vertex buffer where vertices may be written. The stream output synchronization unit also provides each geometry processing unit with a different base address within a global index buffer where indices may be written.

Type: Grant

Filed: December 20, 2012

Date of Patent: March 22, 2016

Assignee: NVIDIA Corporation

Inventors: Jerome F. Duluk, Jr., Ziyad S. Hakura, Henry Packard Moreton
GPU work creation and stateless graphics in OPENGL

Patent number: 9275491

Abstract: One embodiment of the present invention sets forth a method for generating work to be processed by a graphics pipeline residing within a graphics processor. The method includes the steps of receiving an indication that a first graphics workload is to be submitted to a command queue associated with the graphics processor, allocating a first portion of shader accessible memory for one or more units of state information that are necessary for processing the first graphics workload, populating the first portion of shader accessible memory with the one or more units of state information, and transmitting to the command queue of the graphics processor the one or more units of state information stored within the first portion of shader accessible memory, wherein the first graphics workload is processed within the graphics pipeline based on the one or more units of state information.

Type: Grant

Filed: April 1, 2011

Date of Patent: March 1, 2016

Assignee: NVIDIA Corporation

Inventors: Jeffrey A. Bolz, Jesse David Hall, Jerome F. Duluk, Jr., Patrick R. Brown, Gregory Scott Palmer
TECHNIQUES FOR OPTIMIZING STENCIL BUFFERS

Publication number: 20150339799

Abstract: One embodiment sets forth a method for associating each stencil value included in a stencil buffer with multiple fragments. Components within a graphics processing pipeline use a set of stencil masks to partition the bits of each stencil value. Each stencil mask selects a different subset of bits, and each fragment is strategically associated with both a stencil value and a stencil mask. Before performing stencil actions associated with a fragment, the raster operations unit performs stencil mask operations on the operands. No fragments are associated with both the same stencil mask and the same stencil value. Consequently, no fragments are associated with the same stencil bits included in the stencil buffer. Advantageously, by reducing the number of stencil bits associated with each fragment, certain classes of software applications may reduce the wasted memory associated with stencil buffers in which each stencil value is associated with a single fragment.

Type: Application

Filed: August 3, 2015

Publication date: November 26, 2015

Inventors: Eric B. LUM, Jerome F. DULUK, JR.
Programmable blending in multi-threaded processing units

Patent number: 9183609

Abstract: A technique for efficiently rendering content reduces each complex blend mode to a series of basic blend operations. The series of basic blend operations are executed within a recirculating pipeline until a final blended value is computed. The recirculating pipeline is positioned within a color raster operations unit of a graphics processing unit for efficient access to image buffer data.

Type: Grant

Filed: December 20, 2012

Date of Patent: November 10, 2015

Assignee: NVIDIA Corporation

Inventors: Rui Bastos, Mark J. Kilgard, William Craig McKnight, Jerome F. Duluk, Jr., Pierre Souillot, Dale L. Kirkland, Christian Amsinck, Joseph Detmer, Christian Rouet, Don Bittel
Techniques for optimizing stencil buffers

Patent number: 9098925

Abstract: One embodiment sets forth a method for associating each stencil value included in a stencil buffer with multiple fragments. Components within a graphics processing pipeline use a set of stencil masks to partition the bits of each stencil value. Each stencil mask selects a different subset of bits, and each fragment is strategically associated with both a stencil value and a stencil mask. Before performing stencil actions associated with a fragment, the raster operations unit performs stencil mask operations on the operands. No fragments are associated with both the same stencil mask and the same stencil value. Consequently, no fragments are associated with the same stencil bits included in the stencil buffer. Advantageously, by reducing the number of stencil bits associated with each fragment, certain classes of software applications may reduce the wasted memory associated with stencil buffers in which each stencil value is associated with a single fragment.

Type: Grant

Filed: July 15, 2013

Date of Patent: August 4, 2015

Assignee: NVIDIA Corporation

Inventors: Eric B. Lum, Jerome F. Duluk, Jr.
Techniques for optimizing stencil buffers

Patent number: 9098924

Abstract: One embodiment sets forth a method for associating each stencil value included in a stencil buffer with multiple fragments. Components within a graphics processing pipeline use a set of stencil masks to partition the bits of each stencil value. Each stencil mask selects a different subset of bits, and each fragment is strategically associated with both a stencil value and a stencil mask. Before performing stencil actions associated with a fragment, the raster operations unit performs stencil mask operations on the operands. No fragments are associated with both the same stencil mask and the same stencil value. Consequently, no fragments are associated with the same stencil bits included in the stencil buffer. Advantageously, by reducing the number of stencil bits associated with each fragment, certain classes of software applications may reduce the wasted memory associated with stencil buffers in which each stencil value is associated with a single fragment.

Type: Grant

Filed: July 15, 2013

Date of Patent: August 4, 2015

Assignee: NVIDIA CORPORATION

Inventors: Eric B. Lum, Jerome F. Duluk, Jr.
Programmable blending via multiple pixel shader dispatches

Patent number: 9082212

Abstract: Techniques are disclosed for dispatching pixel information in a graphics processing pipeline. A fragment processing unit in the graphics processing pipeline generates a pixel that includes multiple samples based on a portion of a graphics primitive received by a thread. The fragment processing unit calculates a set of source values, where each source value corresponds to a different sample of the pixel. The fragment processing unit retrieves a set of destination values from a render target, where each destination value corresponds to a different source value. The fragment processing unit blends each source value with a corresponding destination value to create a set of final values, and creates one or more dispatch messages to store the set of final values in a set of output registers. One advantage of the disclosed techniques is that pixel shader programs perform per-sample operations with increased efficiency.

Type: Grant

Filed: December 21, 2012

Date of Patent: July 14, 2015

Assignee: NVIDIA Corporation

Inventors: Jerome F. Duluk, Jr., Jesse David Hall
Scheduling and execution of compute tasks

Patent number: 9069609

Abstract: One embodiment of the present invention sets forth a technique for assigning a compute task to a first processor included in a plurality of processors. The technique involves analyzing each compute task in a plurality of compute tasks to identify one or more compute tasks that are eligible for assignment to the first processor, where each compute task is listed in a first table and is associated with a priority value and an allocation order that indicates relative time at which the compute task was added to the first table. The technique further involves selecting a first task compute from the identified one or more compute tasks based on at least one of the priority value and the allocation order, and assigning the first compute task to the first processor for execution.

Type: Grant

Filed: January 18, 2012

Date of Patent: June 30, 2015

Assignee: NVIDIA CORPORATION

Inventors: Karim M. Abdalla, Lacky V. Shah, Jerome F. Duluk, Jr., Timothy John Purcell, Tanmoy Mandal, Gentaro Hirota
SYSTEM, METHOD, AND COMPUTER PROGRAM PRODUCT FOR SIMULTANEOUS EXECUTION OF COMPUTE AND GRAPHICS WORKLOADS

Publication number: 20150178879

Abstract: A system, method, and computer program product are provided for allocating processor resources to process compute workloads and graphics workloads substantially simultaneously. The method includes the steps of allocating a plurality of processing units to process tasks associated with a graphics pipeline, receiving a request to allocate at least one processing unit in the plurality of processing units to process tasks associated with a compute pipeline, and reallocating the at least one processing unit to process tasks associated with the compute pipeline.

Type: Application

Filed: December 20, 2013

Publication date: June 25, 2015

Applicant: NVIDIA CORPORATION

Inventors: Gregory S. Palmer, Jerome F. Duluk, JR., Karim Maher Abdalla, Jonathon S. Evans, Adam Clark Weitkemper, Lacky Vasant Shah, Philip Browning Johnson, Gentaro Hirota
Determining a working set of texture maps

Patent number: 9013498

Abstract: A system and method for tracking and reporting texture map levels of detail that are computed during graphics processing allows for efficient management of texture map storage. Minimum and/or maximum pre-clamped texture map levels of detail values are tracked by a graphics processor and an array stored in memory is updated to report the minimum and/or maximum values for use by an application program. The minimum and/or maximum values may be used to determine the active set of texture map levels of detail that is loaded into graphics memory.

Type: Grant

Filed: December 19, 2008

Date of Patent: April 21, 2015

Assignee: NVIDIA Corporation

Inventors: John S. Montrym, Andrew J. Tao, Henry P. Moreton, Emmett M. Kilgariff, Cass W. Everitt, Alexander L. Minkin, Eric Anderson, Yan Yan Tang, Jerome F. Duluk, Jr.
MANAGING MEMORY REGIONS TO SUPPORT SPARSE MAPPINGS

Publication number: 20150097847

Abstract: One embodiment of the present invention includes a memory management unit (MMU) that is configured to manage sparse mappings. The MMU processes requests to translate virtual addresses to physical addresses based on page table entries (PTEs) that indicate a sparse status. If the MMU determines that the PTE does not include a mapping from a virtual address to a physical address, then the MMU responds to the request based on the sparse status. If the sparse status is active, then the MMU determines the physical address based on whether the type of the request is a write operation and, subsequently, generates an acknowledgement of the request. By contrast, if the sparse status is not active, then the MMU generates a page fault. Advantageously, the disclosed embodiments enable the computer system to manage sparse mappings without incurring the performance degradation associated with both page faults and conventional software-based sparse mapping management.

Type: Application

Filed: October 4, 2013

Publication date: April 9, 2015

Applicant: NVIDIA CORPORATION

Inventors: Jonathan DUNAISKY, Henry Packard MORETON, Jeffrey A. BOLZ, Yury Y. URALSKY, James Leroy DEMING, Rui M. BASTOS, Patrick R. BROWN, Amanpreet GREWAL, Christian AMSINCK, Poornachandra RAO, Jerome F. DULUK, JR., Andrew J. TAO
TECHNIQUES FOR INTERLEAVING SURFACES

Publication number: 20150084974

Abstract: One embodiment sets forth a method for allocating memory to surfaces. A software application specifies surface data, including interleaving state data. Based on the interleaving state data, a surface access unit bloats addressees derived from discrete coordinates associated with the surface, creating a bloated virtual address space with a predictable pattern of addresses that do not correspond to data. Advantageously, by creating predictable regions of addresses that do not correspond to data, the software application program may configure the surface to share physical memory space with one or more other surfaces. In particular, the software application may map the virtual address space together with one or more virtual address spaces corresponding to complementary data patterns to the same physical base address. And, by overlapping the virtual address spaces onto the same pages in physical address space, the physical memory may be more densely packed than by using prior-art allocation techniques.

Type: Application

Filed: September 20, 2013

Publication date: March 26, 2015

Applicant: NVIDIA CORPORATION

Inventors: Eric B. LUM, Cass W. EVERITT, Henry Packard MORETON, Yury Y. URALSKY, Cyril CRASSIN, Jerome F. DULUK, Jr.

prev 1 2 3 4 5 6 7 8 9 … next