Patents by Inventor Jerome F. Duluk, Jr.
Jerome F. Duluk, Jr. has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 9430400Abstract: One embodiment of the present invention sets forth a computer-implemented method for altering migration rules for a unified virtual memory system. The method includes detecting that a migration rule trigger has been satisfied. The method also includes identifying a migration rule action that is associated with the migration rule trigger. The method further includes executing the migration rule action. Other embodiments of the present invention include a computer-readable medium, a computing device, and a unified virtual memory subsystem. One advantage of the disclosed approach is that various settings of the unified virtual memory system may be modified during program execution. This ability to alter the settings allows for an application to vary the manner in which memory pages are migrated and otherwise manipulated, which provides the application the ability to optimize the unified virtual memory system for efficient execution.Type: GrantFiled: December 17, 2013Date of Patent: August 30, 2016Assignee: NVIDIA CorporationInventor: Jerome F. Duluk, Jr.
-
Patent number: 9424201Abstract: One embodiment of the present invention sets forth a computer-implemented method for migrating a memory page from a first memory to a second memory. The method includes determining a first page size supported by the first memory. The method also includes determining a second page size supported by the second memory. The method further includes determining a use history of the memory page based on an entry in a page state directory associated with the memory page. The method also includes migrating the memory page between the first memory and the second memory based on the first page size, the second page size, and the use history.Type: GrantFiled: December 19, 2013Date of Patent: August 23, 2016Assignee: NVIDIA CorporationInventors: Jerome F. Duluk, Jr., Cameron Buschardt, James Leroy Deming, Lucien Dunning, Brian Fahs, Mark Hairgrove, Chenghuan Jia, John Mashey, James M. Van Dyke
-
Patent number: 9418616Abstract: A graphics processing unit includes a set of geometry processing units each configured to process graphics primitives in parallel with one another. A given geometry processing unit generates one or more graphics primitives or geometry objects and buffers the associated vertex data locally. The geometry processing unit also buffers different sets of indices to those vertices, where each such set represents a different graphics primitive or geometry object. The geometry processing units may then stream the buffered vertices and indices to global buffers in parallel with one another. A stream output synchronization unit coordinates the parallel streaming of vertices and indices by providing each geometry processing unit with a different base address within a global vertex buffer where vertices may be written. The stream output synchronization unit also provides each geometry processing unit with a different base address within a global index buffer where indices may be written.Type: GrantFiled: December 20, 2012Date of Patent: August 16, 2016Assignee: NVIDIA CORPORATIONInventors: Jerome F. Duluk, Jr., Ziyad S. Hakura, Henry Packard Moreton
-
Patent number: 9396515Abstract: One embodiment sets forth a method for transforming 3-D images into 2-D rendered images using render target sample masks. A software application creates multiple render targets associated with a surface. For each render target, the software application also creates an associated render target sample mask configured to select one or more samples included in each pixel. Within the graphics pipeline, a pixel shader processes each pixel individually and outputs multiple render target-specific color values. For each render target, a ROP unit uses the associated render target sample mask to select covered samples included in the pixel. Subsequently, the ROP unit uses the render target-specific color value to update the selected samples in the render target, thereby achieving sample-level color granularity.Type: GrantFiled: August 16, 2013Date of Patent: July 19, 2016Assignee: NVIDIA CORPORATIONInventors: Eric B. Lum, Jerome F. Duluk, Jr., Yury Y. Uralsky, Rouslan Dimitrov, Rui M. Bastos
-
Patent number: 9378139Abstract: A system, method, and computer program product for low-latency scheduling and launch of memory defined tasks. The method includes the steps of receiving a task metadata data structure to be stored in a memory associated with a processor, transmitting the task metadata data structure to a scheduling unit of the processor, storing the task metadata data structure in a cache unit included in the scheduling unit, and copying the task metadata data structure from the cache unit to the memory.Type: GrantFiled: May 8, 2013Date of Patent: June 28, 2016Assignee: NVIDIA CorporationInventors: Scott Ricketts, Brian Scott Pharris, Nicholas Wang, Luke David Durant, Philip Alexander Cuadra, Jerome F. Duluk, Jr.
-
Patent number: 9355041Abstract: One embodiment of the present invention is a memory subsystem that includes a sliding window tracker that tracks memory accesses associated with a sliding window of memory page groups. When the sliding window tracker detects an access operation associated with a memory page group within the sliding window, the sliding window tracker sets a reference bit that is associated with the memory page group and is included in a reference vector that represents accesses to the memory page groups within the sliding window. Based on the values of the reference bits, the sliding window tracker causes the selection a memory page in a memory page group that has fallen into disuse from a first memory to a second memory. Because the sliding window tracker tunes the memory pages that are resident in the first memory to reflect memory access patterns, the overall performance of the memory subsystem is improved.Type: GrantFiled: December 12, 2013Date of Patent: May 31, 2016Assignee: NVIDIA CorporationInventors: John Mashey, Cameron Buschardt, James Leroy Deming, Jerome F. Duluk, Jr., Brian Fahs
-
Patent number: 9355430Abstract: One embodiment sets forth a method for allocating memory to surfaces. A software application specifies surface data, including interleaving state data. Based on the interleaving state data, a surface access unit bloats addressees derived from discrete coordinates associated with the surface, creating a bloated virtual address space with a predictable pattern of addresses that do not correspond to data. Advantageously, by creating predictable regions of addresses that do not correspond to data, the software application program may configure the surface to share physical memory space with one or more other surfaces. In particular, the software application may map the virtual address space together with one or more virtual address spaces corresponding to complementary data patterns to the same physical base address. And, by overlapping the virtual address spaces onto the same pages in physical address space, the physical memory may be more densely packed than by using prior-art allocation techniques.Type: GrantFiled: September 20, 2013Date of Patent: May 31, 2016Assignee: NVIDIA CorporationInventors: Eric B. Lum, Cass W. Everitt, Henry Packard Moreton, Yury Y. Uralsky, Cyril Crassin, Jerome F. Duluk, Jr.
-
Patent number: 9311097Abstract: A graphics processing system configured to track per-tile event counts in a tile-based architecture. A tiling unit in the graphics processing system is configured to cause a screen-space pipeline to load a count value associated with a first cache tile into a count memory and to cause the screen-space pipeline to process a first set of primitives that intersect the first cache tile. The tiling unit is further configured to cause the screen-space pipeline to store a second count value in a report memory location. The tiling unit is also configured to cause the screen-space pipeline to process a second set of primitives that intersect the first cache tile and to cause the screen-space pipeline to store a third count value in the first accumulating memory. Conditional rendering operations may be performed on a per-cache tile basis, based on the per-tile event count.Type: GrantFiled: October 23, 2013Date of Patent: April 12, 2016Assignee: NVIDIA CorporationInventors: Ziyad S. Hakura, Jerome F. Duluk, Jr.
-
Patent number: 9293109Abstract: A graphics processing unit includes a set of geometry processing units each configured to process graphics primitives in parallel with one another. A given geometry processing unit generates one or more graphics primitives or geometry objects and buffers the associated vertex data locally. The geometry processing unit also buffers different sets of indices to those vertices, where each such set represents a different graphics primitive or geometry object. The geometry processing units may then stream the buffered vertices and indices to global buffers in parallel with one another. A stream output synchronization unit coordinates the parallel streaming of vertices and indices by providing each geometry processing unit with a different base address within a global vertex buffer where vertices may be written. The stream output synchronization unit also provides each geometry processing unit with a different base address within a global index buffer where indices may be written.Type: GrantFiled: December 20, 2012Date of Patent: March 22, 2016Assignee: NVIDIA CorporationInventors: Jerome F. Duluk, Jr., Ziyad S. Hakura, Henry Packard Moreton
-
Patent number: 9275491Abstract: One embodiment of the present invention sets forth a method for generating work to be processed by a graphics pipeline residing within a graphics processor. The method includes the steps of receiving an indication that a first graphics workload is to be submitted to a command queue associated with the graphics processor, allocating a first portion of shader accessible memory for one or more units of state information that are necessary for processing the first graphics workload, populating the first portion of shader accessible memory with the one or more units of state information, and transmitting to the command queue of the graphics processor the one or more units of state information stored within the first portion of shader accessible memory, wherein the first graphics workload is processed within the graphics pipeline based on the one or more units of state information.Type: GrantFiled: April 1, 2011Date of Patent: March 1, 2016Assignee: NVIDIA CorporationInventors: Jeffrey A. Bolz, Jesse David Hall, Jerome F. Duluk, Jr., Patrick R. Brown, Gregory Scott Palmer
-
Publication number: 20150339799Abstract: One embodiment sets forth a method for associating each stencil value included in a stencil buffer with multiple fragments. Components within a graphics processing pipeline use a set of stencil masks to partition the bits of each stencil value. Each stencil mask selects a different subset of bits, and each fragment is strategically associated with both a stencil value and a stencil mask. Before performing stencil actions associated with a fragment, the raster operations unit performs stencil mask operations on the operands. No fragments are associated with both the same stencil mask and the same stencil value. Consequently, no fragments are associated with the same stencil bits included in the stencil buffer. Advantageously, by reducing the number of stencil bits associated with each fragment, certain classes of software applications may reduce the wasted memory associated with stencil buffers in which each stencil value is associated with a single fragment.Type: ApplicationFiled: August 3, 2015Publication date: November 26, 2015Inventors: Eric B. LUM, Jerome F. DULUK, JR.
-
Patent number: 9183609Abstract: A technique for efficiently rendering content reduces each complex blend mode to a series of basic blend operations. The series of basic blend operations are executed within a recirculating pipeline until a final blended value is computed. The recirculating pipeline is positioned within a color raster operations unit of a graphics processing unit for efficient access to image buffer data.Type: GrantFiled: December 20, 2012Date of Patent: November 10, 2015Assignee: NVIDIA CorporationInventors: Rui Bastos, Mark J. Kilgard, William Craig McKnight, Jerome F. Duluk, Jr., Pierre Souillot, Dale L. Kirkland, Christian Amsinck, Joseph Detmer, Christian Rouet, Don Bittel
-
Patent number: 9098925Abstract: One embodiment sets forth a method for associating each stencil value included in a stencil buffer with multiple fragments. Components within a graphics processing pipeline use a set of stencil masks to partition the bits of each stencil value. Each stencil mask selects a different subset of bits, and each fragment is strategically associated with both a stencil value and a stencil mask. Before performing stencil actions associated with a fragment, the raster operations unit performs stencil mask operations on the operands. No fragments are associated with both the same stencil mask and the same stencil value. Consequently, no fragments are associated with the same stencil bits included in the stencil buffer. Advantageously, by reducing the number of stencil bits associated with each fragment, certain classes of software applications may reduce the wasted memory associated with stencil buffers in which each stencil value is associated with a single fragment.Type: GrantFiled: July 15, 2013Date of Patent: August 4, 2015Assignee: NVIDIA CorporationInventors: Eric B. Lum, Jerome F. Duluk, Jr.
-
Patent number: 9098924Abstract: One embodiment sets forth a method for associating each stencil value included in a stencil buffer with multiple fragments. Components within a graphics processing pipeline use a set of stencil masks to partition the bits of each stencil value. Each stencil mask selects a different subset of bits, and each fragment is strategically associated with both a stencil value and a stencil mask. Before performing stencil actions associated with a fragment, the raster operations unit performs stencil mask operations on the operands. No fragments are associated with both the same stencil mask and the same stencil value. Consequently, no fragments are associated with the same stencil bits included in the stencil buffer. Advantageously, by reducing the number of stencil bits associated with each fragment, certain classes of software applications may reduce the wasted memory associated with stencil buffers in which each stencil value is associated with a single fragment.Type: GrantFiled: July 15, 2013Date of Patent: August 4, 2015Assignee: NVIDIA CORPORATIONInventors: Eric B. Lum, Jerome F. Duluk, Jr.
-
Patent number: 9082212Abstract: Techniques are disclosed for dispatching pixel information in a graphics processing pipeline. A fragment processing unit in the graphics processing pipeline generates a pixel that includes multiple samples based on a portion of a graphics primitive received by a thread. The fragment processing unit calculates a set of source values, where each source value corresponds to a different sample of the pixel. The fragment processing unit retrieves a set of destination values from a render target, where each destination value corresponds to a different source value. The fragment processing unit blends each source value with a corresponding destination value to create a set of final values, and creates one or more dispatch messages to store the set of final values in a set of output registers. One advantage of the disclosed techniques is that pixel shader programs perform per-sample operations with increased efficiency.Type: GrantFiled: December 21, 2012Date of Patent: July 14, 2015Assignee: NVIDIA CorporationInventors: Jerome F. Duluk, Jr., Jesse David Hall
-
Patent number: 9069609Abstract: One embodiment of the present invention sets forth a technique for assigning a compute task to a first processor included in a plurality of processors. The technique involves analyzing each compute task in a plurality of compute tasks to identify one or more compute tasks that are eligible for assignment to the first processor, where each compute task is listed in a first table and is associated with a priority value and an allocation order that indicates relative time at which the compute task was added to the first table. The technique further involves selecting a first task compute from the identified one or more compute tasks based on at least one of the priority value and the allocation order, and assigning the first compute task to the first processor for execution.Type: GrantFiled: January 18, 2012Date of Patent: June 30, 2015Assignee: NVIDIA CORPORATIONInventors: Karim M. Abdalla, Lacky V. Shah, Jerome F. Duluk, Jr., Timothy John Purcell, Tanmoy Mandal, Gentaro Hirota
-
Publication number: 20150178879Abstract: A system, method, and computer program product are provided for allocating processor resources to process compute workloads and graphics workloads substantially simultaneously. The method includes the steps of allocating a plurality of processing units to process tasks associated with a graphics pipeline, receiving a request to allocate at least one processing unit in the plurality of processing units to process tasks associated with a compute pipeline, and reallocating the at least one processing unit to process tasks associated with the compute pipeline.Type: ApplicationFiled: December 20, 2013Publication date: June 25, 2015Applicant: NVIDIA CORPORATIONInventors: Gregory S. Palmer, Jerome F. Duluk, JR., Karim Maher Abdalla, Jonathon S. Evans, Adam Clark Weitkemper, Lacky Vasant Shah, Philip Browning Johnson, Gentaro Hirota
-
Patent number: 9013498Abstract: A system and method for tracking and reporting texture map levels of detail that are computed during graphics processing allows for efficient management of texture map storage. Minimum and/or maximum pre-clamped texture map levels of detail values are tracked by a graphics processor and an array stored in memory is updated to report the minimum and/or maximum values for use by an application program. The minimum and/or maximum values may be used to determine the active set of texture map levels of detail that is loaded into graphics memory.Type: GrantFiled: December 19, 2008Date of Patent: April 21, 2015Assignee: NVIDIA CorporationInventors: John S. Montrym, Andrew J. Tao, Henry P. Moreton, Emmett M. Kilgariff, Cass W. Everitt, Alexander L. Minkin, Eric Anderson, Yan Yan Tang, Jerome F. Duluk, Jr.
-
Publication number: 20150097847Abstract: One embodiment of the present invention includes a memory management unit (MMU) that is configured to manage sparse mappings. The MMU processes requests to translate virtual addresses to physical addresses based on page table entries (PTEs) that indicate a sparse status. If the MMU determines that the PTE does not include a mapping from a virtual address to a physical address, then the MMU responds to the request based on the sparse status. If the sparse status is active, then the MMU determines the physical address based on whether the type of the request is a write operation and, subsequently, generates an acknowledgement of the request. By contrast, if the sparse status is not active, then the MMU generates a page fault. Advantageously, the disclosed embodiments enable the computer system to manage sparse mappings without incurring the performance degradation associated with both page faults and conventional software-based sparse mapping management.Type: ApplicationFiled: October 4, 2013Publication date: April 9, 2015Applicant: NVIDIA CORPORATIONInventors: Jonathan DUNAISKY, Henry Packard MORETON, Jeffrey A. BOLZ, Yury Y. URALSKY, James Leroy DEMING, Rui M. BASTOS, Patrick R. BROWN, Amanpreet GREWAL, Christian AMSINCK, Poornachandra RAO, Jerome F. DULUK, JR., Andrew J. TAO
-
Publication number: 20150084974Abstract: One embodiment sets forth a method for allocating memory to surfaces. A software application specifies surface data, including interleaving state data. Based on the interleaving state data, a surface access unit bloats addressees derived from discrete coordinates associated with the surface, creating a bloated virtual address space with a predictable pattern of addresses that do not correspond to data. Advantageously, by creating predictable regions of addresses that do not correspond to data, the software application program may configure the surface to share physical memory space with one or more other surfaces. In particular, the software application may map the virtual address space together with one or more virtual address spaces corresponding to complementary data patterns to the same physical base address. And, by overlapping the virtual address spaces onto the same pages in physical address space, the physical memory may be more densely packed than by using prior-art allocation techniques.Type: ApplicationFiled: September 20, 2013Publication date: March 26, 2015Applicant: NVIDIA CORPORATIONInventors: Eric B. LUM, Cass W. EVERITT, Henry Packard MORETON, Yury Y. URALSKY, Cyril CRASSIN, Jerome F. DULUK, Jr.