Patents by Inventor Jerome F. Duluk, Jr.

Jerome F. Duluk, Jr. has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20150082001
    Abstract: One embodiment of the present invention includes techniques to support demand paging across a processing unit. Before a host unit transmits a command to an engine that does not tolerate page faults, the host unit ensures that the virtual memory addresses associated with the command are appropriately mapped to physical memory addresses. In particular, if the virtual memory addresses are not appropriately mapped, then the processing unit performs actions to map the virtual memory address to appropriate locations in physical memory. Further, the processing unit ensures that the access permissions required for successful execution of the command are established. Because the virtual memory address mappings associated with the command are valid when the engine receives the command, the engine does not encounter page faults upon executing the command. Consequently, in contrast to prior-art techniques, the engine supports demand paging regardless of whether the engine is involved in remedying page faults.
    Type: Application
    Filed: September 13, 2013
    Publication date: March 19, 2015
    Applicant: NVIDIA CORPORATION
    Inventors: Samuel H. DUNCAN, Jerome F. DULUK, JR., Jonathon Stuart Ramsay EVANS, James Leroy DEMING
  • Patent number: 8984183
    Abstract: One embodiment of the present invention sets forth a technique for enabling the insertion of generated tasks into a scheduling pipeline of a multiple processor system allows a compute task that is being executed to dynamically generate a dynamic task and notify a scheduling unit of the multiple processor system without intervention by a CPU. A reflected notification signal is generated in response to a write request when data for the dynamic task is written to a queue. Additional reflected notification signals are generated for other events that occur during execution of a compute task, e.g., to invalidate cache entries storing data for the compute task and to enable scheduling of another compute task.
    Type: Grant
    Filed: December 16, 2011
    Date of Patent: March 17, 2015
    Assignee: Nvidia Corporation
    Inventors: Timothy John Purcell, Lacky V. Shah, Jerome F. Duluk, Jr., Sean J. Treichler, Karim M. Abdalla, Philip Alexander Cuadra, Brian Pharris
  • Patent number: 8970608
    Abstract: One embodiment of the present invention sets forth a technique for transmitting state information associated with at least one graphics command to a graphics processor. The method includes the steps of generating a state object that specifies a set of properties that is needed to execute a first graphics command within the graphics processor, storing in the state object a value associated with a first property included in the set of properties, marking a second property included in the set of properties as a dynamic property, where a value associated with the second property is not stored in the state object and can be updated without having to modify the state object, and transmitting the state object to the graphics processor in order to execute the first graphics command.
    Type: Grant
    Filed: April 1, 2011
    Date of Patent: March 3, 2015
    Assignee: NVIDIA Corporation
    Inventors: Jeffrey A. Bolz, Eric S. Werness, Jerome F. Duluk, Jr.
  • Publication number: 20150049110
    Abstract: One embodiment sets forth a method for transforming 3-D images into 2-D rendered images using render target sample masks. A software application creates multiple render targets associated with a surface. For each render target, the software application also creates an associated render target sample mask configured to select one or more samples included in each pixel. Within the graphics pipeline, a pixel shader processes each pixel individually and outputs multiple render target-specific color values. For each render target, a ROP unit uses the associated render target sample mask to select covered samples included in the pixel. Subsequently, the ROP unit uses the render target-specific color value to update the selected samples in the render target, thereby achieving sample-level color granularity.
    Type: Application
    Filed: August 16, 2013
    Publication date: February 19, 2015
    Applicant: NVIDIA CORPORATION
    Inventors: Eric B. LUM, Jerome F. DULUK, JR., Yury Y. URALSKY, Rouslan DIMITROV, Rui M. BASTOS
  • Publication number: 20150038205
    Abstract: One embodiment of the present invention includes distributing bonus awards associated with a spelling game. A spelling game server system defines a first subset of letters included in a plurality of letters. A spelling game server system defines a set of allowed words that includes a bonus word. A spelling game server system receives one or more letters from the first subset of letters, where the subset of letters includes at least one newly spelled word, and the at least one newly spelled word is included in the set of allowed words. A spelling game server system awarding a first reward based on the at least one newly spelled word, wherein the first reward includes a bonus reward when the at least one newly spelled word includes at least a portion of the bonus word.
    Type: Application
    Filed: June 12, 2014
    Publication date: February 5, 2015
    Inventors: Veronica Wiechers DULUK, Jerome F. DULUK, Jr.
  • Publication number: 20150015595
    Abstract: One embodiment sets forth a method for associating each stencil value included in a stencil buffer with multiple fragments. Components within a graphics processing pipeline use a set of stencil masks to partition the bits of each stencil value. Each stencil mask selects a different subset of bits, and each fragment is strategically associated with both a stencil value and a stencil mask. Before performing stencil actions associated with a fragment, the raster operations unit performs stencil mask operations on the operands. No fragments are associated with both the same stencil mask and the same stencil value. Consequently, no fragments are associated with the same stencil bits included in the stencil buffer. Advantageously, by reducing the number of stencil bits associated with each fragment, certain classes of software applications may reduce the wasted memory associated with stencil buffers in which each stencil value is associated with a single fragment.
    Type: Application
    Filed: July 15, 2013
    Publication date: January 15, 2015
    Applicant: NVIDIA CORPORATION
    Inventors: Eric B. LUM, Jerome F. DULUK, Jr.
  • Publication number: 20150015594
    Abstract: One embodiment sets forth a method for associating each stencil value included in a stencil buffer with multiple fragments. Components within a graphics processing pipeline use a set of stencil masks to partition the bits of each stencil value. Each stencil mask selects a different subset of bits, and each fragment is strategically associated with both a stencil value and a stencil mask. Before performing stencil actions associated with a fragment, the raster operations unit performs stencil mask operations on the operands. No fragments are associated with both the same stencil mask and the same stencil value. Consequently, no fragments are associated with the same stencil bits included in the stencil buffer. Advantageously, by reducing the number of stencil bits associated with each fragment, certain classes of software applications may reduce the wasted memory associated with stencil buffers in which each stencil value is associated with a single fragment.
    Type: Application
    Filed: July 15, 2013
    Publication date: January 15, 2015
    Inventors: Eric B. LUM, Jerome F. DULUK, JR.
  • Patent number: 8922555
    Abstract: One embodiment of the present invention sets forth a technique for storing only the enabled components for each enabled vector and writing only enabled components to one or more specified render targets. A shader program header (SPH) file provides per-component mask bits for each render target. Each enabled mask bit indicates that the pixel shader generates the corresponding component as an output to the raster operations unit. In the hardware, the per-component mask bits are combined with the applications programming interface (API)-level per-component write masks to determine the components that are updated by the shader program. The combined mask is used as the write enable bits for components in one or more render targets. One advantage of the combined mask is that the components that are not updated are not forwarded from the pixel shader to the ROP, thereby saving bandwidth between those processing units.
    Type: Grant
    Filed: October 6, 2010
    Date of Patent: December 30, 2014
    Assignee: NVIDIA Corporation
    Inventors: Jerome F. Duluk, Jr., Jesse David Hall, Patrick R. Brown, Mark Dennis Stadler
  • Publication number: 20140337569
    Abstract: A system, method, and computer program product for low-latency scheduling and launch of memory defined tasks. The method includes the steps of receiving a task metadata data structure to be stored in a memory associated with a processor, transmitting the task metadata data structure to a scheduling unit of the processor, storing the task metadata data structure in a cache unit included in the scheduling unit, and copying the task metadata data structure from the cache unit to the memory.
    Type: Application
    Filed: May 8, 2013
    Publication date: November 13, 2014
    Applicant: Nvidia Corporation
    Inventors: Scott Ricketts, Brian Scott Pharris, Nicholas Wang, Luke David Durant, Philip Alexander Cuadra, Jerome F. Duluk, Jr.
  • Patent number: 8860743
    Abstract: Systems and methods for texture processing are presented. In one embodiment a texture method includes creating a sparse texture residency translation map; performing a probe process utilizing the sparse texture residency translation map information to return a finest LOD that contains the texels for a texture lookup operation; and performing the texture lookup operation utilizing the finest LOD. In one exemplary implementation, the finest LOD is utilized as a minimum LOD clamp during the texture lookup operation. A finest LOD number indicates a minimum resident LOD and a sparse texture residency translation map includes one finest LOD number per tile of a sparse texture. The sparse texture residency translation can indicate a minimum resident LOD.
    Type: Grant
    Filed: December 31, 2009
    Date of Patent: October 14, 2014
    Assignee: Nvidia Corporation
    Inventors: Andrew Tao, Jerome F. Duluk, Jr., Jesse D. Hall, Henry Moreton
  • Publication number: 20140281296
    Abstract: A system for managing virtual memory. The system includes a first processing unit configured to execute a first operation that references a first virtual memory address. The system also includes a first memory management unit (MMU) associated with the first processing unit and configured to generate a first page fault upon determining that a first page table that is stored in a first memory unit associated with the first processing unit does not include a mapping corresponding to the first virtual memory address. The system further includes a first copy engine associated with the first processing unit. The first copy engine is configured to read a first command queue to determine a first mapping that corresponds to the first virtual memory address and is included in a first page state directory. The first copy engine is also configured to update the first page table to include the first mapping.
    Type: Application
    Filed: October 16, 2013
    Publication date: September 18, 2014
    Applicant: NVIDIA CORPORATION
    Inventors: Jerome F. DULUK, JR., Cameron BUSCHARDT, Sherry CHEUNG, James Leroy DEMING, Samuel H. DUNCAN, Lucien DUNNING, Robert GEORGE, Arvind GOPALAKRISHNAN, Mark HAIRGROVE, Chenghuan JIA, John MASHEY
  • Publication number: 20140281357
    Abstract: A system for managing virtual memory. The system includes a first processing unit configured to execute a first operation that references a first virtual memory address. The system also includes a first memory management unit (MMU) associated with the first processing unit and configured to generate a first page fault upon determining that a first page table that is stored in a first memory unit associated with the first processing unit does not include a mapping corresponding to the first virtual memory address. The system further includes a first copy engine associated with the first processing unit. The first copy engine is configured to read a first command queue to determine a first mapping that corresponds to the first virtual memory address and is included in a first page state directory. The first copy engine is also configured to update the first page table to include the first mapping.
    Type: Application
    Filed: October 16, 2013
    Publication date: September 18, 2014
    Applicant: NVIDIA Corporation
    Inventors: Jerome F. DULUK, JR., Cameron BUSCHARDT, Sherry CHEUNG, James Leroy DEMING, Samuel H. DUNCAN, Lucien DUNNING, Robert GEORGE, Arvind GOPALAKRISHNAN, Mark HAIRGROVE, Chenghuan JIA, John MASHEY
  • Publication number: 20140281263
    Abstract: One embodiment of the present invention is a parallel processing unit (PPU) that includes one or more streaming multiprocessors (SMs) and implements a replay unit per SM. Upon detecting a page fault associated with a memory transaction issued by a particular SM, the corresponding replay unit causes the SM, but not any unaffected SMs, to cease issuing new memory transactions. The replay unit then stores the faulting memory transaction and any faulting in-flight memory transaction in a replay buffer. As page faults are resolved, the replay unit replays the memory transactions in the replay buffer—removing successful memory transactions from the replay buffer—until all of the stored memory transactions have successfully executed. Advantageously, the overall performance of the PPU is improved compared to conventional PPUs that, upon detecting a page fault, stop performing memory transactions across all SMs included in the PPU until the fault is resolved.
    Type: Application
    Filed: December 17, 2013
    Publication date: September 18, 2014
    Applicant: NVIDIA CORPORATION
    Inventors: James Leroy DEMING, Jerome F. DULUK, Jr., John MASHEY, Mark HAIRGROVE, Lucien DUNNING, Jonathon Stuart Ramsay EVANS, Samuel H. DUNCAN, Cameron BUSCHARDT, Brian FAHS
  • Publication number: 20140267264
    Abstract: One embodiment of the present invention sets forth a technique for performing voxelization. The technique involves identifying a voxel that is intersected by a first graphics primitive that has a front side and a back side and selecting a plurality of sample points within the voxel. The technique further involves determining, for each sample point included in the plurality of sample points, whether the sample point is located on the front side of the first graphics primitive or on the back side of the first graphics primitive. Finally, the technique involves storing, for at least a first sample point included in the plurality of sample points, a first result in a voxel mask reflecting whether the first sample point is located on the front side of the first graphics primitive or on the back side of the first graphics primitive.
    Type: Application
    Filed: March 14, 2013
    Publication date: September 18, 2014
    Applicant: NVIDIA CORPORATION
    Inventors: Cyril CRASSIN, Yury Y. URALSKY, Eric ENDERTON, Eric B. LUM, Jerome F. DULUK, JR., Henry Packard MORETON, David LUEBKE
  • Publication number: 20140281110
    Abstract: Techniques are disclosed for tracking memory page accesses in a unified virtual memory system. An access tracking unit detects a memory page access generated by a first processor for accessing a memory page in a memory system of a second processor. The access tracking unit determines whether a cache memory includes an entry for the memory page. If so, then the access tracking unit increments an associated access counter. Otherwise, the access tracking unit attempts to find an unused entry in the cache memory that is available for allocation. If so, then the access tracking unit associates the second entry with the memory page, and sets an access counter associated with the second entry to an initial value. Otherwise, the access tracking unit selects a valid entry in the cache memory; clears an associated valid bit; associates the entry with the memory page; and initializes an associated access counter.
    Type: Application
    Filed: December 9, 2013
    Publication date: September 18, 2014
    Applicant: NVIDIA CORPORATION
    Inventors: Jerome F. DULUK, Jr., Cameron BUSCHARDT, James Leroy DEMING, Brian FAHS, Mark HAIRGROVE, John MASHEY
  • Publication number: 20140281365
    Abstract: One embodiment of the present invention is a memory subsystem that includes a sliding window tracker that tracks memory accesses associated with a sliding window of memory page groups. When the sliding window tracker detects an access operation associated with a memory page group within the sliding window, the sliding window tracker sets a reference bit that is associated with the memory page group and is included in a reference vector that represents accesses to the memory page groups within the sliding window. Based on the values of the reference bits, the sliding window tracker causes the selection a memory page in a memory page group that has fallen into disuse from a first memory to a second memory. Because the sliding window tracker tunes the memory pages that are resident in the first memory to reflect memory access patterns, the overall performance of the memory subsystem is improved.
    Type: Application
    Filed: December 12, 2013
    Publication date: September 18, 2014
    Applicant: NVIDIA CORPORATION
    Inventors: John MASHEY, Cameron BUSCHARDT, James Leroy DEMING, Jerome F. DULUK, JR., Brian FAHS
  • Publication number: 20140281324
    Abstract: One embodiment of the present invention sets forth a computer-implemented method for migrating a memory page from a first memory to a second memory. The method includes determining a first page size supported by the first memory. The method also includes determining a second page size supported by the second memory. The method further includes determining a use history of the memory page based on an entry in a page state directory associated with the memory page. The method also includes migrating the memory page between the first memory and the second memory based on the first page size, the second page size, and the use history.
    Type: Application
    Filed: December 19, 2013
    Publication date: September 18, 2014
    Applicant: NVIDIA CORPORATION
    Inventors: Jerome F. DULUK, JR., Cameron BUSCHARDT, James Leroy DEMING, Lucien DUNNING, Brian FAHS, Mark HAIRGROVE, Chenghuan JIA, John MASHEY, James M. VAN DYKE
  • Publication number: 20140281256
    Abstract: A system for managing virtual memory. The system includes a first processing unit configured to execute a first operation that references a first virtual memory address. The system also includes a first memory management unit (MMU) associated with the first processing unit and configured to generate a first page fault upon determining that a first page table that is stored in a first memory unit associated with the first processing unit does not include a mapping corresponding to the first virtual memory address. The system further includes a first copy engine associated with the first processing unit. The first copy engine is configured to read a first command queue to determine a first mapping that corresponds to the first virtual memory address and is included in a first page state directory. The first copy engine is also configured to update the first page table to include the first mapping.
    Type: Application
    Filed: October 16, 2013
    Publication date: September 18, 2014
    Applicant: NVIDIA CORPORATION
    Inventors: Jerome F. DULUK, JR., Cameron BUSCHARDT, Sherry CHEUNG, James Leroy DEMING, Samuel H. DUNCAN, Lucien DUNNING, Robert GEORGE, Arvind GOPALAKRISHNAN, Mark HAIRGROVE, Chenghuan JIA, John MASHEY
  • Publication number: 20140281297
    Abstract: Techniques are provided by which memory pages may be migrated among PPU memories in a multi-PPU system. According to the techniques, a UVM driver determines that a particular memory page should change ownership state and/or be migrated between one PPU memory and another PPU memory. In response to this determination, the UVM driver initiates a peer transition sequence to cause the ownership state and/or location of the memory page to change. Various peer transition sequences involve modifying mappings for one or more PPU, and copying a memory page from one PPU memory to another PPU memory. Several steps in peer transition sequences may be performed in parallel for increased processing speed.
    Type: Application
    Filed: December 19, 2013
    Publication date: September 18, 2014
    Applicant: NVIDIA CORPORATION
    Inventors: Jerome F. DULUK, JR., John MASHEY, Mark HAIRGROVE, Chenghuan JIA, Cameron BUSCHARDT, Lucien DUNNING, Brian FAHS
  • Publication number: 20140281299
    Abstract: Techniques are disclosed for transitioning a memory page between memories in a virtual memory subsystem. A unified virtual memory (UVM) driver detects a page fault in response to a memory access request associated with a first memory page, where a local page table does not include an entry corresponding to a virtual memory address included in the memory access request. The UVM driver, in response to the page fault, executes a page fault sequence. The page fault sequence includes modifying the ownership state associated with the first memory page to be central-processing-unit-shared. The page fault sequence further includes scheduling the first memory page for migration from a system memory associated with a central processing unit (CPU) to a local memory associated with a parallel processing unit (PPU). One advantage of the disclosed approach is that the PPU accesses memory pages with greater efficiency.
    Type: Application
    Filed: December 18, 2013
    Publication date: September 18, 2014
    Applicant: NVIDIA CORPORATION
    Inventors: Jerome F. DULUK, JR., Cameron BUSCHARDT, James Leroy DEMING, Lucien DUNNING, Brian FAHS, Mark HAIRGROVE, John MASHEY