Patents by Inventor Cass W. Everitt

Cass W. Everitt has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8212825
    Abstract: One embodiment of the present invention sets forth a technique for more effectively utilizing graphics hardware by allowing the developer to exploit parallelism at the primitive-level. In this technique, an algorithm is analyzed to break the total work associated with processing one primitive into discrete portions of work. The results of this analysis are used to program a geometry shader group that includes multiple geometry shaders. Upon receiving a single input primitive, the geometry shader group launches multiple parallel threads, one thread in each geometry shader in the group corresponding to each discrete portion of work. As each thread completes, the output of the thread is stored in on-chip GPU memory for processing by the next stage in the graphics pipeline. Since the overall work associated with a given input primitive is distributed across multiple threads, the output of each thread is smaller and, thus, the total memory required to implement the algorithm is reduced.
    Type: Grant
    Filed: November 27, 2007
    Date of Patent: July 3, 2012
    Assignee: NVIDIA Corporation
    Inventors: Cass W. Everitt, Henry Packard Moreton
  • Publication number: 20120147027
    Abstract: A method and system for improving data coherency in a parallel rendering system is disclosed. Specifically, one embodiment of the present invention sets forth a method for managing a plurality of independently processed texture streams in a parallel rendering system that includes the steps of maintaining a time stamp for a group of tiles of work that are associated with each of the plurality of the texture streams and are associated with a specified area in screen space, and utilizing the time stamps to counter divergences in the independent processing of the plurality of texture streams.
    Type: Application
    Filed: February 17, 2012
    Publication date: June 14, 2012
    Inventors: Steven E. MOLNAR, Cass W. Everitt, Roger L. Allen, Gary M. Tarolli, John M. Danskin
  • Patent number: 8179394
    Abstract: One embodiment of the present invention sets forth a technique to perform fine-grained rendering predication using an IGPU and a DGPU. A graphics driver divides a 3D object into batches of triangles. The IGPU processes each batch of triangles through a modified rendering pipeline to determine if the batch is culled. The IGPU writes bits into a bitstream corresponding to the visibility of the batches. The DGPU reads bits from the bitstream and performs full-blown rendering, including shading, but only on the batches of triangles whose bit indicates that the batch is visible. Advantageously, this approach to rendering predication provides fine-grained culling without adding unnecessary overhead, thereby optimizing both hardware resources and performance.
    Type: Grant
    Filed: December 13, 2007
    Date of Patent: May 15, 2012
    Assignee: NVIDIA Corporation
    Inventors: Cass W. Everitt, Franck R. Diard
  • Patent number: 8171461
    Abstract: Systems and methods for compiling high-level primitive programs are used to generate primitive program micro-code for execution by a primitive processor. A compiler is configured to produce micro-code for a specific target primitive processor based on the target primitive processor's capabilities. The compiler supports features of the high-level primitive program by providing conversions for different applications programming interface conventions, determining output primitive types, initializing attribute arrays based on primitive input profile modifiers, and determining vertex set lengths from specified primitive input types.
    Type: Grant
    Filed: February 24, 2006
    Date of Patent: May 1, 2012
    Assignee: NVIDIA Coporation
    Inventors: Mark J. Kilgard, Cass W. Everitt, Christopher T. Dodd, Robert Steven Glanville
  • Patent number: 8139069
    Abstract: A method and system for improving data coherency in a parallel rendering system is disclosed. Specifically, one embodiment of the present invention sets forth a method for managing a plurality of independently processed texture streams in a parallel rendering system that includes the steps of maintaining a time stamp for a group of tiles of work that are associated with each of the plurality of the texture streams and are associated with a specified area in screen space, and utilizing the time stamps to counter divergences in the independent processing of the plurality of texture streams.
    Type: Grant
    Filed: November 3, 2006
    Date of Patent: March 20, 2012
    Assignee: NVIDIA Corporation
    Inventors: Steven E. Molnar, Cass W. Everitt, Roger L. Allen, Gary M. Tarolli, John M. Danskin
  • Patent number: 8102393
    Abstract: One embodiment of the present invention sets forth a technique to perform fine-grained rendering predication using an IGPU and a DGPU. A graphics driver divides a 3D object into batches of triangles. The IGPU processes each batch of triangles through a modified rendering pipeline to determine if the batch is culled. The IGPU writes bits into a bitstream corresponding to the visibility of the batches. The DGPU reads bits from the bitstream and performs full-blown rendering, including shading, but only on the batches of triangles whose bit indicates that the batch is visible. Advantageously, this approach to rendering predication provides fine-grained culling without adding unnecessary overhead, thereby optimizing both hardware resources and performance.
    Type: Grant
    Filed: December 13, 2007
    Date of Patent: January 24, 2012
    Assignee: NVIDIA Corporation
    Inventors: Cass W. Everitt, Franck R. Diard
  • Patent number: 8094158
    Abstract: Systems and methods for using multiple versions of programmable constants within a multi-threaded processor allow a programmable constant to be changed before a program using the constants has completed execution. Processing performance may be improved since programs using different values for a programmable constant may execute simultaneously. The programmable constants are stored in a constant buffer and an entry of a constant buffer table is bound to the constant buffer. When a programmable constant is changed it is copied to an entry in a page pool and address translation for the page pool is updated to correspond to the old version (copy) of the programmable constant. An advantage is that the constant buffer stores the newest version of the programmable constant.
    Type: Grant
    Filed: January 31, 2006
    Date of Patent: January 10, 2012
    Assignee: NVIDIA Corporation
    Inventors: Roger L. Allen, Cass W. Everitt, Henry P. Moreton, Thomas H. Kong
  • Patent number: 8085272
    Abstract: A method and system for improving data coherency in a parallel rendering system is disclosed. Specifically, one embodiment of the present invention sets forth a method, which includes the steps of receiving a common input stream, tracking a periodic event associated with the common input stream, generating a plurality of fragment streams from the common input stream, inserting a marker based on an occurrence of the periodic event in a first fragment stream in the multiple fragment streams, and utilizing the marker to influence the processing of the first fragment stream so that a plurality of raster operation (ROP) request streams maintains substantially the same coherence as the common input stream. Each fragment stream is independently processed and corresponds to one of the ROP request streams.
    Type: Grant
    Filed: November 3, 2006
    Date of Patent: December 27, 2011
    Assignee: NVIDIA Corporation
    Inventors: Steven E. Molnar, Cass W. Everitt, Roger L. Allen, Gary M. Tarolli, John M. Danskin, Adam Clark Weitkemper, Mark J. French
  • Patent number: 8010944
    Abstract: One embodiment of the invention includes a method for extending an object-oriented programming language to include support for a shading language vector data type. The method generally includes defining a template class for a shading language vector, defining a template class for a swizzled vector, and partially specializing the vector template class for vectors of one, two, three, and four elements. The partial specialization includes a union of instances of the vector swizzle template, where each instance represents a desired vector swizzle. In addition to defining the vector and vector swizzle data types, the templates classes may overload operators provided by the object-oriented programming language to perform operations corresponding to operations of the operators in the shading language.
    Type: Grant
    Filed: December 8, 2006
    Date of Patent: August 30, 2011
    Assignee: NVIDIA Corporation
    Inventors: Mark J. Kilgard, Cass W. Everitt
  • Patent number: 8010945
    Abstract: One embodiment of the invention includes a method for extending an object-oriented programming language to include support for a shading language vector data type. The method generally includes defining a template class for a shading language vector, defining a template class for a swizzled vector, and partially specializing the vector template class for vectors of one, two, three, and four elements. The partial specialization includes a union of instances of the vector swizzle template, where each instance represents a desired vector swizzle. In addition to defining the vector and vector swizzle data types, the templates classes may overload operators provided by the object-oriented programming language to perform operations corresponding to operations of the operators in the shading language.
    Type: Grant
    Filed: December 8, 2006
    Date of Patent: August 30, 2011
    Assignee: NVIDIA Corporation
    Inventors: Mark J. Kilgard, Cass W. Everitt
  • Patent number: 8006236
    Abstract: Systems and methods for compiling high-level primitive programs are used to generate primitive program micro-code for execution by a primitive processor. A compiler is configured to produce micro-code for a specific target primitive processor based on the target primitive processor's capabilities. The compiler supports features of the high-level primitive program by providing conversions for different applications programming interface conventions, determining output primitive types, initializing attribute arrays based on primitive input profile modifiers, and determining vertex set lengths from specified primitive input types.
    Type: Grant
    Filed: February 24, 2006
    Date of Patent: August 23, 2011
    Assignee: NVIDIA Corporation
    Inventors: Mark J. Kilgard, Cass W. Everitt, Christopher T. Dodd, Robert Steven Glanville
  • Patent number: 7999820
    Abstract: Methods and systems for reusing memory addresses in a graphics system are disclosed, so that instances of address translation hardware can be reduced. One embodiment of the present invention sets forth a method, which includes mapping a footprint on a display screen to a group of contiguous physical memory locations in a memory system, determining an anchor physical memory address from a first transaction associated with the footprint, wherein the anchor physical memory address corresponds to an anchor in the group of contiguous physical memory locations, determining a second transaction that is also associated with the footprint, determining a set of least significant bits (LSBs) associated with the second transaction, and combining the anchor physical memory address with the set of LSBs associated with the second transaction to generate a second physical memory address for the second transaction, thereby avoiding a second full address translation.
    Type: Grant
    Filed: December 10, 2007
    Date of Patent: August 16, 2011
    Assignee: NVIDIA Corporation
    Inventors: Adam Clark Weitkemper, Steven E. Molnar, Mark J. French, Cass W. Everitt
  • Patent number: 7944452
    Abstract: Methods and systems for reusing memory addresses in a graphics system are disclosed, so that instances of address translation hardware can be reduced. One embodiment of the present invention sets forth a method, which includes mapping a footprint in screen space to a group of contiguous physical memory locations in a memory system, determining a first physical memory address for a first transaction associated with the footprint, wherein the first physical memory address is within the group of contiguous physical memory locations, determining a second transaction that is also associated with the footprint, determining a set of least significant bits associated with the second transaction, and combining a portion of the first physical memory address with the set of least significant bits associated with the second transaction to generate a second physical memory address for the second transaction, thereby avoiding a second full address translation.
    Type: Grant
    Filed: October 23, 2006
    Date of Patent: May 17, 2011
    Assignee: NVIDIA Corporation
    Inventors: Adam Clark Wietkemper, Steven E. Molnar, Mark J. French, Cass W. Everitt
  • Publication number: 20110087840
    Abstract: One embodiment of the present invention sets forth a technique for performing a memory access request to compressed data within a virtually mapped memory system comprising an arbitrary number of partitions. A virtual address is mapped to a linear physical address, specified by a page table entry (PTE). The PTE is configured to store compression attributes, which are used to locate compression status for a corresponding physical memory page within a compression status bit cache. The compression status bit cache operates in conjunction with a compression status bit backing store. If compression status is available from the compression status bit cache, then the memory access request proceeds using the compression status. If the compression status bit cache misses, then the miss triggers a fill operation from the backing store. After the fill completes, memory access proceeds using the newly filled compression status information.
    Type: Application
    Filed: October 8, 2010
    Publication date: April 14, 2011
    Inventors: David B. GLASCO, Peter B. HOLMQVIST, George R. LYNCH, Patrick R. MARCHAND, Karan MEHRA, James ROBERTS, Cass W. EVERITT, Steven E. MOLNAR
  • Patent number: 7886116
    Abstract: Embodiments of the present invention set forth systems and methods for compressing thread group data written to frame buffer memory to increase overall memory performance. A compression/decompression engine within the frame buffer memory interface includes logic configured to identify situations where the threads of a thread group are writing similar scalar values to memory. Upon recognizing such a situation, the engine is configured to compress the scalar data into a form that allows all of the scalar data to be written to or read from the frame buffer memory in fewer clock cycles than would be required to transmit the data in uncompressed form to or from memory. Consequently, the disclosed systems and methods are able to effectively increase memory performance when executing thread group STORE and LOAD operations.
    Type: Grant
    Filed: July 30, 2007
    Date of Patent: February 8, 2011
    Assignee: NVIDIA Corporation
    Inventor: Cass W. Everitt
  • Patent number: 7877565
    Abstract: Systems and methods for using multiple versions of programmable constants within a multi-threaded processor allow a programmable constant to be changed before a program using the constants has completed execution. Processing performance may be improved since programs using different values for a programmable constant may execute simultaneously. The programmable constants are stored in a constant buffer and an entry of a constant buffer table is bound to the constant buffer. When a programmable constant is changed it is copied to an entry in a page pool and address translation for the page pool is updated to correspond to the old version (copy) of the programmable constant. An advantage is that the constant buffer stores the newest version of the programmable constant.
    Type: Grant
    Filed: January 31, 2006
    Date of Patent: January 25, 2011
    Assignee: NVIDIA Corporation
    Inventors: Roger L. Allen, Cass W. Everitt, Henry Packard Moreton, Thomas H. Kong, Simon S. Moy
  • Patent number: 7868891
    Abstract: Embodiments of methods, apparatuses, devices, and/or systems for load balancing two processors, such as for graphics and/or video processing, for example, are described.
    Type: Grant
    Filed: September 16, 2005
    Date of Patent: January 11, 2011
    Assignee: NVIDIA Corporation
    Inventors: Daniel Elliot Wexler, Larry I. Gritz, Eric B. Enderton, Cass W. Everitt
  • Patent number: 7825933
    Abstract: Systems and methods for compiling high-level primitive programs are used to generate primitive program micro-code for execution by a primitive processor. A compiler is configured to produce micro-code for a specific target primitive processor based on the target primitive processor's capabilities. The compiler supports features of the high-level primitive program by providing conversions for different applications programming interface conventions, determining output primitive types, initializing attribute arrays based on primitive input profile modifiers, and determining vertex set lengths from specified primitive input types.
    Type: Grant
    Filed: February 24, 2006
    Date of Patent: November 2, 2010
    Assignee: NVIDIA Corporation
    Inventors: Mark J. Kilgard, Cass W. Everitt, Christopher T. Dodd, Robert Steven Glanville
  • Patent number: 7746352
    Abstract: A virtually-addressed local texture memory stores selected regions (a sparse representation) of a texture for use by a graphics processor. The graphics processor requests a texel of the texture by referencing a virtual address of the texel. A memory interface references an address map to determine whether the requested texel is in one of the regions of the texture that is resident in the local texture memory. If so, the texel is retrieved from the local memory and used in the rendering operation; if not, an alternative texel that is resident in the local memory is retrieved and used in the rendering operation. Non-resident regions that include requested texels are retrieved from a primary texture data store at regular intervals (e.g., once per frame) and stored in local texture memory for use in a subsequent rendering operation.
    Type: Grant
    Filed: December 14, 2006
    Date of Patent: June 29, 2010
    Assignee: NVIDIA Corporation
    Inventor: Cass W Everitt
  • Publication number: 20100002000
    Abstract: A system and method for dynamically adjusting the pixel sampling rate during primitive shading can improve image quality or increase shading performance. Hybrid antialiasing is performed by selecting a number of shaded samples per pixel fragment. A combination of supersample and multisample antialiasing is used where a cluster of sub-pixel samples (multisamples) is processed for each pass through a fragment shader pipeline. The number of shader passes and multisamples in each cluster can be determined dynamically for each primitive based on rendering state.
    Type: Application
    Filed: July 3, 2008
    Publication date: January 7, 2010
    Inventors: Cass W. Everitt, Steven E. Molnar