Patents by Inventor Cass W. Everitt
Cass W. Everitt has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 8212825Abstract: One embodiment of the present invention sets forth a technique for more effectively utilizing graphics hardware by allowing the developer to exploit parallelism at the primitive-level. In this technique, an algorithm is analyzed to break the total work associated with processing one primitive into discrete portions of work. The results of this analysis are used to program a geometry shader group that includes multiple geometry shaders. Upon receiving a single input primitive, the geometry shader group launches multiple parallel threads, one thread in each geometry shader in the group corresponding to each discrete portion of work. As each thread completes, the output of the thread is stored in on-chip GPU memory for processing by the next stage in the graphics pipeline. Since the overall work associated with a given input primitive is distributed across multiple threads, the output of each thread is smaller and, thus, the total memory required to implement the algorithm is reduced.Type: GrantFiled: November 27, 2007Date of Patent: July 3, 2012Assignee: NVIDIA CorporationInventors: Cass W. Everitt, Henry Packard Moreton
-
Publication number: 20120147027Abstract: A method and system for improving data coherency in a parallel rendering system is disclosed. Specifically, one embodiment of the present invention sets forth a method for managing a plurality of independently processed texture streams in a parallel rendering system that includes the steps of maintaining a time stamp for a group of tiles of work that are associated with each of the plurality of the texture streams and are associated with a specified area in screen space, and utilizing the time stamps to counter divergences in the independent processing of the plurality of texture streams.Type: ApplicationFiled: February 17, 2012Publication date: June 14, 2012Inventors: Steven E. MOLNAR, Cass W. Everitt, Roger L. Allen, Gary M. Tarolli, John M. Danskin
-
Patent number: 8179394Abstract: One embodiment of the present invention sets forth a technique to perform fine-grained rendering predication using an IGPU and a DGPU. A graphics driver divides a 3D object into batches of triangles. The IGPU processes each batch of triangles through a modified rendering pipeline to determine if the batch is culled. The IGPU writes bits into a bitstream corresponding to the visibility of the batches. The DGPU reads bits from the bitstream and performs full-blown rendering, including shading, but only on the batches of triangles whose bit indicates that the batch is visible. Advantageously, this approach to rendering predication provides fine-grained culling without adding unnecessary overhead, thereby optimizing both hardware resources and performance.Type: GrantFiled: December 13, 2007Date of Patent: May 15, 2012Assignee: NVIDIA CorporationInventors: Cass W. Everitt, Franck R. Diard
-
Patent number: 8171461Abstract: Systems and methods for compiling high-level primitive programs are used to generate primitive program micro-code for execution by a primitive processor. A compiler is configured to produce micro-code for a specific target primitive processor based on the target primitive processor's capabilities. The compiler supports features of the high-level primitive program by providing conversions for different applications programming interface conventions, determining output primitive types, initializing attribute arrays based on primitive input profile modifiers, and determining vertex set lengths from specified primitive input types.Type: GrantFiled: February 24, 2006Date of Patent: May 1, 2012Assignee: NVIDIA CoporationInventors: Mark J. Kilgard, Cass W. Everitt, Christopher T. Dodd, Robert Steven Glanville
-
Patent number: 8139069Abstract: A method and system for improving data coherency in a parallel rendering system is disclosed. Specifically, one embodiment of the present invention sets forth a method for managing a plurality of independently processed texture streams in a parallel rendering system that includes the steps of maintaining a time stamp for a group of tiles of work that are associated with each of the plurality of the texture streams and are associated with a specified area in screen space, and utilizing the time stamps to counter divergences in the independent processing of the plurality of texture streams.Type: GrantFiled: November 3, 2006Date of Patent: March 20, 2012Assignee: NVIDIA CorporationInventors: Steven E. Molnar, Cass W. Everitt, Roger L. Allen, Gary M. Tarolli, John M. Danskin
-
Patent number: 8102393Abstract: One embodiment of the present invention sets forth a technique to perform fine-grained rendering predication using an IGPU and a DGPU. A graphics driver divides a 3D object into batches of triangles. The IGPU processes each batch of triangles through a modified rendering pipeline to determine if the batch is culled. The IGPU writes bits into a bitstream corresponding to the visibility of the batches. The DGPU reads bits from the bitstream and performs full-blown rendering, including shading, but only on the batches of triangles whose bit indicates that the batch is visible. Advantageously, this approach to rendering predication provides fine-grained culling without adding unnecessary overhead, thereby optimizing both hardware resources and performance.Type: GrantFiled: December 13, 2007Date of Patent: January 24, 2012Assignee: NVIDIA CorporationInventors: Cass W. Everitt, Franck R. Diard
-
Patent number: 8094158Abstract: Systems and methods for using multiple versions of programmable constants within a multi-threaded processor allow a programmable constant to be changed before a program using the constants has completed execution. Processing performance may be improved since programs using different values for a programmable constant may execute simultaneously. The programmable constants are stored in a constant buffer and an entry of a constant buffer table is bound to the constant buffer. When a programmable constant is changed it is copied to an entry in a page pool and address translation for the page pool is updated to correspond to the old version (copy) of the programmable constant. An advantage is that the constant buffer stores the newest version of the programmable constant.Type: GrantFiled: January 31, 2006Date of Patent: January 10, 2012Assignee: NVIDIA CorporationInventors: Roger L. Allen, Cass W. Everitt, Henry P. Moreton, Thomas H. Kong
-
Patent number: 8085272Abstract: A method and system for improving data coherency in a parallel rendering system is disclosed. Specifically, one embodiment of the present invention sets forth a method, which includes the steps of receiving a common input stream, tracking a periodic event associated with the common input stream, generating a plurality of fragment streams from the common input stream, inserting a marker based on an occurrence of the periodic event in a first fragment stream in the multiple fragment streams, and utilizing the marker to influence the processing of the first fragment stream so that a plurality of raster operation (ROP) request streams maintains substantially the same coherence as the common input stream. Each fragment stream is independently processed and corresponds to one of the ROP request streams.Type: GrantFiled: November 3, 2006Date of Patent: December 27, 2011Assignee: NVIDIA CorporationInventors: Steven E. Molnar, Cass W. Everitt, Roger L. Allen, Gary M. Tarolli, John M. Danskin, Adam Clark Weitkemper, Mark J. French
-
Patent number: 8010944Abstract: One embodiment of the invention includes a method for extending an object-oriented programming language to include support for a shading language vector data type. The method generally includes defining a template class for a shading language vector, defining a template class for a swizzled vector, and partially specializing the vector template class for vectors of one, two, three, and four elements. The partial specialization includes a union of instances of the vector swizzle template, where each instance represents a desired vector swizzle. In addition to defining the vector and vector swizzle data types, the templates classes may overload operators provided by the object-oriented programming language to perform operations corresponding to operations of the operators in the shading language.Type: GrantFiled: December 8, 2006Date of Patent: August 30, 2011Assignee: NVIDIA CorporationInventors: Mark J. Kilgard, Cass W. Everitt
-
Patent number: 8010945Abstract: One embodiment of the invention includes a method for extending an object-oriented programming language to include support for a shading language vector data type. The method generally includes defining a template class for a shading language vector, defining a template class for a swizzled vector, and partially specializing the vector template class for vectors of one, two, three, and four elements. The partial specialization includes a union of instances of the vector swizzle template, where each instance represents a desired vector swizzle. In addition to defining the vector and vector swizzle data types, the templates classes may overload operators provided by the object-oriented programming language to perform operations corresponding to operations of the operators in the shading language.Type: GrantFiled: December 8, 2006Date of Patent: August 30, 2011Assignee: NVIDIA CorporationInventors: Mark J. Kilgard, Cass W. Everitt
-
Patent number: 8006236Abstract: Systems and methods for compiling high-level primitive programs are used to generate primitive program micro-code for execution by a primitive processor. A compiler is configured to produce micro-code for a specific target primitive processor based on the target primitive processor's capabilities. The compiler supports features of the high-level primitive program by providing conversions for different applications programming interface conventions, determining output primitive types, initializing attribute arrays based on primitive input profile modifiers, and determining vertex set lengths from specified primitive input types.Type: GrantFiled: February 24, 2006Date of Patent: August 23, 2011Assignee: NVIDIA CorporationInventors: Mark J. Kilgard, Cass W. Everitt, Christopher T. Dodd, Robert Steven Glanville
-
Patent number: 7999820Abstract: Methods and systems for reusing memory addresses in a graphics system are disclosed, so that instances of address translation hardware can be reduced. One embodiment of the present invention sets forth a method, which includes mapping a footprint on a display screen to a group of contiguous physical memory locations in a memory system, determining an anchor physical memory address from a first transaction associated with the footprint, wherein the anchor physical memory address corresponds to an anchor in the group of contiguous physical memory locations, determining a second transaction that is also associated with the footprint, determining a set of least significant bits (LSBs) associated with the second transaction, and combining the anchor physical memory address with the set of LSBs associated with the second transaction to generate a second physical memory address for the second transaction, thereby avoiding a second full address translation.Type: GrantFiled: December 10, 2007Date of Patent: August 16, 2011Assignee: NVIDIA CorporationInventors: Adam Clark Weitkemper, Steven E. Molnar, Mark J. French, Cass W. Everitt
-
Patent number: 7944452Abstract: Methods and systems for reusing memory addresses in a graphics system are disclosed, so that instances of address translation hardware can be reduced. One embodiment of the present invention sets forth a method, which includes mapping a footprint in screen space to a group of contiguous physical memory locations in a memory system, determining a first physical memory address for a first transaction associated with the footprint, wherein the first physical memory address is within the group of contiguous physical memory locations, determining a second transaction that is also associated with the footprint, determining a set of least significant bits associated with the second transaction, and combining a portion of the first physical memory address with the set of least significant bits associated with the second transaction to generate a second physical memory address for the second transaction, thereby avoiding a second full address translation.Type: GrantFiled: October 23, 2006Date of Patent: May 17, 2011Assignee: NVIDIA CorporationInventors: Adam Clark Wietkemper, Steven E. Molnar, Mark J. French, Cass W. Everitt
-
Publication number: 20110087840Abstract: One embodiment of the present invention sets forth a technique for performing a memory access request to compressed data within a virtually mapped memory system comprising an arbitrary number of partitions. A virtual address is mapped to a linear physical address, specified by a page table entry (PTE). The PTE is configured to store compression attributes, which are used to locate compression status for a corresponding physical memory page within a compression status bit cache. The compression status bit cache operates in conjunction with a compression status bit backing store. If compression status is available from the compression status bit cache, then the memory access request proceeds using the compression status. If the compression status bit cache misses, then the miss triggers a fill operation from the backing store. After the fill completes, memory access proceeds using the newly filled compression status information.Type: ApplicationFiled: October 8, 2010Publication date: April 14, 2011Inventors: David B. GLASCO, Peter B. HOLMQVIST, George R. LYNCH, Patrick R. MARCHAND, Karan MEHRA, James ROBERTS, Cass W. EVERITT, Steven E. MOLNAR
-
Patent number: 7886116Abstract: Embodiments of the present invention set forth systems and methods for compressing thread group data written to frame buffer memory to increase overall memory performance. A compression/decompression engine within the frame buffer memory interface includes logic configured to identify situations where the threads of a thread group are writing similar scalar values to memory. Upon recognizing such a situation, the engine is configured to compress the scalar data into a form that allows all of the scalar data to be written to or read from the frame buffer memory in fewer clock cycles than would be required to transmit the data in uncompressed form to or from memory. Consequently, the disclosed systems and methods are able to effectively increase memory performance when executing thread group STORE and LOAD operations.Type: GrantFiled: July 30, 2007Date of Patent: February 8, 2011Assignee: NVIDIA CorporationInventor: Cass W. Everitt
-
Patent number: 7877565Abstract: Systems and methods for using multiple versions of programmable constants within a multi-threaded processor allow a programmable constant to be changed before a program using the constants has completed execution. Processing performance may be improved since programs using different values for a programmable constant may execute simultaneously. The programmable constants are stored in a constant buffer and an entry of a constant buffer table is bound to the constant buffer. When a programmable constant is changed it is copied to an entry in a page pool and address translation for the page pool is updated to correspond to the old version (copy) of the programmable constant. An advantage is that the constant buffer stores the newest version of the programmable constant.Type: GrantFiled: January 31, 2006Date of Patent: January 25, 2011Assignee: NVIDIA CorporationInventors: Roger L. Allen, Cass W. Everitt, Henry Packard Moreton, Thomas H. Kong, Simon S. Moy
-
Patent number: 7868891Abstract: Embodiments of methods, apparatuses, devices, and/or systems for load balancing two processors, such as for graphics and/or video processing, for example, are described.Type: GrantFiled: September 16, 2005Date of Patent: January 11, 2011Assignee: NVIDIA CorporationInventors: Daniel Elliot Wexler, Larry I. Gritz, Eric B. Enderton, Cass W. Everitt
-
Patent number: 7825933Abstract: Systems and methods for compiling high-level primitive programs are used to generate primitive program micro-code for execution by a primitive processor. A compiler is configured to produce micro-code for a specific target primitive processor based on the target primitive processor's capabilities. The compiler supports features of the high-level primitive program by providing conversions for different applications programming interface conventions, determining output primitive types, initializing attribute arrays based on primitive input profile modifiers, and determining vertex set lengths from specified primitive input types.Type: GrantFiled: February 24, 2006Date of Patent: November 2, 2010Assignee: NVIDIA CorporationInventors: Mark J. Kilgard, Cass W. Everitt, Christopher T. Dodd, Robert Steven Glanville
-
Patent number: 7746352Abstract: A virtually-addressed local texture memory stores selected regions (a sparse representation) of a texture for use by a graphics processor. The graphics processor requests a texel of the texture by referencing a virtual address of the texel. A memory interface references an address map to determine whether the requested texel is in one of the regions of the texture that is resident in the local texture memory. If so, the texel is retrieved from the local memory and used in the rendering operation; if not, an alternative texel that is resident in the local memory is retrieved and used in the rendering operation. Non-resident regions that include requested texels are retrieved from a primary texture data store at regular intervals (e.g., once per frame) and stored in local texture memory for use in a subsequent rendering operation.Type: GrantFiled: December 14, 2006Date of Patent: June 29, 2010Assignee: NVIDIA CorporationInventor: Cass W Everitt
-
Publication number: 20100002000Abstract: A system and method for dynamically adjusting the pixel sampling rate during primitive shading can improve image quality or increase shading performance. Hybrid antialiasing is performed by selecting a number of shaded samples per pixel fragment. A combination of supersample and multisample antialiasing is used where a cluster of sub-pixel samples (multisamples) is processed for each pass through a fragment shader pipeline. The number of shader passes and multisamples in each cluster can be determined dynamically for each primitive based on rendering state.Type: ApplicationFiled: July 3, 2008Publication date: January 7, 2010Inventors: Cass W. Everitt, Steven E. Molnar